Changeset 3513


Ignore:
Timestamp:
Sep 15, 2013, 8:04:16 PM (5 years ago)
Author:
cameron
Message:

Final edits

Location:
docs/Working/re
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • docs/Working/re/abstract.tex

    r3469 r3513  
    88128-bit SSE2 SIMD technology, our algorithm implementations can substantially
    99outperform traditional grep implementations based on NFAs, DFAs or
    10 backtracking.  The algorithms are also designed to scale with the availability
     10backtracking.  5X or better performance advantage against the
     11best of available competitors is not atypical.
     12The algorithms are also designed to scale with the availability
    1113of additional parallel resources such as the wider SIMD facilities (256-bit)
    12 of Intel AVX2 or future 512-bit extensions.   Our GPU implementations show
    13 further acceleration limited only by data transfer speed.
     14of Intel AVX2 or future 512-bit extensions.   Our AVX2 implementation
     15showed dramatic reduction in instruction count and significant
     16improvement in speed.   Our GPU implementations show
     17further acceleration limited primarily by data transfer speed.
    1418
    1519
  • docs/Working/re/conclusion.tex

    r3486 r3513  
    88found in Perl-compatible backtracking implementations.
    99Taking advantage of the SIMD features available on commodity
    10 processors, its implementation in a grep too offers consistently good performance in
    11 contrast to available alternatives.   While lacking some
     10processors, its implementation in a grep offers consistently
     11good performance in contrast to available alternatives. 
     12For moderately complex expressions, 10X or better
     13performance advantages over GNU grep and 5X performance
     14advantage over nrgrep were frequently seen.
     15While lacking some
    1216special optimizations found in other engines to deal with
    1317repeated substrings or to perform skipping actions based
    1418on fixed substrings, it nevertheless performs competitively
    15 in all cases.  The algorithm tends to scale very well with regular
    16 expression complexity, often with order-of-magnitude
    17 performance advantage over even the best of its competitors.
     19in all cases. 
    1820
    1921A parallelized algorithm for long-stream addition has also
Note: See TracChangeset for help on using the changeset viewer.