Changes between Version 1 and Version 2 of ICgrep


Ignore:
Timestamp:
Feb 28, 2015, 9:15:34 AM (4 years ago)
Author:
cameron
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • ICgrep

    v1 v2  
    1 == ICgrep:  Blazingly Fast Regular Expression Search ==
     1== ICgrep:  The Modern Grep Replacement ==
    22
    3 Building off the Parabix transform representation of text, icGrep embodies a
    4 completely new algorithmic approach to high-performance regular expression
    5 matching.   In contrast to the byte-at-a-time approach of NFA, DFA, and backtracking matchers, icGrep
    6 processes UTF-8 input streams 128 code units at a time (using SSE2 technology, or 256 code
    7 units at a time with Intel's new AVX2 instructions).
     3Building off the Parabix transform representation of text as well as
     4the highly-regarded LLVM compiler infrastructure, icGrep is
     5the modern replacement for the venerable grep utility, offering
     6fundamentally parallel and consistently high-performance search
     7as well as broad support for Unicode regular expressions.
     8
     9ICgrep embodies a completely new algorithmic approach to high-performance regular expression
     10matching.   In contrast to the byte-at-a-time approach of NFA, DFA, and backtracking matchers,
     11icGrep 1.0 processes UTF-8 input streams 128 code units at a time using the SSE2 technology available on
     12all x86-64 processors.   The fundamental notion underlying icGrep is that of bitwise
     13data parallelism as described in our 2014 PACT paper:
     14Cameron, Robert D., Thomas C. Shermer, Arrvindh Shriraman, Kenneth S. Herdy, Dan Lin, Benjamin R. Hull, and Meng Lin. "Bitwise data parallelism in regular expression matching." In Proceedings of the 23rd international conference on Parallel architectures and compilation, pp. 139-150. ACM, 2014.
     15
     16== Predictable, Consistent Performance ==
     17
     18=== ICgrep 1.0 Performance ===
     19
     20While GNU grep and other grep utilities often exhibit dramatic slowdown
     21for complex regular expressions involving ambiguous subexpressions, case
     22insensitive matching or large Unicode character classes, icGrep
     23offers consistent high performance in all cases.   In many such cases,
     24icGrep is more than 100X faster than the competition, although a 5-10X ratio is
     25more common.   
     26
     27ICgrep has a significant compilation overhead that may dominate searches in
     28short files of less than 10MB in size.
     29
     30ICgrep also has the overhead of applying the Parabix transform to input text,
     31roughly about 1 CPU cycle per byte on modern processors.   However, once
     32transformed, the bitwise data parallel methods of icGrep often allow searches
     33to conclude within an additional 1 CPU cycle per byte, much faster than is
     34possible with a traditional byte-at-a-time loop.
     35
     36For simple, fixed string searches, grep and other utilities can employ Boyer-Moore type
     37algorithms to sometimes outperform icGrep.   But not by much.
     38
     39=== Performance Roadmap ===
     40
     41ICgrep 1.0 is a single-threaded search tool relying on widely available SSE2 technology
     42and the LLVM 3.5 compiler infrastructure.   However, our research work has shown that
     43we can accelerate icGrep using multithreading with pipeline parallelism, wider SIMD technologies
     44such as the 256-bit AVX2 technology on current Intel processors, and improved compiler support
     45for the Parabix primitives in our custom LLVM version.
    846
    947== Unicode Support ==
    1048
    11 icGrep accepts ASCII or UTF-8 input files and provides a full suite of
     49=== ICgrep 1.0 ===
     50
     51ICgrep 1.0 accepts ASCII or UTF-8 input files and provides a full suite of
    1252Unicode processing features meeting the requirements of Unicode Level 1 support
    1353of [http://www.unicode.org/reports/tr18/ Unicode Technical Standard #18].
    1454See [wiki:IcGrepUnicodeLevel1 Unicode Level 1 Support in icGrep] for details.
    1555
    16 == Linux comparison: icGrep vs egrep ==
     56=== ICgrep Unicode Roadmap ===
    1757
    18 Here's an interesting example of >100X speedup vs. egrep.
     58We are pleased to be moving forward with implementation of Unicode Level 2 support
     59within icGrep with the support of a Google Faculty Research Award.
     60
     61== ICgrep Downloads ==
     62
     63Browse the [source:tags/icgrep1.0 icgrep 1.0 source] code!
     64
     65Download precompiled icgrep software from [icgrep.com http://icgrep.com].
     66
     67Download and build icgrep 1.0 (Linux or Mac OS X) using svn.
    1968
    2069{{{
    21 perf stat ./icgrep -c '[ ](0x)?([a-fA-F0-9][a-fA-F0-9])+[.:,?! ]' ../performance/data/howto
    22 130243
    23 
    24  Performance counter stats for './icgrep -c [ ](0x)?([a-fA-F0-9][a-fA-F0-9])+[.:,?! ] ../performance/data/howto':
    25 
    26          33.084992 task-clock                #    0.989 CPUs utilized         
    27                  6 context-switches          #    0.000 M/sec                 
    28                  0 CPU-migrations            #    0.000 M/sec                 
    29             12,132 page-faults               #    0.367 M/sec                 
    30        120,678,424 cycles                    #    3.648 GHz                     [75.94%]
    31         47,567,214 stalled-cycles-frontend   #   39.42% frontend cycles idle    [84.44%]
    32         40,414,969 stalled-cycles-backend    #   33.49% backend  cycles idle    [75.91%]
    33        182,642,376 instructions              #    1.51  insns per cycle       
    34                                              #    0.26  stalled cycles per insn [87.96%]
    35         14,480,982 branches                  #  437.690 M/sec                   [87.99%]
    36            509,098 branch-misses             #    3.52% of all branches         [80.41%]
    37 
    38        0.033460665 seconds time elapsed
     70svn co http://parabix.costar.sfu.ca/svn/tags/icgrep1.0
    3971}}}
    4072
     73Alternatively, download the development version.
     74
    4175{{{
    42 perf stat egrep -c '[ ](0x)?([a-fA-F0-9][a-fA-F0-9])+[.:,?! ]' ../performance/data/howto
    43 130243
    44 
    45  Performance counter stats for 'egrep -c [ ](0x)?([a-fA-F0-9][a-fA-F0-9])+[.:,?! ] ../performance/data/howto':
    46 
    47        4019.404953 task-clock                #    0.996 CPUs utilized         
    48                342 context-switches          #    0.000 M/sec                 
    49                  5 CPU-migrations            #    0.000 M/sec                 
    50                321 page-faults               #    0.000 M/sec                 
    51     14,845,377,498 cycles                    #    3.693 GHz                     [83.31%]
    52      2,246,155,037 stalled-cycles-frontend   #   15.13% frontend cycles idle    [83.34%]
    53        699,469,381 stalled-cycles-backend    #    4.71% backend  cycles idle    [66.69%]
    54     36,936,781,254 instructions              #    2.49  insns per cycle       
    55                                              #    0.06  stalled cycles per insn [83.34%]
    56      6,793,009,741 branches                  # 1690.054 M/sec                   [83.34%]
    57         20,084,331 branch-misses             #    0.30% of all branches         [83.34%]
    58 
    59        4.034369628 seconds time elapsed
     76svn co http://parabix.costar.sfu.ca/svn/icGREP/icgrep-devel
    6077}}}
    6178
    62 Browse the [source:icGREP/icgrep-devel/ source] code!