Changeset 3522


Ignore:
Timestamp:
Sep 18, 2013, 4:30:04 PM (6 years ago)
Author:
ksherdy
Message:

General cleanup and editting of portions of the intro section.

Location:
docs/Working/re
Files:
3 edited

Legend:

Unmodified
Added
Removed
  • docs/Working/re/ppopp-re.tex

    r3516 r3522  
    9191Thompson \cite{thompson1968} is credited with the first construction to convert regular expressions
    9292to nondeterministic finite automata (NFA).
    93 Following Thompson's approach, a regular expression of length $m$ is first converted
    94 to an NFA with $O(m)$ nodes. It is then possible to search a text of length $n$ using the
     93Following Thompson's approach, a regular expression of length $m$ is converted
     94to an NFA with $O(m)$ states. It is then possible to search a text of length $n$ using the
    9595NFA in worst case $O(mn)$ time. Often, a more efficient choice
    96 is to convert the NFA into a DFA. A DFA has only a single active state at any time
    97 in the matching process and
    98 hence it is possible to search a text at of length $n$ in $O(n)$ time.
    99 However, it is well known that the conversion of an NFA to an equivalent DFA may result
    100 in state explosion. That is, the number of resultant DFA states may increase exponentially.
    101 In \cite{baeza1992new}  text searching was
    102 proposed based on bit-parallelism.
    103 The technique takes advantage of the intrinsic parallelism of bitwise operations
    104 within a computer word. Given a $w$-bit word, the Shift-Or algorithm \cite{Baeza-yates_anew} algorithm uses the
    105 bit-parallel approach to
    106 simulate an NFA in $O(nm/w)$ worst-case time.
    107 
    108 A disadvantage of the bit-parallel Shift-Or pattern matching approach
    109 in comparison to simple string matching algorithms is an inability to skip input characters.
    110 For example, the Boyer-Moore family of algorithms \cite{boyer1977fast} skip input characters
    111 to achieve sublinear times in the average case. Backward Dawg Matching
    112 (BDM) string matching algorithms \cite{crochemore1994text} based on suffix automata are able to skip characters.
     96is to convert an NFA into a DFA. A DFA has a single active state at any time
     97throughout the matching process and
     98hence it is possible to search a text at of length $n$ in $O(n)$ time
     99\footnote{It is well known that the conversion of an NFA to an equivalent DFA may result
     100in state explosion. That is, the number of resultant DFA states may increase exponentially.}.
     101
     102A significant portion of the research in fast regular expression matching can be
     103regarded as the ``quest for efficient automata'' \cite{navarro98fastand}.
     104In \cite{baeza1992new}, Baeza-Yates and Gonnet
     105presented a new approach to string search based on bit-parallelism.
     106This technique takes advantage of the intrinsic parallelism of bitwise operations
     107within a computer word.
     108Given a $w$-bit word, the number of operations that a string search algorithms
     109performs can be reduced by a factor $w$.
     110Using this fact, the Shift-Or algorithm simulates an NFA using
     111bitwise operations and achieves $O(nm/w)$ worst-case time \cite{navarro2000}.
     112A disadvantage of the bit-parallel Shift-Or pattern matching approach
     113is an inability to skip input characters.
     114Simple string matching algorithms,
     115such as the Boyer-Moore family of algorithms \cite{boyer1977fast, horspool1980practical} skip input characters
     116to achieve sublinear times in the average case.
     117Backward Dawg Matching (BDM) string matching algorithms \cite{crochemore1994text}
     118based on suffix automata are able to skip characters.
    113119The Backward Nondeterministic Dawg Matching (BNDM) pattern matching algorithm \cite{wu1992fast}
    114 combines the bit-parallel advantages of Shift-Or and with the character skipping advantages of the BDM algorithm.
    115 The nrgrep pattern matching tool is built over the BNDM algorithm,
    116 and hence the name nrgrep \cite{navarro2000}.
    117 
    118 There has been considerable interest in using parallelization techniques
    119 to improve the performance of regular expression matching on parallel hardware
     120combines the bit-parallel advantages of the Shift-Or approach
     121with the character skipping property of BDM algorithms. The nrgrep pattern matching tool,
     122is built over the BNDM algorithm. Prior to the data-parallel approach to simultaneous
     123processing of data stream elements as discussed herein, nrgrep
     124was by far the fastest grep tool
     125for matching complex patterns and achieved similar performance
     126to that of the fastest existing string
     127matching tools for simple patterns \cite{navarro2000}.
     128
     129There has been considerable
     130interest in accelerating regular expression matching
     131on parallel hardware
    120132such as multi-core processors (CPUs), graphics processing units (GPUs),
    121133field-programmable gate arrays (FPGAs), and specialized architectures such as
     
    128140performs small-ruleset regular expression matching at a rate of
    1291412.88 Gbps per chip on Intel Xeon E5472 hardware \cite{pasetto2010}.
    130 Naghmouchi et al. demonstrated that the Aho-Corasick (AC)
     142Naghmouchi et al. \cite{scarpazza2011top, naghmouchi2010} demonstrated that the Aho-Corasick (AC)
    131143string matching algorithm \cite{aho1975} is well suited for parallel
    132 implementation on multi-core CPUs, GPUs and the Cell BE \cite{scarpazza2011top, naghmouchi2010}.
    133 On each hardware, both thread-level parallelism (additional cores) and data-level parallelism
    134 (wide SIMD units) are leveraged for performance.
    135 Salapura et. al., advocated the use of vector-style processing for regular expressions
     144implementation on multi-core CPUs, GPUs and the Cell BE.
     145On each hardware, both thread-level parallelism (cores) and data-level parallelism
     146(SIMD units) were leveraged for performance.
     147Salapura et. al. \cite{salapura2012accelerating} advocated the use of vector-style processing for regular expressions
    136148in business analytics applications and leveraged the SIMD hardware available
    137 on multi-core processors to acheive a speedup of better than 1.8 over a
    138 range of data sizes of interest \cite{salapura2012accelerating}.
     149on multi-core processors to acheive a speedup of greater than 1.8 over a
     150range of data sizes of interest.
    139151%Cell
    140152In \cite{scarpazza2008}, Scarpazza and Russell presented a SIMD tokenizer
    141 that delivered 1.00–1.78 Gbps on a single
     153that delivered 1.00--1.78 Gbps on a single
    142154Cell BE chip and extended this approach for emulation on the Intel Larrabee
    143155instruction set \cite{scarpazza2009larrabee}.
    144156On the Cell BE, Scarpazza \cite{scarpazza2009cell} described a pattern matching
    145157implementation that delivered a throughput of 40
    146 Gbps for a small dictionary of approximately 100 patterns, and a throughput of 3.3-3.4
     158Gbps for a small dictionary of approximately 100 patterns and a throughput of 3.3-3.4
    147159Gbps for a larger dictionary of thousands of patterns. Iorio and van Lunteren \cite{iorio2008}
    148 presented a string matching implementation for automata that achieves
     160presented a string matching implementation for automata that achieved
    1491614 Gbps on the Cell BE.
    150162% GPU
  • docs/Working/re/reference.bib

    r3490 r3522  
    323323 address = {Berlin, Heidelberg},
    324324}
     325
     326@article{navarro98fastand,
     327    author = {Gonzalo Navarro and Mathieu Raffinot},
     328    title = {Fast and Flexible String Matching by Combining Bit-parallelism and Suffix Automata},
     329    journal = {ACM JOURNAL OF EXPERIMENTAL ALGORITHMICS (JEA)},
     330    year = {1998},
     331    volume = {5},
     332    pages = {2000}
     333}
     334
     335@article{horspool1980practical,
     336  title={Practical fast searching in strings},
     337  author={Horspool, R Nigel},
     338  journal={Software: Practice and Experience},
     339  volume={10},
     340  number={6},
     341  pages={501--506},
     342  year={1980},
     343  publisher={Wiley Online Library}
     344}
Note: See TracChangeset for help on using the changeset viewer.