Sep 13, 2013, 1:16:01 AM (6 years ago)

Added a brief history section. Partially added a review of state-of-the-art section.

1 edited


  • docs/Working/re/ppopp-re.tex

    r3467 r3468  
    8585of string patterns has a long history and
    8686remains a pervasive technique throughout computing applications today.
    87 {\em a brief history}
     87% {\em a brief history}
     88The origins of regular expression matching date back to automata theory
     89developed by Kleene in the 1950s \cite{kleene1951}.
     90Thompson \cite{thompson1968} is credited with the first construction to convert regular expressions
     91to nondeterministic finite automata (NFA).
     92Following Thompson's approach, a regular expression of length m is first converted
     93to an NFA with O(m) nodes. It is then possible to search a text of length n using the
     94NFA in worst case O(mn) time. Often, a more efficient choice
     95is to convert the NFA into a DFA. A DFA has only a single active state at any time
     96in the matching process and
     97hence it is possible to search a text at of length n in worst-case O(n) optimal.
     98However, it is well known that the conversion of an NFA to an equivalent DFA may result
     99in state explosion. That is, the number of resultant DFA states may increase exponentially.
     100In \cite{Baeza-yates_anew} a new approach to text searching was proposed based on bit-parallelism \cite{baeza1992new}.
     101This technique takes advantage of the intrinsic parallelism of bitwise operations
     102within a computer word. Given a w-bit word, the Shift-Or algorithm \cite{Baeza-yates_anew} algorithm uses the
     103bit-parallel approach to
     104simulate an NFA in O($nm/w$) worst-case time.
     106A disadvantage of the bit-parallel Shift-Or pattern matching approach
     107in comparison to simple string matching algorithms is an inability to skip input characters.
     108For example, the Boyer-Moore family of algorithms \cite{boyer1977fast} skip input characters
     109to achieve sublinear times in the average case. Backward Dawg Matching
     110(BDM) string matching algorithms \cite{crochemore1994text} based on suffix automata are able to skip characters.
     111The Backward Nondeterministic Dawg Matching (BNDM) pattern matching algorithm \cite{wu1992fast}
     112combines the bit-parallel advantages of Shift-Or and with the character skipping advantages of the BDM algorithm.
     113The nrgrep pattern matching tool is built over the BNDM algorithm,
     114and hence the name nrgrep \cite{navarro2000}.
    89116There has been considerable interest in using parallelization techniques
    90117to improve the performance of regular expression matching.
    91 {\em a brief review}  \cite{scarpazza2011top}
    92 GPUs\cite{lin2013accelerating}
    93 FPGAs\cite{lee2007high}
     118{\em a brief review} 
     120Scarpazza and Braudaway demonstrate that irregular,
     121control-flow dominated text processing algorithms can be efficiently
     122executed on multicore hardware \cite{scarpazza2008fast}.
     123In related work, Pasetto et al. present a flexible tool that
     124performs small-ruleset regular expression matching at a rate of
     1252.88 Gbps per chip on Intel Xeon E5472 hardware \cite{pasetto2010}.
     126Naghmouchi et al. demonstrate that the Aho-Corasick
     127string matching algorithm is well suited for implementation on commodity multicore, the
     128Cell Broadband Engine and GPGPU hardware \cite{scarpazza2011top, naghmouchi2010}. In each case, both thread-
     129level (additional cores) and data-level (SIMD units) hardware features are leveraged for performance.
     130On the Cell Broadband Engine, Scarpazza’s implementation delivers a throughput of 40
     131Gbps for a small dictionary size of approximately 100 patterns, and a throughput of 3.3-3.4
     132Gbps for a larger dictionary of tens of thousands of patterns.
     133Scarpazza and Russell present a SIMD tokenizer that delivers 1.00–1.78 Gbps on a single
     134Cell Broadband Engine chip \cite{scarpazza2008} and extend this approach for emulation on the Intel Larrabee
     135instruction set \cite{scarpazza2009larrabee}. Iorio and van Lunteren describe a string matching implementation for
     136automata that achieves 4 Gbps on the Cell Broadband Engine \cite{iorio2008}.
     137In more recent work,
    95 Cell BE\cite{scarpazza2011top}
     144% GPUs\cite{lin2013accelerating}
    99147Whereas the existing approaches to parallelization have been
Note: See TracChangeset for help on using the changeset viewer.