Changeset 3522 for docs/Working/re/ppoppre.tex
 Timestamp:
 Sep 18, 2013, 4:30:04 PM (6 years ago)
 File:

 1 edited
Legend:
 Unmodified
 Added
 Removed

docs/Working/re/ppoppre.tex
r3516 r3522 91 91 Thompson \cite{thompson1968} is credited with the first construction to convert regular expressions 92 92 to nondeterministic finite automata (NFA). 93 Following Thompson's approach, a regular expression of length $m$ is firstconverted94 to an NFA with $O(m)$ nodes. It is then possible to search a text of length $n$ using the93 Following Thompson's approach, a regular expression of length $m$ is converted 94 to an NFA with $O(m)$ states. It is then possible to search a text of length $n$ using the 95 95 NFA in worst case $O(mn)$ time. Often, a more efficient choice 96 is to convert the NFA into a DFA. A DFA has only a single active state at any time 97 in the matching process and 98 hence it is possible to search a text at of length $n$ in $O(n)$ time. 99 However, it is well known that the conversion of an NFA to an equivalent DFA may result 100 in state explosion. That is, the number of resultant DFA states may increase exponentially. 101 In \cite{baeza1992new} text searching was 102 proposed based on bitparallelism. 103 The technique takes advantage of the intrinsic parallelism of bitwise operations 104 within a computer word. Given a $w$bit word, the ShiftOr algorithm \cite{Baezayates_anew} algorithm uses the 105 bitparallel approach to 106 simulate an NFA in $O(nm/w)$ worstcase time. 107 108 A disadvantage of the bitparallel ShiftOr pattern matching approach 109 in comparison to simple string matching algorithms is an inability to skip input characters. 110 For example, the BoyerMoore family of algorithms \cite{boyer1977fast} skip input characters 111 to achieve sublinear times in the average case. Backward Dawg Matching 112 (BDM) string matching algorithms \cite{crochemore1994text} based on suffix automata are able to skip characters. 96 is to convert an NFA into a DFA. A DFA has a single active state at any time 97 throughout the matching process and 98 hence it is possible to search a text at of length $n$ in $O(n)$ time 99 \footnote{It is well known that the conversion of an NFA to an equivalent DFA may result 100 in state explosion. That is, the number of resultant DFA states may increase exponentially.}. 101 102 A significant portion of the research in fast regular expression matching can be 103 regarded as the ``quest for efficient automata'' \cite{navarro98fastand}. 104 In \cite{baeza1992new}, BaezaYates and Gonnet 105 presented a new approach to string search based on bitparallelism. 106 This technique takes advantage of the intrinsic parallelism of bitwise operations 107 within a computer word. 108 Given a $w$bit word, the number of operations that a string search algorithms 109 performs can be reduced by a factor $w$. 110 Using this fact, the ShiftOr algorithm simulates an NFA using 111 bitwise operations and achieves $O(nm/w)$ worstcase time \cite{navarro2000}. 112 A disadvantage of the bitparallel ShiftOr pattern matching approach 113 is an inability to skip input characters. 114 Simple string matching algorithms, 115 such as the BoyerMoore family of algorithms \cite{boyer1977fast, horspool1980practical} skip input characters 116 to achieve sublinear times in the average case. 117 Backward Dawg Matching (BDM) string matching algorithms \cite{crochemore1994text} 118 based on suffix automata are able to skip characters. 113 119 The Backward Nondeterministic Dawg Matching (BNDM) pattern matching algorithm \cite{wu1992fast} 114 combines the bitparallel advantages of ShiftOr and with the character skipping advantages of the BDM algorithm. 115 The nrgrep pattern matching tool is built over the BNDM algorithm, 116 and hence the name nrgrep \cite{navarro2000}. 117 118 There has been considerable interest in using parallelization techniques 119 to improve the performance of regular expression matching on parallel hardware 120 combines the bitparallel advantages of the ShiftOr approach 121 with the character skipping property of BDM algorithms. The nrgrep pattern matching tool, 122 is built over the BNDM algorithm. Prior to the dataparallel approach to simultaneous 123 processing of data stream elements as discussed herein, nrgrep 124 was by far the fastest grep tool 125 for matching complex patterns and achieved similar performance 126 to that of the fastest existing string 127 matching tools for simple patterns \cite{navarro2000}. 128 129 There has been considerable 130 interest in accelerating regular expression matching 131 on parallel hardware 120 132 such as multicore processors (CPUs), graphics processing units (GPUs), 121 133 fieldprogrammable gate arrays (FPGAs), and specialized architectures such as … … 128 140 performs smallruleset regular expression matching at a rate of 129 141 2.88 Gbps per chip on Intel Xeon E5472 hardware \cite{pasetto2010}. 130 Naghmouchi et al. demonstrated that the AhoCorasick (AC)142 Naghmouchi et al. \cite{scarpazza2011top, naghmouchi2010} demonstrated that the AhoCorasick (AC) 131 143 string matching algorithm \cite{aho1975} is well suited for parallel 132 implementation on multicore CPUs, GPUs and the Cell BE \cite{scarpazza2011top, naghmouchi2010}.133 On each hardware, both threadlevel parallelism ( additionalcores) and datalevel parallelism134 ( wide SIMD units) are leveraged for performance.135 Salapura et. al. ,advocated the use of vectorstyle processing for regular expressions144 implementation on multicore CPUs, GPUs and the Cell BE. 145 On each hardware, both threadlevel parallelism (cores) and datalevel parallelism 146 (SIMD units) were leveraged for performance. 147 Salapura et. al. \cite{salapura2012accelerating} advocated the use of vectorstyle processing for regular expressions 136 148 in business analytics applications and leveraged the SIMD hardware available 137 on multicore processors to acheive a speedup of better than 1.8 over a138 range of data sizes of interest \cite{salapura2012accelerating}.149 on multicore processors to acheive a speedup of greater than 1.8 over a 150 range of data sizes of interest. 139 151 %Cell 140 152 In \cite{scarpazza2008}, Scarpazza and Russell presented a SIMD tokenizer 141 that delivered 1.00 â1.78 Gbps on a single153 that delivered 1.001.78 Gbps on a single 142 154 Cell BE chip and extended this approach for emulation on the Intel Larrabee 143 155 instruction set \cite{scarpazza2009larrabee}. 144 156 On the Cell BE, Scarpazza \cite{scarpazza2009cell} described a pattern matching 145 157 implementation that delivered a throughput of 40 146 Gbps for a small dictionary of approximately 100 patterns ,and a throughput of 3.33.4158 Gbps for a small dictionary of approximately 100 patterns and a throughput of 3.33.4 147 159 Gbps for a larger dictionary of thousands of patterns. Iorio and van Lunteren \cite{iorio2008} 148 presented a string matching implementation for automata that achieve s160 presented a string matching implementation for automata that achieved 149 161 4 Gbps on the Cell BE. 150 162 % GPU
Note: See TracChangeset
for help on using the changeset viewer.