Changeset 3466 for docs/Working/re/ppoppre.tex
 Timestamp:
 Sep 12, 2013, 12:09:26 PM (6 years ago)
 File:

 1 edited
Legend:
 Unmodified
 Added
 Removed

docs/Working/re/ppoppre.tex
r3464 r3466 96 96 approach and its implementation on SIMD and GPU architectures. 97 97 This approach relies on a bitwise data parallel view of text 98 streams as well as a surprising use of addition to implement99 matching operations. The closest previous work is that100 underlying bitparallel XML parsing using 128bit SSE2 SIMD98 streams as well as a surprising use of addition to match 99 runs of characters in a single step. The closest previous 100 work is that underlying bitparallel XML parsing using 128bit SSE2 SIMD 101 101 technology together with a parallel scanning primitive also 102 102 based on addition \cite{cameron2011parallel}. … … 108 108 addition technique involving a further application of MatchStar 109 109 that enables us to scale the technique to $n$bit addition 110 in $log_{64} n$ steps. 110 in $log_{64} n$ steps. We ultimately apply this technique, 111 for example, to perform 112 synchronized 4096bit addition on GPU warps of 64 threads. 111 113 112 114 There is also a strong keyword match between the bitparallel 113 data streams used in our approach and the bitparallelism of 114 regular expression using the bitap and WuManber NFA 115 (nondeterministic finite automata) algorithms \cite{}. 115 data streams used in our approach and the bitparallelism 116 used for NFA state transitions in the classical algorithms of 117 Wu and Manber \cite{wu1992agrep}, BaezYates and Gonnet \cite{baeza1992new} 118 and Navarro and Raffinot \cite{navarro1998bit}. 116 119 However those algorithms use bitparallelism in a fundamentally 117 different way: to represent all possible current NFA states 118 as a bit vector and to perform byteatatime transitions 119 using the bitwise operations. Nevertheless, the agrep and 120 nrgrep programs implemented using these techniques remain 121 among the strongest competitors in overall performance 122 to our implementations. 123 120 different way: representing all possible current NFA states 121 as a bit vector and performing parallel transitions to a new 122 set of states using table lookups and bitwise logic. Whereas 123 our approach can match multiple characters per step, bitparallel 124 NFA algorithms proceed through the input one byte at a time. 125 Nevertheless, the agrep \cite{wu1992agrep} and 126 nrgrep \cite{navarro2000} programs implemented using these techniques remain 127 among the strongest competitors in regular expression matching 128 performance, so we include them in our comparative evaluation. 124 129 125 130 126 131 The remainder of this paper is organized as follows. 127 Section \ref{sec:bitwise} 132 Section \ref{sec:unbounded} presents our basic algorithm using 133 a model of unbounded bitwise data parallelism. 128 134 Section \ref{sec:regexp} 129 135 Section \ref{sec:analysis} … … 134 140 135 141 136 \section{ Unbounded Bitwise Data Parallelism}\label{sec:bitwise}142 \section{An Unbounded Model for Bitwise Regular Expression Matching}\label{sec:unbounded} 137 143 138 144 Whereas all traditional regular expression matching engines are … … 158 164 UTF16. In the case of a byte stream, the first step is to transpose 159 165 the byte stream into eight parallel bit streams, such that bit stream 160 $i$ comprises the $i $\thbit of each byte. These streams form166 $i$ comprises the $i^\text{th}$ bit of each byte. These streams form 161 167 a set of basis bit streams from which many other parallel bit 162 168 streams can be calculated, such as character class bit
Note: See TracChangeset
for help on using the changeset viewer.