Changeset 4447
 Timestamp:
 Feb 2, 2015, 12:40:45 PM (4 years ago)
 File:

 1 edited
Legend:
 Unmodified
 Added
 Removed

docs/Working/icGrep/paradigm.tex
r4446 r4447 1 1 \section{Bitwise Data Parallel Paradigm and Methods}\label{sec:seqpar} 2 2 3 The introduction of the method of bitwise data parallelism adds a 4 fundamentally new paradigm for regular expression matching to 5 complement the traditional approaches using DFAs, NFAs or backtracking. 6 Whereas the traditional approaches are all sequential in nature, 7 performing some form of state transition processing on one input 8 code unit at a time, the bitwise data parallel approach takes a 9 conceptually parallel view of the input stream. Rather than 10 parallelizing existing sequential approaches, the method introduces 11 parallelism from the ground up. 12 13 As a new paradigm, there is much research to do in building upon 14 the basic framework, characterizing performance depending on 15 input patterns and texts, developing methods for special cases and 16 so on. Here we make some small contributions to the general 17 framework and methods before moving on to discuss Unicode issues. 18 19 One important aspect of the bitwise data parallel approach 20 is transposition of input data. In previous work, the Parabix transform 21 has been reported as imposing an amoritized cost of 1.1 CPU cycles/input byte, 22 when working with SSE2 \cite{}. This is consistent with {\tt icGrep} 23 results. However, the cost of the cost of this transposition can 24 be hidden through multithreading and pipeline parallelism, having one 25 core perform work ahead performing transposition, while another 26 comes behind to perform matching. We discuss this further in 27 Section \ref{architecture}. 28 29 The treatment of character and character class recognition is 30 another area of fundamental difference between the traditional 31 sequential methods and the bitwise approach. It is here that 32 the clearest separation of the sequential and parallel approaches 33 occurs. In the sequential approaches, characters are processed 34 sequentially with table lookups or jump tables used for each 35 transition. In the bitwise data parallel approach, all calculations 36 of character class bit streams are done completely in parallel using 37 bitwise logic. 38 39 In the bitwise paradigm, the MatchStar operation elegantly finds 40 all possible matches for Kleene* repetitions of characters or 41 character classes using a single longstream addition operation. 42 Interestingly, the MatchStar operation also has application to 43 parallelized longstream addition\cite{}, as well as use in Myers 44 bitparallel edit distance algorithm\cite{}. In the next section, 45 we show how MatchStar can be extended for UTF8 sequences. 46 47 We have incorporated an elegant technique for bounded repetitions 48 in icGrep. This technique allows the matches to $C{m,n}$ for some 49 character class $C$, lower bound $m$ and upper bound $n$ to be determined 50 in $\lceil{\log_2 m}\rceil + \lceil{\log_2{nm}}\rceil$ steps. 51 Let $C_k$ be the bit stream identifying positions at which the $k$ 52 prior input bytes are all in $C$. Then the observation that 53 $C_2k = C_k \wedge (C_k \mbox{\tt <<} k)$ enables positions meeting the lower 54 bound to be determined. An upper bound $k$ similarly involves excluding 55 those positions not within $k$ positions of the pending markers 56 (from the previous match step). 57 58 A final general technique worth mentioning is that related to 59 input skipping. For sequential matching, 60 the BoyerMoore method is well known for the 61 possible skipping through input positions of up to the length of the 62 search string for fixedstring search. NRgrep \cite{navarro2001nr} 63 extends this skipping to regular expression search using the BNDM 64 (backward nondeterminstic dawg matching) method. Is there an 65 inputskipping method for the bitwise parallel paradigm? The answer 66 is yes: whenever the bit vector of match positions in play for the current 67 input block reduce to all zero, the remainder of the pattern can be 68 skipped for processing the block. This method has been implemented in icGrep. 69 70 71 72
Note: See TracChangeset
for help on using the changeset viewer.