Changeset 4447 for docs

Feb 2, 2015, 12:40:45 PM (3 years ago)

Section 3

1 edited


  • docs/Working/icGrep/paradigm.tex

    r4446 r4447  
    11\section{Bitwise Data Parallel Paradigm and Methods}\label{sec:seqpar}
     3The introduction of the method of bitwise data parallelism adds a
     4fundamentally new paradigm for regular expression matching to
     5complement the traditional approaches using DFAs, NFAs or backtracking.
     6Whereas the traditional approaches are all sequential in nature,
     7performing some form of state transition processing on one input
     8code unit at a time, the bitwise data parallel approach takes a
     9conceptually parallel view of the input stream.   Rather than
     10parallelizing existing sequential approaches, the method introduces
     11parallelism from the ground up.   
     13As a new paradigm, there is much research to do in building upon
     14the basic framework, characterizing performance depending on
     15input patterns and texts, developing methods for special cases and
     16so on.   Here we make some small contributions to the general
     17framework and methods before moving on to discuss Unicode issues.
     19One important aspect of the bitwise data parallel approach
     20is transposition of input data.   In previous work, the Parabix transform
     21has been reported as imposing an amoritized cost of 1.1 CPU cycles/input byte,
     22when working with SSE2 \cite{}.  This is consistent with {\tt icGrep}
     23results.  However, the cost of the cost of this transposition can
     24be hidden through multithreading and pipeline parallelism, having one
     25core perform work ahead performing transposition, while another
     26comes behind to perform matching.  We discuss this further in
     27Section \ref{architecture}.
     29The treatment of character and character class recognition is
     30another area of fundamental difference between the traditional
     31sequential methods and the bitwise approach.   It is here that
     32the clearest separation of the sequential and parallel approaches
     33occurs.   In the sequential approaches, characters are processed
     34sequentially with table lookups or jump tables used for each
     35transition.   In the bitwise data parallel approach, all calculations
     36of character class bit streams are done completely in parallel using
     37bitwise logic. 
     39In the bitwise paradigm, the MatchStar operation elegantly finds
     40all possible matches for Kleene-* repetitions of characters or
     41character classes using a single long-stream addition operation.
     42Interestingly, the MatchStar operation also has application to
     43parallelized long-stream addition\cite{}, as well as use in Myers
     44bit-parallel edit distance algorithm\cite{}.  In the next section,
     45we show how MatchStar can be extended for UTF-8 sequences.
     47We have incorporated an elegant technique for bounded repetitions
     48in icGrep.   This technique allows the matches to $C{m,n}$ for some
     49character class $C$, lower bound $m$ and upper bound $n$ to be determined
     50in $\lceil{\log_2 m}\rceil + \lceil{\log_2{n-m}}\rceil$ steps. 
     51Let $C_k$ be the bit stream identifying positions at which the $k$
     52prior input bytes are all in $C$.   Then the observation that
     53$C_2k = C_k \wedge (C_k \mbox{\tt <<} k)$ enables positions meeting the lower
     54bound to be determined.   An upper bound $k$ similarly involves excluding
     55those positions not within $k$ positions of the pending markers
     56(from the previous match step).
     58A final general technique worth mentioning is that related to
     59input skipping.  For sequential matching,
     60the Boyer-Moore method is well known for the
     61possible skipping through input positions of up to the length of the
     62search string for fixed-string search.   NR-grep \cite{navarro2001nr}
     63extends this skipping to regular expression search using the BNDM
     64(backward non-determinstic dawg matching) method.  Is there an
     65input-skipping method for the bitwise parallel paradigm?  The answer
     66is yes: whenever the bit vector of match positions in play for the current
     67input block reduce to all zero, the remainder of the pattern can be
     68skipped for processing the block.   This method has been implemented in icGrep.
Note: See TracChangeset for help on using the changeset viewer.