Changeset 4448 for docs

Ignore:
Timestamp:
Feb 2, 2015, 1:49:07 PM (4 years ago)
Message:

References

Location:
docs/Working/icGrep
Files:
3 edited

Legend:

Unmodified
 r4447 One important aspect of the bitwise data parallel approach is transposition of input data.   In previous work, the Parabix transform has been reported as imposing an amoritized cost of 1.1 CPU cycles/input byte, when working with SSE2 \cite{}.  This is consistent with {\tt icGrep} has been reported as imposing an amoritized cost of 1 CPU cycle/input byte, when working with SSE2 \cite{lin2012parabix}.  This is consistent with {\tt icGrep} results.  However, the cost of the cost of this transposition can be hidden through multithreading and pipeline parallelism, having one core perform work ahead performing transposition, while another comes behind to perform matching.  We discuss this further in Section \ref{architecture}. Section \ref{sec:architecture}. The treatment of character and character class recognition is character classes using a single long-stream addition operation. Interestingly, the MatchStar operation also has application to parallelized long-stream addition\cite{}, as well as use in Myers bit-parallel edit distance algorithm\cite{}.  In the next section, parallelized long-stream addition\cite{cameron2014bitwise}, as well as use in Myers bit-parallel edit distance algorithm\cite{myers1999fast}.  In the next section, we show how MatchStar can be extended for UTF-8 sequences. Let $C_k$ be the bit stream identifying positions at which the $k$ prior input bytes are all in $C$.   Then the observation that $C_2k = C_k \wedge (C_k \mbox{\tt <<} k)$ enables positions meeting the lower $C_{2k} = C_k \wedge (C_k \mbox{\tt <<} k)$ enables positions meeting the lower bound to be determined.   An upper bound $k$ similarly involves excluding those positions not within $k$ positions of the pending markers those positions not within $k$ of the pending markers (from the previous match step).