# Changeset 4553 for docs/Working

Ignore:
Timestamp:
May 14, 2015, 2:37:24 PM (4 years ago)
Message:

matching equations

Location:
docs/Working/icGrep
Files:
2 edited

### Legend:

Unmodified
 r4506 simultaneously.   A set of parallel bit streams is computed, with each bit position corresponding to one code-unit position within input character stream. Each bit stream may be considered an $L$-bit integer, where $L$ is the length of the input stream. {\em Character class streams}, such as {\tt [d]} for the stream that marks the position of d'' characters and {\tt [a-z]} for the stream of lower case process a {\em marker} bit stream identifies the full set of positions within the input data stream that match the regular expression to this point. Consider the following simplified definition of regular expressions A regular expression may be a character class $C$ or formed by composition.  If $R$ and $S$ are regular expressions, then the concatenation $RS$ is the regular expression standing for the set of all strings formed by concatenating one string from $R$ and one string from $S$, $R|S$ is the alternation standing for the union of the sets of strings from $R$ and $S$, and $R*$ is the Kleene-star closure consists of the set of strings formed from 0 or more occurrences of strings from $R$. Bitwise data parallel search for substring occurrences matching a regular expression $R$ proceeds as follows.  We start with a marker stream $m_0$ consisting of all ones, i.e., $m_0 = 2^{L} - 1$. Matches are then computed according to the following equations. \begin{eqnarray*} \mbox{\rm Match}(m, C) & = &  \mbox{\rm Advance}(\mbox{\rm CharClass}(C) \wedge m) \\ \mbox{\rm Match}(m, RS) & = &  \mbox{\rm Match}(\mbox{\rm Match}(m, R), S) \\ \mbox{\rm Match}(m, R|S) & = &  \mbox{\rm Match}(m, R) \vee \mbox{\rm Match}(m, S)) \\ \mbox{\rm Match}(m, C*) & = &  \mbox{\rm MatchStar}(m, \mbox{\rm CharClass}(C)) \\ \mbox{\rm Match}(m, R*) & = &  m \vee \mbox{\rm Match}(\mbox{\rm Match}(m, R) \wedge (\neg m), R*) \end{eqnarray*}