Ignore:
Timestamp:
Feb 22, 2014, 3:37:03 PM (5 years ago)
Author:
cameron
Message:

Introduce ambiguity into MatchStar? examples

File:
1 edited

Legend:

Unmodified
Added
Removed
  • docs/Working/re/re-main.tex

    r3634 r3636  
    280280\begin{center}
    281281\begin{tabular}{cr}\\
    282 input data  & \verb`a4534q--b29z---az---a4q--bca22z--`\\
     282input data  & \verb`a453z--b3z--az--a12949z--ca22z7--`\\
    283283$B_7$ & \verb`.................................`\\
    284 $B_6$ & \verb`1....1..1..1...11...1.1..111..1..`\\
     284$B_6$ & \verb`1...1..1.1..11..1.....1..11..1...`\\
    285285$B_5$ & \verb`111111111111111111111111111111111`\\
    286 $B_4$ & \verb`.11111...111....1....11.....111..`\\
    287 $B_3$ & \verb`......11..11111.1111...11.....111`\\
    288 $B_2$ & \verb`.11.1.11....111..111.1.11......11`\\
    289 $B_1$ & \verb`...1....11.1....1........11.111..`\\
    290 $B_0$ & \verb`1.11.111..1.1111.1111.111.11...11`\\
    291 \verb:[a]: & \verb`1..............1....1......1.....`\\
    292 \verb:[z]: & \verb`...........1....1.............1..`\\
    293 \verb:[0-9]: & \verb`.1111....11..........1......11...`
     286$B_4$ & \verb`.1111...11...1...111111....1111..`\\
     287$B_3$ & \verb`....111..111.111...1.1111....1.11`\\
     288$B_2$ & \verb`.11..11...11..11....1..11.....111`\\
     289$B_1$ & \verb`...11..111...1....1...1..1.1111..`\\
     290$B_0$ & \verb`1.11.11.1.111.1111.1.1.1111...111`\\
     291\verb:[a]: & \verb`1...........1...1.........1......`\\
     292\verb:[z9]: & \verb`....1....1...1.....1.11......1...`\\
     293\verb:[0-9]: & \verb`.111....1........11111.....11.1..`
    294294\end{tabular}
    295295
     
    317317transposed representation, and the character class bit streams
    318318\verb:[a]:,
    319 \verb:[z]:, and
     319\verb:[z9]:, and
    320320\verb:[0-9]:
    321321that may be computed from the basis bit streams using bitwise logic.
     
    327327\begin{center}
    328328\begin{tabular}{cr}\\
    329 input data  & \verb`a4534q--b29z---az---a4q--bca22z--`\\
    330 $M_1$ & \verb`.1..............1....1......1....`\\
    331 $M_2$ & \verb`.11111..........1....11.....111..`\\
    332 $M_3$ & \verb`.................1.............1.`
     329input data  & \verb`a453z--b3z--az--a12949z--ca22z7--`\\
     330$M_1$ & \verb`.1...........1...1.........1.....`\\
     331$M_2$ & \verb`.1111........1...111111....111...`\\
     332$M_3$ & \verb`.....1........1.....1.11......1..`
    333333\end{tabular}
    334334
    335335\end{center}
    336 \caption{Marker Streams in Matching {\tt a[0-9]*z}}
     336\caption{Marker Streams in Matching {\tt a[0-9]*[z9]}}
    337337\label{fig:streams2}
    338338\end{figure}
     
    342342the problem of searching the input stream of Figure \ref{fig:streams}
    343343to finding occurrence of strings matching
    344 the regular expression \verb:a[0-9]*z:.
     344the regular expression \verb:a[0-9]*[z9]:. 
     345Note that this is an ambiguous regular expression, which could match
     346texts such as \verb:a12949z: in multiple ways.
    345347The matching process involves the concept of {\em marker streams}, that
    346348is streams that mark the positions of current matches during the
     
    351353reachable from positions marked by $M_1$ by further matching zero or
    352354more digits (\verb:[0-9]*:) and finally $M_3$ the stream
    353 marking positions after a final \verb:z: has been found.
     355marking positions after a final \verb:z: or \verb:9: has been found.
    354356Without describing the details of how these streams are computed
    355357for the time being, Figure \ref{fig:streams2} shows what each
    356358of these streams should be for our example matching problem.
    357 Note our convention that a marker stream contains a 1 bit
     359Our convention that a marker stream contains a 1 bit
    358360at the next character position to be matched, that is,
    359361immediately past the last position that was matched.
     362Note that all three matches from the third occurrence of \verb:a:
     363are correctly marked in $M_3$.
    360364
    361365
     
    366370\begin{center}
    367371\begin{tabular}{cr}\\
    368 input data  & \verb`a4534q--b29z---az---a4q--bca22z--`\\
    369 $M_1$ & \verb`.1..............1....1......1....`\\
    370 $D = \text{\tt [0-9]}$ & \verb`.1111....11..........1......11...`\\
    371 $T_0 = M_1 \wedge D$ & \verb`.1...................1......1....`\\
    372 $T_1 = T_0 + D$ & \verb`.....1...11...........1.......1..`\\
    373 $T_2 = T_1 \oplus D$ & \verb`.11111...............11.....111..`\\
    374 $M_2 = T_2 \, | \, M_1$ & \verb`.11111..........1....11.....111..`
     372input data  & \verb`a453z--b3z--az--a12949z--ca22z7--`\\
     373$M_1$ & \verb`.1...........1...1.........1.....`\\
     374$D = \text{\tt [0-9]}$ & \verb`.111....1........11111.....11.1..`\\
     375$T_0 = M_1 \wedge D$ & \verb`.1...............1.........1.....`\\
     376$T_1 = T_0 + D$ & \verb`....1...1.............1......11..`\\
     377$T_2 = T_1 \oplus D$ & \verb`.1111............111111....111...`\\
     378$M_2 = T_2 \, | \, M_1$ & \verb`.1111........1...111111....111...`
    375379\end{tabular}
    376380
     
    402406all possible positions that can be reached by 0 or more occurrences
    403407of characters in class $C$ from each position in $M$. 
     408
     409MatchStar differs from ScanThru of the Parabix tool chain in that it
     410finds all matches, not just the longest match.  This is necessary
     411for general matching involving possibly ambiguous regular
     412expressions.   
    404413
    405414\input{compilation}
Note: See TracChangeset for help on using the changeset viewer.