# Changeset 1298 for docs

Ignore:
Timestamp:
Aug 8, 2011, 2:45:08 PM (8 years ago)
Message:

Minor edits.

File:
1 edited

### Legend:

Unmodified
 r1186 Parabix1 processes source XML in a functionally equivalent manner as a traditional recursive descent XML parser. That is, Parabix1 moves sequentially through the source document, maintains a single parser cursor position, and parsers recursively and depth-first. Where Parabix1 differs from the traditional parser is that it scans for key markup characters using a series of bit streams. A bit stream is simply a sequence of $0$s and $1$s. A $1$-bit marks the postion of each key character in the corresponding source data stream. A single stream is generated for each of the key markup characters. In Parabix1, basis bit streams are used to generate character-class streams for key markup characters. Basis bit streams are defined as the set of bit streams that represent the transposed data format of the source XML byte data. In other words, $M$ bit source characters are represented in transposed representation using $M$ basis bit streams. Figure \ref{fig:BitstreamsExample} presents an example of the basis bit stream representation of 8-bit ASCII characters. $B_0 \ldots B_7$ are the individual bit streams. The $0$ bits in the bit streams are represented by periods as to emphasize the $1$ bits. In Parabix1, basis bit streams are used to generate character-class streams for key markup characters. Basis bit streams are defined as the set of bit streams that represent the transposed data format of the source XML byte data. In other words, $M$-bit source characters are represented in transposed representation using $M$ basis bit streams. Figure \ref{fig:BitstreamsExample} presents an example of the basis bit stream representation of 8-bit ASCII characters. $B_0 \ldots B_7$ are the individual bit streams. The $0$ bits in the bit streams are represented by periods as to emphasize the $1$ bits. \begin{figure}[h]