source: docs/Working/icXML/background-fundemental-differences.tex @ 3633

Last change on this file since 3633 was 2872, checked in by nmedfort, 7 years ago

edits

File size: 1.4 KB
Line 
1\subsection {Sequential vs. Parallel Paradigm}
2
3Xerces---like all traditional XML parsers---processes XML documents sequentially.
4Each character is examined to distinguish between the
5XML-specific markup, such as a left angle bracket ``\verb`<`'', and the
6content held within the document. 
7As the parser progresses through the document, it alternates between markup scanning,
8validation and content processing modes.
9
10In other words, Xerces belongs to an equivalent class applications termed FSM applications\footnote{
11  Herein FSM applications are considered software systems whose behaviour is defined by the inputs,
12  current state and the events associated with transitions of states.}.
13Each state transition indicates the processing context of subsequent characters.
14Unfortunately, textual data tends to be unpredictable and any character could induce a state transition.
15
16Parabix-style XML parsers utilize a concept of layered processing.
17A block of source text is transformed into a set of lexical \bitstream{}s,
18which undergo a series of operations that can be grouped into logical layers,
19e.g., transposition, character classification, and lexical analysis.
20Each layer is pipeline parallel and require neither speculation nor pre-parsing stages\cite{HPCA2012}.
21To meet the API requirements of the document-ordered Xerces output,
22the results of the Parabix processing layers must be interleaved to produce the equivalent behaviour.
Note: See TracBrowser for help on using the repository browser.