source: docs/Working/icXML/background-fundemental-differences.tex @ 3633

Last change on this file since 3633 was 2872, checked in by nmedfort, 7 years ago


File size: 1.4 KB
1\subsection {Sequential vs. Parallel Paradigm}
3Xerces---like all traditional XML parsers---processes XML documents sequentially.
4Each character is examined to distinguish between the
5XML-specific markup, such as a left angle bracket ``\verb`<`'', and the
6content held within the document. 
7As the parser progresses through the document, it alternates between markup scanning,
8validation and content processing modes.
10In other words, Xerces belongs to an equivalent class applications termed FSM applications\footnote{
11  Herein FSM applications are considered software systems whose behaviour is defined by the inputs,
12  current state and the events associated with transitions of states.}.
13Each state transition indicates the processing context of subsequent characters.
14Unfortunately, textual data tends to be unpredictable and any character could induce a state transition.
16Parabix-style XML parsers utilize a concept of layered processing.
17A block of source text is transformed into a set of lexical \bitstream{}s,
18which undergo a series of operations that can be grouped into logical layers,
19e.g., transposition, character classification, and lexical analysis.
20Each layer is pipeline parallel and require neither speculation nor pre-parsing stages\cite{HPCA2012}.
21To meet the API requirements of the document-ordered Xerces output,
22the results of the Parabix processing layers must be interleaved to produce the equivalent behaviour.
Note: See TracBrowser for help on using the repository browser.