Changeset 1010 for docs/PACT2011


Ignore:
Timestamp:
Mar 25, 2011, 5:58:35 PM (8 years ago)
Author:
ksherdy
Message:

Minor edit.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • docs/PACT2011/02-background.tex

    r1009 r1010  
    3838% However, textual data tends to consist of variable-length items in generally unpredictable patterns \cite{Cameron2010}.
    3939
    40 Traditional XML parsers process XML sequentially a single byte-at-a-time. Following this approach, an XML parser processes a source document serially, from the first to the last byte in the source file in a top-down manner. Each character of the source text is examined in turn to distinguish between the XML-specific markup, such as an opening angle bracket `<', and the content held within the document. The current character that the parser is processing is refered to as its cursor position. As the parser moves the cursor through the source document, the parser alternates between markup scanning, and data validation and processing operations. At each processing step, the parser scans the source document and either locates the expected markup, or reports an error condition and terminates.
     40Traditional XML parsers process XML sequentially a single byte-at-a-time. Following this approach, an XML parser processes a source document serially, from the first to the last byte of the source file. Each character of the source text is examined in turn to distinguish between the XML-specific markup, such as an opening angle bracket `<', and the content held within the document. The current character that the parser is processing is refered to as its cursor position. As the parser moves the cursor through the source document, the parser alternates between markup scanning, and data validation and processing operations. At each processing step, the parser scans the source document and either locates the expected markup, or reports an error condition and terminates.
    4141
    4242In other words, traditional XML parsers are complex finite-state machines that use byte comparisons to transition between data and metadata states. Each state transition indicates the context in which to interpret the subsequent characters. Unfortunetly, textual data tends to consist of variable-length items in generally unpredictable patterns \cite{Cameron2010}; thus any character could be a state transition until deemed otherwise.
Note: See TracChangeset for help on using the changeset viewer.