Changeset 1017 for docs


Ignore:
Timestamp:
Mar 25, 2011, 6:05:18 PM (8 years ago)
Author:
ksherdy
Message:

Minor edit.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • docs/PACT2011/02-background.tex

    r1016 r1017  
    3535
    3636\subsection{Traditional XML Parsers}
    37 
    3837% However, textual data tends to consist of variable-length items in generally unpredictable patterns \cite{Cameron2010}.
    39 
    4038Traditional XML parsers process XML sequentially a single byte-at-a-time. Following this approach, an XML parser processes a source document serially, from the first to the last byte of the source file. Each character of the source text is examined in turn to distinguish between the XML-specific markup, such as an opening angle bracket `<', and the content held within the document. The current character that the parser is processing is refered to as its cursor position. As the parser moves the cursor through the source document, the parser alternates between markup scanning, and data validation and processing operations. At each processing step, the parser scans the source document and either locates the expected markup, or reports an error condition and terminates. In other words, traditional XML parsers are complex finite-state machines that use byte comparisons to transition between data and metadata states. Each state transition indicates the context in which to interpret the subsequent characters. Unfortunetly, textual data tends to consist of variable-length items in generally unpredictable patterns \cite{Cameron2010}; thus any character could be a state transition until deemed otherwise.
    4139
     
    4947
    5048\subsection {Parallel XML Parsing}
    51 
    5249In general, parallel XML acceleration methods comes in one of two forms: multithreaded approaches and SIMD-ized techniques. Multithreaded XML parsers take advantage of multiple cores by first quickly preparsing the XML file to locate key partitioning points. The XML workload is then divided and processed independently across the available cores \cite{ZhangPanChiu09}. A serial join step typically follows. SIMD XML parsers leverage the SIMD registers to overcome the performance limitations of the byte-at-a-time sequential processing paradigm and inherent data dependent branch misprediction rates \cite{Cameron2010}. SIMD instructions allows the processor to perform the same operation on multiple pieces of data simultaneously. To our knowledge, the only SIMD-based XML parsers are Parabix1 and Parabix2, both of which were designed and developed by Cameron et al. \cite{CameronHerdyLin2008}. We discuss both versions of Parabix in Section \ref{section:reserach}.
    5350
    5451\subsection {SIMD Operations}
    55 
    56 
    57 
    5852% Two such SIMD XML parsers, Parabix1 and Parabix2, utilizes parallel bit stream processing technology.
    59 
    60 
    6153% Extract section 2.2 and merge into 3.   Add a new subsection
    6254% in section 2 saying a bit about SIMD.   Say a bit about pure SIMD vertical
     
    6557% Also note that the SIMD registers support bitwise logic across
    6658% their full width and that this is extensively used in our work.
    67 
    68 
    6959% \subsection{Parallel XML Parsing}
    7060%
Note: See TracChangeset for help on using the changeset viewer.