Changeset 2301 for docs


Ignore:
Timestamp:
Aug 14, 2012, 3:43:54 PM (7 years ago)
Author:
cameron
Message:

Revise parabix background.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • docs/Working/PPoPP/background-parabix.tex

    r2300 r2301  
    3737depending on the character immediately following the
    3838opener (i.e., ``\verb:/:'') or not.  The remaining three
    39 lines show streams that may be computed in subsequent
     39lines show streams that can be computed in subsequent
    4040parsing, namely streams marking the element names,
    4141attribute names and attribute values of tags. 
    4242
    43 Two intuitions may help explain how the Parabix approach might lead
     43Two intuitions may help explain how the Parabix approach can lead
    4444to improved XML parsing performance.   The first is that
    4545the use of the full register width offers a considerable
     
    5151often just computing a single bit of information per iteration:
    5252is the scan complete at this position yet or not?  Rather than
    53 computing these bits one at a time, why not attempt to compute
    54 many (e.g., 128 with SSE registers) at once?
     53computing these bits one at a time, an approach that computes
     54many of them in parallel (e.g., 128 with SSE registers) should
     55provide substantial benefit.
    5556
    56 In fact, the Parabix approach has been used to fully implement
    57 many aspects of XML processing, including transcoding\cite{Cameron2008},
    58 character classification and validation, and tag parsing.  The first
    59 such Parabix parser used processor bit scan instructions to dramatically
    60 accelerate sequential scanning loops for individual characters \cite{CameronHerdyLin2008},
    61 while recent work has incorporated a method of parallel scanning using bitstream
    62 addition \cite{cameron-EuroPar2011}.   Building on these methods, a prototype compiler
    63 and portable run-time library have been developed and subsequently evaluated
    64 to show substantial performance and energy benefits across a range of
    65 processors \cite{HPCA2012}.
     57Previous studies have shown the performance benefits of the
     58Parabix approach inmany aspects of XML processing, including transcoding\cite{Cameron2008},
     59character classification and validation, tag parsing and well-formedness
     60checking.  The first Parabix  parser used processor bit scan instructions
     61to considerably accelerate sequential scanning loops for individual
     62characters \cite{CameronHerdyLin2008}.   Recent work has incorporated a method of parallel
     63scanning using bitstream addition \cite{cameron-EuroPar2011}, as
     64well as combining SIMD methods with 4-stage pipeline parallelism to further improve
     65throughput \cite{HPCA2012}.
    6666
     67Although these research prototypes handle the full syntax of
     68DTDless XML documents including well-formedness checking, they fall
     69short of the functionality required in full XML parser for several reasons.
     70Commercial XML processors include a number of additional facilities such
     71as support for transcoding of multiple character sets,
     72the ability to parse and validate against DTDs (document type definitions),
     73both internal and external,
     74facilities for handling different XML vocabularies through namespace
     75processing, as well validation against XML schema.  In addition,
     76commercial parsers can be expected to provide a number of API
     77facilities beyond those found in research prototypes, including
     78full implementations of the widely used SAX and DOM interfaces.
     79
Note: See TracChangeset for help on using the changeset viewer.