Changeset 1289


Ignore:
Timestamp:
Aug 8, 2011, 1:06:13 PM (8 years ago)
Author:
ksherdy
Message:

Minor edits to improve readability and flow.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • docs/PACT2011/00-abstract.tex

    r1088 r1289  
    1 XML is a data format designed for documents as well as the
    2 representation of data structures.  The simplicity and generality of
    3 the rules make it widely used in web services and database
    4 systems.  Traditional XML parsers have been built around the
    5 byte-at-a-time model, in which they process every character token in
    6 the file in a sequential fashion.  Unfortunately, the byte-at-time
    7 sequential model is a performance barrier in demanding applications,
    8 and is also energy-inefficient, making poor use of the wide registers
    9 and other parallelism features in modern processors.
     1XML is a set of rules for the encoding documents in machine-readable form.
     2The simplicity and generality of the rules make it widely used in web services and database
     3systems.  Traditional XML parsers are built around a
     4byte-at-a-time processing model where each character token
     5of an XML document is examined in sequence.  Unfortunately, the byte-at-a-time
     6sequential model is a performance barrier in more demanding applications,
     7is energy-inefficient, and makes poor use of the wide SIMD registers
     8and other parallelism features of modern processors.
    109
    1110This paper assesses the energy and performance of a new approach
    12 to XML parsing based on parallel bit stream technology.  This method
    13 first converts the character steams into sets of parallel
    14 bitstreams and then exploits SIMD operations prevalent on modern CPUs.
    15 The first generation Parabix1 parser then uses bit-scan instructions
    16 over these streams to make multibyte moves in an otherwise sequential
     11to XML parsing, based on parallel bit stream technology, and as implemented on successive
     12software generations of the Parabix XML parser.
     13This method first converts the character streams into sets of parallel
     14bit streams and then exploits SIMD operations prevalent on commodity-level hardware.
     15The first generation Parabix1 parser exploits the processor built-in bit-scan instructions
     16over these streams to make multibyte moves but follows an otherwise sequential
    1717approach.  The second generation Parabix2 technology adds further
    1818parallelism by replacing much of the sequential
    19 bit scanning with a parallel scanning approach based on bit-stream
     19bit scanning with a parallel scanning approach based on bit stream
    2020addition.  We evaluate Parabix1 and Parabix2
    21 against two widely-used XML parsers, James Clark's Expat and Apache's Xerces
    22 on three generations of x86 machines, including the new Intel
     21against two widely used XML parsers, James Clark's Expat and Apache's Xerces, and
     22across three generations of x86 machines, including the new Intel
    2323\SB{}.  We show that Parabix2's speedup is 2$\times$--7$\times$
    2424over Expat and Xerces.  In stark contrast to the energy expenditures necessary
    2525to realize performance gains through multicore parallelism, we also show
    26 that our Parabix parsers deliver energy savings directly in proportion
    27 to performance gains.  We also assess the scalability advantages
    28 of SIMD processor improvements the different Intel machine generations,
     26that our Parabix parsers deliver energy savings in direct proportion
     27to the gains in performance.  In addition, we assess the scalability advantages
     28of SIMD processor improvements across Intel processor generations,
    2929culminating with an evaluation of the 256-bit AVX technology in
    30 \SB{} vs. the now legacy 128-bit SSE technology.
     30\SB{} versus the now legacy 128-bit SSE technology.
    3131
Note: See TracChangeset for help on using the changeset viewer.