Changeset 994

Mar 25, 2011, 6:31:41 AM (9 years ago)

Tighten and correct abstract.

1 edited


  • docs/PACT2011/00-abstract.tex

    r984 r994  
    44systems. Traditional XML parsers have been built around the
    55byte-at-a-time model, in which they process every character token in
    6 the file in a sequential fashion. Unfortunately, the byte-at-time
    7 sequential model is a fundamental hindrance on performance and in
    8 some cases can add up 100\% overhead to the database queries
    9 themselves.
     6the file in a sequential fashion.  Unfortunately, the byte-at-time
     7sequential model is a performance barrier in demanding applications,
     8and is also energy-inefficient, making poor user of the wide registers
     9and other parallelism features in modern processors.
    11 In this paper, we propose a new XML parser, Parabix, based on parallel
    12 bit stream technology, which converts the character strings into
     11This paper assesses the energy and performance of a new approach
     12to XML parsing based on parallel bit stream technology.  This method
     13first converts the character steams into sets of parallel
    1314bitstreams and then exploits SIMD operations prevalent on modern CPUs.
    14 The first generation parser that we developed, Parabix1, uses the
    15 bitscan and bit level sequencing SIMD operations to emulate much of the
    16 parsers functions. Unfortunately operations like bitscan are
    17 inherently sequential in nature and Parabix1's speedup is limited. We
    18 present a second generation parser, Parabix2, that fully parallelizes
    19 the parsing operations using using parallel bit level logic provided in
    20 modern SIMD extensions like SSE2.  We evaluate Parabix1 and Parabix2
     15Our first generation Parabix1 parser then uses bitscan instructions
     16over these streams to make multibyte moves in an otherwise sequential
     17approach.   Our second generation Parabix2 technology further
     18parallelizes our parsers by replacing much of the sequential
     19bit scanning with a parallel scanning approach based on bitstream
     20addition.    We evaluate Parabix1 and Parabix2
    2121against two widely-used XML parsers, James Clark's Expat and Apache's Xerces
    2222on three generations of x86 machines, including the new Intel
    23 Sandy Bridge. We show that Parabix2's speedup is 2$\times$---8$\times$
    24 over Expat and Xerces. Across the different Intel machine generations,
    25 Parabix rides the scalability curve of SIMD operations whose
    26 performance inherently scales better than traditional sequential
    27 thread performance. Comparing Intel's new Sandy Bridge core with the Core
    28 i3 we observed performance improvement between 20---60\% for our
    29 Parabix parsers while sequential parsers like Xerces improve by
    30 $<$20\%. We measure real CPU power to demonstrate that Parabix also
    31 brings with itself significant energy efficiency. On the core i3,
    32 Parabix consumes $\simeq$4nJ per byte parsed while Xerces consumes
    33 $\simeq$20nJ per byte parsed. Finally, we perform a case study of the
    34 Intel's new 256-bit wide AVX instructions, and demonstrate that it
    35 provides X speedup over 128 bit SSE2 instruction set.
     23Sandy Bridge.    We show that Parabix2's speedup is 2$\times$--8$\times$
     24over Expat and Xerces.  In stark contrast to the energy expenditures necessary
     25to realize performance gains through multicore parallelism, we also show
     26that our Parabix parsers deliver energy savings directly in proportion
     27to performance gains.   We also assess the scalability advantages
     28of SIMD processor improvements the different Intel machine generations,
     29culminating with an evaluation of the 256-bit AVX technology in
     30SandyBridge vs. the now legacy 128-bit SSE technology.
Note: See TracChangeset for help on using the changeset viewer.