source: docs/PACT2011/00-abstract.tex @ 996

Last change on this file since 996 was 994, checked in by cameron, 9 years ago

Tighten and correct abstract.

File size: 1.9 KB
1XML is a data format designed for documents as well as the
2representation of data structures. The simplicity and generality of
3the rules make it widely used in web services and database
4systems. Traditional XML parsers have been built around the
5byte-at-a-time model, in which they process every character token in
6the file in a sequential fashion.  Unfortunately, the byte-at-time
7sequential model is a performance barrier in demanding applications,
8and is also energy-inefficient, making poor user of the wide registers
9and other parallelism features in modern processors.
11This paper assesses the energy and performance of a new approach
12to XML parsing based on parallel bit stream technology.  This method
13first converts the character steams into sets of parallel
14bitstreams and then exploits SIMD operations prevalent on modern CPUs.
15Our first generation Parabix1 parser then uses bitscan instructions
16over these streams to make multibyte moves in an otherwise sequential
17approach.   Our second generation Parabix2 technology further
18parallelizes our parsers by replacing much of the sequential
19bit scanning with a parallel scanning approach based on bitstream
20addition.    We evaluate Parabix1 and Parabix2
21against two widely-used XML parsers, James Clark's Expat and Apache's Xerces
22on three generations of x86 machines, including the new Intel
23Sandy Bridge.    We show that Parabix2's speedup is 2$\times$--8$\times$
24over Expat and Xerces.  In stark contrast to the energy expenditures necessary
25to realize performance gains through multicore parallelism, we also show
26that our Parabix parsers deliver energy savings directly in proportion
27to performance gains.   We also assess the scalability advantages
28of SIMD processor improvements the different Intel machine generations,
29culminating with an evaluation of the 256-bit AVX technology in
30SandyBridge vs. the now legacy 128-bit SSE technology.
Note: See TracBrowser for help on using the repository browser.