source: docs/PACT2011/00-abstract.tex @ 1088

Last change on this file since 1088 was 1088, checked in by ksherdy, 8 years ago

Minor edits.

File size: 1.8 KB
1XML is a data format designed for documents as well as the
2representation of data structures.  The simplicity and generality of
3the rules make it widely used in web services and database
4systems.  Traditional XML parsers have been built around the
5byte-at-a-time model, in which they process every character token in
6the file in a sequential fashion.  Unfortunately, the byte-at-time
7sequential model is a performance barrier in demanding applications,
8and is also energy-inefficient, making poor use of the wide registers
9and other parallelism features in modern processors.
11This paper assesses the energy and performance of a new approach
12to XML parsing based on parallel bit stream technology.  This method
13first converts the character steams into sets of parallel
14bitstreams and then exploits SIMD operations prevalent on modern CPUs.
15The first generation Parabix1 parser then uses bit-scan instructions
16over these streams to make multibyte moves in an otherwise sequential
17approach.  The second generation Parabix2 technology adds further
18parallelism by replacing much of the sequential
19bit scanning with a parallel scanning approach based on bit-stream
20addition.  We evaluate Parabix1 and Parabix2
21against two widely-used XML parsers, James Clark's Expat and Apache's Xerces
22on three generations of x86 machines, including the new Intel
23\SB{}.  We show that Parabix2's speedup is 2$\times$--7$\times$
24over Expat and Xerces.  In stark contrast to the energy expenditures necessary
25to realize performance gains through multicore parallelism, we also show
26that our Parabix parsers deliver energy savings directly in proportion
27to performance gains.  We also assess the scalability advantages
28of SIMD processor improvements the different Intel machine generations,
29culminating with an evaluation of the 256-bit AVX technology in
30\SB{} vs. the now legacy 128-bit SSE technology.
Note: See TracBrowser for help on using the repository browser.