# Changeset 994 for docs

Ignore:
Timestamp:
Mar 25, 2011, 6:31:41 AM (8 years ago)
Message:

Tighten and correct abstract.

File:
1 edited

### Legend:

Unmodified
 r984 systems. Traditional XML parsers have been built around the byte-at-a-time model, in which they process every character token in the file in a sequential fashion. Unfortunately, the byte-at-time sequential model is a fundamental hindrance on performance and in some cases can add up 100\% overhead to the database queries themselves. the file in a sequential fashion.  Unfortunately, the byte-at-time sequential model is a performance barrier in demanding applications, and is also energy-inefficient, making poor user of the wide registers and other parallelism features in modern processors. In this paper, we propose a new XML parser, Parabix, based on parallel bit stream technology, which converts the character strings into This paper assesses the energy and performance of a new approach to XML parsing based on parallel bit stream technology.  This method first converts the character steams into sets of parallel bitstreams and then exploits SIMD operations prevalent on modern CPUs. The first generation parser that we developed, Parabix1, uses the bitscan and bit level sequencing SIMD operations to emulate much of the parsers functions. Unfortunately operations like bitscan are inherently sequential in nature and Parabix1's speedup is limited. We present a second generation parser, Parabix2, that fully parallelizes the parsing operations using using parallel bit level logic provided in modern SIMD extensions like SSE2.  We evaluate Parabix1 and Parabix2 Our first generation Parabix1 parser then uses bitscan instructions over these streams to make multibyte moves in an otherwise sequential approach.   Our second generation Parabix2 technology further parallelizes our parsers by replacing much of the sequential bit scanning with a parallel scanning approach based on bitstream addition.    We evaluate Parabix1 and Parabix2 against two widely-used XML parsers, James Clark's Expat and Apache's Xerces on three generations of x86 machines, including the new Intel Sandy Bridge. We show that Parabix2's speedup is 2$\times$---8$\times$ over Expat and Xerces. Across the different Intel machine generations, Parabix rides the scalability curve of SIMD operations whose performance inherently scales better than traditional sequential thread performance. Comparing Intel's new Sandy Bridge core with the Core i3 we observed performance improvement between 20---60\% for our Parabix parsers while sequential parsers like Xerces improve by $<$20\%. We measure real CPU power to demonstrate that Parabix also brings with itself significant energy efficiency. On the core i3, Parabix consumes $\simeq$4nJ per byte parsed while Xerces consumes $\simeq$20nJ per byte parsed. Finally, we perform a case study of the Intel's new 256-bit wide AVX instructions, and demonstrate that it provides X speedup over 128 bit SSE2 instruction set. Sandy Bridge.    We show that Parabix2's speedup is 2$\times$--8$\times$ over Expat and Xerces.  In stark contrast to the energy expenditures necessary to realize performance gains through multicore parallelism, we also show that our Parabix parsers deliver energy savings directly in proportion to performance gains.   We also assess the scalability advantages of SIMD processor improvements the different Intel machine generations, culminating with an evaluation of the 256-bit AVX technology in SandyBridge vs. the now legacy 128-bit SSE technology.