source: docs/HPCA2012/00-abstract.tex @ 1358

Last change on this file since 1358 was 1358, checked in by cameron, 8 years ago

Abstract fixes

File size: 2.7 KB
1In modern applications text files are employed widely. For example,
2XML files provide data storage in human readable format and are widely
3used in applications ranging from database systems to mobile phone
4SDKs.  Traditional text processing tools are built around a
5byte-at-a-time processing model where each character token of a
6document is examined. The byte-at-a-time model is highly challenging
7for commodity processors. It includes many unpredictable
8input-dependent branches which cause pipeline squashes and
9stalls. Furthermore, typical text processing tools perform few
10operations per processed character and experience high cache miss
11rates. Overall, parsing text in important domains like XML processing
12requires high performance motivating hardware designers to adopt ASIC
15% In this paper on commodity.
16% We expose through a toolchain.
17% We demonstrate what can be achieved with branches etc.
18% We study various tradeoffs.
19% Finally we show the benefits can be stacked
21In this paper, we enable text processing applications to effectively
22use commodity processors. We introduce Parabix (Parallel Bitstream)
23technology, a software runtime and execution model that allows
24applications to exploit modern SIMD instructions extensions for high
25performance text processing. Parabix enables the application developer
26to write constructs assuming unlimited SIMD data parallelism and
27Parabix's bitstream translator generates code based on machine specifics
28(e.g., SIMD register widths).  The key insight into efficient text
29processing in Parabix is the data organization. Parabix transposes the
30sequence of character bytes into sets of 8 parallel bit streams which
31then enables us to operate on multiple characters with single
32bit-parallel SIMD operators. We demonstrate the features and
33efficiency of parabix with a XML parsing application. We evaluate a
34Parabix-based XML parser against two widely used XML parsers, Expat
35and Apache's Xerces, and across three generations of x86 processors,
36including the new Intel \SB{}.  We show that Parabix's speedup is
372$\times$--7$\times$ over Expat and Xerces. We observe that Parabix
38overall makes efficient use of intra-core parallel hardware on
39commodity processors and supports significant gains in energy. Using
40Parabix, we assess the scalability advantages of SIMD processor
41improvements across Intel processor generations, culminating with a
42look at the latest 256-bit AVX technology in \SB{} versus the now
43legacy 128-bit SSE technology. Finally, we partition the XML
44program into pipeline stages and demonstrate that thread-level
45parallelism exploits SIMD units scattered across the different cores
46and improves performance (2$\times$ on 4 cores) at same energy levels
47as the single-thread version.
Note: See TracBrowser for help on using the repository browser.