source: docs/HPCA2012/00-abstract.tex @ 1379

Last change on this file since 1379 was 1360, checked in by cameron, 8 years ago

More abstract changes.

File size: 2.8 KB
Line 
1In modern applications text files are employed widely. For example,
2XML files provide data storage in human readable format and are
3ubiquitous in applications ranging from database systems to mobile phone
4SDKs.  Traditional text processing tools are built around a
5byte-at-a-time processing model where each character token of a
6document is examined. The byte-at-a-time model is highly challenging
7for commodity processors. It includes many unpredictable
8input-dependent branches which cause pipeline squashes and
9stalls. Furthermore, typical text processing tools perform few
10operations per processed character and experience high cache miss
11rates. Overall, parsing text in important domains like XML processing
12requires high performance motivating hardware designers to adopt ASIC
13solutions.
14
15% In this paper on commodity.
16% We expose through a toolchain.
17% We demonstrate what can be achieved with branches etc.
18% We study various tradeoffs.
19% Finally we show the benefits can be stacked
20
21In this paper, we enable text processing applications to effectively
22use commodity processors. We introduce Parabix (Parallel Bitstream)
23technology, a software toolkit and execution framework that allows
24applications to exploit modern SIMD instructions extensions for high
25performance text processing. Parabix enables the application developer
26to write constructs assuming unlimited SIMD data parallelism and
27Parabix's bitstream translator generates code based on machine specifics
28(e.g., SIMD register widths).  The key insight into efficient text
29processing in Parabix is the data organization. Parabix transposes the
30sequence of character bytes into sets of 8 parallel bit streams which
31then enables us to operate on multiple characters with single
32bit-parallel SIMD operators. We demonstrate the features and
33efficiency of Parabix with a XML parsing application. We evaluate a
34Parabix-based XML parser against two widely used XML parsers, Expat
35and Apache's Xerces, and across three generations of x86 processors,
36including the new Intel \SB{}.  We show that Parabix's speedup is
372$\times$--7$\times$ over Expat and Xerces. We observe that Parabix
38overall makes efficient use of intra-core parallel hardware on
39commodity processors and supports significant gains in energy. Using
40Parabix, we assess the scalability advantages of SIMD processor
41improvements across Intel processor generations, culminating with a
42look at the latest 256-bit AVX technology in \SB{} versus the now
43legacy 128-bit SSE technology. We also examine Parabix on mobile platforms
44using ARM processors with Neon SIMD extensions.  Finally, we partition the XML
45program into pipeline stages and demonstrate that thread-level
46parallelism exploits SIMD units scattered across the different cores
47and improves performance (2$\times$ on 4 cores) at same energy levels
48as the single-thread version.
49
50
51
Note: See TracBrowser for help on using the repository browser.