source: docs/HPCA2012/final_ieee/00-abstract.tex @ 1747

Last change on this file since 1747 was 1747, checked in by cameron, 8 years ago

Abstract/title/intro cleanups.

File size: 4.6 KB
2% In modern applications text files are employed widely. For example,
3% XML files provide data storage in human readable format and are
4% ubiquitous in applications ranging from database systems to mobile
5% phone SDKs. 
6% Traditional text processing tools are built around a
7% byte-at-a-time processing model where each character token of a
8% document is examined. The byte-at-a-time model is highly challenging
9% for commodity processors. It includes many unpredictable
10% input-dependent branches which cause pipeline squashes and
11% stalls. Furthermore, typical text processing tools perform few
12% operations per character and experience high cache miss
13% rates. Overall, parsing text in important domains like XML processing
14% requires high performance motivating the adoption of custom hardware
15% solutions.
17% % In this paper on commodity.
18% % We expose through a toolchain.
19% % We demonstrate what can be achieved with branches etc.
20% % We study various tradeoffs.
21% % Finally we show the benefits can be stacked
23% In this paper, we enable text processing applications to effectively
24% use commodity processors. We introduce Parabix (Parallel Bit Stream)
25% technology, a software toolchain and execution framework that allows
26% applications to exploit modern SIMD instructions for high performance
27% text processing. Parabix enables the application developer to write
28% constructs assuming unlimited SIMD data parallelism and Parabix's
29% bit stream translator generates code based on machine specifics (e.g.,
30% SIMD register widths).  The key insight into efficient text processing
31% in Parabix is the data organization. Parabix transposes the sequence
32% of character bytes into sets of 8 parallel bit streams which then
33% enables us to operate on multiple characters with bit-parallel SIMD
34% operations. We demonstrate the features and efficiency of Parabix with
35% an XML parsing application. We evaluate the Parabix-based parser
36% against two widely used XML parsers, Expat and Apache's
37% Xerces. Parabix makes efficient use of intra-core SIMD hardware and
38% demonstrates 2$\times$--7$\times$ speedup and 4$\times$ improvement in
39% energy efficiency compared to the conventional parsers. We assess the
40% scalability of SIMD implementations across three generations of x86
41% processors including the new \SB{}. We compare the 256-bit AVX
42% technology in Intel \SB{} versus the now legacy 128-bit SSE technology
43% and analyze the benefits and challenges of using the AVX
44% extensions.  Finally, we partition the XML program into pipeline stages
45% and demonstrate that thread-level parallelism enables the application
46% to exploits SIMD units scattered across the different cores and
47% improves performance (2$\times$ on 4 cores) at same energy levels as
48% the single-thread version for the XML application.
50Modern applications employ text files widely for providing data
51storage in readable format for applications ranging from database
52systems to mobile phones. Traditional text processing tools are built
53around a byte-at-a-time sequential processing model, and introduce
54significant branch and cache miss penalties.  Recent work has
55explored a transposed representation of text, Parabix (Parallel Bit
56Streams), to accelerate scanning and parsing using SIMD facilities.
58This paper advocates and develops Parabix as a general framework
59and toolkit, describing the software toolchain and run-time support
60that allows applications to exploit modern SIMD instructions for high
61performance text processing. The toolchain enables the application
62developer to write constructs assuming unbounded character streams
63and Parabix's code translator generates code based on machine
64specifics (e.g., SIMD register widths).   The general argument
65in support of Parabix technology is made by a detailed performance
66and energy study of XML parsing across a range of processor architectures.
67Parabix exploits intra-core SIMD hardware and demonstrates
682$\times$--7$\times$ speedup and 4$\times$ improvement in energy
69efficiency compared to two widely used conventional software parsers,
70Expat and Apache-Xerces. We study SIMD implementations across three
71generations of x86 processors including the new \SB{}. We compare the
72256-bit AVX technology in Intel \SB{} versus the now legacy 128-bit
73SSE technology and analyze the benefits and challenges 3-operand
74instruction formats and wider SIMD hardware.  Finally, we partition
75the XML program into pipeline stages and demonstrate that thread-level
76parallelism enables the application to exploits SIMD units scattered
77across the different cores and improves performance (2$\times$ on 4
78cores) at same energy levels as the single-thread version for the XML
Note: See TracBrowser for help on using the repository browser.