Changeset 1726


Ignore:
Timestamp:
Nov 21, 2011, 5:56:15 PM (8 years ago)
Author:
lindanl
Message:

shorten abstract for HPCA

File:
1 edited

Legend:

Unmodified
Added
Removed
  • docs/HPCA2012/00-abstract.tex

    r1421 r1726  
    1 In modern applications text files are employed widely. For example,
    2 XML files provide data storage in human readable format and are
    3 ubiquitous in applications ranging from database systems to mobile
    4 phone SDKs.  Traditional text processing tools are built around a
    5 byte-at-a-time processing model where each character token of a
    6 document is examined. The byte-at-a-time model is highly challenging
    7 for commodity processors. It includes many unpredictable
    8 input-dependent branches which cause pipeline squashes and
    9 stalls. Furthermore, typical text processing tools perform few
    10 operations per character and experience high cache miss
    11 rates. Overall, parsing text in important domains like XML processing
    12 requires high performance motivating the adoption of custom hardware
    13 solutions.
     1% In modern applications text files are employed widely. For example,
     2% XML files provide data storage in human readable format and are
     3% ubiquitous in applications ranging from database systems to mobile
     4% phone SDKs. 
     5% Traditional text processing tools are built around a
     6% byte-at-a-time processing model where each character token of a
     7% document is examined. The byte-at-a-time model is highly challenging
     8% for commodity processors. It includes many unpredictable
     9% input-dependent branches which cause pipeline squashes and
     10% stalls. Furthermore, typical text processing tools perform few
     11% operations per character and experience high cache miss
     12% rates. Overall, parsing text in important domains like XML processing
     13% requires high performance motivating the adoption of custom hardware
     14% solutions.
     15%
     16% % In this paper on commodity.
     17% % We expose through a toolchain.
     18% % We demonstrate what can be achieved with branches etc.
     19% % We study various tradeoffs.
     20% % Finally we show the benefits can be stacked
     21%
     22% In this paper, we enable text processing applications to effectively
     23% use commodity processors. We introduce Parabix (Parallel Bit Stream)
     24% technology, a software toolchain and execution framework that allows
     25% applications to exploit modern SIMD instructions for high performance
     26% text processing. Parabix enables the application developer to write
     27% constructs assuming unlimited SIMD data parallelism and Parabix's
     28% bit stream translator generates code based on machine specifics (e.g.,
     29% SIMD register widths).  The key insight into efficient text processing
     30% in Parabix is the data organization. Parabix transposes the sequence
     31% of character bytes into sets of 8 parallel bit streams which then
     32% enables us to operate on multiple characters with bit-parallel SIMD
     33% operations. We demonstrate the features and efficiency of Parabix with
     34% an XML parsing application. We evaluate the Parabix-based parser
     35% against two widely used XML parsers, Expat and Apache's
     36% Xerces. Parabix makes efficient use of intra-core SIMD hardware and
     37% demonstrates 2$\times$--7$\times$ speedup and 4$\times$ improvement in
     38% energy efficiency compared to the conventional parsers. We assess the
     39% scalability of SIMD implementations across three generations of x86
     40% processors including the new \SB{}. We compare the 256-bit AVX
     41% technology in Intel \SB{} versus the now legacy 128-bit SSE technology
     42% and analyze the benefits and challenges of using the AVX
     43% extensions.  Finally, we partition the XML program into pipeline stages
     44% and demonstrate that thread-level parallelism enables the application
     45% to exploits SIMD units scattered across the different cores and
     46% improves performance (2$\times$ on 4 cores) at same energy levels as
     47% the single-thread version for the XML application.
    1448
    15 % In this paper on commodity.
    16 % We expose through a toolchain.
    17 % We demonstrate what can be achieved with branches etc.
    18 % We study various tradeoffs.
    19 % Finally we show the benefits can be stacked
    2049
    21 In this paper, we enable text processing applications to effectively
    22 use commodity processors. We introduce Parabix (Parallel Bit Stream)
    23 technology, a software toolchain and execution framework that allows
    24 applications to exploit modern SIMD instructions for high performance
    25 text processing. Parabix enables the application developer to write
    26 constructs assuming unlimited SIMD data parallelism and Parabix's
     50Traditional text processing tools are built around a byte-at-a-time
     51sequential processing model, which is hard to parallelize without special hardware.
     52However, Parabix (Parallel Bit Stream) technology
     53enables text processing applications to effectively use commodity processors.
     54In this paper, we generalize Parabix into a software toolchain and execution
     55framework that allows applications to exploit modern SIMD instructions for high
     56performance text processing. This toolchain enables the application developer
     57to write constructs assuming unlimited SIMD data parallelism and Parabix's
    2758bit stream translator generates code based on machine specifics (e.g.,
    28 SIMD register widths).  The key insight into efficient text processing
    29 in Parabix is the data organization. Parabix transposes the sequence
    30 of character bytes into sets of 8 parallel bit streams which then
    31 enables us to operate on multiple characters with bit-parallel SIMD
    32 operations. We demonstrate the features and efficiency of Parabix with
     59SIMD register widths). We demonstrate the features and efficiency of Parabix with
    3360an XML parsing application. We evaluate the Parabix-based parser
    3461against two widely used XML parsers, Expat and Apache's
     
    4572improves performance (2$\times$ on 4 cores) at same energy levels as
    4673the single-thread version for the XML application.
    47 
    48 
    49 
Note: See TracChangeset for help on using the changeset viewer.