Ignore:
Timestamp:
Aug 23, 2011, 11:42:04 AM (8 years ago)
Author:
ashriram
Message:

New conclusion

File:
1 edited

Legend:

Unmodified
Added
Removed
  • docs/HPCA2012/00-abstract.tex

    r1349 r1350  
    11In modern applications text files are employed widely. For example,
    22XML files provide data storage in human readable format and are widely
    3 used in web services, database systems, and mobile phone SDKs.
    4 Traditional text processing tools are built around a byte-at-a-time
    5 processing model where each character token of a document is
    6 examined. The byte-at-a-time model is highly challenging for commodity
    7 processors. It includes many unpredictable input-dependent branches
    8 which cause pipeline squashes and stalls. Furthermore, typical text
    9 processing tools perform few operations per processed character and
    10 experience high cache miss rates when parsing the file. Overall,
    11 parsing text in important domains like XML processing requires high
    12 performance motivating hardware designers to adopt customized hardware
    13 and ASIC solutions.
     3used in applications ranging from database systems to mobile phone
     4SDKs.  Traditional text processing tools are built around a
     5byte-at-a-time processing model where each character token of a
     6document is examined. The byte-at-a-time model is highly challenging
     7for commodity processors. It includes many unpredictable
     8input-dependent branches which cause pipeline squashes and
     9stalls. Furthermore, typical text processing tools perform few
     10operations per processed character and experience high cache miss
     11rates. Overall, parsing text in important domains like XML processing
     12requires high performance motivating hardware designers to adopt ASIC
     13solutions.
    1414
    1515% In this paper on commodity.
     
    2121In this paper, we enable text processing applications to effectively
    2222use commodity processors. We introduce Parabix (Parallel Bitstream)
    23 technology, a software runtime and execution model that allows applications
    24 to exploit modern SIMD instructions extensions for high performance
    25 text processing. Parabix enables the application developer to write
    26 constructs assuming unlimited SIMD data parallelism. Our runtime
    27 translator generates code based on machine specifics (e.g., SIMD
    28 register widths) to realize the programmer specifications.  The key
    29 insight into efficient text processing in Parabix is the data
    30 organization. It transposes the sequence of 8-bit characters into
    31 sets of 8 parallel bit streams which then enables us to operate on
    32 multiple characters with single bit-parallel SIMD operators. We
    33 demonstrate the features and efficiency of parabix with a XML parsing
    34 application. We evaluate a Parabix-based XML parser against two widely
    35 used XML parsers, Expat and Apache's Xerces, and across three
    36 generations of x86 processors, including the new Intel \SB{}.  We show
    37 that Parabix's speedup is 2$\times$--7$\times$ over Expat and
    38 Xerces. We observe that Parabix overall makes efficient use of
    39 intra-core parallel hardware on commodity processors and supports
    40 significant gains in energy. Using Parabix, we assess the scalability
    41 advantages of SIMD processor improvements across Intel processor
    42 generations, culminating with a look at the latex 256-bit AVX
    43 technology in \SB{} versus the now legacy 128-bit SSE technology. As
    44 part of this study we also preview the Neon extensions on ARM
    45 processors. Finally, we partition the XML program into pipeline stages
    46 and demonstrate that thread-level parallelism exploits SIMD units
    47 scattered across the different cores and improves performance
    48 (2$\times$ on 4 cores) at same energy levels as the single-thread
    49 version.
     23technology, a software runtime and execution model that allows
     24applications to exploit modern SIMD instructions extensions for high
     25performance text processing. Parabix enables the application developer
     26to write constructs assuming unlimited SIMD data parallelism and
     27Parabix's runtime translator generates code based on machine specifics
     28(e.g., SIMD register widths).  The key insight into efficient text
     29processing in Parabix is the data organization. Parabix transposes the
     30sequence of character bytes into sets of 8 parallel bit streams which
     31then enables us to operate on multiple characters with single
     32bit-parallel SIMD operators. We demonstrate the features and
     33efficiency of parabix with a XML parsing application. We evaluate a
     34Parabix-based XML parser against two widely used XML parsers, Expat
     35and Apache's Xerces, and across three generations of x86 processors,
     36including the new Intel \SB{}.  We show that Parabix's speedup is
     372$\times$--7$\times$ over Expat and Xerces. We observe that Parabix
     38overall makes efficient use of intra-core parallel hardware on
     39commodity processors and supports significant gains in energy. Using
     40Parabix, we assess the scalability advantages of SIMD processor
     41improvements across Intel processor generations, culminating with a
     42look at the latex 256-bit AVX technology in \SB{} versus the now
     43legacy 128-bit SSE technology. Finally, we partition the XML
     44program into pipeline stages and demonstrate that thread-level
     45parallelism exploits SIMD units scattered across the different cores
     46and improves performance (2$\times$ on 4 cores) at same energy levels
     47as the single-thread version.
    5048
    5149
Note: See TracChangeset for help on using the changeset viewer.