Changeset 1349 for docs/HPCA2012


Ignore:
Timestamp:
Aug 23, 2011, 9:55:22 AM (8 years ago)
Author:
cameron
Message:

Minor edits in abstract/intro

Location:
docs/HPCA2012
Files:
3 edited

Legend:

Unmodified
Added
Removed
  • docs/HPCA2012/00-abstract.tex

    r1348 r1349  
    88which cause pipeline squashes and stalls. Furthermore, typical text
    99processing tools perform few operations per processed character and
    10 experience high cache miss rate when parsing the file. Overall,
    11 parsing text in important domains like XML processing require high
    12 performance and hardware designers have adopted customized hardware
     10experience high cache miss rates when parsing the file. Overall,
     11parsing text in important domains like XML processing requires high
     12performance motivating hardware designers to adopt customized hardware
    1313and ASIC solutions.
    1414
     
    1919% Finally we show the benefits can be stacked
    2020
    21 In this paper we enable text processing applications to effectively
     21In this paper, we enable text processing applications to effectively
    2222use commodity processors. We introduce Parabix (Parallel Bitstream)
    23 technology, a software runtime and execution model that applications
     23technology, a software runtime and execution model that allows applications
    2424to exploit modern SIMD instructions extensions for high performance
    2525text processing. Parabix enables the application developer to write
     
    2828register widths) to realize the programmer specifications.  The key
    2929insight into efficient text processing in Parabix is the data
    30 organization. It transposes the sequence of 8-byte characters into
     30organization. It transposes the sequence of 8-bit characters into
    3131sets of 8 parallel bit streams which then enables us to operate on
    32 multiple characters with a single bit-parallel SIMD operators. We
     32multiple characters with single bit-parallel SIMD operators. We
    3333demonstrate the features and efficiency of parabix with a XML parsing
    34 application. We evaluate Parabix-based XML parser against two widely
     34application. We evaluate a Parabix-based XML parser against two widely
    3535used XML parsers, Expat and Apache's Xerces, and across three
    3636generations of x86 processors, including the new Intel \SB{}.  We show
  • docs/HPCA2012/01-intro.tex

    r1348 r1349  
    88shut off. Chip makers strive to achieve energy efficient computing by
    99operating at more optimal core frequencies and aim to increase
    10 performance with larger number of cores. Unfortunately, given the
     10performance with a larger number of cores. Unfortunately, given the
    1111limited levels of parallelism that can be found in
    1212applications~\cite{blake-isca-2010}, it is not certain how many cores
     
    3939transposing byte-oriented character data into parallel bit streams for
    4040the individual bits of each byte, the Parabix framework exploits the
    41 SIMD extensions (SSE/AVX on x86, Neon on ARM) on commodity processors
     41SIMD extensions on commodity processors (SSE/AVX on x86, Neon on ARM)
    4242to process hundreds of character positions in an input stream
    4343simultaneously.  To achieve transposition, Parabix exploits
    4444sophisticated SIMD instructions that enable data elements to be packed
    45 and unpacked from registers in a regular manner which improve the
     45and unpacked from registers in a regular manner which improves the
    4646overall cache access behavior of the application resulting in
    4747significantly fewer misses and better utilization.  Parabix also
     
    5757applications ranging from Office Open XML in Microsoft Office to NDFD
    5858XML of the NOAA National Weather Service, from KML in Google Earth to
    59 Castor XML in the Martian Rovers, a XML data in Android phones.  XML
     59Castor XML in the Martian Rovers, as well as ubiquitous XML data in Android phones.  XML
    6060parsing efficiency is important for multiple application areas; in
    6161server workloads the key focus in on overall transactions per second,
    62 while in applications in the network switches and cell phones, latency
    63 and the energy are of paramount importance.  Traditional
    64 software-based XML parsers have many inefficiencies due to complex
    65 input-dependent branching structures leading to considerable branch
    66 misprediction penalties as well as poor use of memory bandwidth and
     62while in applications in network switches and cell phones, latency
     63and energy are of paramount importance.  Traditional
     64software-based XML parsers have many inefficiencies including
     65considerable branch misprediction penalties due to complex
     66input-dependent branching structures as well as poor use of memory bandwidth and
    6767data caches due to byte-at-a-time processing and multiple buffering.
    6868XML ASIC chips have been around for over 6 years, but typically lag
    6969behind CPUs in technology due to cost constraints. Our focus is how
    70 much can we improve performance of the XML parser on commodity
     70much we can improve performance of the XML parser on commodity
    7171processors with Parabix technology.
    7272
    7373In the end, as summarized by
    7474Figure~\ref{perf-energy} our Parabix-based XML parser improves the
    75 performance by ?$\times$ and energy efficiency by ?$\times$ compared
     75performance by %?$\times$
     76and energy efficiency %by ?$\times$
     77several-fold compared
    7678to widely-used software parsers and approaching the performance of
    77 ?$cycles/input-byte$ performance of ASIC XML
    78 parsers~\cite{}.\footnote{The actual energy consumption of the XML
     79%?$cycles/input-byte$
     80performance of ASIC XML
     81parsers.%~\cite{}.
     82\footnote{The actual energy consumption of the XML
    7983  ASIC chips is not published by the companies.}
    8084
Note: See TracChangeset for help on using the changeset viewer.