Changeset 247


Ignore:
Timestamp:
Dec 24, 2008, 6:36:56 PM (10 years ago)
Author:
cameron
Message:

Intro

Location:
docs/ASPLOS09
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • docs/ASPLOS09/asplos094-cameron.tex

    r246 r247  
    6464keyword1, keyword2
    6565
    66 
     66\sloppy
    6767
    6868\section{Introduction}
     
    8282One approach that shows some promise, however, is the
    8383method of parallel bit streams, recently applied to
    84 UTF-8 to UTF-16 transcoding \cite{PPoPP08}, XML parsing \cite{CASCON08}
     84UTF-8 to UTF-16 transcoding \cite{u8u16, PPoPP08}, XML parsing \cite{CASCON08, Herdy}
    8585and amino acid sequencing\cite{Green}.
    8686In this method, byte-oriented character data is first transposed to eight
     
    9292out the work.
    9393
    94 The cited study of UTF-8 to UTF-16 transcoding reports a 3X to 25X
    95 speed-up in using parallel bit stream techniques on SIMD-capable uniprocessors
    96 employing the SSE or Altivec instruction sets.  A full implementation
    97 for each of these platforms is documented using literate programming techniques
    98 and available as an open source code base \cite{u8u16}.
    99 Although this transcoding task represents but a single text processing kernel,
    100 it is widely cited as a critical bottleneck in XML parsing, accounting for 30\% or more
    101 of processing time\cite{NicolaJohn03, Perkins05, Psaila06}.  The example is also interesting in that
    102 it illustrates a text processing task that can largely
    103 be carried out using SIMD operations even though
    104 UTF-8 character sequences are variable length. Indeed, one weakness of the
    105 actual implementation is that it reverts to byte-at-a-time processing
    106 at block boundaries, employing a block shortening strategy to
    107 reduce block lengths to as low as 125 bytes from 128.
    108 With more capable SIMD instruction set architectures providing
    109 better facilities for multiregister shifts and larger register
    110 files, a solution employing SIMD techniques for virtually all
    111 of the main work would maintain better data alignment, avoid
    112 problems of register pressure and be easier to parallelize across
    113 multiple cores.  It should also naturally scale to 256-bit SIMD technology
    114 such as the expected AVX technology of Intel.
    115 
    116 The report addressing the broader problem of XML parsing is perhaps more
    117 interesting, demonstrating the utility of parallel bit stream techniques
     94In application to UTF-8 to UTF-16 transcoding, a 3X to 25X speed-up
     95is achieved in using parallel bit stream techniques on SIMD-capable uniprocessors
     96employing the SSE or Altivec instruction sets\cite{PPoPP08}.
     97In the broader context of XML parsing, further applications of these
     98techniques demonstrate the utility of parallel bit stream techniques
    11899in delivering performance benefits through a significant portion of the
    119100web technology stack.  In an XML statistics gathering application,
     
    121102overall 3X to 10X performance improvement is achieved in using
    122103the parallel bit stream methods in comparison with a similarly
    123 coded application using such well known parsers as Expat and Xerces.
    124 Although still a work in progress, the parser has functional
    125 coverage of XML and related specifications comparable to, but somewhat
    126 beyond Expat.   The study also identifies promising further
    127 work in extending the parallel bit stream methods to parallel
    128 hash value computation and parallel regular expression matching
     104coded application using such well known parsers as Expat and Xerces \cite{CASCON08}.
     105In an application involving transformation between different XML formats
     106(GML and SVG), an implementation using parallel bit stream technology
     107required a mere 15 cycles per byte, while a range of other technologies
     108required from 25 to 200 cycles per byte \cite{Herdy}.
     109Ongoing work is further applying the parallel bit stream methds to
     110parallel hash value computation and parallel regular expression matching
    129111for the purpose of validating XML datatype declarations in
    130 accord with XML Schema.
     112accord with XML Schema \cite{CASCON08}.
    131113
    132114Given these promising initial results in the application of
Note: See TracChangeset for help on using the changeset viewer.