Changeset 1025


Ignore:
Timestamp:
Mar 25, 2011, 7:01:42 PM (8 years ago)
Author:
cameron
Message:

Cleanups - third person form

Location:
docs/PACT2011
Files:
3 edited

Legend:

Unmodified
Added
Removed
  • docs/PACT2011/00-abstract.tex

    r1014 r1025  
    1313first converts the character steams into sets of parallel
    1414bitstreams and then exploits SIMD operations prevalent on modern CPUs.
    15 Our first generation Parabix1 parser then uses bitscan instructions
     15The first generation Parabix1 parser then uses bitscan instructions
    1616over these streams to make multibyte moves in an otherwise sequential
    17 approach.   Our second generation Parabix2 technology further
    18 parallelizes our parsers by replacing much of the sequential
     17approach.   The second generation Parabix2 technology further
     18parallelizes these parsers by replacing much of the sequential
    1919bit scanning with a parallel scanning approach based on bitstream
    2020addition.    We evaluate Parabix1 and Parabix2
  • docs/PACT2011/01-intro.tex

    r998 r1025  
    3838while SIMD (single-instruction multiple data) parallelism
    3939has been of interest to Intel in designing new SIMD instructions\cite{XMLSSE42}
    40 as well as to our group in developing parallel bit stream technology
     40as well as to the developers of parallel bit stream technology
    4141\cite{CameronHerdyLin2008,Cameron2009,Cameron2010}.
    4242Each of these approaches has shown considerable performance
     
    5353characteristics of several XML parsers across three generations
    5454of x86-64 processor technology.   The parsers we consider are
    55 the widely used byte-at-a-time parsers Expat and Xerces as well our
    56 own Parabix1 and Parabix2 parsers
     55the widely used byte-at-a-time parsers Expat and Xerces as well the
     56Parabix1 and Parabix2 parsers based on parallel bit stream technology
    5757A compelling result is that
    5858the performance benefits of parallel bit stream technology
     
    7575and traditional parsing methods.   Section 3 then reviews
    7676parallel bit stream technology as applied to
    77 XML parsing in our Parabix1 and Parabix2 parsers.
     77XML parsing in the Parabix1 and Parabix2 parsers.
    7878Section 4 then introduces our methodology and approach
    7979for the performance and energy study tackled in the
     
    8686performance gains through three generations of Intel
    8787architecture culminating with performance assessment
    88 on our one week-old Sandy Bridge test machine.
     88on our two week-old Sandy Bridge test machine.
    8989Section 7 looks specifically at issues in applying
    9090the new 256-bit AVX technology to parallel bit stream
  • docs/PACT2011/02-background.tex

    r1019 r1025  
    11\section{Background}
    22\label{section:background}
    3 This section provides a brief overview of XML and traditional and parallel XML processing technology. Section \ref{section:reserach} describes the key design and performance aspects of both generations of the Parabix parallel XML processing technology.
    43
    54\subsection{XML}
     
    4039Expat and Xerces-C are popular byte-a-time sequential parsers. Both are C/C++ based and open-source. Expat was originally released in 1998; it is currently used in Mozilla Firefox and Open Office \cite{expat}. Xerces-C was released in 1999 and is the foundation of the Apache XML project \cite{xerces}. For example, the main loop of Xerces-C well-formedness scanner contains:
    4140
    42 \begin{verbatim}
    43    XXXXXXXXXX   XERCES CODE   XXXXXXXXXX
    44 \end{verbatim}
     41%\begin{verbatim}
     42%   XXXXXXXXXX   XERCES CODE   XXXXXXXXXX
     43%\end{verbatim}
    4544
    4645The major disadvantage of the byte-at-a-time sequential approach to XML parsering is that each character incurs at least one conditional branch. The cummulative effect of branch mispredictions penalties are known to degrade parsing performance in proportion to the markup density of the source document \cite{CameronHerdyLin2008} (i.e., the proportion of XML-markup vs. XML-data).
    4746
    4847\subsection {Parallel XML Parsing}
    49 In general, parallel XML acceleration methods comes in one of two forms: multithreaded approaches and SIMD-based techniques. Multithreaded XML parsers take advantage of multiple cores via number of strategies. Approaches include preparsing the XML file to locate key partitioning points \cite{ZhangPanChiu09} and speculative P-DFAs \cite{ZhangPanChiu09}. Once divided, the XML workload is processed independently across the available cores. SIMD XML parsers leverage the SIMD registers to overcome the performance limitations of the byte-at-a-time sequential processing paradigm as well as inherent data dependent branch misprediction rates \cite{Cameron2010}. SIMD instructions allow the processor to perform the same operation on multiple pieces of data simultaneously. To our knowledge, the only SIMD-based XML parsers are Parabix1 and Parabix2, both of which were designed and developed by Cameron et al. \cite{CameronHerdyLin2008}. We discuss both versions of Parabix in Section \ref{section:reserach}.
     48In general, parallel XML acceleration methods comes in one of two forms: multithreaded approaches and SIMD-based techniques.
     49Multithreaded XML parsers take advantage of multiple cores via number of strategies.
     50Approaches include preparsing the XML file to locate key partitioning points \cite{ZhangPanChiu09} and speculative p-DFAs \cite{ZhangPanChiu09}.
     51Once divided, the XML workload is processed independently across the available cores. SIMD XML parsers leverage the
     52SIMD registers to overcome the performance limitations of the byte-at-a-time sequential processing paradigm as well as
     53inherent data dependent branch misprediction rates.  SIMD instructions allow the processor to perform the same
     54operation on multiple pieces of data simultaneously.  The Parabix1 and Parabix2 parsers studied in this paper
     55fall in this class and are described in more detail in Section \ref{section:parabix} following.
    5056
    51 \subsection {SIMD Operations}
     57%\subsection {SIMD Operations}
    5258% Two such SIMD XML parsers, Parabix1 and Parabix2, utilizes parallel bit stream processing technology.
    5359% Extract section 2.2 and merge into 3.   Add a new subsection
Note: See TracChangeset for help on using the changeset viewer.