Ignore:
Timestamp:
May 12, 2011, 7:23:42 PM (8 years ago)
Author:
cameron
Message:

Addressing reviewer comments.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • docs/EuroPar2011/europar-cameron.tex

    r901 r1185  
    9898parallel bitstream technology shows considerable promise for these
    9999%types of applications\cite{PPoPP08,CameronHerdyLin2008,Green2009}.
    100 types of applications \cite{PPoPP08,CameronHerdyLin2008,Green2009}.
     100types of applications \cite{PPoPP08,CameronHerdyLin2008}.
    101101In this approach, character streams are processed $N$ positions at
    102102a time using the $N$-bit SIMD registers commonly found on commodity
     
    117117as GML to SVG conversion \cite{Herdy2008}. 
    118118Other efforts to accelerate XML parsing include the use of custom
    119 XML chips \cite{Leventhal2009}, FPGAs \cite{DaiNiZhu2010}, and
    120 multithread/multicore speedups based on fast preparsing \cite{ZhangPanChiu09}.
     119XML chips \cite{Leventhal2009}, FPGAs \cite{DaiNiZhu2010}, careful coding and schema-based processing\cite{XMLScreamer} and
     120multithread/multicore speedups based on data parallelism\cite{Shah2009,ZhangPanChiu09}.
    121121
    122122In this paper, we further increase the parallelism in our methods
     
    310310from an initial set of marker positions $M$ through
    311311the spans of characters belonging to a character class $C$ found at each position.
    312 \[s(M, C) = (M_0 + C)  \wedge \neg C\]
     312\[s(M, C) = (M + C)  \wedge \neg C\]
    313313
    314314
     
    325325\begin{center}
    326326\begin{tabular}{rcl}
    327 DecRef & ::=   &        '\verb:&#:' Digit+ '\verb:;:'  \\
     327DecRef & ::=   &        '\verb:&#:' Digit${}^{+}$ '\verb:;:'  \\
    328328Digit  & ::=   &         \verb:[0-9]:\\
    329 STag         &  ::=   &        '\verb:<:' Name (WS  Attribute)* WS? '\verb:>:'  \\
    330 Attribute & ::=   &        Name WS? '=' WS? AttValue \\
    331 AttValue  &           ::=   &        `\verb:":' \verb:[^<"]*: `\verb:":' $|$ ``\verb:':'' \verb:[^<']*: ``\verb:':'' \\
     329STag         &  ::=   &        '\verb:<:' Name (W  Attribute)* W${}^{?}$ '\verb:>:'  \\
     330Attribute & ::=   &        Name W${}^{?}$ '=' W${}^{?}$ AttValue \\
     331AttValue  &           ::=   &      (  `\verb:":' \verb:[^<"]*: `\verb:":') $|$ (``\verb:':'' \verb:[^<']*: ``\verb:':'') \\
     332        W       &    ::=   &    (\verb:\x20: $|$ \verb:\x9: $|$ \verb:\xD: $|$ \verb:\xA:)${}^{+}$ \\
    332333%DQuoted & ::= & \verb:[^<"]*:  \\
    333334%SQuoted & ::= & \verb:[^<']*:
     
    427428$N = $ name chars & \verb`11.1.1...111..111.111.1..11..11..1111..111.1.11`\\
    428429$W = $ white space & \verb`....1..1.............1......1..................`\\
    429 $Q = \neg$\verb:[">]: & \verb`11.11111.111.1111.111111.11.1111.1111.1111.1111`\\
     430$Q = \neg$\verb:["<]: & \verb`11.11111.111.1111.111111.11.1111.1111.1111.1111`\\
    430431\\
    431432$M_0$ & \verb`..1..............1........................1....`\\
     
    434435$M_{0,8} = s(M_{0,7}, W) \wedge \neg$\verb:[>]: & \verb`.....1................1........................`\\
    435436\\
    436 $M_{1,1} = s(M_{0,8}$ & \verb`......1................1.......................`\\
     437$M_{1,1} = s(M_{0,8}, N)$ & \verb`......1................1.......................`\\
    437438$M_{1,2} = s(M_{1,1}, W) \wedge$\verb:[=]: & \verb`......1................1.......................`\\
    438439$M_{1,3} = n(M_{1,2})$ & \verb`.......1................1......................`\\
    439 $M_{1,4} = s({1,3}, W) \wedge$\verb:["]: & \verb`........1...............1......................`\\
     440$M_{1,4} = s(M_{1,3}, W) \wedge$\verb:["]: & \verb`........1...............1......................`\\
    440441$M_{1,5} = n(M_{1,4})$ & \verb`.........1...............1.....................`\\
    441442$M_{1,6} = s(M_{1,5}, Q) \wedge$\verb:["]: & \verb`............1..............1...................`\\
     
    467468 tags through the
    468469opening angle brackets and  the element names, up to the first
    469 attribute name, if present.  Note that the there are no attribute names
     470attribute name, if present.  Note that there are no attribute names
    470471in the final tag shown, so the corresponding marker becomes zeroed
    471472out at the closing angle bracket.
     
    644645%\subsection{Document Structure}
    645646
    646 An XML document consists of a single root element with recursively
    647 defined structure together with material in the document
    648 prolog and epilogs.  Verifying this top-level structure and
    649 the structure of the prolog and epilog material is not
    650 well suited to parallel bitstream techniques, in particular, nor
    651 to any form of parallelism, in general.  In essence, the
    652 prolog and epilog materials occur once per document instance
    653 Thus the requirements to check this top-level structure
    654 for well-formedness are relatively minor, with an overhead
    655 that is quite low for sufficiently sized files.
     647An XML document consists of a single root element within
     648which all others contained; this constraint is also
     649checked during post-bitstream processing.   In addition,
     650we define the necessary "miscellaneous" bitstreams
     651for checking the prolog and epilog material before
     652and after the root element.
    656653
    657654%\subsection{Summary}
     
    688685to itself; the bit shifted out in an upshift is in this
    689686case equivalent to the carry generated by the additon.
    690 The only other information flow requirement in the
    691 calculation of parallel bitstreams occurs with the
    692 bitstream subtractions that are used to calculate span streams.
    693 In this case, the information flow is based on borrows
    694 generated, which can be handled in the same way as carries.
    695687
    696688Properly determining, initializing and inserting carry bits
     
    768760
    769761The remaining rows of Table \ref{parsers-cpb} show performance
    770 of parallel bitstream implementations.  The first row shows
     762of parallel bitstream implementations, including post-bitstream
     763processing.  The first row shows
    771764the performance of our Parabix 1 implementation using
    772765bit scan instructions.   While showing a substantial speed-up
    773766over the byte-at-a-time parsers in every case, note also that
    774767the performance advantage increases with increasing markup
    775 density, as expected.   The last two rows show different versions of
    776 the \verb:xmlwf: implemented based on the Parabix 2 technology
    777 as discussed in this paper.   They differ in the carry handling
    778 strategy, with the ``simd'' row referring to carry
     768density, as expected.   The last two rows show Parabix 2
     769implementations using different carry-handling
     770strategies, with the ``simd'' row referring to carry
    779771computations performed with simulated calculation of
    780772propagated and generated carries using SIMD operations, while the
    781 ``adc64'' row refers to an implementation directly employing
     773``adc64'' row referring to an implementation directly employing
    782774the processor carry flags and add-with-carry instructions on
    78377564-bit general registers.  In both cases, the overall
     
    854846
    855847\bibliographystyle{plain}
     848
    856849\bibliography{xmlperf}
    857850
Note: See TracChangeset for help on using the changeset viewer.