Changeset 2478 for docs

Oct 18, 2012, 4:05:50 PM (7 years ago)

Rework introduction

1 edited


  • docs/Working/icXML/icxml-main.tex

    r2471 r2478  
    23 \documentclass[preprint]{sigplanconf}
    2525% The following \documentclass options may be useful:
    36 \conferenceinfo{PPoPP '13}{date, City.}
     36\conferenceinfo{EuroSys '13}{date, City.}
    3838\copyrightdata{[to be supplied]}
    4141\preprintfooter{short description of paper}   % 'preprint' option specified.
    43 \title{ICXML:  Accelerating a Commercial XML Parser Using Parallel Technologies}
     43\title{ICXML:  Accelerating a Commercial XML Parser Using SIMD and Multicore Technologies}
    4444%\subtitle{Subtitle Text, if any}
    4545\authorinfo{Anonymous Hackers}
    48 % \authorinfo{Nigel Medforth \and Dan Lin \and Rob Cameron \and Arrvindh Shriraman}
    49 %            {Simon Fraser University}
    50 %            {\{nmedfort,lindanl,cameron,ashriram\}}
     48% \authorinfo{Nigel Medforth \and Dan Lin \and Kenneth S. Herdy \and Arrvindh Shriraman \and Robert D. Cameron }
     49%            {International Characters, Inc., and Simon Fraser University}
     50%            {\{nmedfort,lindanl,ksherdy,ashriram,cameron\}}
    72 Paragraph 1: 
    7472Parallelization and acceleration of XML parsing is a widely
    7573studied problem that has seen the development of a number
    76 of interesting research prototypes.
    77 One possibility to data parallelizing the parsing process is by adding a
    78 pre-parsing step to get the skeleton that symbolized the tree structure of the XML document \cite{GRID2006}.
    79 The pre-parsing stage can also be parallelized using state machines \cite{E-SCIENCE2007, IPDPS2008}.
    80 Methods without pre-parsing require speculation \cite{HPCC2011} or post-processing that
     74of interesting research prototypes using both SIMD and
     75multicore parallelism.   Most works have investigated
     76strategies for data parallel solutions on multicore
     77architectures using various strategies to break input
     78documents into segments that can be allocated to different cores.
     79For example, one possibility for data
     80parallelization is to add a pre-parsing step to compute
     81a skeleton tree structure of an  XML document \cite{GRID2006}.
     82The parallelization of the pre-parsing stage itself can be tackled with
     83state machines \cite{E-SCIENCE2007, IPDPS2008}.
     84Methods without pre-parsing have used speculation \cite{HPCC2011} or post-processing that
    8185combines the partial results \cite{ParaDOM2009}.
    8286A hybrid method that combines data parallelism and pipeline parallelism is proposed to
    8387hide the latency of the ``job'' that has to be done sequentially \cite{ICWS2008}.
    84 Intel introduced new string processing instructions in the SSE 4.2 instruction set extension
    85 and showed how it can be used to improve the performance of XML parsing \cite{XMLSSE42}.
    86 Parabix XML parser exploit the SIMD extensions to process hundreds of XML input characters
    87 simultaneously \cite{Cameron2009, cameron-EuroPar2011}.
    88 Parabix can also be combined with thread-level parallelism to achieve further improvement
    89 on multicore systems \cite{HPCA2012}.
    91 Paragraph 2:
     89Fewer efforts have investigated SIMD parallelism, although this approach
     90has the potential advantage of improving single core performance as well
     91as offering savings in energy consumption.
     92Intel introduced specialized SIMD string processing instructions in the SSE 4.2 instruction set extension
     93and showed how they can be used to improve the performance of XML parsing \cite{XMLSSE42}.
     94The Parabix framework uses generic SIMD extensions and bit parallel methods to
     95process hundreds of XML input characters simultaneously \cite{Cameron2009, cameron-EuroPar2011}.
     96Parabix prototypes have also combined SIMD methods with thread-level parallelism to
     97achieve further acceleration on multicore systems \cite{HPCA2012}.
    9299In this paper, we move beyond research prototypes to consider
    93 the detailed integration of parallel methods into the Xerces-C++
    94 parser of the Apache Software Foundation, an existing
     100the detailed integration of both SIMD and multicore parallelism into the
     101Xerces-C++ parser of the Apache Software Foundation, an existing
    95102standards-compliant open-source parser that is widely used
    96 in commercial practice.    Surprisingly, our results show
    97 that a speed-up of more than 100\% can be achieved in some
    98 applications, in apparent defiance of simple calculations
    99 based on Amdahl's law.  [Write text on these calculations
    100 based on reported costs of XML tokenization  (30\%?), transcoding...]
    102 Symbol table lookup: more than 15\%, compute key:3\% \cite{ZhaoBhuyan06}
    104 Schema valiation double, triple or quadruple the parsing cost. \cite{NicolaJohn03}
    106 Transcoding:  about 50\% \cite{Perkins05}
    108 Paragraph 3:
     103in commercial practice.    The challenge of this work is
     104to incorporate parallelize the Xerces parser in such a way as to
     105preserve the existing APIs as well as offering worthwhile
     106end-to-end acceleration of XML processing.   
    109107To achieve the best results possible, we have undertaken
    110108a comprehensive restructuring of the Xerces-C++ parser,
    115113resolution, bit parallel methods in namespace processing, as well as staged
    116114processing with pipeline parallelism to take advantage of
    117 multiple cores.
     115multiple cores.   
     117The remainder of this paper is organized as follows.   Section 2 discusses
     118the structure of the Xerces and Parabix XML parsers and the fundamental
     119differences between the two parsing models.   Section 3 then presents
     120the icXML design based on a restructured Xerces architecture to
     121incorporate SIMD parallelism using Parabix methods.   Section 4 presents a performance
     122study demonstrating substantial end-to-end acceleration of
     123a GML-to-SVG translation application written against the Xerces API.
     124Section 5 moves on to consider the multithreading of the icXML architecture
     125using the pipeline parallelism model.  Section 6 concludes the
     126paper with a discussion of future work and the potential for
     127applying the techniques discussed herein in other application domains.
Note: See TracChangeset for help on using the changeset viewer.