Changeset 2259 for docs/Working


Ignore:
Timestamp:
Aug 3, 2012, 3:49:27 PM (7 years ago)
Author:
lindanl
Message:

Introduction fillins

Location:
docs/Working/PPoPP
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • docs/Working/PPoPP/icxml-main.tex

    r2257 r2259  
    6868studied problem that has seen the development of a number
    6969of interesting research prototypes.
    70 
    71 
    72 [Review the previous literature on various parallelization methods.
    73      - Scarpazzi XML tokenization, parabix1 and 2, Intel SSE4.2 \cite{XMLSSE42},
    74    Kenneth Chiu's work,  other data parallelism work (Balisage 08)...
    75 ]
     70One possibility to data parallelizing the parsing process is by adding a pre-parsing step to get the skeleton that symbolized the tree structure of the XML document \cite{GRID2006}.
     71The pre-parsing stage can also be parallelized using state machines \cite{E-SCIENCE2007, IPDPS2008}.
     72Methods without pre-parsing require speculation \cite{HPCC2011} or post-processing that combines the partial results \cite{ParaDOM2009}.
     73A hybrid method that combines data parallelism and pipeline parallelism is proposed to hide the latency of the ``job'' that has to be done sequentially \cite{ICWS2008}.
     74Intel introduced new string processing instructions in the SSE 4.2 instruction set extension and showed how it can be used to improve the performance of XML parsing \cite{XMLSSE42}.
     75Parabix XML parser exploit the SIMD extensions to process hundreds of XML input characters simultaneously \cite{Cameron2009, cameron-EuroPar2011}.
     76Parabix can also be combined with thread-level parallelism to achieve further improvement on multicore systems \cite{HPCA2012}.
    7677
    7778Paragraph 2:
     
    8586based on Amdahl's law.  [Write text on these calculations
    8687based on reported costs of XML tokenization  (30\%?), transcoding...]
    87  
     88
     89Symbol table lookup: more than 15\%, compute key:3\% \cite{ZhaoBhuyan06}
     90
     91Schema valiation double, triple or quadruple the parsing cost. \cite{NicolaJohn03}
     92
     93Transcoding:  about 50\% \cite{Perkins05}
     94
    8895Paragraph 3:
    8996To achieve the best results possible, we have undertaken
  • docs/Working/PPoPP/reference.bib

    r2258 r2259  
    484484}
    485485
    486 @inproceedings{Shah:2009,
    487  author = {Shah, Bhavik and Rao, Praveen R. and Moon, Bongki and Rajagopalan, Mohan},
    488  title = {A Data Parallel Algorithm for {XML DOM} Parsing},
    489  booktitle = {Proc. 6th Int'l XML Database Symposium on Database and XML Technologies},
    490  series = {XSym '09},
    491  year = {2009},
    492  location = {Lyon, France},
    493  pages = {75--90},
    494  numpages = {16},
    495  publisher = {Springer-Verlag},
    496  address = {Berlin, Heidelberg},
    497 }
    498 
    499486@inproceedings{GRID2006,
    500487 author = {Lu, Wei and Chiu, Kenneth and Pan, Yinfei},
Note: See TracChangeset for help on using the changeset viewer.