Ignore:
Timestamp:
Oct 20, 2012, 12:30:32 PM (7 years ago)
Author:
nmedfort
Message:

edits

File:
1 edited

Legend:

Unmodified
Added
Removed
  • docs/Working/icXML/background-parabix.tex

    r2516 r2522  
    44The Parabix (parallel bit stream) framework is a transformative approach to XML parsing
    55(and other forms of text processing.) The key idea is to exploit the availability of wide
    6 (e.g., 128-bit) SIMD registers in commodity processors to represent data from long blocks
     6SIMD registers (e.g., 128-bit) in commodity processors to represent data from long blocks
    77of input data by using one register bit per single input byte.
    88To facilitate this, the input data is first transposed into a set of basis bit streams.
     
    1818Similarly, a character is numeric
    1919{\tt [0-9]} if and only if $\lnot(b_0 \lor b_1) \land (b_2 \land b_3) \land \lnot(b_4 \land (b_5 \lor b_6))$.
    20 % An important observation here is that a range of characters can sometimes
    21 % take fewer operations and require fewer basis bit streams to compute
    22 % than individual characters. Finding optimal solutions to all
    23 % character-classes is non-trivial and goes beyond the scope of this
    24 % paper.
     20An important observation here is that ranges of characters may
     21require fewer operations than individual characters and multiple
     22classes can sometimes share the classification cost.
    2523
    2624\begin{figure}[tbh]
     
    113111well as combining SIMD methods with 4-stage pipeline parallelism to further improve
    114112throughput \cite{HPCA2012}.
    115 Although these research prototypes handle the full syntax of
    116 DTD-less XML documents, they lacked the functionality required by full XML parsers.
     113Although these research prototypes handled the full syntax of
     114{\bf grammarless} XML documents, they lacked the functionality required by full XML parsers.
    117115Namely, commercial XML processors, such as Xerces,
    118116as support for transcoding of multiple character sets,
    119 the ability to parse and validate against DTDs, both internal and external,
     117the ability to parse and validate against DTD grammars, both internal and external,
    120118facilities for handling different XML vocabularies through namespace
    121119processing, as well validation against XML Schema grammars. 
Note: See TracChangeset for help on using the changeset viewer.