Ignore:
Timestamp:
Aug 25, 2011, 1:56:51 PM (8 years ago)
Author:
ashriram
Message:

Done evaluation

File:
1 edited

Legend:

Unmodified
Added
Removed
  • docs/HPCA2012/03b-research.tex

    r1372 r1380  
    1111
    1212
    13 Figure \ref{parabix_arch} shows the overall structure of the Parabix XML parser set up for
    14 well-formedness checking.
    15 The input file is processed using 11 functions organized into 7 modules. 
    16 In the first module, the Read\_Data function loads data blocks from an input file to data\_buffer.
    17 The data is then transposed to eight parallel basis bitstreams (basis\_bits) in the Transposition module.
    18 The eight bitstreams are used in the Classification function to generate all the XML lexical item streams (lex)
    19 as well as in the U8\_Validation module to validate UTF-8 characters.
    20 The lexical item streams and scope streams (scope) that are generated in Gen\_Scope function
    21 are supplied to the parsing module, which consists three functions, Parse\_CtCDPI, Parse\_Ref and Parse\_tag.
    22 These functions deal with the parsing of
    23 comments, CDATA sections, processing instructions, references and tags.   After this,
    24 information is gathered by Name\_Validation and Err\_Check functions, producing
    25 name check streams and error streams.  These are then passed to the final module for Postprocessing.
    26 All the possible errors that cannot be conveniently detected by bitstreams are checked in this last module.
    27 The final output reports any well-formedness error detected and its position within the input file.
     13Figure \ref{parabix_arch} shows the overall structure of the Parabix
     14XML parser set up for well-formedness checking.  The input file is
     15processed using 11 functions organized into 7 modules.  In the first
     16module, the Read\_Data function loads data blocks from an input file
     17to data\_buffer.  The data is then transposed to eight parallel basis
     18bitstreams (basis\_bits) in the Transposition module.  The eight
     19bitstreams are used in the Classification function to generate all the
     20XML lexical item streams (lex) as well as in the U8\_Validation module
     21to validate UTF-8 characters.  The lexical item streams and scope
     22streams (scope) that are generated in Gen\_Scope function are supplied
     23to the parsing module, which consists three functions, Parse\_CtCDPI,
     24Parse\_Ref and Parse\_tag.  These functions deal with the parsing of
     25comments, CDATA sections, processing instructions, references and
     26tags.  After this, information is gathered by Name\_Validation and
     27Err\_Check functions, producing name check streams and error streams.
     28These are then passed to the final module for Postprocessing.  All the
     29possible errors that cannot be conveniently detected by bitstreams are
     30checked in this last module.  The final output reports any
     31well-formedness error detected and its position within the input file.
    2832
    29 Within this structure, all functions in the four shaded modules consist entirely of parallel bit stream
    30 operations.  Of these, the Classification function consists of XML character class definitions that
    31 are generated using ccc, while much of the U8\_Validation similarly consists of UTF-8 byte class
    32 definitions that are also generated by ccc.  The remainder of these functions are programmed using
    33 our unbounded bitstream language following the logical requirements of XML parsing.   All the functions
    34 in the four shaded modules are then compiled to low-level C/C++ code using our Pablo compiler.   This
    35 code is then linked in with the general Transposition code available in the Parabix run-time library,
    36 as well as the hand-written Postprocessing code that completes the well-formed checking.
     33Within this structure, all functions in the four shaded modules
     34consist entirely of parallel bit stream operations.  Of these, the
     35Classification function consists of XML character class definitions
     36that are generated using ccc, while much of the U8\_Validation
     37similarly consists of UTF-8 byte class definitions that are also
     38generated by ccc.  The remainder of these functions are programmed
     39using our unbounded bitstream language following the logical
     40requirements of XML parsing.  All the functions in the four shaded
     41modules are then compiled to low-level C/C++ code using our Pablo
     42compiler.  This code is then linked in with the general Transposition
     43code available in the Parabix run-time library, as well as the
     44hand-written Postprocessing code that completes the well-formed
     45checking.
Note: See TracChangeset for help on using the changeset viewer.