Changeset 1355


Ignore:
Timestamp:
Aug 23, 2011, 5:40:29 PM (8 years ago)
Author:
cameron
Message:

Use of ccc and Pablo

Location:
docs/HPCA2012
Files:
3 edited

Legend:

Unmodified
Added
Removed
  • docs/HPCA2012/03-research.tex

    r1354 r1355  
    221221Accurate parsing and parallel lexical analysis is done through processor-friendly equations that requires no speculation nor multithreading.
    222222
    223 \subsection{Parallel Bit Stream Compilation}
    224 \label {parallel bit stream compilation}
    225 
    226 While the description of parallel bit stream parsing in the previous section works conceptually on
    227 infinitely-long bit streams, in practice, the implementation of a Parabix parser must process input streams as bit stream blocks
    228 whose size is equal to the SIMD register width of the target processor.
    229 In our work, we leverage the unbounded integer type of the Python programming language to verify our equations
    230 against unbounded bit streams.
    231 Using a restricted subset of Python, we prototype and validate the functionality of our applications,
    232 such as CSV parsing, XML validation, and UTF-8 to UTF-16 transcoding, etc.
    233 To transform the Python code into the equivalent block-at-a-time C/C++ code,
    234 the key question becomes how to efficiently transfer information from one bit stream block to the next whenever the result of an operation
    235 crosses a block boundary.
    236 
    237 The answer lies in carry bit propagation. Since the parallel ScanThru operation relies solely on bit-wise addition and logical operations,
    238 block-to-block information transfer is captured in entirety by the carry bit associated with each underlying addition operation.
    239 Boolean-logical operations do not pass information over block boundaries since each bit value is independent of the state of its surrounding bits.
    240 Properly determining, initializing and inserting carry bits into a block-by-block implementation is tedious and error prone.
    241 Thus we have developed compiler technology to automatically transform parallel bit stream
    242 Python code to block-at-a-time C/C++ implementations. Details are beyond the scope of this paper, but are described in the on-line
    243 source code repository at {\tt http://parabix.costar.sfu.ca}.
    244223
    245224\subsection{Parabix Tool Chain}
     
    267246
    268247\begin{tabular}{r l}
    269 \ttfamily{\bfseries INPUT:} & \verb`digit = [<]` \\
     248\ttfamily{\bfseries INPUT:} & \verb`digit = [0-9]` \\
    270249\ttfamily{\bfseries OUTPUT:} & \verb`temp1 = (basis_bits.bit_0 | basis_bits.bit_1)` \\
    271250& \verb`temp2 = (basis_bits.bit_2 & basis_bits.bit_3)` \\
     
    292271code in C/C++, where the block size is the register
    293272width for SIMD operations on the selected target architecture.
     273The key functionality of Pablo is to arrange for block-to-block
     274carry bit propagation to implement the long bitstream shift
     275and addition operations required by
     276{\tt Advance} and {\tt ScanThru}.
     277
    294278
    295279
  • docs/HPCA2012/03b-research.tex

    r1339 r1355  
    22\label{section:parser}
    33\subsection{Parser Structure}
    4 \begin{figure}
     4
     5\begin{figure}[h]
    56\begin{center}
    67\includegraphics[width=1\textwidth]{plots/parabix_arch.pdf}
     
    2728The final output reports any well-formedness error detected and its position within the input file.
    2829
    29 \subsection{Parallel Bit Stream Compilation}
    30 
    31 
    32 While the description of parallel bit stream parsing in the previous section works conceptually on
    33 unbounded bit streams, in practice, a corresponding C implementation to process input streams into blocks
    34 of size equal to the SIMD register width of the target processor is required. In our work, we leverage the unbounded
    35 integer type of the Python programming language. Using a restricted subset of Python, we prototype and validate the
    36 functionality of applications, such as XML validation and UTF-8 to UTF-16 transcoding. We then compile this Python code
    37 into equivalent block-at-a-time C code. The key question becomes how to transfer information from one block to the next whenever
    38 token scans cross block boundaries.
    39 
    40 The answer lies in carry bit propagation. Since the parallel $scanto$ operation relies solely on bit-wise addition and logical operations,
    41 block-to-block information transfer is captured in entirety by the carry bit associated with each underlying addition operation. Logical operations
    42 do not require information flow across block boundaries. Properly determining, initializing and inserting carry bits into a block-by-block
    43 implementation is tedious and error prone. Thus we have developed compiler technology to automatically transform parallel bit stream
    44 Python code to block-at-a-time C implementations. Details are beyond the scope of this paper, but are described in the on-line
    45 source code repository at parabix.costar.sfu.ca.
    46 
     30Within this structure, all functions in the four shaded modules consist entirely of parallel bit stream
     31operations.  Of these, the Classification function consists of XML character class definitions that
     32are generated using ccc, while much of the U8\_Validation similarly consists of UTF-8 byte class
     33definitions that are also generated by ccc.  The remainder of these functions are programmed using
     34our unbounded bitstream language following the logical requirements of XML parsing.   All the functions
     35in the four shaded modules are then compiled to low-level C/C++ code using our Pablo compiler.   This
     36code is then linked in with the general Transposition code available in the Parabix run-time library,
     37as well as the hand-written Postprocessing code that completes the well-formed checking.
Note: See TracChangeset for help on using the changeset viewer.