# Changeset 1355

Ignore:
Timestamp:
Aug 23, 2011, 5:40:29 PM (8 years ago)
Message:

Use of ccc and Pablo

Location:
docs/HPCA2012
Files:
3 edited

### Legend:

Unmodified
 r1354 Accurate parsing and parallel lexical analysis is done through processor-friendly equations that requires no speculation nor multithreading. \subsection{Parallel Bit Stream Compilation} \label {parallel bit stream compilation} While the description of parallel bit stream parsing in the previous section works conceptually on infinitely-long bit streams, in practice, the implementation of a Parabix parser must process input streams as bit stream blocks whose size is equal to the SIMD register width of the target processor. In our work, we leverage the unbounded integer type of the Python programming language to verify our equations against unbounded bit streams. Using a restricted subset of Python, we prototype and validate the functionality of our applications, such as CSV parsing, XML validation, and UTF-8 to UTF-16 transcoding, etc. To transform the Python code into the equivalent block-at-a-time C/C++ code, the key question becomes how to efficiently transfer information from one bit stream block to the next whenever the result of an operation crosses a block boundary. The answer lies in carry bit propagation. Since the parallel ScanThru operation relies solely on bit-wise addition and logical operations, block-to-block information transfer is captured in entirety by the carry bit associated with each underlying addition operation. Boolean-logical operations do not pass information over block boundaries since each bit value is independent of the state of its surrounding bits. Properly determining, initializing and inserting carry bits into a block-by-block implementation is tedious and error prone. Thus we have developed compiler technology to automatically transform parallel bit stream Python code to block-at-a-time C/C++ implementations. Details are beyond the scope of this paper, but are described in the on-line source code repository at {\tt http://parabix.costar.sfu.ca}. \subsection{Parabix Tool Chain} \begin{tabular}{r l} \ttfamily{\bfseries INPUT:} & \verbdigit = [<] \\ \ttfamily{\bfseries INPUT:} & \verbdigit = [0-9] \\ \ttfamily{\bfseries OUTPUT:} & \verbtemp1 = (basis_bits.bit_0 | basis_bits.bit_1) \\ & \verbtemp2 = (basis_bits.bit_2 & basis_bits.bit_3) \\ code in C/C++, where the block size is the register width for SIMD operations on the selected target architecture. The key functionality of Pablo is to arrange for block-to-block carry bit propagation to implement the long bitstream shift and addition operations required by {\tt Advance} and {\tt ScanThru}.
 r1339 \label{section:parser} \subsection{Parser Structure} \begin{figure} \begin{figure}[h] \begin{center} \includegraphics[width=1\textwidth]{plots/parabix_arch.pdf} The final output reports any well-formedness error detected and its position within the input file. \subsection{Parallel Bit Stream Compilation} While the description of parallel bit stream parsing in the previous section works conceptually on unbounded bit streams, in practice, a corresponding C implementation to process input streams into blocks of size equal to the SIMD register width of the target processor is required. In our work, we leverage the unbounded integer type of the Python programming language. Using a restricted subset of Python, we prototype and validate the functionality of applications, such as XML validation and UTF-8 to UTF-16 transcoding. We then compile this Python code into equivalent block-at-a-time C code. The key question becomes how to transfer information from one block to the next whenever token scans cross block boundaries. The answer lies in carry bit propagation. Since the parallel $scanto$ operation relies solely on bit-wise addition and logical operations, block-to-block information transfer is captured in entirety by the carry bit associated with each underlying addition operation. Logical operations do not require information flow across block boundaries. Properly determining, initializing and inserting carry bits into a block-by-block implementation is tedious and error prone. Thus we have developed compiler technology to automatically transform parallel bit stream Python code to block-at-a-time C implementations. Details are beyond the scope of this paper, but are described in the on-line source code repository at parabix.costar.sfu.ca. Within this structure, all functions in the four shaded modules consist entirely of parallel bit stream operations.  Of these, the Classification function consists of XML character class definitions that are generated using ccc, while much of the U8\_Validation similarly consists of UTF-8 byte class definitions that are also generated by ccc.  The remainder of these functions are programmed using our unbounded bitstream language following the logical requirements of XML parsing.   All the functions in the four shaded modules are then compiled to low-level C/C++ code using our Pablo compiler.   This code is then linked in with the general Transposition code available in the Parabix run-time library, as well as the hand-written Postprocessing code that completes the well-formed checking.