source: docs/HPCA2012/03b-research.tex @ 1333

Last change on this file since 1333 was 1333, checked in by lindanl, 8 years ago

Parabix parser structure

File size: 3.0 KB
Line 
1\section{Parabix}
2
3\subsection{Parabix Structure}
4\begin{figure}
5\begin{center}
6\includegraphics[width=0.5\textwidth]{plots/parabix_arch.pdf}
7\end{center}
8\caption{Parabix2 Structure}
9\label{parabix_arch}
10\end{figure}
11
12
13Figure \ref{parabix_arch} shows the overall structure of the Parabix XML parser set up for
14well-formedness checking.
15The input file is processed using 11 functions organized into 7 modules. 
16In the first module, the Read\_Data function loads data blocks from an input file to data\_buffer.
17The data is then transposed to eight parallel basis bitstreams (basis\_bits) in the Transposition module.
18The eight bitstreams are used in the Classification function to generate all the XML lexical item streams (lex)
19as well as in the U8\_Validation module to validate UTF-8 characters.
20The lexical item streams and scope streams (scope) that are generated in Gen\_Scope function
21are supplied to the parsing module, which consists three functions, Parse\_CtCDPI, Parse\_Ref and Parse\_tag.
22These functions deal with the parsing of
23comments, CDATA sections, processing instructions, references and tags.   After this,
24information is gathered by Name\_Validation and Err\_Check functions, producing
25name check streams and error streams.  These are then passed to the final module for Postprocessing.
26All the possible errors that cannot be conveniently detected by bitstreams are checked in this last module.
27The final output reports any well-formedness error detected and its position within the input file.
28
29\subsection{Parallel Bit Stream Compilation}
30
31
32While the description of parallel bit stream parsing in the previous section works conceptually on
33unbounded bit streams, in practice, a corresponding C implementation to process input streams into blocks
34of size equal to the SIMD register width of the target processor is required. In our work, we leverage the unbounded
35integer type of the Python programming language. Using a restricted subset of Python, we prototype and validate the
36functionality of applications, such as XML validation and UTF-8 to UTF-16 transcoding. We then compile this Python code
37into equivalent block-at-a-time C code. The key question becomes how to transfer information from one block to the next whenever
38token scans cross block boundaries.
39
40The answer lies in carry bit propagation. Since the parallel $scanto$ operation relies solely on bit-wise addition and logical operations,
41block-to-block information transfer is captured in entirety by the carry bit associated with each underlying addition operation. Logical operations
42do not require information flow across block boundaries. Properly determining, initializing and inserting carry bits into a block-by-block
43implementation is tedious and error prone. Thus we have developed compiler technology to automatically transform parallel bit stream
44Python code to block-at-a-time C implementations. Details are beyond the scope of this paper, but are described in the on-line
45source code repository at parabix.costar.sfu.ca.
46
Note: See TracBrowser for help on using the repository browser.