# Changeset 1368 for docs/HPCA2012/03-research.tex

Ignore:
Timestamp:
Aug 24, 2011, 3:55:35 PM (8 years ago)
Message:

large changes to section 2; minor edits to section 3

File:
1 edited

### Legend:

Unmodified
 r1355 Rather than viewing it as a stream of bytes or characters and performing per-byte comparisons to determine when to transition between parsing states, Parabix views data as a set of \emph{bit streams} and uses a mixture of boolean-logic and integer-based math on those streams to effectively parse many bytes in parallel. In this paper, we use the notation $\land$, $\lor$ and $\lnot$ to denote the AND, OR and NOT boolean operations. boolean-logic\footnote{In this paper, we use the notation $\land$, $\lor$ and $\lnot$ to denote the AND, OR and NOT boolean operations.} and integer-based math on those streams to effectively parse many bytes in parallel. Bit streams are the foundation of the Parabix technology. In order to understand how it is possible to perform complex state-transition-based text processing operations in parallel, understanding what bit streams are, The significance of each bit value is dependent on the type of bit stream. We view bit streams in one of two ways: $n$ 1-bit values for boolean-logic operations and 1 $n$-bit value for integer-based math. For simplicity, assume that $n$ is infinitely long w.r.t. the size of the source data. In reality, each bit stream is divided into a series of $w$-bit blocks, where $w$ is equal to the width of the SIMD registers within the system (e.g., 64-bits for MMX, 128-bits for SSE/NEON, and 256-bits for AVX). We discuss how these $\frac{n}{w}$ bit stream segments can be used as infinitely-long bit stream in Section \ref{parallel bit stream compilation}. For simplicity, assume that $n$ is infinitely long (i.e., unbounded) w.r.t. the size of the source data. In reality, each bit stream is divided into a series of $w$-bit blocks, where $w$ is equal to the SIMD register width of the system (e.g., 64, 128, or 256). We discuss how these $\lceil \frac{n}{w} \rceil$ bit blocks can be viewed as an unbounded bit stream in Section \ref{parabix tool chain}. The first type of bit streams used in Parabix are refered to as \emph{basis bit streams}, which contain the input source data. % However, a <' is legal within an XML comment so not every <' necessarily means that we are opening a new tag. Character-class bit streams allow us to perform up to 128 comparisons'' in parallel in a single operation (using Intel SSE/ARM NEON) Character-class bit streams allow us to perform up to 128 comparisons'' in parallel with a single operation by using a series of boolean-logic operations to merge multiple basis bit streams into a single character-class stream that marks the positions of key characters with a $1$. Using this parallel bit stream approach, conditional branch statements used to identify key positions and/or syntax errors at each each parsing position are mostly eliminated, which, as Section XXX shows, minimizes branch misprediction penalties. Accurate parsing and parallel lexical analysis is done through processor-friendly equations that requires no speculation nor multithreading. Accurate parsing and parallel lexical analysis is done through processor-friendly equations and requires neither speculation nor multithreading. \subsection{Parabix Tool Chain} \label{parabix tool chain} To support the Parabix framework, we are developing tool technology to automate various aspects previous subsection as well as if and while control structures. Pablo translates these operations to block-at-a-time code in C/C++, where the block size is the register width for SIMD operations on the selected target architecture. code in C/C++. %, where the block size is the register width for SIMD operations on the selected target architecture. The key functionality of Pablo is to arrange for block-to-block carry bit propagation to implement the long bitstream shift an abstract SIMD machine based on the Inductive Doubling Instruction Set Architecture (IDISA) \cite{CameronLin2009}. Originally developed for 128-bit Altivec operations on Power PC as well as 64-bit MMX and 128-bit SSE operations on Intel, we have recently extended our library support to include These operations were originally developed for 128-bit Altivec operations on Power PC as well as 64-bit MMX and 128-bit SSE operations on Intel but have recently extended to support the new 256-bit AVX operations on Intel as well as the 128-bit NEON operations on the ARM architecture. Further details