Ignore:
Timestamp:
Aug 24, 2011, 3:55:35 PM (8 years ago)
Author:
ksherdy
Message:

large changes to section 2; minor edits to section 3

File:
1 edited

Legend:

Unmodified
Added
Removed
  • docs/HPCA2012/03-research.tex

    r1355 r1368  
    4545Rather than viewing it as a stream of bytes or characters and performing per-byte comparisons to
    4646determine when to transition between parsing states, Parabix views data as a set of \emph{bit streams} and uses a mixture of
    47 boolean-logic and integer-based math on those streams to effectively parse many bytes in parallel.
    48 In this paper, we use the notation $\land$, $\lor$ and $\lnot$ to denote the AND, OR and NOT boolean operations.
     47boolean-logic\footnote{In this paper, we use the notation $\land$, $\lor$ and $\lnot$ to denote the AND, OR and NOT boolean operations.} and integer-based math on those streams to effectively parse many bytes in parallel.
    4948Bit streams are the foundation of the Parabix technology. In order to understand how it is possible to
    5049perform complex state-transition-based text processing operations in parallel, understanding what bit streams are,
     
    7271The significance of each bit value is dependent on the type of bit stream.
    7372We view bit streams in one of two ways: $n$ 1-bit values for boolean-logic operations and 1 $n$-bit value for integer-based math.
    74 For simplicity, assume that $n$ is infinitely long w.r.t. the size of the source data. In reality, each bit stream is divided into
    75 a series of $w$-bit blocks, where $w$ is equal to the width of the SIMD registers within the system
    76 (e.g., 64-bits for MMX, 128-bits for SSE/NEON, and 256-bits for AVX).
    77 We discuss how these $\frac{n}{w}$ bit stream segments can be used as infinitely-long bit stream in Section \ref{parallel bit stream compilation}.
     73For simplicity, assume that $n$ is infinitely long (i.e., unbounded) w.r.t. the size of the source data. In reality, each bit stream is divided into
     74a series of $w$-bit blocks, where $w$ is equal to the SIMD register width of the system (e.g., 64, 128, or 256).
     75We discuss how these $\lceil \frac{n}{w} \rceil$ bit blocks can be viewed as an unbounded bit stream in Section \ref{parabix tool chain}.
    7876
    7977The first type of bit streams used in Parabix are refered to as \emph{basis bit streams}, which contain the input source data.
     
    144142% However, a `<' is legal within an XML comment so not every `<' necessarily means that we are opening a new tag.
    145143
    146 Character-class bit streams allow us to perform up to 128 ``comparisons'' in parallel in a single operation (using Intel SSE/ARM NEON)
     144Character-class bit streams allow us to perform up to 128 ``comparisons'' in parallel with a single operation
    147145by using a series of boolean-logic operations to merge multiple basis bit streams into
    148146a single character-class stream that marks the positions of key characters with a $1$.
     
    219217Using this parallel bit stream approach, conditional branch statements used to identify key positions and/or syntax errors at each
    220218each parsing position are mostly eliminated, which, as Section XXX shows, minimizes branch misprediction penalties.
    221 Accurate parsing and parallel lexical analysis is done through processor-friendly equations that requires no speculation nor multithreading.
     219Accurate parsing and parallel lexical analysis is done through processor-friendly equations and requires neither speculation nor multithreading.
    222220
    223221
    224222\subsection{Parabix Tool Chain}
     223\label{parabix tool chain}
    225224
    226225To support the Parabix framework, we are developing tool technology to automate various aspects
     
    269268previous subsection as well as if and while control structures.
    270269Pablo translates these operations to block-at-a-time
    271 code in C/C++, where the block size is the register
    272 width for SIMD operations on the selected target architecture.
     270code in C/C++.
     271%, where the block size is the register width for SIMD operations on the selected target architecture.
    273272The key functionality of Pablo is to arrange for block-to-block
    274273carry bit propagation to implement the long bitstream shift
     
    286285an abstract SIMD machine based on the Inductive Doubling
    287286Instruction Set Architecture (IDISA) \cite{CameronLin2009}.
    288 Originally developed for 128-bit Altivec operations on Power PC
    289 as well as 64-bit MMX and 128-bit SSE operations on Intel,
    290 we have recently extended our library support to include
     287These operations were originally developed for 128-bit Altivec operations on Power PC
     288as well as 64-bit MMX and 128-bit SSE operations on Intel
     289but have recently extended to support
    291290the new 256-bit AVX operations on Intel as well as the 128-bit
    292291NEON operations on the ARM architecture. Further details
Note: See TracChangeset for help on using the changeset viewer.