Changeset 1368 for docs/HPCA2012/03research.tex
 Timestamp:
 Aug 24, 2011, 3:55:35 PM (8 years ago)
 File:

 1 edited
Legend:
 Unmodified
 Added
 Removed

docs/HPCA2012/03research.tex
r1355 r1368 45 45 Rather than viewing it as a stream of bytes or characters and performing perbyte comparisons to 46 46 determine when to transition between parsing states, Parabix views data as a set of \emph{bit streams} and uses a mixture of 47 booleanlogic and integerbased math on those streams to effectively parse many bytes in parallel. 48 In this paper, we use the notation $\land$, $\lor$ and $\lnot$ to denote the AND, OR and NOT boolean operations. 47 booleanlogic\footnote{In this paper, we use the notation $\land$, $\lor$ and $\lnot$ to denote the AND, OR and NOT boolean operations.} and integerbased math on those streams to effectively parse many bytes in parallel. 49 48 Bit streams are the foundation of the Parabix technology. In order to understand how it is possible to 50 49 perform complex statetransitionbased text processing operations in parallel, understanding what bit streams are, … … 72 71 The significance of each bit value is dependent on the type of bit stream. 73 72 We view bit streams in one of two ways: $n$ 1bit values for booleanlogic operations and 1 $n$bit value for integerbased math. 74 For simplicity, assume that $n$ is infinitely long w.r.t. the size of the source data. In reality, each bit stream is divided into 75 a series of $w$bit blocks, where $w$ is equal to the width of the SIMD registers within the system 76 (e.g., 64bits for MMX, 128bits for SSE/NEON, and 256bits for AVX). 77 We discuss how these $\frac{n}{w}$ bit stream segments can be used as infinitelylong bit stream in Section \ref{parallel bit stream compilation}. 73 For simplicity, assume that $n$ is infinitely long (i.e., unbounded) w.r.t. the size of the source data. In reality, each bit stream is divided into 74 a series of $w$bit blocks, where $w$ is equal to the SIMD register width of the system (e.g., 64, 128, or 256). 75 We discuss how these $\lceil \frac{n}{w} \rceil$ bit blocks can be viewed as an unbounded bit stream in Section \ref{parabix tool chain}. 78 76 79 77 The first type of bit streams used in Parabix are refered to as \emph{basis bit streams}, which contain the input source data. … … 144 142 % However, a `<' is legal within an XML comment so not every `<' necessarily means that we are opening a new tag. 145 143 146 Characterclass bit streams allow us to perform up to 128 ``comparisons'' in parallel in a single operation (using Intel SSE/ARM NEON)144 Characterclass bit streams allow us to perform up to 128 ``comparisons'' in parallel with a single operation 147 145 by using a series of booleanlogic operations to merge multiple basis bit streams into 148 146 a single characterclass stream that marks the positions of key characters with a $1$. … … 219 217 Using this parallel bit stream approach, conditional branch statements used to identify key positions and/or syntax errors at each 220 218 each parsing position are mostly eliminated, which, as Section XXX shows, minimizes branch misprediction penalties. 221 Accurate parsing and parallel lexical analysis is done through processorfriendly equations that requires nospeculation nor multithreading.219 Accurate parsing and parallel lexical analysis is done through processorfriendly equations and requires neither speculation nor multithreading. 222 220 223 221 224 222 \subsection{Parabix Tool Chain} 223 \label{parabix tool chain} 225 224 226 225 To support the Parabix framework, we are developing tool technology to automate various aspects … … 269 268 previous subsection as well as if and while control structures. 270 269 Pablo translates these operations to blockatatime 271 code in C/C++ , where the block size is the register272 width for SIMD operations on the selected target architecture.270 code in C/C++. 271 %, where the block size is the register width for SIMD operations on the selected target architecture. 273 272 The key functionality of Pablo is to arrange for blocktoblock 274 273 carry bit propagation to implement the long bitstream shift … … 286 285 an abstract SIMD machine based on the Inductive Doubling 287 286 Instruction Set Architecture (IDISA) \cite{CameronLin2009}. 288 Originally developed for 128bit Altivec operations on Power PC289 as well as 64bit MMX and 128bit SSE operations on Intel ,290 we have recently extended our library support to include 287 These operations were originally developed for 128bit Altivec operations on Power PC 288 as well as 64bit MMX and 128bit SSE operations on Intel 289 but have recently extended to support 291 290 the new 256bit AVX operations on Intel as well as the 128bit 292 291 NEON operations on the ARM architecture. Further details
Note: See TracChangeset
for help on using the changeset viewer.