# Changeset 1421

Ignore:
Timestamp:
Aug 31, 2011, 7:47:40 PM (8 years ago)
Message:

Location:
docs/HPCA2012
Files:
7 edited

Unmodified
Removed
• ## docs/HPCA2012/00-abstract.tex

 r1393 In this paper, we enable text processing applications to effectively use commodity processors. We introduce Parabix (Parallel Bitstream) use commodity processors. We introduce Parabix (Parallel Bit Stream) technology, a software toolchain and execution framework that allows applications to exploit modern SIMD instructions for high performance text processing. Parabix enables the application developer to write constructs assuming unlimited SIMD data parallelism and Parabix's bitstream translator generates code based on machine specifics (e.g., bit stream translator generates code based on machine specifics (e.g., SIMD register widths).  The key insight into efficient text processing in Parabix is the data organization. Parabix transposes the sequence
• ## docs/HPCA2012/01-intro.tex

 r1413 (parsers/finite state machines) that is considered to be the hardest application class to parallelize~\cite{Asanovic:EECS-2006-183}. We present Parabix, a novel execution framework and software run-time present Parabix, a novel execution framework and software runtime environment that can be used to dramatically improve the efficiency of text processing and parsing on commodity processors.  Parabix Overall we make the following contributions: 1) We outline the Parabix architecture, tool chain and run-time 1) We outline the Parabix architecture, tool chain and runtime environment and describe how it may be used to produce efficient XML parser implementations on a variety of commodity processors. parsing and provides insight into the inefficiency of traditional parsers.  Section~\ref{section:parabix} describes the Parabix architecture, tool chain and run-time environment. architecture, tool chain and runtime environment. Section~\ref{section:parser} describes the our design of an XML parser based on the Parabix framework.  Section~\ref{section:baseline}
• ## docs/HPCA2012/05-corei3.tex

 r1415 \subsection{Branch Mispredictions} \label{section:XML-branches} It is hard to handle branch mispredictions in traditional XML parsers due to: (1) the variable length nature of the syntactic elements contained within XML documents; (2) a data dependent characteristic, and (3) the extensive set of syntax constraints imposed by the XML 1.0/1.1 specifications. In general, performance is limited by branch mispredictions. Unfortunately, it is difficult to reduce the branch misprediction rate of traditional XML parsers due to: (1) the variable length nature of the syntactic elements contained within XML documents; (2) a data dependent characteristic, and (3) the extensive set of syntax constraints imposed by the XML 1.0/1.1 specifications. % Branch mispredictions are known % to signficantly degrade XML parsing performance in proportion to the markup density of the source document
• ## docs/HPCA2012/06-scalability.tex

 r1418 superscalar microprocessor. It includes a 32kB L1 data cache and a 512kB L2 shared cache.  Migration of Parabix-XML to the Android platform only required developing a Parabix runtime library for ARM \NEON{}. only required developing a Parabix run-time library for ARM \NEON{}. The majority of the runtime functionality was ported directly. However, a small subset of key SIMD instructions (e.g., bit
• ## docs/HPCA2012/07-avx.tex

 r1418 In this section, we discuss the scalability and performance advantages of our 256-bit AVX (Advanced Vector Extensions) Parabix-XML port.  The Parabix runtime libraries originally targeted the 128-bit SSE2 SIMD Parabix run-time libraries originally targeted the 128-bit SSE2 SIMD technology, available on all modern 64-bit Intel and AMD processors. It was recently been ported to AVX, which is commercially to take advantage of the 3-operand form of AVX instructions while retaining a uniform 128-bit SIMD processing width.  The second involved rewriting the Parabix runtime library to involved rewriting the Parabix run-time library to leverage the 256-bit AVX instructions wherever possible and to simulate the remaining operations using pairs of 128-bit operations. Figure
• ## docs/HPCA2012/11-conclusions.tex

 r1411 operations in some cases have higher overheads compared to the existing 128 bit SSE operations. We also compare Intel's SIMD extensions against the ARM Neon. Note that Parabix allowed us to extensions against the ARM \NEON{}. Note that Parabix allowed us to perform these studies without having to change the application source. Finally, we parallelized the Parabix XML parser to take advantage of fine-grain parallelism we exploit; parallelized Parabix achieves a further 2$\times$ improvement in performance.
• ## docs/HPCA2012/main.tex

 r1411 \usepackage{wrapfig} \usepackage{amssymb}    % for \varnothing (empty set) symbol \usepackage{ulem} \def\lb{\linebreak[1]} \def\CITHREE{Core-i3}
Note: See TracChangeset for help on using the changeset viewer.