# Changeset 1411 for docs

Ignore:
Timestamp:
Aug 31, 2011, 6:14:20 PM (8 years ago)
Message:

spell checked evaluation

Location:
docs/HPCA2012
Files:
11 edited

Unmodified
Removed
• ## docs/HPCA2012/01-intro.tex

 r1407 \footnote{The actual energy consumption of the XML ASIC chips is not published by the companies.} Overall we make the following contributions in this paper. % Overall we make the following contributions: 1) We outline the Parabix architecture, tool chain and run-time While studied in the context of XML parsing, the Parabix framework can be widely applied to many problems in text processing and parsing. parsing.  We have realeased Parabix completely open source and are interested in exploring the applications that can take advantage of our tool chain(\textit{http://anonymous}). 2) We compare Parabix XML parsers against conventional parsers and 2) We compare the Parabix XML parser against conventional parsers and assess the improvement in overall performance and energy efficiency on each platform.  We are the first to compare and contrast SSE/AVX extensions across multiple generation of Intel processors and show that there are performance challenges when using newer generation SIMD extensions. We compare the ARM Neon extensions against the x86 SIMD extensions and comment on the latency of SIMD operations across these architectures. variety of hardware platforms.  We are the first to compare and contrast SSE/AVX extensions across multiple generation of Intel processors and show that there are performance challenges when using newer generation SIMD extensions. We compare the ARM Neon extensions against the x86 SIMD extensions and comment on the latency of SIMD operations. 3) Finally, building on the SIMD parallelism of Parabix technology, Section~\ref{section:background} presents background material on XML parsing and provides insight into the inefficiency of traditional parsers on mainstream processors.  Section~\ref{section:parabix} describes the Parabix architecture, tool chain and run-time environment.  Section~\ref{section:parser} describes the application of the Parabix framework to the construction of an XML parser enforcing all the well-formedness rules of the XML specification. Section~\ref{section:baseline} presents a detailed performance analysis of Parabix on a \CITHREE\ system using hardware performance counters and compares it against conventional parsers. parsers.  Section~\ref{section:parabix} describes the Parabix architecture, tool chain and run-time environment. Section~\ref{section:parser} describes the our design of an XML parser based on the Parabix framework.  Section~\ref{section:baseline} presents a detailed performance analysis of Parabix on a \CITHREE\ system using hardware performance counters. Section~\ref{section:scalability} compares the performance and energy efficiency of 128 bit SIMD extensions across three generations of
• ## docs/HPCA2012/02-background.tex

 r1393 \cite{xerces}, uses a series of nested switch statements and state-dependent flag tests to control the parsing logic of the program.  Our analysis, which we detail in Section \ref{section:XML-branches}, found that Xerces requires between 6 - 13 branches per byte of XML to support this form of control flow, depending on the fraction of markup in the overall document.  Cache program. Xerces's complex data dependent control flow requires between 6 --- 13 branches per byte of XML input, depending on the markup in the file (details in Section~\ref{section:XML-branches}).  Cache utilization is also significantly reduced due to the manner in which markup and content must be scanned and buffered for future use.  For
• ## docs/HPCA2012/05-corei3.tex

 r1407 \label{section:XML-branches} In general, performance is limited by branch mispredictions. Unfortunetly, it is difficult to reduce the branch misprediction rate of Unfortunately, it is difficult to reduce the branch misprediction rate of traditional XML parsers due to: (1) the variable length nature of the syntactic elements contained within XML documents;
• ## docs/HPCA2012/06-scalability.tex

 r1409 \section{Evaluation of Parabix accross different Hardware} \section{Evaluation of Parabix across different Hardware} \label{section:scalability} \subsection{Performance}
• ## docs/HPCA2012/07-avx.tex

 r1410 the version that only takes advantage of the AVX 3-operand mode is labeled 128-bit avx,'' and the version uses the 256-bit operations wherever possible is labelled 256-bit avx.''  The operations wherever possible is labeled 256-bit avx.''  The instruction counts are divided into three classes: non-SIMD'' operations are the general purpose instructions.  The bitwise SIMD''
• ## docs/HPCA2012/08-arm.tex

 r1339 Migration of Parabix2 to the Android platform began with the retargetting of a subset of the Parabix2 IDISA SIMD library for ARM re-targeting of a subset of the Parabix2 IDISA SIMD library for ARM NEON.  This library code was cross-compiled for Android using the Android NDK. The Android NDK is a companion tool to the Android SDK
• ## docs/HPCA2012/10-related.tex

 r1407 of numerous multi-threaded and hardware-based approaches: Multithreaded XML techniques include preparsing the XML file to locate key partitioning points \cite{ZhangPanChiu09} and speculative p-DFAs \cite{ZhangPanChiu09}. Hardware methods include custom XML chips \cite{Leventhal2009} and FPGA-based implementations \cite{DaiNiZhu2010}.  Recently Cameron et al.~\cite{CameronHerdyLin2008, cameron-EuroPar2011} accelerated XML parsing using SSE instructions. Finally, other have explored the design of custom hardware for bit parallel operations in network key partitioning points~\cite{ParaDOM2009,LiWangLiuLi2009} and speculative p-DFAs~\cite{ZhangPanChiu09}. Hardware methods include custom XML chips \cite{Leventhal2009} and FPGA-based implementations \cite{DaiNiZhu2010}.  Intel's SSE4 instructions targeted XML parsers, but these have not seen widespread use because of portability concerns and the programming challenges that accompany low level instructions~\cite{sse4}. Recently, Cameron et al.~\cite{CameronHerdyLin2008, cameron-EuroPar2011} designed an accelerated XML parser using widely available SSE2 instructions. Finally, others have explored the design of custom hardware for bit parallel operations for text search in network processors~\cite{tan-sherwood-isca-2005}. % To accelerate XML parsingmost of the recent work has % focused on parallelization through the use of multicore parallelism % for chip multiprocessors \cite{ZhangPanChiu09, },
• ## docs/HPCA2012/11-conclusions.tex

 r1379 reduction in branches, 7$\times$---15$\times$ reduction in branch mispredictions, % ?\times\$ reduction in LLC misses, and increase in data parallelism processing upto 128 characters with a single operation. We used the processing up to 128 characters with a single operation. We used the Parabix framework and XML parsers to study the features of the new 256 bit AVX extension in Intel processors. We find that while the move to
• ## docs/HPCA2012/main.tex

 r1398 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % ACM title header format \title{\vspace{-30pt} Boosting the Efficiency of Text Processing on Commodity Processors: The Parabix Story \title{\vspace{-30pt} Parabix : Boosting the Efficiency of Text Processing on \\ Commodity Processors % % \thanks{% % tighten spacing: \let\oldthebibliography\thebibliography \def\thebibliography#1{\oldthebibliography{#1}\parsep-5pt\itemsep0pt} % \vspace{-\baselineskip} \def\thebibliography#1{\oldthebibliography{#1}\parsep5pt\itemsep0pt} { \setstretch{1} \footnotesize % \scriptsize \bibliographystyle{abbrv} \bibliography{reference}
• ## docs/HPCA2012/reference.bib

 r1405 year = {Aug 2009} } @misc{sse4, author= {Zhai Lei}, title = {XML Parsing Accelerator with Intel Streaming SIMD Extensions 4}, howpublished = "{http://software.intel.com/en-us/articles/xml-parsing-accelerator-with-intel-streaming-simd-extensions-4-intel-sse4/}"}, year = {2008} }
Note: See TracChangeset for help on using the changeset viewer.