Changeset 1339 for docs

Aug 22, 2011, 11:49:50 AM (8 years ago)

Intro updates; section cross-references

8 edited


  • docs/HPCA2012/01-intro.tex

    r1330 r1339  
    5454We study Parabix technology in application to the problem of XML parsing
    5555and develop several implementations for different computing platforms.
    5856XML is a particularly interesting application; it is a standard of the
    5957web consortium that provides a common framework for encoding and
    6664while in applications in the network switches and cell phones latency
    6765and the energy cost of parsing is of paramount
    68 importance. Software-based XML parsers are particulary inefficient and
    69 consist of giant \textit{switch-case} statements, which waste
    70 processor resources processor since they introduce input-data
    71 dependent branches. They also have poor cache efficiency since they
    72 sift forward and backward through the input-data stream trying to
    73 match the parsed tags.  XML ASIC chips have been around for over 6
     66importance.   Traditional software-based XML parsers have many
     67inefficiencies due to complex input-dependent branching structures
     68leading to considerable branch misprediction penalties as well
     69as poor use of memory bandwidth and data caches due to byte-at-a-time
     70processing and multiple buffering.  XML ASIC chips have been around for over 6
    7471years, but typically lag behind CPUs in technology due to cost
    7572constraints. Our focus is how much can we improve performance of the
    156153The remainder of this paper is organized as follows.
    157 Section~\ref{background} presents background material on XML parsing
     154Section~\ref{section:background} presents background material on XML parsing
    158155and provides insight into the inefficiency of traditional parsers on
    159 mainstream processors.  Section~\ref{parallel-bitstream} reviews
    160 parallel bit stream technology a framework to exploit sophisticated
    161 data parallel SIMD extensions on modern processors.  Section 5
    162 presents a detailed performance evaluation on a \CITHREE\ processor as
     156mainstream processors.  Section~\ref{section:parabix} describes the
     157Parabix architecture, tool chain and run-time environment.
     158Section~\ref{section:parser} describes the application of the
     159Parabix framework to the construction of an XML parser
     160meeting enforcing all the well-formedness rules of the XML
     161specification.  Section~\ref{section:methodology} then describes
     162the overall methodology of our performance and energy study.
     163Section~\ref{section:baseline} presents a detailed
     164performance evaluation on a \CITHREE\ processor as
    163165our primary evaluation platform, addressing a number of
    164166microarchitectural issues including cache misses, branch
    165 mispredictions, and SIMD instruction counts.  Section 6 examines
     167mispredictions, and SIMD instruction counts.  Section~ref{section:scalability} examines
    166168scalability and performance gains through three generations of Intel
    167 architecture culminating with a performance assessment on our two
    168 week-old \SB\ test machine. We looks specifically at issues in
    169 applying the new 256-bit AVX technology to parallel bit stream
    170 technology and notes that the major performance benefit seen so far
    171 results from the change to the non-destructive three-operand
    172 instruction format.
     169architecture.  Section~\ref{section:avx} examines the extension
     170of the Parabix technology to take advantage of Intel's new
     171256-bit AVX technology, while Section~\ref{section:neon} investigates
     172the applications of this technology on mobile platforms using
     173ARM processors with Neon SIMD extensions.
     174Section~\ref{section:multithread} then looks at the multithreading of the
     175Parabix XML parser using pipeline parallelism.
     176Section~\ref{section:conclusion} concludes the paper.
  • docs/HPCA2012/03b-research.tex

    r1334 r1339  
    1 \section{Parabix}
    3 \subsection{Parabix Structure}
     1\section{The Parabix XML Parser}
     3\subsection{Parser Structure}
  • docs/HPCA2012/04-methodology.tex

    r1335 r1339  
    44In this section we describe our methodology for the measurements and
  • docs/HPCA2012/06-scalability.tex

    r1335 r1339  
    34Figure \ref{Scalability} (a) demonstrates the average XML
  • docs/HPCA2012/07-avx.tex

    r1335 r1339  
    11\section{Scaling Parabix2 for AVX}
    33In this section, we discuss the scalability and performance advantages of our 256-bit AVX (Advanced Vector Extensions) Parabix2 port.
    44Parabix2 originally targetted the 128-bit SSE2 SIMD technology available on all modern 64-bit Intel and AMD processors but
  • docs/HPCA2012/08-arm.tex

    r1335 r1339  
    33\section {Parabix on Mobile Platforms}
    55The Samsung Galaxy Tab GT-P1000M device houses a Samsung S5PC110 ARM
    66\CORTEXA8{} 1Ghz single-core, dual-issue, superscalar
  • docs/HPCA2012/09-pipeline.tex

    r1335 r1339  
    11\section{Multi-threaded Parabix}
    23The general problem of addressing performance through multicore parallelism
    34is the increasing energy cost. As discussed in previous sections,
  • docs/HPCA2012/10-conclusions.tex

    r1327 r1339  
    23This paper has examined energy efficiency and performance
    34characteristics of four XML parsers considered over three
Note: See TracChangeset for help on using the changeset viewer.