Changeset 966 for docs/PACT2011


Ignore:
Timestamp:
Mar 21, 2011, 8:39:19 PM (9 years ago)
Author:
ksherdy
Message:

Backgound section, initial draft.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • docs/PACT2011/02-background.tex

    r949 r966  
    11\section{Background}
    22\label{section:background}
    3 XML documents tend to be verbose---especially in the case of SOAP and WSDL. Processing these documents typically requires parsing them from a text-based format into an application-specific one.
    4 Cameron et al.'s work in \cite{CameronHerdyLin2008} shows that both parser selection and markup density have a substantial impact in the computational cost of processing XML documents.
     3This section provides a brief overview of XML and traditional XML processing technology and describes the key design and performance aspects of sucessive generations of the Parabix parallel XML processing technology.
     4
     5XML markup encodes a description of an XML documents storage layout and logical structure and is verbose by design \cite{TR:XML} --- this  is especially evident in the case of markup dense SOAP and WSDL documents. Processing XML documents requires a software module called an XML parser to read XML documents and provide access to their content and structure in an application-specific manner \cite{TR:XML}.
     6Cameron et al.'s work in \cite{CameronHerdyLin2008} demonstrates that both XML parser selection and XML document markup density have considerable impact on the computational costs of processing XML documents. Computational costs are reported in processor cycles per XML byte; markup density is defined as the ratio of XML markup bytes to the total number of XML document bytes.
     7
     8\subsection{Sequential XML Parsing}
     9Traditional XML parsers employ a sequential byte-at-a-time approach. In this approach, the XML parser serially scans the source document and processes the input in a top-down manner. As it moves through the source document a typical parser oscillates between markup scanning and data validation operations. At each processing step, a parser initiates a scan of the source document and matches expected markup bytes or reports an error and terminates. For example, in scanning for a particular XML markup delimeter, such as the opening angle bracket, each loop scanning operation scans for the expected markup bytes. This scanning loop provides one bit of information with each iteration whether the character in question is matched or not. On commodity processors, SIMD registers are commonly 128 bits wide, with the advent of the Intel's Sandy Bridge Advanced Vector Extensions (AVX) \cite{Firasta2008} 256 bit registers are now available. Yet traditional XML parsers work with only 8 bits at a time. Thus, there is often a dramatic mismatch between available processor resources and the amount of information that is computed \cite{CameronHerdyLin2008}. Moreover, XML document markup corresponds to conditional branches in XML  processing software, that is, decision points in this sequential parsing process. In the sequential approach, branch mispredictions represent performance degration in proportion to the markup density of the source document \cite{CameronHerdyLin2008}.
     10
     11\subsection{Parallel XML Parsing}
     12Using SIMD registers to process multiple bytes at a time is one possibility to take advantage of SIMD registers and
     13overcome the performance limitations of the sequential paradigm and inherently data dependent branch misprediction rates. However, textual data tends to consist of variable-length items in generally unpredictable patterns \cite{Cameron2010}. Our approach breaks this sequential processing model and is based on parallel bit stream processing technology. In this method, byte-oriented character data is first transposed to eight parallel bit streams, one for each bit position within the character code units (bytes). Loading bit stream data into SIMD registers, then, allows data from SIMD register width W (128-bit,256-bit,512-bit) consecutive code units to be represented and processed at once. Bitwise logic and shift operations, bit scans, population counts and other bit-based operations are then used to carry out the work in parallel \cite{CameronLin2009}.
     14
     15\subsubsection{Parabix1}
     16Our first generation parallel XML parser -- Parabix1 -- uses employs a less conventional approach of SIMD technology to represent text in parallel bitstreams. Bits of each stream are in one-to-one-correspondence with the bytes of a character stream. A transposition step first transforms sequential byte stream data into eight basis bitstreams for the bits of each byte. Bitwise logical combinations of these basis bitstreams can then be used to classify bytes in various ways, while the bit scan operations common to commodity processors can be used for fast sequential scanning. At a high level, Parabix1 processes source XML in a functionally equivalent manner as a traditional processor. That is, Parabix1 moves sequentially through the source documen, maintaining a single cursor scanning position throughout the parse. However, this scanning operation itself is accelerated significantly which leads to dramatic performance improvements, since bit scan operations can perform up to general register width (32-bit,64-bit) finite state transitions per clock cycle. This approach has recently been applied to Unicode transcoding and XML parsing to good effect, with research prototypes showing substantial speed-ups over even the best of byte-at-a-time alternatives \cite{Cameron2010} \cite{CameronHerdyLin2008} \cite{CameronLin2009}.
     17
     18\subsubsection{Parabix2}
     19In our second generation XML parser -- Parabix2 -- we address the replacement of sequential parsing using bit scan instructions with a parallel parsing method using bitstream addition. Unlike the single cursor approach of Parabix1 and conceptually of traditional sequential approach, in Parabix2 multiple cursors positions are processed in parallel. To deal with these parallel cursors, three additional categories of bitstreams are introduced. Marker bitstreams are used to represent positions of interest in the parsing of a source data stream \cite{Cameron2010}. The appearance of a 1 at a position in a marker bitstream could, for example, denote the starting position an XML tag in the data stream. In general, the set of bit positions in a marker bitstream may be considered to be the current parsing positions of multiple parses taking place in parallel throughout the source data stream. A further aspect of the parallel method is that conditional branch statements used to identify syntax error at each each parsing position are eliminated. Instead, error bitstreams are used to identify the position of parsing or well-formedness errors during the parsing process. Error positions are gathered and processed in as a final post processing step. Hence, Parabix2 offers additional parallelism over Parabix1 in the form of multiple cursor parsing as well as significanlty reduces branch misprediction penalty.
     20
     21%
     22% --- Not Sure ---
     23%
     24% The first two parsers employ traditional byte-at-a-time methods of
     25% parsing; these parsers were selected based on their popularity in the
     26% marketplace and the availability of source code for deeper analysis.
     27% The Fluke i410 current clamp is a digital multimeter that reads the
     28% magnetic field of a live electrical cable to determine the current
     29% passing through it without affecting the underlying hardware.
     30
     31% \textbf{No Need to talk about resource usage until  later}
     32
     33% In order to determine how and which performance factors influence energy consumption,
     34% we intend to use the Fluke i410 current clamp in conjunction with PMCs to compare the per parser invocation and per source XML byte energy % usage of three XML parsers:
     35% Expat 2.0.1, Xerces-C++ 3.1.1 (SAX2), and Parabix2. All three parsers are C/C++ based, event-driven, stream-oriented XML parsers.
    536
    637
    7 In order to determine how and which performance factors influence energy consumption,
    8 we intend to use the Fluke i410 current clamp in conjunction with PMCs to compare the per parser invocation and per source XML byte energy usage of three XML parsers:
    9 Expat 2.0.1, Xerces-C++ 3.1.1 (SAX2), and Parabix2. All three parsers are C/C++ based, event-driven, stream-oriented XML parsers.
    10 
    11 
    12 
    13 Think the person reading doesn't know much about XML parsers.
    14 
    15 
    16 Need gory details on byte at a time parsers. Pictures.
    17 Xerces. Explain overall dataflow and control flow for these parsers.
    18 Briefly highlight inefficiencies.
    19 
    20 
    21 
    22 Talk about the usage of SIMD instructions and how it might help. Lead
    23 onto briefly describe the key technology behind parabix.
    24 
    25 
    26 
    27 
    28 
    29 
    30 
    31 
    32 
    33 
    34 The first two parsers employ traditional byte-at-a-time methods of
    35 parsing; these parsers were selected based on their popularity in the
    36 marketplace and the availability of source code for deeper analysis.
    37 The Fluke i410 current clamp is a digital multimeter that reads the
    38 magnetic field of a live electrical cable to determine the current
    39 passing through it without affecting the underlying hardware.
    40 
    41 
    42 \textbf{No Need to talk about resource usage until  later}
    43 
     38%
     39% --- Section Advice ---
     40
     41% 1. Think the person reading doesn't know much about XML parsers.
     42%
     43% 2. Need gory details on byte at a time parsers. Pictures.
     44% Xerces. Explain overall dataflow and control flow for these parsers.
     45% Briefly highlight inefficiencies.
     46%
     47% 3. Talk about the usage of SIMD instructions and how it might help. Lead
     48% onto briefly describe the key technology behind parabix.
     49%
Note: See TracChangeset for help on using the changeset viewer.