# source:docs/HPCA2011/02-background.tex@1302

Last change on this file since 1302 was 1302, checked in by lindanl, 8 years ago

Create a directory for HPCA

File size: 11.7 KB
Line
1\section{Background}
2\label{section:background}
3
4\subsection{XML}
5In 1998, the W3C officially adopted XML as a standard. The defining characteristics of XML are that it can represent virtually any type of information through the use of self-describing markup tags and can easily store semi-structured data in a descriptive fashion. XML markup encodes a description of an XML document's storage layout and logical structure. Because XML was intended to be human-readable, XML markup tags are often verbose by design \cite{TR:XML}.
6
7% <ProductName Language="Chinese">éšä»¶</ProductName> % can't represent in verbose and not really sure if the google auto-translater is correct
8
9XML files can be classified as document-oriented'' or data-oriented'' \cite{DuCharme04}. Documented-oriented XML is designed for human readability, such as shown in Figure \ref{fig:sample_xml}; data-oriented XML files are intended to be parsed by machines and omit human-friendly'' formatting techniques, such as the use of whitespace and descriptive natural language'' naming schemes.  Although the XML specification itself does not distinguish between XML for documents'' and XML for data'' \cite{TR:XML}, the latter often requires the use of an XML parser to extract the information within. The role of an XML parser is to transform the text-based XML data into application ready data.
10
11\begin{figure}[h]
12{
13\scriptsize
14\begin{verbatim}
15 <?xml version="1.0"?>
16 <Products>
17  <Product ID="0001">
18   <ProductName Language="English">Widget</ProductName>
19   <ProductName Language="French">Bitoniau</ProductName>
20   <Company>ABC</Company>
21   <Price>$19.95</Price> 22 </Product> 23 </Products> 24\end{verbatim} 25} 26\caption{Example XML Document}\label{fig:sample_xml} 27\end{figure} 28 29 30%For example, an XML parser for a web browser may take a XML file, apply a style sheet to it, and display it to the end user in an attractive yet informative way; an XML database parser may take a XML file and construct indexes and/or compress the tree into a proprietary format to provide the end user with efficient relational, hierarchical, and/or object-based query access to it. 31 32%Cameron et al.'s work in \cite{CameronHerdyLin2008} demonstrates that both XML parser selection and XML document markup density have considerable impact on the computational costs of processing XML documents. Computational costs are reported in processor cycles per XML byte; markup density is defined as the ratio of XML markup bytes to the total number of XML document bytes. 33 34\subsection{Traditional XML Parsers} 35% However, textual data tends to consist of variable-length items in generally unpredictable patterns \cite{Cameron2010}. 36Traditional XML parsers process XML sequentially a single byte-at-a-time. Following this approach, an XML parser processes a source document serially, from the first to the last byte of the source file. Each character of the source text is examined in turn to distinguish between the XML-specific markup, such as an opening angle bracket `<', and the content held within the document. The current character that the parser is processing is commonly referred to using the concept of a current cursor position. As the parser moves the cursor through the source document, the parser alternates between markup scanning, and data validation and processing operations. At each processing step, the parser scans the source document and either locates the expected markup, or reports an error condition and terminates. In other words, traditional XML parsers operate as complex finite-state machines that use byte comparisons to transition between data and metadata states. Each state transition indicates the context in which to interpret the subsequent characters. Unfortunately, textual data tends to consist of variable-length items sequenced in generally unpredictable patterns \cite{Cameron2010}; thus any character could be a state transition until deemed otherwise. 37 38Expat and Xerces-C are popular byte-a-time sequential parsers. Both are C/C++ based and open-source. Expat was originally released in 1998; it is currently used in Mozilla Firefox and provides the core functionality of many additional XML processing tools \cite{expat}. Xerces-C was released in 1999 and is the foundation of the Apache XML project \cite{xerces}. 39 40% For example, the main loop of Xerces-C well-formedness scanner contains: 41%\begin{verbatim} 42% XXXXXXXXXX XERCES CODE XXXXXXXXXX 43%\end{verbatim} 44 45A major disadvantage of the sequential byte-at-a-time approach to XML parsing is that each XML character incurs at least one conditional branch. The cumulative effect of branch mispredictions penalties are known to degrade XML parsing performance in proportion to the markup density of the source document \cite{CameronHerdyLin2008} (i.e., the proportion of XML-markup to XML-data). 46 47\subsection {Parallel XML Parsing} 48In general, parallel XML acceleration methods come in one of two forms: multithreaded approaches and SIMD-based techniques. 49Multithreaded XML parsers take advantage of multiple cores via number of strategies. 50Common strategies include preparsing the XML file to locate key partitioning points \cite{ZhangPanChiu09} and speculative p-DFAs \cite{ZhangPanChiu09}. 51SIMD XML parsers leverage the SIMD registers to overcome the performance limitations of the sequential byte-at-a-time processing model and its 52inherently data dependent branch misprediction rates. Further, data parallel SIMD instructions allow the processor to perform the same 53operation on multiple pieces of data simultaneously. The Parabix1 and Parabix2 parsers studied in this paper 54fall under the SIMD classification. The Parabix parser versions studied are described in further detail in Section \ref{section:parabix}. 55 56%\subsection {SIMD Operations} 57% Two such SIMD XML parsers, Parabix1 and Parabix2, utilizes parallel bit stream processing technology. 58% Extract section 2.2 and merge into 3. Add a new subsection 59% in section 2 saying a bit about SIMD. Say a bit about pure SIMD vertical 60% operations and then mention the pack operations that allow 61% us to implement transposition efficiently in parallel. 62% Also note that the SIMD registers support bitwise logic across 63% their full width and that this is extensively used in our work. 64% \subsection{Parallel XML Parsing} 65% 66% Parallel XML processing generally comes in one of two forms: multithreading and SIMD. Multithreaded XML parsers take advantage of parallelism by first quickly preparsing the XML file to locate the key markup entities and determine the best workload distribution in which process the XML file using$n$-cores \cite{ZhangPanChiu09}. SIMD XML parsers leverage the SIMD registers to overcome the performance limitations of the sequential paradigm and inherently data dependent branch misprediction rates \cite{Cameron2010}. Two such SIMD XML parsers, Parabix1 and Parabix2, utilizes parallel bit stream processing technology. With this method, byte-oriented character data is first transposed to eight parallel bit streams, one for each bit position within the character code units (bytes). These bit streams are then loaded into SIMD registers of width$W$(e.g., 64-bit, 128-bit, 256-bit, etc). This allows$W\$ consecutive code units to be represented and processed at once. Bitwise logic and shift operations, bit scans, population counts and other bit-based operations are then used to carry out the work in parallel \cite{CameronLin2009}.
67%
68% \subsubsection{Parabix1}
69%
70% Our first generation parallel bit stream XML parser---Parabix1---uses employs a less conventional approach of SIMD technology to represent text in parallel bit streams. Bits of each stream are in one-to-one-correspondence with the bytes of a character stream. A transposition step first transforms sequential byte stream data into eight basis bit streams for the bits of each byte. Bitwise logical combinations of these basis bit streams can then be used to classify bytes in various ways, while the bit scan operations common to commodity processors can be used for fast sequential scanning. At a high level, Parabix1 processes source XML in a functionally equivalent manner as a traditional processor. That is, Parabix1 moves sequentially through the source document, maintaining a single cursor scanning position throughout the parse. However, this scanning operation itself is accelerated significantly which leads to dramatic performance improvements, since bit scan operations can perform up to general register width (32-bit, 64-bit) finite state transitions per clock cycle. This approach has recently been applied to Unicode transcoding and XML parsing to good effect, with research prototypes showing substantial speed-ups over even the best of byte-at-a-time alternatives \cite{CameronHerdyLin2008, CameronLin2009, Cameron2010}.
71%
72% \subsubsection{Parabix2}
73%
74% In our second generation XML parser---Parabix2---we address the replacement of sequential parsing using bit scan instructions with a parallel parsing method using bitstream addition. Unlike the single cursor approach of Parabix1 and conceptually of traditional sequential approach, in Parabix2 multiple cursors positions are processed in parallel. To deal with these parallel cursors, three additional categories of bit streams are introduced. Marker bit streams are used to represent positions of interest in the parsing of a source data stream \cite{Cameron2010}. The appearance of a 1 at a position in a marker bitstream could, for example, denote the starting position an XML tag in the data stream. In general, the set of bit positions in a marker bitstream may be considered to be the current parsing positions of multiple parses taking place in parallel throughout the source data stream. A further aspect of the parallel method is that conditional branch statements used to identify syntax error at each each parsing position are eliminated. Instead, error bit streams are used to identify the position of parsing or well-formedness errors during the parsing process. Error positions are gathered and processed in as a final post processing step. Hence, Parabix2 offers additional parallelism over Parabix1 in the form of multiple cursor parsing as well as significanlty reduces branch misprediction penalty.
75
76%
77% --- Not Sure ---
78%
79% The first two parsers employ traditional byte-at-a-time methods of
80% parsing; these parsers were selected based on their popularity in the
81% marketplace and the availability of source code for deeper analysis.
82% The Fluke i410 current clamp is a digital multimeter that reads the
83% magnetic field of a live electrical cable to determine the current
84% passing through it without affecting the underlying hardware.
85
86% \textbf{No Need to talk about resource usage until  later}
87
88% In order to determine how and which performance factors influence energy consumption,
89% we intend to use the Fluke i410 current clamp in conjunction with PMCs to compare the per parser invocation and per source XML byte energy % usage of three XML parsers:
90% Expat 2.0.1, Xerces-C++ 3.1.1 (SAX2), and Parabix2. All three parsers are C/C++ based, event-driven, stream-oriented XML parsers.
91
93%
94%
95% Need gory details on byte at a time parsers. Pictures.
96% Xerces. Explain overall data flow and control flow for these parsers.
97% Briefly highlight inefficiencies.
98%
99%
100%
101% Talk about the usage of SIMD instructions and how it might help. Lead
102% onto briefly describe the key technology behind parabix.
103
104%