source: docs/HPCA2012/04-methodology.tex @ 1375

Last change on this file since 1375 was 1375, checked in by cameron, 8 years ago

Intro to Methodology

File size: 5.2 KB
Line 
1\section{Evaluation Framework}
2\label{section:methodology}
3
4\paragraph{XML Parsers}\label{parsers}
5
6We evaluate the Parabix XML parser described above
7against two widely available open-source
8parsers,  Xerces-C++, and Expat.
9Each of the parsers is evaluated on the task of implementing the
10parsing and well-formedness checking requirements of the full
11XML 1.0 specification\cite{TR:XML}.    Xerces-C++
12version 3.1.1 (SAX) \cite{xerces} is a validating open source XML
13parser written in C++ available as part of the the Apache project.
14However, we use the WFXML scanner of Xerces to avoid the costs
15of validation and also use the SAX interface to avoid the
16costs of DOM tree construction.
17Expat version 2.0.1 \cite{expat} is a non-validating XML parser
18library written in C.
19
20
21\paragraph{XML Workloads}\label{workloads}
22XML is used for a variety of purposes ranging from databases to config
23files in mobile phones. A key feature of these XML files that affects
24the overall parsing performance is the \textit{Markup
25  density}. \textit{Markup density} is defined as the ratio of the
26total markup contained within an XML file to the total XML document
27size.  This metric has substantial influence on the performance of
28traditional recursive descent XML parser implementations.  We use a
29mixture of document-oriented and data-oriented XML files in our study
30to provide workloads with a full spectrum of markup densities.
31
32Table \ref{XMLDocChars} shows the document characteristics of the XML
33input files selected for this performance study.  The jawiki.xml and
34dewiki.xml XML files represent document-oriented XML inputs and
35contain the three-byte and four-byte UTF-8 sequence required for the
36UTF-8 encoding of Japanese and German characters respectively.  The
37remaining data files are data-oriented XML documents and consist
38entirely of single byte $7$-bit encoded ASCII characters.
39
40\begin{table*}
41\begin{center}
42{
43\footnotesize
44\begin{tabular}{|l||l|l|l|l|l|}
45\hline
46File Name               & dewiki.xml            & jawiki.xml            & roads.gml     & po.xml        & soap.xml \\ \hline   
47File Type               & document              & document              & data          & data          & data   \\ \hline     
48File Size (kB)          & 66240                 & 7343                  & 11584         & 76450         & 2717 \\ \hline
49Markup Item Count       & 406792                & 74882                 & 280724        & 4634110       & 18004 \\ \hline
50Markup Density          & 0.07                  & 0.13                  & 0.57          & 0.76          & 0.87  \\ \hline
51\end{tabular}
52}
53\end{center}
54\caption{XML Document Characteristics} 
55\label{XMLDocChars} 
56\end{table*}
57
58
59\paragraph{Platform Hardware}
60SSE extensions have been available on commodity Intel processors for
61over a decade since the Pentium III. They have steadily evolved with
62improvements in instruction latency, cache interface, and register
63resources, and the addition domain specific instructions. Here we
64investigate SIMD extensions across three different generations of
65intel processors. Table \ref{hwinfo} describes the Intel multicores we
66investigate. We compare the energy and performance profile of the
67Parabix under the platforms.  We also analyze the implementation
68specifics of SIMD extensions under various microarchitecture. We we
69evalute both the legacy SSE and newer AVX extensions supported by
70Sandybridge.
71
72We propose to investigate each the execution profiles of XML parsers
73using the the Performance Monitoring Counter (PMC) hardware event
74found in the processor. We have chosen several key hardware
75performance events which provide insight into the profile of our
76application and indicate if the processor is doing useful
77work~\cite{bellosa2001, bertran2010}.  The set of performance counters
78included in our study are Branch instructions, Branch mispredictions,
79Integer instructions, SIMD instructions, and Cache misses. In
80addition, we characterize the SIMD operations and study the type and
81class of SIMD operations using the Intel Pin binary instrumentation
82framework.
83
84
85
86
87
88\begin{table*}[h]
89\begin{center}
90\footnotesize
91\begin{tabular}{|l||l|l|l|}
92\hline
93Processor & Core2 Duo (2.13GHz) & i3-530 (2.93GHz) & Sandybridge (2.80GHz) \\ \hline
94L1 D Cache & 32KB & 32KB & 32KB \\ \hline       
95L2 Cache & Shared 2MB & 256KB/core & 256KB/core \\ \hline 
96L3 Cache & --- & 4MB  & 6MB \\ \hline 
97Bus or QPI &  1066Mhz Bus & 1333Mhz QPI & 1333Mhz QPI \\ \hline
98Memory  & 2GB & 4GB & 6GB\\ \hline
99Max TDP & 65W & 73W &  95W \\ \hline
100\end{tabular}
101\caption{Platform Hardware Specs} 
102\label{hwinfo}
103\end{center}
104\end{table*}
105
106
107
108\paragraph{Energy Measurement}
109
110A key benefit of the Parabix parser is its more efficient use of the
111processor pipeline which reflects in the overall energy usage.  We
112measure the energy consumption of the processor directly using a
113current clamp. We apply the Fluke i410 current clamp \cite{clamp} to the 12V wires
114that supply power to the processor sockets. The clamp detects the
115magnetic field created by the flowing current and converts it into
116voltage levels (1mV per 1A current). The voltage levels are then
117monitored by an Agilent 34410a digital multimeter at the granularity
118of 100 samples per second. This measurement captures the instantaneous
119power to the processor package, including cores, caches, northbridge
120memory controller, and the quick-path interconnects. We obtain samples
121throughout the entire execution of the program and then calculate overall
122total energy as  $12V*\sigma^{N_{samples}}_{i=1} Sample_i$.
123
124
Note: See TracBrowser for help on using the repository browser.