source: docs/HPCA2012/04-methodology.tex @ 1768

Last change on this file since 1768 was 1692, checked in by lindanl, 8 years ago

Some figure adjustment to the new template

File size: 5.0 KB
Line 
1\section{Evaluation Framework}
2\label{section:methodology}
3
4\paragraph{XML Parsers:}\label{parsers}
5We evaluate the Parabix XML parser described above
6against two widely available open-source parsers: Xerces-C \cite{xerces} and Expat \cite{expat}.
7Each of the parsers is evaluated on the task of implementing the
8parsing and well-formedness validation requirements of the full
9XML 1.0 specification\cite{TR:XML}.
10Xerces-C version 3.1.1 (SAX) is a validating XML
11parser written in C++ and is available as part of the the Apache project.
12Expat version 2.0.1 is a stream-oriented non-validating XML parser library written in C.
13To ensure a fair comparison, we restricted our analysis of Xerces-C to its WFXML scanner to
14eliminate the cost of non-well-formedness validation and used the SAX interface to avoid
15the memory cost of DOM tree construction.
16
17\paragraph{XML Workloads:}\label{workloads}
18XML is used for a variety of purposes ranging from databases to configuration
19files in mobile phones.
20A key predictor of the overall parsing performance of an XML file is \textit{Markup density} (i.e., the ratio of markup vs. the total XML document size.)
21This metric has substantial influence on the performance of
22traditional recursive descent XML parsers.  We use a
23mixture of document-oriented and data-oriented XML files
24to analyze performance over a spectrum of markup densities.
25
26Table \ref{XMLDocChars} shows the document characteristics of the XML
27input files selected for this performance study.  The jawiki.xml and
28dewiki.xml XML files represent document-oriented XML inputs and
29contain the three-byte and four-byte UTF-8 sequence required for the
30UTF-8 encoding of Japanese and German characters respectively.  The
31remaining data files are data-oriented XML documents and consist
32entirely of single byte  encoded ASCII characters.
33
34\begin{table*}[htbp]
35\begin{center}
36{
37\footnotesize
38\begin{tabular}{|l||l|l|l|l|l|}
39\hline
40File Name               & dew.xml               & jaw.xml               & roads.gml     & po.xml        & soap.xml \\ \hline   
41File Type               & document              & document              & data          & data          & data   \\ \hline     
42File Size (kB)          & 66240                 & 7343                  & 11584         & 76450         & 2717 \\ \hline
43Markup Item Count       & 406792                & 74882                 & 280724        & 4634110       & 18004 \\ \hline
44Markup Density          & 0.07                  & 0.13                  & 0.57          & 0.76          & 0.87  \\ \hline
45\end{tabular}
46}
47\end{center}
48\caption{XML Document Characteristics} 
49\label{XMLDocChars} 
50\end{table*}
51
52
53\paragraph{Platform Hardware:}
54SSE extensions have been available on commodity Intel processors for
55over a decade since the Pentium III. They have steadily evolved with
56improvements in instruction latency, cache interface, register
57resources, and the addition of domain specific instructions. Here we
58investigate SIMD extensions across three different generations of
59intel processors (hardware details in Table \ref{hwinfo}). We compare
60the energy and performance profile of the Parabix under the platforms.
61We also analyze the implementation specifics of SIMD extensions under
62various microarchitectures and the newer AVX extensions supported by
63Sandybridge.
64
65
66We investigated the execution profiles of each XML parser
67using the performance counters found in the processor.
68We chose several key hardware events that provide insight into the profile of each
69application and indicate if the processor is doing useful work
70~\cite{bellosa2001, bertran2010}
71The set of events included in our study are: Branch instructions, Branch mispredictions,
72Integer instructions, SIMD instructions, and Cache misses. In
73addition, we characterize the SIMD operations and study the type and
74class of SIMD operations using the Intel Pin binary instrumentation
75framework.
76
77\begin{table*}[htbp]
78\begin{center}
79\footnotesize
80\begin{tabular}{|l||l|l|l|}
81\hline
82Processor & Core2 Duo (2.13GHz) & i3-530 (2.93GHz) & Sandybridge (2.80GHz) \\ \hline
83L1 D Cache & 32KB & 32KB & 32KB \\ \hline       
84L2 Cache & Shared 2MB & 256KB/core & 256KB/core \\ \hline 
85L3 Cache & --- & 4MB  & 6MB \\ \hline 
86Bus or QPI &  1066Mhz Bus & 1333Mhz QPI & 1333Mhz QPI \\ \hline
87Memory  & 2GB & 4GB & 6GB\\ \hline
88Max TDP & 65W & 73W &  95W \\ \hline
89\end{tabular}
90\caption{Platform Hardware Specs} 
91\label{hwinfo}
92\end{center}
93\vspace{-20pt}
94\end{table*}
95
96
97
98\paragraph{Energy Measurement:}
99A key benefit of the Parabix parser is its more efficient use of the
100processor pipeline which reflects in the overall energy usage.  We
101measure the energy consumption of the processor directly using a
102current clamp. We apply the Fluke i410 current clamp \cite{clamp} to the 12V wires
103that supply power to the processor sockets. The clamp detects the
104magnetic field created by the flowing current and converts it into
105voltage levels (1mV per 1A current). The voltage levels are then
106monitored by an Agilent 34410a digital multimeter at the granularity
107of 100 samples per second. This measurement captures the instantaneous
108power to the processor package, including cores, caches, northbridge
109memory controller, and the quick-path interconnects. We obtain samples
110throughout the entire execution of the program and then calculate overall
111total energy as  $12V*\sum^{N_{samples}}_{i=1} Sample_i$.
112
113
Note: See TracBrowser for help on using the repository browser.