source: docs/PACT2011/04-methodology.tex @ 1080

Last change on this file since 1080 was 1080, checked in by ksherdy, 8 years ago

Add \CIFIVE macro.

File size: 7.2 KB
Line 
1\section{Methodology}
2
3
4In this section, we describe our methodology for the measurements and
5investigation of XML parsing energy consumption and performance.  In
6brief, for each of the XML parsers under study we propose to measure
7and evaluate the energy consumption required to carry out XML
8well-formedness checking, under a variety of workloads, and as
9executed on three different Intel cores.
10
11To begin our study, we propose to first investigate each of the XML
12parsers in terms of the PMCs hardware events as listed in the
13following subsection. Based on the recommendation of previous
14proposals \cite{bellosa2001, bertran2010, bircher2007}, we have chosen
15several key hardware performance events for which the authors indicate
16have a strong correlation to energy consumption.  We also measure
17other runtime counts such as the number of SIMD instructions and
18bitwise operations using the Intel Pin binary instrumentation
19framework. From these data, we hope to gain insight into the XML
20parser execution characteristics and compare and constrast different
21industrial parsers.
22
23The foundational work by Bellosa in \cite{bellosa2001} as well as more
24recent work in \cite {bircher2007, bertran2010} show that
25hardware-usage patterns has a significant impact in the energy
26consumption of a particular application; \cite{bellosa2001,
27  bircher2007, bertran2010} further show that there is a strong
28correlation between specific performance events and energy usage---but
29the authors of each differ slightly in opinion as to which performance
30monitoring counters\footnote{Performance monitoring counters (PMCs)
31  are special-purpose registers that are included in most modern
32  microprocessors; they store the running count of specific hardware
33  events, such as retired instructions, cache misses, branch
34  mispredictions, and arithmetic-logic unit operations to name a few.
35  They can be used to capture information about any program at
36  run-time, under any workload, at a very fine granularity.} (PMCs) to
37use.
38
39
40The following subsections describe the XML parsers under study, XML
41workloads, the hardware architectures, PMC hardware events selected
42for measurement, and the energy measurement set up. We analyze the
43performance of the different parsers based on the hardware performance
44counter measurements and contrast their energy consumption
45measurements based on direct measurement.
46
47
48\subsection{Parsers}\label{parsers}
49
50The XML parsing technologies selected for this study are the Parabix2,
51Xerces-C++, and Expat XML parsers.  Parabix2 \cite{parabix2} (parallel
52bit streams for XML) is the second generation Parabix parser. Parabix2
53is an open-source XML parser that leverages the SIMD capabilities of
54modern commodity processors; it employs the new parallelization
55techniques using parallel parsing with bit stream addition to deliver
56dramatic performance improvements over traditional byte-at-a-time
57parsing technology.  Xerces-C++ version 3.1.1 (SAX) \cite{xerces} is a
58validating open source XML parser written in C++ by the Apache
59project.  Expat version 2.0.1 \cite{expat} is a non-validating XML
60parser library written in C.
61
62\begin{table*}
63\begin{center}
64\begin{tabular}{|c||r|r|r|r|r|}
65\hline
66File Name               & dewiki.xml            & jawiki.xml            & roads.gml     & po.xml        & soap.xml \\ \hline   
67File Type               & document              & document              & data          & data          & data   \\ \hline     
68File Size (kB)          & 66240                 & 7343                  & 11584         & 76450         & 2717 \\ \hline
69Markup Item Count       & 406792                & 74882                 & 280724        & 4634110       & 18004 \\ \hline
70Markup Density          & 0.07                  & 0.13                  & 0.57          & 0.76          & 0.87  \\ \hline
71\end{tabular}
72\end{center}
73\caption{XML Document Characteristics} 
74\label{XMLDocChars} 
75\end{table*}
76
77\subsection{Workloads}\label{workloads}
78
79Markup density is defined
80as the ratio of the total markup contained within an XML file to the
81total XML document size.  This metric may have substantial influence
82on the performance of XML parsing. 
83We use a mixture of document-oriented and data-oriented XML
84files in our study to provide workloads with a spectrum of
85markup densities.
86
87Table \ref{XMLDocChars} shows the document characteristics of the XML
88input files selected for this performance study.  The jawiki.xml and
89dewiki.xml XML files represent document-oriented XML inputs,
90containing three-byte and four-byte UTF8 sequence.  The remaining
91files are data-oriented inputs and consist of only ASCII
92characters. 
93
94
95\subsection{Platform Hardware}
96\paragraph{Intel \CO{}}
97The Intel \CO{} is a Conroe based processor produced by
98Intel. Table \ref{core2info} gives the hardware description of the
99Intel \CO{} machine selected.
100\begin{table}[h]
101\begin{center}
102\begin{tabular}{|c||c|}
103\hline
104Processor & Intel Core2 Duo processor 6400  (2.13GHz) \\ \hline
105L1 Cache & 32KB I-Cache, 32KB D-Cache \\ \hline 
106L2 Cache & 2MB \\ \hline
107Front Side Bus &  1066 MHz\\ \hline
108Memory  & 2GB \\ \hline
109Hard disk & 80GB SCSI \\ \hline
110Max TDP & 65W \\ \hline
111\end{tabular}
112\end{center}
113\caption{\CO{}} 
114\label{core2info} 
115\end{table}
116
117\paragraph {Intel \CITHREE{}}
118The Intel \CITHREE\ is a Nehalem based processor produced by Intel. The
119intent of this processor is to serve as an example low end server
120processor. Table \ref{i3info} gives the hardware description of the
121Intel \CITHREE\ machine selected.
122
123\begin{table}[h]
124\begin{center}
125\begin{tabular}{|c||c|}
126\hline
127Processor & Intel i3-530 (2.93GHz) \\ \hline
128L1 Cache & 32KB I-Cache, 32K D-Cache \\ \hline 
129L2 Cache & 256KB \\ \hline
130L3 Cache & 4-MB \\ \hline
131Front Side Bus & 1333 MHz \\ \hline
132Memory  & 4GB \\ \hline
133Hard disk & SCSI 1TB \\ \hline
134Max TDP & 73W \\ \hline
135
136\end{tabular}
137\end{center}
138\caption{\CITHREE{}} 
139\label{i3info} 
140\end{table}
141
142\paragraph{Intel \CIFIVE{}}
143The Intel \CIFIVE\ is a \SB\ based processor produced by
144Intel. Table \ref{sandybridgeinfo} gives the hardware description of the
145Intel \CITHREE\ machine selected.
146
147\begin{table}[h]
148\begin{center}
149\begin{tabular}{|c||c|}
150\hline
151Processor & Intel Sandybridge i5-2300 (2.80GHz) \\ \hline
152L1 Cache &  192 KB\\ \hline     
153L2 Cache &  4 X 256KB \\ \hline
154L3 Cache & 6-MB \\ \hline
155Front Side Bus &  1333 MHz\\ \hline
156Memory  &  6GB DDDR\\ \hline
157Hard disk &  SATA 1TB\\ \hline
158Max TDP & 95W \\ \hline
159
160\end{tabular}
161\end{center}
162\caption{\SB{}} 
163\label{sandybridgeinfo} 
164\end{table}
165
166\subsection{PMC Hardware Events}\label{events}
167
168Each of the hardware events selected relates to performance
169and energy features associated with
170one or more hardware units.   For example, total branch mispredictions
171relate to the branch predictor and branch target buffer capacity.
172
173The set of PMC events used include the following.
174\begin{itemize}
175\item Processor Cycles
176\item Branch Instructions
177\item Branch Mispredictions
178\item Integer Instructions
179\item SIMD Instructions
180\item Cache Misses
181\end{itemize}
182
183\subsection{Energy Measurement}
184  To measure energy we use a Fluke i410 current
185clamp applied on the 12V wires that supply power to the processor
186sockets. The clamp detects the magnetic field created by the flowing
187current and converts it into voltage levels (1mV per 1A
188current). The voltage levels are then monitored by an Agilent 34410a
189multimeter at the granularity of 100 samples per second. This
190measurement captures the power to the processor package, including
191cores, caches, Northbridge memory controller, and the quick-path
192interconnects \cite{clamp}.
Note: See TracBrowser for help on using the repository browser.