Changeset 1103 for docs


Ignore:
Timestamp:
Apr 9, 2011, 1:16:28 AM (8 years ago)
Author:
ksherdy
Message:

Edit Methodology

Location:
docs/PACT2011
Files:
3 edited

Legend:

Unmodified
Added
Removed
  • docs/PACT2011/04-methodology.tex

    r1085 r1103  
    22
    33
    4 In this section, we describe our methodology for the measurements and
    5 investigation of XML parsing energy consumption and performance.  In
    6 brief, for each of the XML parsers under study we propose to measure
     4In this section we describe our methodology for the measurements and
     5investigation of XML parser energy consumption and performance.  In
     6brief, for each of the four XML parsers under study we propose to measure
    77and evaluate the energy consumption required to carry out XML
    88well-formedness checking, under a variety of workloads, and as
    9 executed on three different Intel cores.
     9executed on three different Intel processors.
    1010
    11 To begin our study, we propose to first investigate each of the XML
    12 parsers in terms of the PMCs hardware events as listed in the
    13 following subsection. Based on the recommendation of previous
    14 proposals \cite{bellosa2001, bertran2010, bircher2007}, we have chosen
     11To begin our study we propose to first investigate each of the XML
     12parsers in terms of the Performance Monitoring Counter \footnote{Performance Monitoring Counters
     13 are special-purpose registers available with most modern
     14 microprocessors. PMCs store the running count of specific hardware
     15 events, such as retired instructions, cache misses, branch
     16 mispredictions, and arithmetic-logic unit operations.
     17 PMCs can be used to capture information about any program at
     18 run-time and under any workload at a fine granularity.} (PMC) hardware events listed in
     19the PMC Hardware Events subsection. Based on the findings of previous
     20work \cite{bellosa2001, bertran2010, bircher2007} we have chosen
    1521several key hardware performance events for which the authors indicate
    16 have a strong correlation to energy consumption.  We also measure
    17 other runtime counts such as the number of SIMD instructions and
     22a strong correlation with energy consumption. In addition, we measure
     23the runtime counts of SIMD instructions and
    1824bitwise operations using the Intel Pin binary instrumentation
    19 framework. From these data, we hope to gain insight into the XML
    20 parser execution characteristics and compare and constrast different
    21 industrial parsers.
     25framework. Based on these data we gain further insight into XML
     26parser execution characteristics and compare and constrast each of the Parabix parser versions
     27against the performance of standard industry parsers.
    2228
    2329The foundational work by Bellosa in \cite{bellosa2001} as well as more
    24 recent work in \cite {bircher2007, bertran2010} show that
    25 hardware-usage patterns has a significant impact in the energy
    26 consumption of a particular application; \cite{bellosa2001,
    27   bircher2007, bertran2010} further show that there is a strong
    28 correlation between specific performance events and energy usage---but
    29 the authors of each differ slightly in opinion as to which performance
    30 monitoring counters\footnote{Performance monitoring counters (PMCs)
    31   are special-purpose registers that are included in most modern
    32   microprocessors; they store the running count of specific hardware
    33   events, such as retired instructions, cache misses, branch
    34   mispredictions, and arithmetic-logic unit operations to name a few.
    35   They can be used to capture information about any program at
    36   run-time, under any workload, at a very fine granularity.} (PMCs) to
    37 use.
    38 
     30recent work in \cite {bircher2007, bertran2010} demonstrate that
     31hardware-usage patterns have a significant impact on the energy
     32consumption characteristics of an application \cite{bellosa2001,
     33  bircher2007, bertran2010}. Further, the authors demonstrate a strong
     34correlation between specific PMC events and energy usage. However, each
     35author differs slightly in their opinion of the exact set of PMCs to use.
    3936
    4037The following subsections describe the XML parsers under study, XML
    4138workloads, the hardware architectures, PMC hardware events selected
    42 for measurement, and the energy measurement set up. We analyze the
    43 performance of the different parsers based on the hardware performance
    44 counter measurements and contrast their energy consumption
    45 measurements based on direct measurement.
     39for measurement, and the energy measurement instrumentation set up. We analyze the
     40performance of each of the XML parsers under study based on PMC hardware event counts and contrast their energy consumption
     41measurements based on direct measurements.
    4642
    4743
    4844\subsection{Parsers}\label{parsers}
    4945
    50 The XML parsing technologies selected for this study are the Parabix2,
    51 Xerces-C++, and Expat XML parsers.  Parabix2 \cite{parabix2} (parallel
    52 bit streams for XML) is the second generation Parabix parser. Parabix2
    53 is an open-source XML parser that leverages the SIMD capabilities of
    54 modern commodity processors; it employs the new parallelization
    55 techniques using parallel parsing with bit stream addition to deliver
     46The XML parsing technologies selected for this study are the Parabix1, Parabix2,
     47Xerces-C++, and Expat XML parsers. Parabix1 (parallel bit Streams for XML) is our first generation SIMD and Parallel Bit Stream technology based XML parser \cite{Parabix1}.
     48Parabix1 leverages the processor built-in {\em bitscan} operation for high-performance XML character scanning as well as the
     49SIMD capabilities of modern commodity processors to achieve high performance.
     50Parabix2 \cite{parabix2} represents the second generation of the Parabix1 parser. Parabix2
     51is an open-source XML parser that also leverages Parallel Bit Stream technology and the SIMD capabilities of
     52modern commodity processors. However, Parabix2 differs from Parabix1 in that it employs new parallelization
     53techniques, such as a multiple cursor approach to parallel parsing together with bit stream addition techniques to advance multiple cursors independently and in parallel. Parabix2 delivers
    5654dramatic performance improvements over traditional byte-at-a-time
    5755parsing technology.  Xerces-C++ version 3.1.1 (SAX) \cite{xerces} is a
     
    7977Markup density is defined
    8078as the ratio of the total markup contained within an XML file to the
    81 total XML document size.  This metric may have substantial influence
    82 on the performance of XML parsing
     79total XML document size.  This metric has substantial influence
     80on the performance of traditional recursive descent XML parser implementations
    8381We use a mixture of document-oriented and data-oriented XML
    84 files in our study to provide workloads with a spectrum of
     82files in our study to provide workloads with a full spectrum of
    8583markup densities.
    8684
    8785Table \ref{XMLDocChars} shows the document characteristics of the XML
    8886input files selected for this performance study.  The jawiki.xml and
    89 dewiki.xml XML files represent document-oriented XML inputs,
    90 containing three-byte and four-byte UTF8 sequence.  The remaining
    91 files are data-oriented inputs and consist of only ASCII
    92 characters. 
     87dewiki.xml XML files represent document-oriented XML inputs
     88and contain the three-byte and four-byte UTF-8 sequence required for the UTF-8 encoding of Japanese and German characters respectively.  The remaining
     89data files are data-oriented XML documents and consist entirely of single byte $7$-bit encoded ASCII characters. 
    9390
    9491
    9592\subsection{Platform Hardware}
    9693\paragraph{Intel \CO{}}
    97 The Intel \CO{} is a Conroe based processor produced by
     94Intel \CO{} code name Conroe processor produced by
    9895Intel. Table \ref{core2info} gives the hardware description of the
    99 Intel \CO{} machine selected.
     96Intel \CO{} machine.
    10097\begin{table}[h]
    10198\begin{center}
     
    116113
    117114\paragraph {Intel \CITHREE{}}
    118 The Intel \CITHREE\ is a Nehalem based processor produced by Intel. The
    119 intent of this processor is to serve as an example low end server
     115Intel \CITHREE\ code name Nehalem processor produced by Intel. The
     116intent of the selection of this processor is to serve as an example of a low end server
    120117processor. Table \ref{i3info} gives the hardware description of the
    121 Intel \CITHREE\ machine selected.
     118Intel \CITHREE\ machine.
    122119
    123120\begin{table}[h]
     
    141138
    142139\paragraph{Intel \CIFIVE{}}
    143 The Intel \CIFIVE\ is a \SB\ based processor produced by
     140Intel \CIFIVE\ code name \SB\ processor produced by
    144141Intel. Table \ref{sandybridgeinfo} gives the hardware description of the
    145 Intel \CITHREE\ machine selected.
     142Intel \CITHREE\ machine.
    146143
    147144\begin{table}[h]
     
    171168relate to the branch predictor and branch target buffer capacity.
    172169
    173 The set of PMC events used include the following.
     170The set of PMC events used included in this study are as follows.
    174171\begin{itemize}
    175172\item Processor Cycles
     
    182179
    183180\subsection{Energy Measurement}
    184   To measure energy we use a Fluke i410 current
     181  To measure energy we use the Fluke i410 current
    185182clamp applied on the 12V wires that supply power to the processor
    186183sockets. The clamp detects the magnetic field created by the flowing
  • docs/PACT2011/reference.bib

    r1018 r1103  
    2121title = "{Xerces C++ Parser}",
    2222howpublished = "{http://xerces.apache.org/xerces-c/}"
     23}
     24
     25@misc{parabix1,
     26author = {Robert D. Cameron et al},
     27title = {Parabix1},
     28howpublished = {http://parabix.costar.sfu.ca/}
    2329}
    2430
Note: See TracChangeset for help on using the changeset viewer.