Changeset 1399


Ignore:
Timestamp:
Aug 30, 2011, 6:54:25 PM (8 years ago)
Author:
ksherdy
Message:

edits

File:
1 edited

Legend:

Unmodified
Added
Removed
  • docs/HPCA2012/04-methodology.tex

    r1393 r1399  
    22\label{section:methodology}
    33
    4 \paragraph{XML Parsers}\label{parsers}
     4\paragraph{XML Parsers:}\label{parsers}
     5We evaluate the Parabix XML parser described above
     6against two widely available open-source parsers: Xerces-C \cite{xerces} and Expat \cite{expat}.
     7Each of the parsers is evaluated on the task of implementing the
     8parsing and well-formedness validation requirements of the full
     9XML 1.0 specification\cite{TR:XML}.
     10Xerces-C version 3.1.1 (SAX) is a validating XML
     11parser written in C++ and is available as part of the the Apache project.
     12Expat version 2.0.1 is a stream-oriented non-validating XML parser library written in C.
     13To ensure a fair comparison, we restricted our analysis of Xerces-C to its WFXML scanner to
     14eliminate the cost of non-well-formedness validation and used the SAX interface to avoid
     15the memory cost of DOM tree construction.
    516
    6 We evaluate the Parabix XML parser described above against two widely
    7 available open-source parsers, Xerces-C++, and Expat.  Each of the
    8 parsers is evaluated on the task of implementing the parsing and
    9 well-formedness checking requirements of the full XML 1.0
    10 specification\cite{TR:XML}.  Xerces-C++ version 3.1.1 (SAX)
    11 \cite{xerces} is a validating open source XML parser written in C++
    12 available as part of the the Apache project.  To ensure a fair
    13 comparison, we use the WFXML scanner of Xerces to eliminate the
    14 overheads of validation and also use the SAX interface to avoid the
    15 overheads costs of DOM tree construction.  Expat version 2.0.1
    16 \cite{expat} is a non-validating XML parser library written in C.
    17 
    18 
    19 \paragraph{XML Workloads}\label{workloads}
     17\paragraph{XML Workloads:}\label{workloads}
    2018XML is used for a variety of purposes ranging from databases to config
    21 files in mobile phones. A key feature of these XML files that affects
    22 the overall parsing performance is the \textit{Markup
    23   density}. \textit{Markup density} is defined as the ratio of the
    24 total markup contained within an XML file to the total XML document
    25 size.  This metric has substantial influence on the performance of
    26 traditional recursive descent XML parser implementations.  We use a
     19files in mobile phones.
     20A key predictor of the overall parsing performance of an XML file is
     21its \textit{Markup density} (i.e., the ratio of markup vs. the total XML document size.)
     22This metric has substantial influence on the performance of
     23traditional recursive descent XML parsers.  We use a
    2724mixture of document-oriented and data-oriented XML files in our study
    2825to  analyze workloads with a full spectrum of markup densities.
     
    5552
    5653
    57 \paragraph{Platform Hardware}
     54\paragraph{Platform Hardware:}
    5855SSE extensions have been available on commodity Intel processors for
    5956over a decade since the Pentium III. They have steadily evolved with
     
    6865Sandybridge.
    6966
    70 We propose to investigate each the execution profiles of XML parsers
    71 using the the Performance Monitoring Counter (PMC) hardware event
    72 found in the processor. We have chosen several key hardware
    73 performance events which provide insight into the profile of our
    74 application and indicate if the processor is doing useful
    75 work~\cite{bellosa2001, bertran2010}.  The set of performance counters
    76 included in our study are Branch instructions, Branch mispredictions,
     67We investigated the execution profiles of each XML parser
     68using the Performance Monitoring Counter (PMC) found in the processor.
     69We chose several key hardware events that provide insight into the profile of each
     70application and indicate if the processor is doing useful work
     71~\cite{bellosa2001, bertran2010}. 
     72The set of events included in our study are: Branch instructions, Branch mispredictions,
    7773Integer instructions, SIMD instructions, and Cache misses. In
    7874addition, we characterize the SIMD operations and study the type and
    7975class of SIMD operations using the Intel Pin binary instrumentation
    80 framework.
    81 
    82 
    83 
    84 
     76framework.
    8577
    8678\begin{table*}[h]
     
    10496
    10597
    106 \paragraph{Energy Measurement}
    107 
     98\paragraph{Energy Measurement:}
    10899A key benefit of the Parabix parser is its more efficient use of the
    109100processor pipeline which reflects in the overall energy usage.  We
     
    118109memory controller, and the quick-path interconnects. We obtain samples
    119110throughout the entire execution of the program and then calculate overall
    120 total energy as  $12V*\sigma^{N_{samples}}_{i=1} Sample_i$.
     111total energy as  $12V*\sum^{N_{samples}}_{i=1} Sample_i$.
    121112
    122113
Note: See TracChangeset for help on using the changeset viewer.