Changeset 2513 for docs/Working


Ignore:
Timestamp:
Oct 19, 2012, 8:27:51 PM (7 years ago)
Author:
ksherdy
Message:

Performance section updates.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • docs/Working/icXML/performance.tex

    r2511 r2513  
    22
    33We evaluate the Xerces C++ 3.1.1, ICXML Xerces C++ XML parser and pipelined
    4 ICXML Xerces C++ against two benchmark applications. A key predictor of
    5 the overall parsing performance
    6 of an XML file is Markup density (i.e., the ratio of markup
    7 vs. the total XML document size.) This metric has substantial
    8 influence on the performance of traditional recursive descent
    9 XML parsers. We use a mixture of document-oriented and
    10 data-oriented XML files to analyze performance over a spectrum
    11 of markup densities.
    12 
    13 SSE SIMD extensions have been
    14 available on commodity Intel processors for over a decade
    15 since the Pentium III. They have steadily evolved with improvements
    16 improvements in instruction latency, cache interface, register
    17 resources, and the addition of domain specific instructions.
    18 Here we investigate XML parser performance
     4ICXML Xerces C++ against two benchmark applications. Firstly against the Xerces C++ SAXCount
     5sample application and secondly against a real world
     6GML to SVG format conversion application implemented against the Xerces C++
     7DocumentHandler interface. Herein we investigate XML parser performance
    198evaluated using an Intel Core i7 quad-core
    209"Sandy Bridge" processor (3.40GHz, 4 physical cores/8 threads,
    211032+32 Kb (per core) L1 cache,
    2211256 Kb (per core) L2 cache,
    23 8 MB L3 cache).
     128 MB L3 cache) and leverage the SSE2 SIMD instructions
     13available on modern Intel commodity processors.
     14
     15We investigated the execution profiles of each XML parser
     16using the performance counters found in the processor.
     17We chose several key hardware events that provide insight into the profile of each
     18application and indicate if the processor is doing useful work. 
     19The set of events included in our study are:
     20processor cycles, branch instructions, branch mispredictions,
     21and cache misses.
    2422
    2523\subsection{Xerces C++ SAXCount}
     
    4745\end{table}
    4846
    49 
    5047Table \ref{XMLDocChars} shows the document characteristics of the XML input
    5148files selected for the Xerces C++ SAXCount benchmark. The jaw.xml
     
    5350required for the UTF-8 encoding of Japanese characters. The remaining data files are data-oriented
    5451XML documents and consist entirely of single byte encoded ASCII characters.
     52
     53A key predictor of the overall parsing performance
     54of an XML file is Markup density (i.e., the ratio of markup
     55vs. the total XML document size.) This metric has substantial
     56influence on the performance of traditional recursive descent
     57XML parsers. We use a mixture of document-oriented and
     58data-oriented XML files to analyze performance over a spectrum
     59of markup densities.
    5560
    5661Figure \ref{perf_SAX} compares the performance of Xerces, \icXML{} and pipelined \icXML{} in terms of CPU cycles per byte.
     
    7075\subsection{GML2SVG}
    7176
    72 The visualization of geographic information is a primary goals of on-demand web-based mapping systems \cite{lu2007advances}.
     77The visualization of geographic information is a primary goal of on-demand web-based mapping systems \cite{lu2007advances}.
    7378Web-based mapping systems commonly encode spatial data with GML for transmission and with SVG for display \cite{lu2007advances}.
    7479GML is an XML grammar defined by the Open Geospatial Consortium (OGC) to encode geographical features \cite{lake2004geography}.
     
    8085
    8186In this section we present a performance evaluation of the translation wide spectrum of Geography Markup Language (GML)
    82 data files to Scalable Vector Graphics (SVG) format for visualization.
     87data files to Scalable Vector Graphics (SVG) format for visualization. In the GML to SVG benchmark, GML feature elements
     88and GML geometry elements tags are matched. GML coordinate data are then extracted
     89and transformed to the SVG path data encodings. Equivalent SVG path elements are generated and output to the destination
     90SVG document. GML to SVG data translations are executed on GML source data modelling the city of Vancouver, British Columbia, Canada.
    8391
    8492\subsubsection{Workload}
    85 In the GML to SVG benchmark, GML feature elements and GML geometry elements tags are matched. GML coordinate data are then extracted
    86 and transformed to the SVG path data encodings. Equivalent SVG path elements are generated and output to the destination
    87 SVG document. GML to SVG data translations are executed on GML source data modelling the city of Vancouver, British Columbia, Canada.
    88 This data set consists of 46 distinct GML feature layers ranging in size from approximately 9 KB to 125.2 MB with an average document size of
    89 18.6 MB. Markup density ranged from approximately 0.0447 to 0.719 with an average markup density of 0.519. In this performance study,
     93
     94The GML source document set consists of 46 distinct GML feature layers ranging in size from approximately 9 KB to 125.2 MB
     95and with an average document size of 18.6 MB. Markup density ranges from approximately 0.0447 to 0.719
     96and with an average markup density of 0.519. In this performance study,
    9097213.4 MB of source GML data generates 91.9 MB of target SVG data.
    9198
Note: See TracChangeset for help on using the changeset viewer.