source: docs/Working/icXML/performance.tex @ 2508

Last change on this file since 2508 was 2508, checked in by ksherdy, 7 years ago

Updated performance section.

File size: 3.6 KB
Line 
1\section{Performance}
2
3We evaluate the Xerces C++ 3.1.1, ICXML Xerces C++ XML parser and pipelined
4ICXML Xerces C++ against two benchmark applications. A key predictor of
5the overall parsing performance
6of an XML file is Markup density (i.e., the ratio of markup
7vs. the total XML document size.) This metric has substantial
8influence on the performance of traditional recursive descent
9XML parsers. We use a mixture of document-oriented and
10data-oriented XML files to analyze performance over a spectrum
11of markup densities.
12
13SSE SIMD extensions have been
14available on commodity Intel processors for over a decade
15since the Pentium III. They have steadily evolved with improvements
16improvements in instruction latency, cache interface, register
17resources, and the addition of domain specific instructions.
18Here we investigate XML parser performance
19evaluated using an Intel Core i7 quad-core
20"Sandy Bridge" processor (3.40GHz, 4 physical cores/8 threads,
2132+32 Kb (per core) L1 cache,
22256 Kb (per core) L2 cache,
238 MB L3 cache).
24
25\subsection{Xerces C++ SAXCount}
26
27SAXCount is the simplest application that counts the elements and characters of a given XML file using the (event based) SAX API.
28The SAXCount sample parses an XML file and prints out a count of the number of elements in the file.
29
30\subsubsection{Workload}
31
32XX shows the document characteristics of the XML input
33files selected for the Xerces C++ SAXCount benchmark. The jawiki.xml
34and dewiki.xml XML files represent document-oriented XML
35inputs and contain the three-byte and four-byte UTF-8 sequence
36required for the UTF-8 encoding of Japanese and German
37characters respectively. The remaining data files are dataoriented
38XML documents and consist entirely of single byte
39encoded ASCII characters.
40
41\begin{figure}
42\includegraphics[width=0.5\textwidth]{plots/perf_SAX.pdf}
43\caption{}
44\label{perf_SAX}
45\end{figure}
46
47\subsection{GML2SVG}
48
49The visualization of geographic information is a primary goals of on-demand web-based mapping systems \cite{lu2007advances}.
50Web-based mapping systems commonly encode spatial data with GML for transmission and with SVG for display \cite{lu2007advances}.
51GML is an XML grammar defined by the Open Geospatial Consortium (OGC) to encode geographical features \cite{lake2004geography}.
52As an XML grammar, GML is platform neutral and is well suited  the exchange of spatial data over the Internet.
53GML however, is not a visualization format. Rather, GML relies on commercially available viewers for data visualization,
54with Scalable Vector Graphics (SVG) viewers being one of the most common \cite{lu2007advances}. Large volumes of GML data are
55typical in on-demand web-based mapping, and as a consequence, the visualization of GML as SVG requires
56high-performance GML to SVG translation.
57
58In this section we present a performance evaluation of the translation wide spectrum of Geography Markup Language (GML)
59data files to Scalable Vector Graphics (SVG) format for visualization.
60
61\subsubsection{Benchmark Data Characteristics}
62In the GML to SVG benchmark, GML feature elements and GML geometry elements tags are matched. GML coordinate data are then extracted
63and transformed to the SVG path data encodings. Equivalent SVG path elements are generated and output to the destination
64SVG document. GML to SVG data translations are executed on GML source data modelling the city of Vancouver, British Columbia, Canada.
65This data set consists of 46 distinct GML feature layers ranging in size from approximately 9 KB to 12 MB.
66In this performance study, approximately 213.4 MB of source GML data generates approximately 91.9 MB of destination SVG data.
67
68
69
70 
Note: See TracBrowser for help on using the repository browser.