Changeset 3041


Ignore:
Timestamp:
Apr 19, 2013, 12:39:07 AM (6 years ago)
Author:
ksherdy
Message:

Replaced verbatim with code tags, replaced math escape with appropriate tags, added unicode escapes, added personal blurbs.

Location:
docs/Balisage13/Bal2013came0601
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • docs/Balisage13/Bal2013came0601/Bal2013came0601.html

    r3040 r3041  
    1212<div lang="en" class="article">
    1313<div class="titlepage">
    14 <h2 class="article-title" id="idp270784"></h2>
     14<h2 class="article-title" id="idp26800"></h2>
    1515<div class="author">
    1616<h3 class="author">Nigel Medforth</h3>
    1717<div class="affiliation">
    18 <p class="jobtitle"></p>
    19 <p class="orgname"></p>
    20 </div>
    21 <h5 class="author-email"><code class="email">&lt;<a class="email" href="mailto:"></a>&gt;</code></h5>
     18<p class="jobtitle">Developer</p>
     19<p class="orgname">International Characters Inc.</p>
     20</div>
     21<div class="affiliation">
     22<p class="jobtitle">Graduate Student, School of Computing Science</p>
     23<p class="orgname">Simon Fraser University </p>
     24</div>
     25<h5 class="author-email"><code class="email">&lt;<a class="email" href="mailto:nmedfort@sfu.ca">nmedfort@sfu.ca</a>&gt;</code></h5>
    2226</div>
    2327<div class="author">
     
    4347<p class="orgname">Simon Fraser University</p>
    4448</div>
     49<div class="affiliation">
     50<p class="jobtitle">Chief Technology Officer</p>
     51<p class="orgname">International Characters, Inc.</p>
     52</div>
    4553<h5 class="author-email"><code class="email">&lt;<a class="email" href="mailto:cameron@cs.sfu.ca">cameron@cs.sfu.ca</a>&gt;</code></h5>
    4654</div>
     
    5563<div class="abstract">
    5664<p class="title"><b>Abstract</b></p>
    57 <p id="idp438976">Prior research on the acceleration of XML processing
     65<p id="idp28432">Prior research on the acceleration of XML processing
    5866using SIMD and multi-core parallelism has lead to
    5967a number of interesting research prototypes.  This work
     
    7684<p><b>Table of Contents</b></p>
    7785<dl>
    78 <dt><span class="section"><a href="#idp426544" class="toc">Introduction</a></span></dt>
    79 <dt><span class="section"><a href="#idp427464" class="toc">Background</a></span></dt>
     86<dt><span class="section"><a href="#idp15032" class="toc">Introduction</a></span></dt>
     87<dt><span class="section"><a href="#idp15928" class="toc">Background</a></span></dt>
    8088<dd><dl>
    81 <dt><span class="section"><a href="#idp427800" class="toc">Xerces C++ Structure</a></span></dt>
    82 <dt><span class="section"><a href="#idp432856" class="toc">The Parabix Framework</a></span></dt>
    83 <dt><span class="section"><a href="#idp584560" class="toc">Sequential vs. Parallel Paradigm</a></span></dt>
     89<dt><span class="section"><a href="#idp16248" class="toc">Xerces C++ Structure</a></span></dt>
     90<dt><span class="section"><a href="#idp22392" class="toc">The Parabix Framework</a></span></dt>
     91<dt><span class="section"><a href="#idp178296" class="toc">Sequential vs. Parallel Paradigm</a></span></dt>
    8492</dl></dd>
    85 <dt><span class="section"><a href="#idp587512" class="toc">Architecture</a></span></dt>
     93<dt><span class="section"><a href="#idp181248" class="toc">Architecture</a></span></dt>
    8694<dd><dl>
    87 <dt><span class="section"><a href="#idp587832" class="toc">Overview</a></span></dt>
    88 <dt><span class="section"><a href="#idp609600" class="toc">Character Set Adapters</a></span></dt>
    89 <dt><span class="section"><a href="#idp616344" class="toc">Combined Parallel Filtering</a></span></dt>
    90 <dt><span class="section"><a href="#idp628552" class="toc">Content Stream</a></span></dt>
    91 <dt><span class="section"><a href="#idp634584" class="toc">Namespace Handling</a></span></dt>
    92 <dt><span class="section"><a href="#idp642952" class="toc">Error Handling</a></span></dt>
     95<dt><span class="section"><a href="#idp181568" class="toc">Overview</a></span></dt>
     96<dt><span class="section"><a href="#idp195064" class="toc">Character Set Adapters</a></span></dt>
     97<dt><span class="section"><a href="#idm9304" class="toc">Combined Parallel Filtering</a></span></dt>
     98<dt><span class="section"><a href="#idp221352" class="toc">Content Stream</a></span></dt>
     99<dt><span class="section"><a href="#idp228528" class="toc">Namespace Handling</a></span></dt>
     100<dt><span class="section"><a href="#idp240160" class="toc">Error Handling</a></span></dt>
    93101</dl></dd>
    94 <dt><span class="section"><a href="#idp650360" class="toc">Multithreading with Pipeline Parallelism</a></span></dt>
    95 <dt><span class="section"><a href="#idp657176" class="toc">Performance</a></span></dt>
     102<dt><span class="section"><a href="#idp247200" class="toc">Multithreading with Pipeline Parallelism</a></span></dt>
     103<dt><span class="section"><a href="#idp257920" class="toc">Performance</a></span></dt>
    96104<dd><dl>
    97 <dt><span class="section"><a href="#idp659048" class="toc">Xerces C++ SAXCount</a></span></dt>
    98 <dt><span class="section"><a href="#idp663432" class="toc">GML2SVG</a></span></dt>
     105<dt><span class="section"><a href="#idp259800" class="toc">Xerces C++ SAXCount</a></span></dt>
     106<dt><span class="section"><a href="#idp264184" class="toc">GML2SVG</a></span></dt>
    99107</dl></dd>
    100 <dt><span class="section"><a href="#idp664008" class="toc">Conclusion and Future Work</a></span></dt>
     108<dt><span class="section"><a href="#idp264760" class="toc">Conclusion and Future Work</a></span></dt>
    101109</dl>
    102110</div>
    103 <div class="section" id="idp426544">
     111<div class="section" id="idp15032">
    104112<h2 class="title" style="clear: both">Introduction</h2>
    105 <p id="idp426888"></p>
    106 <p id="idp427016"></p>
    107 <p id="idp427144"></p>
    108 <p id="idp427272"></p>
    109 </div>
    110 <div class="section" id="idp427464">
     113<p id="idp15352"></p>
     114<p id="idp15480"></p>
     115<p id="idp15608"></p>
     116<p id="idp15736"></p>
     117</div>
     118<div class="section" id="idp15928">
    111119<h2 class="title" style="clear: both">Background</h2>
    112 <div class="section" id="idp427800">
     120<div class="section" id="idp16248">
    113121<h3 class="title" style="clear: both">Xerces C++ Structure</h3>
    114 <p id="idp428144">
     122<p id="idp16568">
    115123The Xerces C++ parser
    116124
     
    130138parsing as well as a DOM tree-based parsing interface.
    131139</p>
    132 <p id="idp429744">
     140<p id="idp18112">
    133141
    134142
     
    151159is required, involving all aspects of the parser.
    152160</p>
    153 <p id="idp431920">
    154 
    155 
    156 
    157 
    158 </p>
    159 </div>
    160 <div class="section" id="idp432856">
     161<p id="idp21496">
     162
     163
     164
     165
     166</p>
     167</div>
     168<div class="section" id="idp22392">
    161169<h3 class="title" style="clear: both">The Parabix Framework</h3>
    162 <p id="idp433176">
     170<p id="idp22712">
    163171The Parabix (parallel bit stream) framework is a transformative approach to XML parsing
    164172(and other forms of text processing.) The key idea is to exploit the availability of wide
     
    170178Boolean-logic operations\footnote{∧, \√ and ¬ denote the boolean AND, OR and NOT operators.}
    171179are used to classify the input bits into a set of <span class="ital">character-class bit streams</span>, which identify key
    172 characters (or groups of characters) with a $1$.
     180characters (or groups of characters) with a <code class="code">1</code>.
    173181For example, one of the fundamental characters in XML is a left-angle bracket.
    174 A character is a<code class="code">&lt; if and only if ¬(b<sub>0</sub> √ b<sub>1</sub>) ∧ (b<sub>2</sub> ∧
     182A character is an <code class="code">'&lt;' if and only if ¬(b<sub>0</sub> √ b<sub>1</sub>) ∧ (b<sub>2</sub> ∧
    175183b<sub>3</sub>) ∧ (b<sub>4</sub> ∧ b<sub>5</sub>) ∧ ¬ (b<sub>6</sub> √ b<sub>7</sub>) = 1</code>.
    176 Similarly, a character is numeric 
    177 <code class="code">[0-9] if and only if $¬(b<sub>0</sub> √ b<sub>1</sub>) ∧ (b<sub>2</sub> ∧ b<sub>3</sub>) ∧ ¬(b<sub>4</sub> ∧ (b<sub>5</sub> √ b<sub>6</sub>))</code>.
     184Similarly, a character is numeric,
     185<code class="code">[0-9] if and only if ¬(b<sub>0</sub> √ b<sub>1</sub>) ∧ (b<sub>2</sub> ∧ b<sub>3</sub>) ∧ ¬(b<sub>4</sub> ∧ (b<sub>5</sub> √ b<sub>6</sub>))</code>.
    178186An important observation here is that ranges of characters may
    179187require fewer operations than individual characters and
     
    181189multiple classes can share the classification cost.
    182190</p>
    183 <p id="idp577560">
    184 
    185 </p>
    186 <p id="idp579000">
     191<p id="idp170240">
     192
     193</p>
     194<p id="idp171696">
    187195Consider, for example, the XML source data stream shown in the first line of .
    188196The remaining lines of this figure show several parallel bit streams that are computed in Parabix-style
     
    195203tag openers into start tag marks and end tag marks
    196204depending on the character immediately following the
    197 opener (i.e., ``\verb:/:'') or not.  The remaining three
     205opener (i.e., <code class="code">"/"</code>) or not.  The remaining three
    198206lines show streams that can be computed in subsequent
    199207parsing (using the technique
     
    201209attribute names and attribute values of tags. 
    202210</p>
    203 <p id="idp580288">
     211<p id="idp175032">
    204212Two intuitions may help explain how the Parabix approach can lead
    205213to improved XML parsing performance. The first is that
     
    216224should provide substantial benefit.
    217225</p>
    218 <p id="idp580480">
     226<p id="idp175992">
    219227Previous studies have shown that the Parabix approach improves many aspects of XML processing,
    220228including transcoding \cite{Cameron2008}, character classification and validation,
     
    229237they lacked the functionality required by full XML parsers.
    230238</p>
    231 <p id="idp580672">
     239<p id="idp177744">
    232240Commercial XML processors support transcoding of multiple character sets and can parse and
    233241validate against multiple document vocabularies.
     
    236244</p>
    237245</div>
    238 <div class="section" id="idp584560">
     246<div class="section" id="idp178296">
    239247<h3 class="title" style="clear: both">Sequential vs. Parallel Paradigm</h3>
    240 <p id="idp584880">
     248<p id="idp178616">
    241249Xerces—like all traditional XML parsers—processes XML documents sequentially.
    242250Each character is examined to distinguish between the
     
    246254validation and content processing modes.
    247255</p>
    248 <p id="idp585952">
     256<p id="idp179688">
    249257In other words, Xerces belongs to an equivalent class applications termed FSM applications\footnote{
    250258  Herein FSM applications are considered software systems whose behaviour is defined by the inputs,
     
    253261Unfortunately, textual data tends to be unpredictable and any character could induce a state transition.
    254262</p>
    255 <p id="idp586616">
     263<p id="idp180352">
    256264Parabix-style XML parsers utilize a concept of layered processing.
    257265A block of source text is transformed into a set of lexical bitstreams,
     
    264272</div>
    265273</div>
    266 <div class="section" id="idp587512">
     274<div class="section" id="idp181248">
    267275<h2 class="title" style="clear: both">Architecture</h2>
    268 <div class="section" id="idp587832">
     276<div class="section" id="idp181568">
    269277<h3 class="title" style="clear: both">Overview</h3>
    270 <p id="idp588280">
     278<p id="idp182016">
    271279icXML is more than an optimized version of Xerces. Many components were grouped, restructured and
    272280rearchitected with pipeline parallelism in mind.
     
    293301the user-defined DTD and schema grammar(s) before passing it to the end-user.
    294302</p>
    295 <p id="idp591904">
    296 
    297 </p>
    298 <p id="idp592160">
     303<p id="idp185640">
     304
     305</p>
     306<p id="idp185896">
    299307In icXML functions are grouped into logical components.
    300308As shown in Figure \ref{fig:icxml-arch}, two major categories exist: (1) the Parabix Subsystem and (2) the Markup Processor.
     
    315323From here, two data-independent branches exist: the Symbol Resolver and Content Preparation Unit.
    316324</p>
    317 <p id="idp404616">
     325<p id="idp188856">
    318326A typical XML file contains few unique element and attribute names—but each of them will occur frequently.
    319327icXML stores these as distinct data structures, called symbols, each with their own global identifier (GID).
     
    321329the raw data to produce a sequence of GIDs, called the <span class="ital">symbol stream</span>.
    322330</p>
    323 <p id="idp406288">
     331<p id="idp190496">
    324332The final components of the Parabix Subsystem are the <span class="ital">Content Preparation Unit</span> and <span class="ital">Content Stream Generator</span>.
    325333The former takes the (transposed) basis bitstreams and selectively filters them, according to the
     
    327335filtered streams into the tagged UTF-16 <span class="ital">content stream</span>, discussed in Section \ref{section:arch:contentstream}.
    328336</p>
    329 <p id="idp407840">
     337<p id="idp191984">
    330338Combined, the symbol and content stream form icXML's compressed IR of the XML document.
    331339The <span class="ital">Markup Processor</span>~parses the IR to validate and produce the sequential output for the end user.
     
    339347However, preprocessing associated with each symbol greatly reduces the work of this stage.
    340348</p>
    341 <p id="idp609280">
    342 
    343 </p>
    344 </div>
    345 <div class="section" id="idp609600">
     349<p id="idp194728">
     350
     351</p>
     352</div>
     353<div class="section" id="idp195064">
    346354<h3 class="title" style="clear: both">Character Set Adapters</h3>
    347 <p id="idp610216">
     355<p id="idp195712">
    348356In Xerces, all input is transcoded into UTF-16 to simplify the parsing costs of Xerces itself and
    349357provide the end-consumer with a single encoding format.
     
    354362transcoding imposes at least a cost of buffer copying.
    355363</p>
    356 <p id="idp611592">
     364<p id="idp196496">
    357365In icXML, however,  the concept of Character Set Adapters (CSAs) is used to minimize transcoding costs.
    358366Given a specified input encoding, a CSA is responsible for checking that
     
    362370is performed using the parallel bitstream representation of the source input.
    363371</p>
    364 <p id="idp612312">
     372<p id="idp197216">
    365373An important observation is that many character sets are an
    366374extension to the legacy 7-bit ASCII character set.  This includes the
     
    370378serves to compute lexical item streams for all such ASCII-based character sets.
    371379</p>
    372 <p id="idp612944">
     380<p id="idp197848">
    373381A second observation is that—regardless of which character set is used—quite
    374382often all of the characters in a particular block of input will be within the ASCII range.
     
    379387UTF-16 form are each set to zero in this case.
    380388</p>
    381 <p id="idp614496">
     389<p id="idp199400">
    382390A third observation is that repeated transcoding of the names of XML
    383391elements, attributes and so on can be avoided by using a look-up mechanism.
     
    390398additional cost.
    391399</p>
    392 <p id="idp615280">
     400<p id="idp200184">
    393401The cost of individual character transcoding is avoided whenever a block of input is
    394402confined to the ASCII subset and for all but the first occurrence of any XML element or attribute name.
     
    404412</p>
    405413</div>
    406 <div class="section" id="idp616344">
     414<div class="section" id="idm9304">
    407415<h3 class="title" style="clear: both">Combined Parallel Filtering</h3>
    408 <p id="idp616664">
     416<p id="idm8952">
    409417As just mentioned, UTF-8 to UTF-16 transcoding involves marking
    410418all but the last bytes of multi-byte UTF-8 sequences as
    411419positions for deletion.   For example, the two
    412 Chinese characters \begin{CJK*}{UTF8}{gbsn}䜠奜\end{CJK*}
    413 are represented as two three-byte UTF-8 sequences \verb'E4 BD A0'
    414 and \verb'E5 A5 BD' while the UTF-16 representation must be
    415 compressed down to the two code units \verb'4F60' and \verb'597D'.
     420Chinese characters <code class="code">䜠奜</code>
     421are represented as two three-byte UTF-8 sequences <code class="code">E4 BD A0</code>
     422and <code class="code">E5 A5 BD</code> while the UTF-16 representation must be
     423compressed down to the two code units <code class="code">4F60</code> and <code class="code">597D</code>.
    416424In the bit parallel representation, this corresponds to a reduction
    417425from six bit positions representing UTF-8 code units (bytes)
     
    423431for deletion.   In this case, the portion of the mask
    424432corresponding to these input bytes is the bit sequence
    425 \verb'110110'.  Using this approach, transcoding may then be
     433<code class="code">110110</code>.  Using this approach, transcoding may then be
    426434completed by applying parallel deletion and inverse transposition of the
    427435UTF-16 bitstreams\cite{Cameron2008}.
    428436</p>
    429 <p id="idp619512">
    430 
    431 
    432 
    433 
    434 
    435 
    436 
    437 
    438 
    439 </p>
    440 <p id="idp621992">
     437<p id="idm6424">
     438
     439
     440
     441
     442
     443
     444
     445
     446
     447</p>
     448<p id="idm3968">
    441449Rather than immediately paying the
    442450costs of deletion and transposition just for transcoding,
     
    458466
    459467</p>
    460 <p id="idp622312">
     468<p id="idm1464">
    461469In essence, the deletion masks for transcoding and
    462470for line break normalization each represent a bitwise
     
    465473applied once.
    466474</p>
    467 <p id="idp624608">
     475<p id="idp217752">
    468476A further application of combined filtering
    469477is the processing of XML character and entity
     
    484492that this is not true, it is addressed in post-processing.
    485493</p>
    486 <p id="idp627584">
     494<p id="idp220384">
    487495The final step of combined filtering occurs during
    488496the process of reducing markup data to tag bytes
     
    500508</p>
    501509</div>
    502 <div class="section" id="idp628552">
     510<div class="section" id="idp221352">
    503511<h3 class="title" style="clear: both">Content Stream</h3>
    504 <p id="idp628872">
     512<p id="idp221696">
    505513A relatively-unique concept for icXML is the use of a filtered content stream.
    506514Rather that parsing an XML document in its original format, the input is transformed
     
    514522through the parallel filtering algorithm, described in section \ref{sec:parfilter}.
    515523</p>
    516 <p id="idp630400">
     524<p id="idp223408">
    517525Combined with the symbol stream, the parser traverses the content stream to effectively
    518526reconstructs the input document in its output form.
    519 The initial <span class="ital">0</span> indicates an empty content string. The following \verb|&gt;|
     527The initial <span class="ital">0</span> indicates an empty content string. The following <code class="code">&gt;</code>
    520528indicates that a start tag without any attributes is the first element in this text and
    521529the first unused symbol, <code class="code">document</code>, is the element name.
     
    529537directly jump to the end of every string without scanning for it.
    530538</p>
    531 <p id="idp632232">
    532 Following ``\verb`fee`'' is a \verb`=`, which marks the existence of an attribute.
     539<p id="idp226064">
     540Following <code class="code">'fee'</code> is a <code class="code">=</code>, which marks the existence of an attribute.
    533541Because all of the intra-element was performed in the Parabix Subsystem, this must be a legal attribute.
    534542Since attributes can only occur within start tags and must be accompanied by a textual value,
    535543the next symbol in the symbol stream must be the element name of a start tag,
    536 and the following one must be the name of the attribute and the string that follows the \verb`=` must be its value.
    537 However, the subsequent \verb`=` is not treated as an independent attribute because the parser has yet to
    538 read a \verb`&gt;`, which marks the end of a start tag. Thus only one symbol is taken from the symbol stream and
     544and the following one must be the name of the attribute and the string that follows the <code class="code">=</code> must be its value.
     545However, the subsequent <code class="code">=</code> is not treated as an independent attribute because the parser has yet to
     546read a <code class="code">&gt;</code>, which marks the end of a start tag. Thus only one symbol is taken from the symbol stream and
    539547it (along with the string value) is added to the element.
    540 Eventually the parser reaches a \verb`/`, which marks the existence of an end tag. Every end tag requires an
     548Eventually the parser reaches a <code class="code">/</code>, which marks the existence of an end tag. Every end tag requires an
    541549element name, which means they require a symbol. Inter-element validation whenever an empty tag is detected to
    542550ensure that the appropriate scope-nesting rules have been applied.
    543551</p>
    544552</div>
    545 <div class="section" id="idp634584">
     553<div class="section" id="idp228528">
    546554<h3 class="title" style="clear: both">Namespace Handling</h3>
    547 <p id="idp635192">
     555<p id="idp229176">
    548556In XML, namespaces prevents naming conflicts when multiple vocabularies are used together.
    549557It is especially important when a vocabulary application-dependant meaning, such as when
     
    551559Namespaces are bound to uniform resource identifiers (URIs), which are strings used to identify
    552560specific names or resources.
    553 On line 1 of Figure \ref{fig:namespace1}, the \verb|xmlns| attribute instructs the XML
     561On line 1 of Figure \ref{fig:namespace1}, the <code class="code">xmlns</code> attribute instructs the XML
    554562processor to bind the prefix <code class="code">p</code> to the URI '<code class="code">pub.net</code>' and the default (empty)
    555 prefix to <code class="code">book.org</code>. Thus to the XML processor, the \verb|title| on line 2 and
    556 \verb|price| on line 4 both read as \verb|"book.org":title| and \verb|"book.org":price|
    557 respectively, whereas on line 3 and 5, \verb|p:name| and \verb|price| are seen as
    558 \verb|"pub.net":name| and \verb|"pub.net":price|. Even though the actual element name
    559 \verb|price|, due to namespace scoping rules they are viewed as two uniquely-named items
     563prefix to <code class="code">book.org</code>. Thus to the XML processor, the <code class="code">title</code> on line 2 and
     564<code class="code">price</code> on line 4 both read as <code class="code">"book.org":title</code> and <code class="code">"book.org":price</code>
     565respectively, whereas on line 3 and 5, <code class="code">p:name</code> and <code class="code">price</code> are seen as
     566<code class="code">"pub.net":name</code> and <code class="code">"pub.net":price</code>. Even though the actual element name
     567<code class="code">price</code>, due to namespace scoping rules they are viewed as two uniquely-named items
    560568because the current vocabulary is determined by the namespace(s) that are in-scope.
    561569</p>
    562 <p id="idp637032">
    563 
    564 </p>
    565 <p id="idp637288">
     570<p id="idp233096">
     571
     572</p>
     573<p id="idp233368">
    566574In both Xerces and icXML, every URI has a one-to-one mapping to a URI ID.
    567575These persist for the lifetime of the application through the use of a global URI pool.
     
    573581(2) those that repeatedly modify the namespaces in predictable patterns.
    574582</p>
    575 <p id="idp637480">
     583<p id="idp233560">
    576584For that reason, icXML contains an independent namespace stack and utilizes bit vectors to cheaply perform
    577585
    578586
    579 When a prefix is declared (e.g., \verb|xmlns:p="pub.net"|), a namespace binding is created that maps
     587When a prefix is declared (e.g., <code class="code">xmlns:p="pub.net"</code>), a namespace binding is created that maps
    580588the prefix (which are assigned Prefix IDs in the symbol resolution process) to the URI.
    581589Each unique namespace binding has a unique namespace id (NSID) and every prefix contains a bit vector marking every
    582590NSID that has ever been associated with it within the document. For example, in Table \ref{tbl:namespace1}, the
    583 prefix binding set of \verb|p| and \verb|xmlns| would be \verb|01| and \verb|11| respectively.
     591prefix binding set of <code class="code">p</code> and <code class="code">xmlns</code> would be <code class="code">01</code> and <code class="code">11</code> respectively.
    584592To resolve the in-scope namespace binding for each prefix, a bit vector of the currently visible namespaces is
    585593maintained by the system. By ANDing the prefix bit vector with the currently visible namespaces, the in-scope
     
    587595A namespace binding table, similar to Table \ref{tbl:namespace1}, provides the actual URI ID.
    588596</p>
    589 <p id="idp640280">
    590 
    591 </p>
    592 <p id="idp640536">
    593 
    594 
    595 
    596 
    597 </p>
    598 <p id="idp641968">
     597<p id="idp237336">
     598
     599</p>
     600<p id="idp237608">
     601
     602
     603
     604
     605</p>
     606<p id="idp239176">
    599607To ensure that scoping rules are adhered to,
    600608whenever a start tag is encountered, any modification to the currently visible namespaces is calculated and stored
     
    605613</p>
    606614</div>
    607 <div class="section" id="idp642952">
     615<div class="section" id="idp240160">
    608616<h3 class="title" style="clear: both">Error Handling</h3>
    609 <p id="idp643296">
     617<p id="idp240504">
    610618
    611619Xerces outputs error messages in two ways: through the programmer API and as thrown objects for fatal errors.
     
    616624each with its own system for detecting and producing error messages.
    617625</p>
    618 <p id="idp644416">
     626<p id="idp241640">
    619627Within the Parabix Subsystem, all computations are performed in parallel, a block at a time.
    620628Errors are derived as artifacts of bitstream calculations, with a 1-bit marking the byte-position of an error within a block,
     
    649657column number.
    650658</p>
    651 <p id="idp648656">
     659<p id="idp245496">
    652660The Markup Processor is a state-driven machine. As such, error detection within it is very similar to Xerces.
    653661However, reporting the correct line/column is a much more difficult problem.
     
    664672</div>
    665673</div>
    666 <div class="section" id="idp650360">
     674<div class="section" id="idp247200">
    667675<h2 class="title" style="clear: both">Multithreading with Pipeline Parallelism</h2>
    668 <p id="idp650728">
     676<p id="idp247568">
    669677As discussed in section \ref{background:xerces}, Xerces can be considered a FSM application.
    670 These are ``embarrassingly sequential.''\cite{Asanovic:EECS-2006-183} and notoriously difficult to parallelize.
     678These are "embarrassingly sequential."\cite{Asanovic:EECS-2006-183} and notoriously difficult to parallelize.
    671679However, icXML is designed to organize processing into logical layers.   
    672680In particular, layers within the Parabix Subsystem are designed to operate
     
    676684of modules.
    677685</p>
    678 <p id="idp651520">
     686<p id="idp249104">
    679687The most straightforward division of work in icXML is to separate
    680688the Parabix Subsystem and the Markup Processor into distinct logical layers into two separate stages.
    681689The resultant application, <span class="ital">icXML-p</span>, is a course-grained software-pipeline application.
    682 In this case, the Parabix Subsystem thread $T_1$ reads 16k of XML input $I$ at a time and produces the
    683 content, symbol and URI streams, then stores them in a pre-allocated shared data structure $S$.
    684 The Markup Processor thread $T_2$ consumes $S$, performs well-formedness and grammar-based validation,
     690In this case, the Parabix Subsystem thread <code class="code">T<sub>1</sub></code> reads 16k of XML input <code class="code">I</code> at a time and produces the
     691content, symbol and URI streams, then stores them in a pre-allocated shared data structure <code class="code">S</code>.
     692The Markup Processor thread <code class="code">T<sub>2</sub></code> consumes <code class="code">S</code>, performs well-formedness and grammar-based validation,
    685693and the provides parsed XML data to the application through the Xerces API. 
    686694The shared data structure is implemented using a ring buffer,
     
    688696In the examples of Figure \ref{threads_timeline1} and \ref{threads_timeline2}, the ring buffer has four entries.
    689697A lock-free mechanism is applied to ensure that each entry can only be read or written by one thread at the same time.
    690 In Figure \ref{threads_timeline1} the processing time of $T_1$ is longer than $T_2$;
    691 thus $T_2$ always waits for $T_1$ to write to the shared memory.
    692 Figure \ref{threads_timeline2} illustrates the scenario in which $T_1$ is faster
    693 and must wait for $T_2$ to finish reading the shared data before it can reuse the memory space.
    694 </p>
    695 <p id="idp653392">
    696 
    697 </p>
    698 <p id="idp653664">
     698In Figure \ref{threads_timeline1} the processing time of <code class="code">T<sub>1</sub></code> is longer than <code class="code">T<sub>2</sub></code>;
     699thus <code class="code">T<sub>2</sub></code> always waits for <code class="code">T<sub>1</sub></code> to write to the shared memory.
     700Figure \ref{threads_timeline2} illustrates the scenario in which <code class="code">T<sub>1</sub></code> is faster
     701and must wait for <code class="code">T<sub>2</sub></code> to finish reading the shared data before it can reuse the memory space.
     702</p>
     703<p id="idp254152">
     704
     705</p>
     706<p id="idp254424">
    699707Overall, our design is intended to benefit a range of applications.
    700708Conceptually, we consider two design points.
     
    704712while the cost of XML parsing represents an overhead of 40%.
    705713</p>
    706 <p id="idp653856">
     714<p id="idp254616">
    707715Our design is predicated on a goal of using the Parabix
    708716framework to achieve a 50% to 100% improvement in the parsing engine itself.   
     
    712720In this case, the single-threaded icXML should achieve a 1.5x speedup over Xerces
    713721so that the total application cost reduces to 67% of the original. 
    714 However, in \icXML-p, our ideal scenario gives us two well-balanced threads
     722However, in icXML-p, our ideal scenario gives us two well-balanced threads
    715723each performing about 33% of the original work.   
    716724In this case, Amdahl's law predicts that we could expect up to a 3x speedup at best.
    717725</p>
    718 <p id="idp655592">
     726<p id="idp256344">
    719727At the other extreme of our design range, we consider an application
    720 in which core parsing cost is 40%.   Assuming the 2x speedup of
     728in which core parsing cost is 40%.  Assuming the 2x speedup of
    721729the Parabix Subsystem over the corresponding Xerces core, single-threaded
    722 icXML delivers a 25% speedup.   However, the most significant
     730icXML delivers a 25% speedup.  However, the most significant
    723731aspect of our two-stage multi-threaded design then becomes the
    724732ability to hide the entire latency of parsing within the serial time
    725 required by the application.   In this case, we achieve
     733required by the application.  In this case, we achieve
    726734an overall speedup in processing time by 1.67x.
    727735</p>
    728 <p id="idp656304">
     736<p id="idp257048">
    729737Although the structure of the Parabix Subsystem allows division of the work into
    730738several pipeline stages and has been demonstrated to be effective
     
    738746</p>
    739747</div>
    740 <div class="section" id="idp657176">
     748<div class="section" id="idp257920">
    741749<h2 class="title" style="clear: both">Performance</h2>
    742 <p id="idp657512">
    743 We evaluate \xerces{}, icXML, \icXML-p against two benchmarking applications:
     750<p id="idp258256">
     751We evaluate Xerces-C++ 3.1.1, icXML, icXML-p against two benchmarking applications:
    744752the Xerces C++ SAXCount sample application,
    745753and a real world GML to SVG transformation application.
     
    7507588 MB L3 cache) running the 64-bit version of Ubuntu 12.04 (Linux).
    751759</p>
    752 <p id="idp658168">
     760<p id="idp258920">
    753761We analyzed the execution profiles of each XML parser
    754762using the performance counters found in the processor.
     
    764772to collect per core hardware events.
    765773</p>
    766 <div class="section" id="idp659048">
     774<div class="section" id="idp259800">
    767775<h3 class="title" style="clear: both">Xerces C++ SAXCount</h3>
    768 <p id="idp659392">
     776<p id="idp260144">
    769777Xerces comes with sample applications that demonstrate salient features of the parser.
    770778SAXCount is the simplest such application:
     
    772780and prints out the totals.
    773781</p>
    774 <p id="idp659856">
    775 
    776 </p>
    777 <p id="idp660128">
     782<p id="idp260608">
     783
     784</p>
     785<p id="idp260880">
    778786Table \ref{XMLDocChars} shows the document characteristics of the XML input
    779787files selected for the Xerces C++ SAXCount benchmark. The jaw.xml
     
    782790XML documents and consist entirely of single byte encoded ASCII characters.
    783791</p>
    784 <p id="idp660320">
     792<p id="idp261072">
    785793A key predictor of the overall parsing performance of an XML file is markup density\footnote{
    786794  Markup Density: the ratio of markup bytes used to define the structure of the document vs. its file size.}.
     
    791799of markup densities.
    792800</p>
    793 <p id="idp662072">
     801<p id="idp262824">
    794802Figure \ref{perf_SAX} compares the performance of Xerces, icXML and pipelined icXML in terms of
    795803CPU cycles per byte for the SAXCount application.
    796804The speedup for icXML over Xerces is 1.3x to 1.8x.
    797 With two threads on the multicore machine, \icXML-p can achieve speedup up to 2.7x.
     805With two threads on the multicore machine, icXML-p can achieve speedup up to 2.7x.
    798806Xerces is substantially slowed by dense markup
    799807but icXML is less affected through a reduction in branches and the use of parallel-processing techniques.
    800 \icXML-p performs better as markup-density increases because the work performed by each stage is
     808icXML-p performs better as markup-density increases because the work performed by each stage is
    801809well balanced in this application.
    802810</p>
    803 <p id="idp662840">
    804 
    805 </p>
    806 </div>
    807 <div class="section" id="idp663432">
     811<p id="idp263592">
     812
     813</p>
     814</div>
     815<div class="section" id="idp264184">
    808816<h3 class="title" style="clear: both">GML2SVG</h3>
    809 <p id="idp663752"></p>
    810 </div>
    811 </div>
    812 <div class="section" id="idp664008">
     817<p id="idp264504"></p>
     818</div>
     819</div>
     820<div class="section" id="idp264760">
    813821<h2 class="title" style="clear: both">Conclusion and Future Work</h2>
    814 <p id="idp664360">
     822<p id="idp265112">
    815823This paper is the first case study documenting the significant
    816824performance benefits that may be realized through the integration
     
    828836feasibility of these techniques.
    829837</p>
    830 <p id="idp665368">
     838<p id="idp266936">
    831839The further development of icXML to move beyond 2-stage
    832840pipeline parallelism is ongoing, with realistic prospects for
     
    836844library should offer substantial benefits. 
    837845</p>
    838 <p id="idp665904">
     846<p id="idp267472">
    839847The example of XML parsing may be considered prototypical
    840848of finite-state machines applications which have sometimes
    841 been considered ``embarassingly sequential'' and so
    842 difficult to parallelize that ``nothing works.''  So the
     849been considered "embarassingly sequential" and so
     850difficult to parallelize that "nothing works."  So the
    843851case study presented here should be considered an important
    844852data point in making the case that parallelization can
    845853indeed be helpful across a broad array of application types.
    846854</p>
    847 <p id="idp666504">
     855<p id="idp268584">
    848856To overcome the software engineering challenges in applying
    849857parallel bitstream technology to existing software systems,
     
    856864</p>
    857865</div>
    858 <div class="bibliography" id="idp667464">
     866<div class="bibliography" id="idp269544">
    859867<h2 class="title" style="clear:both">Bibliography</h2>
    860868<p class="bibliomixed" id="XMLChip09">[Leventhal and Lemoine 2009] Leventhal, Michael and
  • docs/Balisage13/Bal2013came0601/Bal2013came0601.xml

    r3040 r3041  
    3030</para>
    3131      </abstract>
    32       <author>
     32                        <author>
    3333         <personname>
    3434            <firstname>Nigel</firstname>
     
    3636         </personname>
    3737         <personblurb>
    38             <para></para>
     38             <para>Nigel Medforth is a M.Sc. student at Simon Fraser University and the lead developer of icXML.
     39             He earned a Bachelor of Technology in Information Technology at Kwantlen Polytechnic University in 2009
     40             and was awarded the Dean’s Medal for Outstanding Achievement.</para>
     41             <para>Nigel is currently researching ways to leverage both the Parabix framework and stream-processing models
     42             to further accelerate XML parsing within icXML.</para>
    3943         </personblurb>
    4044         <affiliation>
    41             <jobtitle></jobtitle>
    42             <orgname></orgname>
     45            <jobtitle>Developer</jobtitle>
     46             <orgname>International Characters Inc.</orgname>
    4347         </affiliation>
    44          <email></email>
     48         <affiliation>
     49            <jobtitle>Graduate Student, School of Computing Science</jobtitle>
     50             <orgname>Simon Fraser University </orgname>
     51         </affiliation>
     52         <email>nmedfort@sfu.ca</email>
    4553      </author>
    4654      <author>
     
    6977               Simon Fraser University in 2005.
    7078                                                </para>
    71             <para> Ken is currently pursuing graduate studies in Computing Science at Simon Fraser
     79            <para> Ken is currently pursuing PhD studies in Computing Science at Simon Fraser
    7280               University with industrial scholarship support from the Natural Sciences and
    7381               Engineering Research Council of Canada, the Mathematics of Information Technology and
     
    8391         <email>ksherdy@sfu.ca</email>
    8492      </author>
    85       <author>
    86          <personname>
    87             <firstname>Rob</firstname>
    88             <surname>Cameron</surname>
    89          </personname>
    90          <personblurb>
    91             <para>Dr. Rob Cameron is Professor and Director of Computing Science at Simon Fraser
    92                University. With a broad spectrum of research interests related to programming
    93                languages, software engineering and sociotechnical design of public computing
    94                infrastructure, he has recently been focusing on high performance text processing
    95                using parallel bit stream technology and its applications to XML. He is also a
    96                patentleft evangelist, advocating university-based technology transfer models
    97                dedicated to free use in open source. </para>
    98          </personblurb>
    99          <affiliation>
    100             <jobtitle>Professor of Computing Science</jobtitle>
    101             <orgname>Simon Fraser University</orgname>
     93                 <author>
     94          <personname>
     95             <firstname>Rob</firstname>
     96             <surname>Cameron</surname>
     97          </personname>
     98          <personblurb>
     99             <para>Dr. Rob Cameron is Professor of Computing Science and Associate Dean of
     100             Applied Sciences at Simon Fraser
     101                University.   His research interests include programming language
     102                and software system technology, with a specific focus on high performance
     103                text processing using SIMD and multicore parallelism.  He is the developer
     104                of the REX XML shallow parser as well as the parallel bit stream (Parabix)
     105                framework for SIMD text processing. 
     106              </para>
     107          </personblurb>
     108          <affiliation>
     109             <jobtitle>Professor of Computing Science</jobtitle>
     110             <orgname>Simon Fraser University</orgname>
     111         </affiliation>
     112          <affiliation>
     113             <jobtitle>Chief Technology Officer</jobtitle>
     114             <orgname>International Characters, Inc.</orgname>
    102115         </affiliation>
    103116         <email>cameron@cs.sfu.ca</email>
    104       </author>
     117       </author>
    105118      <author>
    106119         <personname>
     
    220233Boolean-logic operations\footnote{&#8743;, \&#8744; and &#172; denote the boolean AND, OR and NOT operators.}
    221234are used to classify the input bits into a set of <emphasis role="ital">character-class bit streams</emphasis>, which identify key
    222 characters (or groups of characters) with a $1$.
     235characters (or groups of characters) with a <code>1</code>.
    223236For example, one of the fundamental characters in XML is a left-angle bracket.
    224 A character is a<code>&lt; if and only if &#172;(b<subscript>0</subscript> &#8744; b<subscript>1</subscript>) &#8743; (b<subscript>2</subscript> &#8743;
     237A character is an <code>&apos;&lt;&apos; if and only if &#172;(b<subscript>0</subscript> &#8744; b<subscript>1</subscript>) &#8743; (b<subscript>2</subscript> &#8743;
    225238b<subscript>3</subscript>) &#8743; (b<subscript>4</subscript> &#8743; b<subscript>5</subscript>) &#8743; &#172; (b<subscript>6</subscript> &#8744; b<subscript>7</subscript>) = 1</code>.
    226 Similarly, a character is numeric 
    227 <code>[0-9] if and only if $&#172;(b<subscript>0</subscript> &#8744; b<subscript>1</subscript>) &#8743; (b<subscript>2</subscript> &#8743; b<subscript>3</subscript>) &#8743; &#172;(b<subscript>4</subscript> &#8743; (b<subscript>5</subscript> &#8744; b<subscript>6</subscript>))</code>.
     239Similarly, a character is numeric,
     240<code>[0-9] if and only if &#172;(b<subscript>0</subscript> &#8744; b<subscript>1</subscript>) &#8743; (b<subscript>2</subscript> &#8743; b<subscript>3</subscript>) &#8743; &#172;(b<subscript>4</subscript> &#8743; (b<subscript>5</subscript> &#8744; b<subscript>6</subscript>))</code>.
    228241An important observation here is that ranges of characters may
    229242require fewer operations than individual characters and
     
    271284tag openers into start tag marks and end tag marks
    272285depending on the character immediately following the
    273 opener (i.e., ``\verb:/:'') or not.  The remaining three
     286opener (i.e., <code>&quot;/&quot;</code>) or not.  The remaining three
    274287lines show streams that can be computed in subsequent
    275288parsing (using the technique
     
    503516all but the last bytes of multi-byte UTF-8 sequences as
    504517positions for deletion.   For example, the two
    505 Chinese characters \begin{CJK*}{UTF8}{gbsn}䜠奜\end{CJK*}
    506 are represented as two three-byte UTF-8 sequences \verb'E4 BD A0'
    507 and \verb'E5 A5 BD' while the UTF-16 representation must be
    508 compressed down to the two code units \verb'4F60' and \verb'597D'.
     518Chinese characters <code>&#x4F60;&#x597D;</code>
     519are represented as two three-byte UTF-8 sequences <code>E4 BD A0</code>
     520and <code>E5 A5 BD</code> while the UTF-16 representation must be
     521compressed down to the two code units <code>4F60</code> and <code>597D</code>.
    509522In the bit parallel representation, this corresponds to a reduction
    510523from six bit positions representing UTF-8 code units (bytes)
     
    516529for deletion.   In this case, the portion of the mask
    517530corresponding to these input bytes is the bit sequence
    518 \verb'110110'.  Using this approach, transcoding may then be
     531<code>110110</code>.  Using this approach, transcoding may then be
    519532completed by applying parallel deletion and inverse transposition of the
    520533UTF-16 bitstreams\cite{Cameron2008}.
     
    642655Combined with the symbol stream, the parser traverses the content stream to effectively
    643656reconstructs the input document in its output form.
    644 The initial <emphasis role="ital">0</emphasis> indicates an empty content string. The following \verb|>|
     657The initial <emphasis role="ital">0</emphasis> indicates an empty content string. The following <code>&gt;</code>
    645658indicates that a start tag without any attributes is the first element in this text and
    646659the first unused symbol, <code>document</code>, is the element name.
     
    655668</para>
    656669<para>
    657 Following ``\verb`fee`'' is a \verb`=`, which marks the existence of an attribute.
     670Following <code>&apos;fee&apos;</code> is a <code>=</code>, which marks the existence of an attribute.
    658671Because all of the intra-element was performed in the Parabix Subsystem, this must be a legal attribute.
    659672Since attributes can only occur within start tags and must be accompanied by a textual value,
    660673the next symbol in the symbol stream must be the element name of a start tag,
    661 and the following one must be the name of the attribute and the string that follows the \verb`=` must be its value.
    662 However, the subsequent \verb`=` is not treated as an independent attribute because the parser has yet to
    663 read a \verb`>`, which marks the end of a start tag. Thus only one symbol is taken from the symbol stream and
     674and the following one must be the name of the attribute and the string that follows the <code>=</code> must be its value.
     675However, the subsequent <code>=</code> is not treated as an independent attribute because the parser has yet to
     676read a <code>&gt;</code>, which marks the end of a start tag. Thus only one symbol is taken from the symbol stream and
    664677it (along with the string value) is added to the element.
    665 Eventually the parser reaches a \verb`/`, which marks the existence of an end tag. Every end tag requires an
     678Eventually the parser reaches a <code>/</code>, which marks the existence of an end tag. Every end tag requires an
    666679element name, which means they require a symbol. Inter-element validation whenever an empty tag is detected to
    667680ensure that the appropriate scope-nesting rules have been applied.
     
    677690Namespaces are bound to uniform resource identifiers (URIs), which are strings used to identify
    678691specific names or resources.
    679 On line 1 of Figure \ref{fig:namespace1}, the \verb|xmlns| attribute instructs the XML
     692On line 1 of Figure \ref{fig:namespace1}, the <code>xmlns</code> attribute instructs the XML
    680693processor to bind the prefix <code>p</code> to the URI &apos;<code>pub.net</code>&apos; and the default (empty)
    681 prefix to <code>book.org</code>. Thus to the XML processor, the \verb|title| on line 2 and
    682 \verb|price| on line 4 both read as \verb|"book.org":title| and \verb|"book.org":price|
    683 respectively, whereas on line 3 and 5, \verb|p:name| and \verb|price| are seen as
    684 \verb|"pub.net":name| and \verb|"pub.net":price|. Even though the actual element name
    685 \verb|price|, due to namespace scoping rules they are viewed as two uniquely-named items
     694prefix to <code>book.org</code>. Thus to the XML processor, the <code>title</code> on line 2 and
     695<code>price</code> on line 4 both read as <code>&quot;book.org&quot;:title</code> and <code>&quot;book.org&quot;:price</code>
     696respectively, whereas on line 3 and 5, <code>p:name</code> and <code>price</code> are seen as
     697<code>&quot;pub.net&quot;:name</code> and <code>&quot;pub.net&quot;:price</code>. Even though the actual element name
     698<code>price</code>, due to namespace scoping rules they are viewed as two uniquely-named items
    686699because the current vocabulary is determined by the namespace(s) that are in-scope.
    687700</para>
     
    716729<!-- speculation and scope resolution options with a single XOR operation &#8212; even if many alterations are performed. -->
    717730<!-- performance advantage figure?? average cycles/byte cost? -->
    718 When a prefix is declared (e.g., \verb|xmlns:p="pub.net"|), a namespace binding is created that maps
     731When a prefix is declared (e.g., <code>xmlns:p=&quot;pub.net&quot;</code>), a namespace binding is created that maps
    719732the prefix (which are assigned Prefix IDs in the symbol resolution process) to the URI.
    720733Each unique namespace binding has a unique namespace id (NSID) and every prefix contains a bit vector marking every
    721734NSID that has ever been associated with it within the document. For example, in Table \ref{tbl:namespace1}, the
    722 prefix binding set of \verb|p| and \verb|xmlns| would be \verb|01| and \verb|11| respectively.
     735prefix binding set of <code>p</code> and <code>xmlns</code> would be <code>01</code> and <code>11</code> respectively.
    723736To resolve the in-scope namespace binding for each prefix, a bit vector of the currently visible namespaces is
    724737maintained by the system. By ANDing the prefix bit vector with the currently visible namespaces, the in-scope
     
    821834<para>
    822835As discussed in section \ref{background:xerces}, Xerces can be considered a FSM application.
    823 These are ``embarrassingly sequential.''\cite{Asanovic:EECS-2006-183} and notoriously difficult to parallelize.
     836These are &quot;embarrassingly sequential.&quot;\cite{Asanovic:EECS-2006-183} and notoriously difficult to parallelize.
    824837However, icXML is designed to organize processing into logical layers.   
    825838In particular, layers within the Parabix Subsystem are designed to operate
     
    833846the Parabix Subsystem and the Markup Processor into distinct logical layers into two separate stages.
    834847The resultant application, <emphasis role="ital">icXML-p</emphasis>, is a course-grained software-pipeline application.
    835 In this case, the Parabix Subsystem thread $T_1$ reads 16k of XML input $I$ at a time and produces the
    836 content, symbol and URI streams, then stores them in a pre-allocated shared data structure $S$.
    837 The Markup Processor thread $T_2$ consumes $S$, performs well-formedness and grammar-based validation,
     848In this case, the Parabix Subsystem thread <code>T<subscript>1</subscript></code> reads 16k of XML input <code>I</code> at a time and produces the
     849content, symbol and URI streams, then stores them in a pre-allocated shared data structure <code>S</code>.
     850The Markup Processor thread <code>T<subscript>2</subscript></code> consumes <code>S</code>, performs well-formedness and grammar-based validation,
    838851and the provides parsed XML data to the application through the Xerces API. 
    839852The shared data structure is implemented using a ring buffer,
     
    841854In the examples of Figure \ref{threads_timeline1} and \ref{threads_timeline2}, the ring buffer has four entries.
    842855A lock-free mechanism is applied to ensure that each entry can only be read or written by one thread at the same time.
    843 In Figure \ref{threads_timeline1} the processing time of $T_1$ is longer than $T_2$;
    844 thus $T_2$ always waits for $T_1$ to write to the shared memory.
    845 Figure \ref{threads_timeline2} illustrates the scenario in which $T_1$ is faster
    846 and must wait for $T_2$ to finish reading the shared data before it can reuse the memory space.
     856In Figure \ref{threads_timeline1} the processing time of <code>T<subscript>1</subscript></code> is longer than <code>T<subscript>2</subscript></code>;
     857thus <code>T<subscript>2</subscript></code> always waits for <code>T<subscript>1</subscript></code> to write to the shared memory.
     858Figure \ref{threads_timeline2} illustrates the scenario in which <code>T<subscript>1</subscript></code> is faster
     859and must wait for <code>T<subscript>2</subscript></code> to finish reading the shared data before it can reuse the memory space.
    847860</para>
    848861<para>
     
    879892In this case, the single-threaded icXML should achieve a 1.5x speedup over Xerces
    880893so that the total application cost reduces to 67% of the original. 
    881 However, in \icXML-p, our ideal scenario gives us two well-balanced threads
     894However, in icXML-p, our ideal scenario gives us two well-balanced threads
    882895each performing about 33% of the original work.   
    883896In this case, Amdahl's law predicts that we could expect up to a 3x speedup at best.
     
    885898<para>
    886899At the other extreme of our design range, we consider an application
    887 in which core parsing cost is 40%.   Assuming the 2x speedup of
     900in which core parsing cost is 40%.  Assuming the 2x speedup of
    888901the Parabix Subsystem over the corresponding Xerces core, single-threaded
    889 icXML delivers a 25% speedup.   However, the most significant
     902icXML delivers a 25% speedup.  However, the most significant
    890903aspect of our two-stage multi-threaded design then becomes the
    891904ability to hide the entire latency of parsing within the serial time
    892 required by the application.   In this case, we achieve
     905required by the application.  In this case, we achieve
    893906an overall speedup in processing time by 1.67x.
    894907</para>
     
    909922                        <title>Performance</title>             
    910923<para>
    911 We evaluate \xerces{}, icXML, \icXML-p against two benchmarking applications:
     924We evaluate Xerces-C++ 3.1.1, icXML, icXML-p against two benchmarking applications:
    912925the Xerces C++ SAXCount sample application,
    913926and a real world GML to SVG transformation application.
     
    981994CPU cycles per byte for the SAXCount application.
    982995The speedup for icXML over Xerces is 1.3x to 1.8x.
    983 With two threads on the multicore machine, \icXML-p can achieve speedup up to 2.7x.
     996With two threads on the multicore machine, icXML-p can achieve speedup up to 2.7x.
    984997Xerces is substantially slowed by dense markup
    985998but icXML is less affected through a reduction in branches and the use of parallel-processing techniques.
    986 \icXML-p performs better as markup-density increases because the work performed by each stage is
     999icXML-p performs better as markup-density increases because the work performed by each stage is
    9871000well balanced in this application.
    9881001</para>
     
    10321045The example of XML parsing may be considered prototypical
    10331046of finite-state machines applications which have sometimes
    1034 been considered ``embarassingly sequential'' and so
    1035 difficult to parallelize that ``nothing works.''  So the
     1047been considered &quot;embarassingly sequential&quot; and so
     1048difficult to parallelize that &quot;nothing works.&quot;  So the
    10361049case study presented here should be considered an important
    10371050data point in making the case that parallelization can
Note: See TracChangeset for help on using the changeset viewer.