Changeset 2505 for docs


Ignore:
Timestamp:
Oct 19, 2012, 6:56:35 PM (7 years ago)
Author:
nmedfort
Message:

more edits

Location:
docs/Working/icXML
Files:
6 edited

Legend:

Unmodified
Added
Removed
  • docs/Working/icXML/arch-charactersetadapters.tex

    r2496 r2505  
    22\label{arch:character-set-adapter}
    33
    4 The first major difference between Xerces and \icXML{} is the use of Character Set Adapters (CSAs). In Xerces, all input
    5 is transcoded into UTF-16 to simplify the parsing costs of Xerces itself and to provide the end-consumer with a single
    6 encoding format.
     4In Xerces, all input is transcoded into UTF-16 to simplify the parsing costs of Xerces itself and
     5provide the end-consumer with a single encoding format.
     6\icXML{} uses Character Set Adapters (CSAs) to parse data from encoding type into a set of basis and
     7lexical bit streams.
  • docs/Working/icXML/arch-errorhandling.tex

    r2496 r2505  
    4242column number.
    4343
    44 \begin{figure}[ht]
    45 {\bf TODO: An example of a skip mask, error mask, and the raw data and transcoded data for it.
    46 Should a multi-byte character be used and/or some CRLFs to show the difficulties?}
    47 \label{fig:error_mask}
    48 \caption{}
    49 \end{figure}
     44% \begin{figure}[ht]
     45% {\bf TODO: An example of a skip mask, error mask, and the raw data and transcoded data for it.
     46% Should a multi-byte character be used and/or some CRLFs to show the difficulties?}
     47% \label{fig:error_mask}
     48% \caption{}
     49% \end{figure}
    5050
    5151The \MP{} is a state-driven machine. As such, error detection within it is very similar to Xerces.
     
    5353The \MP{} parses the content stream, which is a series of tagged UTF-16 strings.
    5454Each string is normalized in accordance with the XML specification.
    55 All symbol data and unnecessary whitespace is eliminated from the stream.
    56 This means it is impossible to directly assess the current location using only the cursor position within the content stream.
     55All symbol data and unnecessary whitespace is eliminated from the stream;
     56thus its impossible to derive the current location using only the content stream.
    5757To calculate the location, the \MP{} borrows three additional pieces of information from the \PS{}:
    5858the line-feed, skip mask, and a {\it deletion mask stream}, which is a bit stream denoting the (code-unit) position of every
  • docs/Working/icXML/arch-namespace.tex

    r2494 r2505  
    3232
    3333
    34 In both Xerces and ICXML, every URI has a one-to-one mapping to a URI ID.
     34In both Xerces and \icXML{}, every URI has a one-to-one mapping to a URI ID.
    3535These persist for the lifetime of the application through the use of a global URI pool.
    3636Xerces maintains a stack of namespace scopes that is pushed (popped) every time a start tag (end tag) occurs
     
    4141(2) those that repeatedly modify the namespaces in predictable patterns.
    4242
    43 For that reason, ICXML contains an independent namespace stack and utilizes bit vectors to cheaply perform
     43For that reason, \icXML{} contains an independent namespace stack and utilizes bit vectors to cheaply perform
    4444% speculation and
    4545scope resolution options with a single XOR operation---even if many alterations are performed.
  • docs/Working/icXML/arch-overview.tex

    r2496 r2505  
    1313and validates context-specific character set issues, such as tokenization of qualified-names and
    1414ensures each character is legal w.r.t. the XML specification.
    15 The {\it Scanner} pulls data through the reader and constructs the intermediate (and near-final)
    16 representation of the document; it deals with all issues related to entity expansion, validates
     15The {\it Scanner} pulls data through the reader and constructs the intermediate representation (IR)
     16of the document; it deals with all issues related to entity expansion, validates
    1717the XML well-formedness constraints and any character set encoding issues that cannot
    1818be completely handled by the reader or transcoder (e.g., surrogate characters, validation
     
    2121with handling namespace scoping issues between different XML vocabularies and faciliates
    2222the scanner with the construction and utilization of Schema grammar structures.
    23 The {\it Validator} takes the intermediate representation produced by the Scanner (and
     23The {\it Validator} takes the IR produced by the Scanner (and
    2424potentially annotated by the Namespace Binder) and assesses whether the final output matches
    25 the user-defined DTD and Schema grammar(s) before passing the information to the end-user.
     25the user-defined DTD and Schema grammar(s) before passing it to the end-user.
    2626
    2727\begin{figure}
     
    3939mirrors Xerces's Transcoder duties; however instead of producing UTF-16 it produces a
    4040set of lexical bit streams, similar to those shown in Figure \ref{fig:parabix1}.
    41 These lexical bit streams are later transformed into UTF-16 in the Content Buffer Generator, after additional processing is performed.
     41These lexical bit streams are later transformed into UTF-16 in the Content Stream Generator,
     42after additional processing is performed.
    4243The first precursor to producing UTF-16 is the {\it Parallel Markup Parser} phase.
    4344It takes the lexical streams and produces a set of marker bit streams in which a 1-bit identifies
     
    4849The {\it Line-Column Tracker} uses the lexical information to keep track of the cursor position(s) through the use of an
    4950optimized population count algorithm; this is described in Section \ref{section:arch:errorhandling}.
    50 From here, two major data-independent branches remain: the {\bf symbol resolver} and the {\bf content stream generator}.
    51 % The output of both are required by the \MP{}.
    52 Apart from the Parabix framework, another core difference between Xerces and \icXML{} is the use of symbols.
    53 A typical XML document will contain relatively few unique element and attribute names---but each of them will occur frequently throughout the document.
    54 In \icXML{}, names are represented by distinct symbol structures and global identifiers (GIDs).
    55 Using the information produced by the parallel markup parser, the {\it Symbol Resolver} uses a bitscan intrinsic to
    56 iterate through a symbol bit stream (64-bits at a time) to generate a set of GIDs.
    57 % It keys each symbol on its raw data representation, which means it can potentially be run in parallel with the content stream generator.
    58 One of the main advantages of using GIDs is that grammar information can be associated with the symbol itself and help bypass
    59 the lookup cost in the validation process.
    60 The final component of the \PS{} is the {\it Content Stream Generator}. This component has a multitude of
    61 responsibilities, which will be discussed in Section \ref{sec:parfilter}, but its primary function is to produce
    62 near-final UTF-16 content.
     51From here, two data-independent branches exist: the Symbol Pesolver and Content Preperation Unit.
    6352
    64 The {\it \MP{}} parses a compressed representation of the XML document, generated by the
    65 symbol resolver and content stream generator, to validate and produce the final (sequential) output for the end user.
    66 The {\it WF checker} performs all remaining inter-element wellformedness validation that would be too costly
     53\icXML{} represents elements and attributes as distinct data structures, called symbols,
     54each with their own global identifier (GID).
     55Using the {\bf symbol marker streams} produced by the Parallel Markup Parser, the {\it Symbol Resolver} scans through
     56the raw data to produce a stream (series) of GIDs, called the {\it symbol stream}.
     57A typical XML file will contain relatively few unique element and attribute names---but each of them will occur
     58frequently. % throughout the document.
     59% Grammar information can be associated with each symbol and can help reduce the look-up cost of the later Validation process.
     60
     61The final components of the \PS{} are the {\it Content Preperation Unit} and {\it Content Stream Generator}.
     62The former takes the (transposed) basis bit streams and selectively filters them, according to the
     63information provided by the Parallel Markup Parser, and the latter transforms the
     64filtered streams into the tagged UTF-16 {\it content stream}.
     65This is discussed in Section \ref{sec:parfilter}.
     66Combined, the symbol stream and content stream form \icXML{}'s compressed IR of the XML document.
     67
     68% This component has a multitude of
     69% responsibilities, which will be discussed in Section \ref{sec:parfilter}, but its primary function is to produce
     70% near-final UTF-16 content.
     71
     72The {\it \MP{}} parses the IR to validate and produces the sequential output for the end user.
     73The {\it WF checker} performs inter-element wellformedness validation that would be too costly
    6774to perform in bitspace, such as ensuring every start tag has a matching end tag.
    68 The {\it Namespace Processor} replaces Xerces's namespace binding functionality. Unlike Xerces,
    69 this is performed as a discrete phase and simply produces a set of URI identifiers (URI IDs), to
    70 be associated with each occurrence of a symbol.
     75Xerces's namespace binding functionality is replaced by the {\it Namespace Processor}. Unlike Xerces,
     76it's performed as a discrete phase, which produces a set of URI identifiers (URI IDs) that is
     77associated with symbol occurrence.
    7178This is discussed in Section \ref{section:arch:namespacehandling}.
    72 The final {\it Validation} process is responsible for the same tasks as Xerces's validator, however,
    73 the majority of the grammar look up operations are performed beforehand and stored within the symbols themselves.
     79The final {\it Validation} process is responsible for the same tasks as Xerces's validator; however
     80the majority of the grammar look-ups are performed beforehand and stored within the symbol themselves.
    7481
    7582\begin{figure}
  • docs/Working/icXML/background-parabix.tex

    r2494 r2505  
    4141\end{tabular}
    4242\end{center}
    43 
    4443\caption{8-bit ASCII Basis Bit Streams}
    4544\label{fig:BitStreamsExample}
  • docs/Working/icXML/multithread.tex

    r2502 r2505  
    1 \section{Leveraging SIMD Parallelism for Multicore: Pipeline Parallelism}
     1%\section{Leveraging SIMD Parallelism for Multicore: Pipeline Parallelism}
     2\section{Leveraging Pipeline Parallelism}
    23% As discussed in section \ref{background:xerces}, Xerces can be considered a complex finite-state machine
    34% Finite-state machine belongs to the hardest application class to parallelize and process efficiently
     
    89As discussed in section \ref{background:xerces}, Xerces can be considered a complex finite-state machine,
    910the hardest type of application to parallelize and process efficiently \cite{Asanovic:EECS-2006-183}.
    10 However, \icXML{} provides logical layers between modules,
    11 which naturally enables pipeline parallel processing.
     11However \icXML{} provides logical layers between modules, which naturally enables pipeline parallel processing.
    1212
    13 In our pipeline model, each thread is in charge of one module or one group of modules.
    14 A straight forward division is to take advantage of the layer between Parabix Subsystem and Markup Processor.
    15 In this case, the first thread $T_1$ will read 16k of XML input $I$ at a time
    16 and process all the modules in Parabix Subsystem to generates
    17 % content buffer, symbol array, URI array, context ID array and store them to a pre-allocated shared data structure $S$.
    18 content buffer, symbol array, URI array, and store them to a pre-allocated shared data structure $S$.
    19 The second thread $T_2$ consumes the data provided by the first thread and
    20 goes through all the modules in Markup Processor and writes output $O$.
    21 
     13In the pipeline model, each thread is in charge of a group of modules.
     14A straight forward division is to take advantage of the layer between \PS{} and \MP{}.
     15In this case, the first thread $T_1$ reads 16k of XML input $I$ at a time and produces the
     16content, symbol and URI streams, then stores them in a pre-allocated shared data structure $S$.
     17The second thread $T_2$ consumes $S$, performs well-formedness and grammar-based validation
     18then generates the output $O$.
    2219The shared data structure is implemented using a ring buffer,
    23 where each entry consists of all the arrays shared between the two threads with size of 160k.
     20where every entry contains an independent set of data streams.
    2421In the example of Figure \ref{threads_timeline1} and \ref{threads_timeline2}, the ring buffer has four entries.
    2522A lock-free mechanism is applied to ensure that each entry can only be read or written by one thread at the same time.
    26 In Figure \ref{threads_timeline1}, the processing time of the first thread is longer,
    27 thus the second thread always wait for the first thread to finish processing one chunk of input
    28 and write to the shared memory.
    29 Figure \ref{threads_timeline2} illustrates a different situation where the second thread is slower
    30 and the first thread has to wait for the second thread finishing reading the shared data before it can reuse the memory space.
     23In Figure \ref{threads_timeline1} the processing time of $T_1$ is longer than $T_2$;
     24thus $T_2$ always waits for $T_1$ to write to the shared memory.
     25Figure \ref{threads_timeline2} illustrates the scenario in which $T_1$ is faster
     26and must wait for $T_2$ to finish reading the shared data before it can reuse the memory space.
     27
     28
     29% In our pipeline model, each thread is in charge of one module or one group of modules.
     30% A straight forward division is to take advantage of the layer between \PS{} and \MP{}.
     31% In this case, the first thread $T_1$ will read 16k of XML input $I$ at a time
     32% and process all the modules in \PS{} to generates
     33% content buffer, symbol array, URI array, and store them to a pre-allocated shared data structure $S$.
     34% The second thread $T_2$ consumes the data provided by the first thread and
     35% goes through all the modules in Markup Processor and writes output $O$.
     36
     37% The shared data structure is implemented using a ring buffer,
     38% where each entry consists of all the arrays shared between the two threads with size of 160k.
     39% In the example of Figure \ref{threads_timeline1} and \ref{threads_timeline2}, the ring buffer has four entries.
     40% A lock-free mechanism is applied to ensure that each entry can only be read or written by one thread at the same time.
     41% In Figure \ref{threads_timeline1}, the processing time of the first thread is longer,
     42% thus the second thread always wait for the first thread to finish processing one chunk of input
     43% and write to the shared memory.
     44% Figure \ref{threads_timeline2} illustrates a different situation where the second thread is slower
     45% and the first thread has to wait for the second thread finishing reading the shared data before it can reuse the memory space.
    3146
    3247To understand the performance improvement that can be achieved by this pipeline model,
Note: See TracChangeset for help on using the changeset viewer.