 r2471 % Parallel: blocks/segments/buffers through layers Parabix-style XML parsers utilize a concept of layers: as block of source text is transformed into a set of lexical bit streams, it undergoes a series of operations that can be grouped together as a logical layer, such as transposition, character classification, and the lexical analysis phases. Each layer is pipeline parallel, as they require no speculation nor pre-parsing stages\cite{HPCA2012}. The disadvantage of this approach is that, taken individually, the resultant parallel bit streams may out-of-order w.r.t. the source document and must be amalgamated and iterated through to produce sequential output. as each block of source text is transformed into a set of lexical bit streams, it undergoes a series of operations that can be grouped together in logical layers, such as transposition, character classification, and the lexical analysis phases. Each layer is pipeline parallel, requiring no speculation nor pre-parsing\cite{HPCA2012}. In adapting to the requirements of the Xerces sequential parsing API, however, the resultant parallel bit streams, taken individually, may out-of-order with respect to the source document.  They hence must be amalgamated and iterated through to produce sequential output. % The end user should not be expected to work with out-of-order data ...
 r2483 opener (i.e., \verb:/:'') or not.  The remaining three lines show streams that can be computed in subsequent parsing, namely streams marking the element names, parsing (using the technique of bitstream addition \cite{cameron-EuroPar2011}), namely streams marking the element names, attribute names and attribute values of tags. {\it Do we need to explain how those can be computed from the input text or do we simply refer them to prior papers?} Two intuitions may help explain how the Parabix approach can lead is the scan complete at this position yet?  Rather than computing these individual decision-bits, an approach that computes many of them in parallel (e.g., 128) should provide substantial benefit. many of them in parallel (e.g., 128 bytes at a time using 128-bit registers) should provide substantial benefit. Previous studies have shown Parabix approach improves many aspects of XML processing,
 r2483 % Should we show a val-grind summary of a few files in a linechart form? Xerces, like all traditional parsers, process XML documents sequentially a byte-at-a-time from the first to the last byte of input data. Each byte passes through several processing layers and are Xerces, like all traditional parsers, processes XML documents sequentially a byte-at-a-time from the first to the last byte of input data. Each byte passes through several processing layers and is classified and eventually validated within the context of the document state. This introduces implicit dependencies between the various tasks within the application that make it
 r2483 of interesting research prototypes using both SIMD and multicore parallelism.   Most works have investigated strategies for data parallel solutions on multicore data parallel solutions on multicore architectures using various strategies to break input documents into segments that can be allocated to different cores. standards-compliant open-source parser that is widely used in commercial practice.    The challenge of this work is to incorporate parallelize the Xerces parser in such a way as to to parallelize the Xerces parser in such a way as to preserve the existing APIs as well as offering worthwhile end-to-end acceleration of XML processing. seeking to expose as many critical aspects of XML parsing as possible for parallelization.   Overall, we have employed parabix-style methods in transcoding, tokenization employed Parabix-style methods in transcoding, tokenization and tag parsing,  parallel string comparison methods in symbol resolution, bit parallel methods in namespace processing, as well as staged
