source: docs/HPCA2012/final_ieee/10-related.tex @ 3889

Last change on this file since 3889 was 1788, checked in by cameron, 7 years ago

Refs. clean-up

File size: 3.5 KB
1\section{Related Work}
4There has been work in the past which has sought to address the
5overheads of text processing in specific applications (e.g., XML
6parsers) and have adopted specialized hardware and software solutions
7for each application.
8% Event-based SAX (Simple API for XML) parsers avoid the tree
9% construction costs of the more flexible DOM (Document Object Model)
10% parsers \cite{Perkins05}.  Nicola and John specifically identified
11XML parsing as a threat to database performance  \cite{NicolaJohn03}
12outlines a number of potential directions for
13improving performance.  The commercial importance
14of XML parsing has spurred the development of numerous multi-threaded
15and hardware-based approaches: Multithreaded XML techniques include
16preparsing the XML file to locate key partitioning
17points~\cite{ParaDOM2009,LiWangLiuLi2009} and speculative
18p-DFAs~\cite{ZhangPanChiu09}. Hardware methods include custom XML
19chips \cite{Leventhal2009} and FPGA-based implementations
20\cite{DaiNiZhu2010}. Others have explored the design of custom
21hardware for bit parallel operations for text search in network
22processors~\cite{tan-sherwood-isca-2005}. Intel's SSE4.2 instructions targeted
23XML parsers, but these have not seen widespread use because of portability
24concerns and the programming challenges that accompany low level
27Parallel bitstream methods were first introduced in application
28to the problem of UTF-8 to UTF-16 transcoding \cite{Cameron2008}.
29Our first XML parser used SSE2 instructions for bitstream transposition
30and character class formation coupled with processor bit scan
31operations to accelerate the sequential scans in a recursive
32descent parser \cite{CameronHerdyLin2008}.
33The technique has also been employed to accelerate string
34matching operations in protein identification \cite{JMBE:31@99}.
35The work on an inductive doubling instruction set
36~\cite{CameronLin2009} established the basis for our
37portable SIMD run-time environments, while the introduction
38of parallel scanning primitives ~\cite{cameron-EuroPar2011} 
39has provided the basis for our Pablo compiler technology on
40unbounded bitstreams.
42 In this paper, we have developed a generalized
43Parabix architecture and have described the software tool chain that
44programmers can use to build scalable text processing applications on
45commodity multicores. We have explored in the detail the tradeoffs
46between the SIMD implementations across processor generations (i.e.,
47SSE vs AVX) and multiple platfoms (ARM vs Intel). Finally, we have
48also explored the benefits of using pipeline-based multicore
49parallelism as a technique to eliminate imbalances in SIMD
50bitstream-based parallelization and improve overall efficiency.
59% To accelerate XML parsingmost of the recent work has
60% focused on parallelization through the use of multicore parallelism
61% for chip multiprocessors \cite{ZhangPanChiu09, },
64% In this paper, we have introduce parallel bit streams as a general
65% abstraction to parallelize and improve the performance general text
66% processing. We have developed a compiler tool chain and the runtime to
67% enable bit streams to exploit SIMD extensions found on commodity
68% processors.  We are also the first to perform a detailed analysis of
69% SIMD instruction extensions across three generations of Intel
70% processors including the new 256-bit AVX extensions. Finally, we have
71% shown the benefits of using multithreading in conjunction with data
72% parallel phases of the application.
Note: See TracBrowser for help on using the repository browser.