source: docs/Working/re/ppopp-re.tex @ 3463

Last change on this file since 3463 was 3463, checked in by cameron, 6 years ago

Unbounded parallelism paras

File size: 7.4 KB
Line 
1%-----------------------------------------------------------------------------
2%
3%               Template for sigplanconf LaTeX Class
4%
5% Name:         sigplanconf-template.tex
6%
7% Purpose:      A template for sigplanconf.cls, which is a LaTeX 2e class
8%               file for SIGPLAN conference proceedings.
9%
10% Guide:        Refer to "Author's Guide to the ACM SIGPLAN Class,"
11%               sigplanconf-guide.pdf
12%
13% Author:       Paul C. Anagnostopoulos
14%               Windfall Software
15%               978 371-2316
16%               paul@windfall.com
17%
18% Created:      15 February 2005
19%
20%-----------------------------------------------------------------------------
21
22
23\documentclass{sigplanconf}
24
25% The following \documentclass options may be useful:
26
27% preprint      Remove this option only once the paper is in final form.
28% 10pt          To set in 10-point type instead of 9-point.
29% 11pt          To set in 11-point type instead of 9-point.
30% authoryear    To obtain author/year citation style instead of numeric.
31
32\usepackage{amsmath}
33
34
35\begin{document}
36
37\special{papersize=8.5in,11in}
38\setlength{\pdfpageheight}{\paperheight}
39\setlength{\pdfpagewidth}{\paperwidth}
40
41\conferenceinfo{PPoPP 2014}{February 15-19, 2013, Orland, Florida, United States} 
42\copyrightyear{2013} 
43\copyrightdata{978-1-nnnn-nnnn-n/yy/mm} 
44\doi{nnnnnnn.nnnnnnn}
45
46% Uncomment one of the following two, if you are not going for the
47% traditional copyright transfer agreement.
48
49%\exclusivelicense                % ACM gets exclusive license to publish,
50                                  % you retain copyright
51
52%\permissiontopublish             % ACM gets nonexclusive license to publish
53                                  % (paid open-access papers,
54                                  % short abstracts)
55
56\titlebanner{banner above paper title}        % These are ignored unless
57\preprintfooter{short description of paper}   % 'preprint' option specified.
58
59\title{Bitwise Data Parallelism in Regular Expression Matching}
60\subtitle{Subtitle Text, if any}
61
62\authorinfo{Robert D. Cameron \and Kenneth S. Herdy \and Dan Lin \and Meng Lin \and Ben Hull \and Thomas S. Shermer \and Arrvindh Shriraman}
63           {Simon Fraser University}
64           {\{cameron,ksherdy,lindanl,linmengl,bhull,shermer,ashriram\}@cs.sfu.ca}
65
66\maketitle
67
68\begin{abstract}
69\input{abstract}
70\end{abstract}
71\category{Theory of computation}{Formal languages and automata theory}{Regular languages}
72\category{Computer systems organization}{Parallel architectures}{Single instruction, multiple data}
73
74% general terms are not compulsory anymore,
75% you may leave them out
76%\terms
77%term1, term2
78
79\keywords
80regular expression matching, grep, parallel bit stream technology
81
82\section{Introduction}
83
84
85
86
87
88
89
90
91
92Parallelism in regular expressions.
93\begin{enumerate}
94\item stream parallelism
95\item NFA state parallelism
96\item ruleset parallelism
97\item data parallelism
98\end{enumerate}
99
100
101
102The remainder of this paper is organized as follows.
103Section \ref{sec:bitwise} 
104Section \ref{sec:regexp} 
105Section \ref{sec:analysis} 
106Section \ref{sec:SSE2} 
107Section \ref{sec:AVX2} 
108Section \ref{sec:GPU} 
109Section \ref{sec:Concl} concludes the paper with a discussion of areas for future work.
110
111
112\section{Unbounded Bitwise Data Parallelism}\label{sec:bitwise}
113
114Whereas all traditional regular expression matching engines are
115fundamentally based on an element-at-a-time processing model,
116the approach we introduce in this paper is based on quite a
117different conceptual model: processing all elements of an input data
118stream simultaneously.    Depending on the available parallel
119processing resources, an actual implementation may divide an
120input stream into segments and process the segments
121sequentially.   Within each segment, all elements of the input
122stream are processed in parallel.   The approach is inherently
123scalable and can be adapted to a variety of parallel architectures.
124The fundamental primitives required are bitwise logic and
125long-stream addition. 
126
127A key concept in this approach is the derivation of bit streams
128that are parallel to the input data stream, i.e., in one-to-one
129correspondence with the data element positions of the input
130streams.   Typically, the input stream is a byte stream comprising
131the 8-bit character code units of a particular encoding such
132as extended ASCII, ISO-8859-1 or UTF-8.   However, the method may also
133easily be used with wider code units such as the 16-bit code units of
134UTF-16.   In the case of a byte stream, the first step is to transpose
135the byte stream into eight parallel bit streams, such that bit stream
136$i$ comprises the $i$\th bit of each byte.   These streams form
137a set of basis bit streams from which many other parallel bit
138streams can be calculated, such as character class bit
139streams such that each bit $j$ of the stream specifies
140whether character $j$ of the input stream is in the class
141or not.   {\bf perhaps a figure here.}
142
143Using the SIMD capabilities of commodity processors,
144the basis bit streams and character class bit streams may be
145efficiently computed using the Parabix framework that
146has been developed and used extensively for high-performance
147XML parsing \cite{}.  For example, working with the 128-bit
148SIMD registers on commodity Intel and AMD processors,
149the Parabix framework can be employed to process input
150streams in blocks of 128 bytes at a time.   However, the
151general model is not dependent on any particular data block size
152and can be scaled for arbitrary sized blocks depending on
153available resources.   
154
155\section{Bitwise Regular Expression Matching with MatchStar}\label{sec:regexp}
156
157
158\subsection{Bitwise Data Parallelism vs. Bit-Parallel State Transition}
159The bitap algorithms use bitwise operations in an orthogonal way.
160
161\section{Analytical Comparison with DFA and NFA Implementations}\label{sec:analysis}
162
163\begin{enumerate}
164\item Operations
165\item Memory behaviour per input byte: note tables of DFA/NFA.
166
167Bille and Throup \em{Faster regular expression matching}\cite{bille2009faster}
168
169\end{enumerate}
170
171
172
173\section{Commodity SIMD Implementation and Experimental Evaluation}\label{sec:SSE2}
174\subsection{Implementation Notes}
175\subsection{Evaluation Methodology}
176\subsection{Comparison}
177
178
179
180\section{SIMD Scalability}\label{sec:AVX2}
181\subsection{AVX Stream Addition}
182  We use MatchStar for carry propagation.
183
184\section{GPU Implementation}\label{sec:GPU}
185
186\section{Miscellaneous}
187\subsection{Skipping}
188\subsection{Unicode}
189
190\section{Conclusion}\label{sec:Concl}
191\subsection{Contributions}
192\begin{enumerate}
193\item New Algorithm Class for Regular Expression Matching
194\item MatchStar for Character Class Repetition
195\item Long Stream Addition
196\item Implementations showing performance and scalability
197\end{enumerate}
198\subsection{Future Work}
199\begin{enumerate}
200\item Substring capture
201\item Unicode character classes
202\item Nonregular regexp features: zero-width assertions, backreferences
203\item Multicore for ruleset parallelism
204\end{enumerate}
205
206
207
208%\appendix
209%\section{Appendix Title}
210
211%This is the text of the appendix, if you need one.
212
213\acks
214
215This research was supported by grants from the Natural Sciences and Engineering Research Council of Canada and
216MITACS, Inc.
217
218% We recommend abbrvnat bibliography style.
219
220\bibliographystyle{abbrvnat}
221
222% The bibliography should be embedded for final submission.
223 
224\bibliography{reference}
225
226%\begin{thebibliography}{}
227%\softraggedright
228
229%\bibitem[Smith et~al.(2009)Smith, Jones]{smith02}
230%P. Q. Smith, and X. Y. Jones. ...reference text...
231%
232%\end{thebibliography}
233
234
235\end{document}
236
237
Note: See TracBrowser for help on using the repository browser.