source: docs/Working/re/ppopp-re.tex @ 3467

Last change on this file since 3467 was 3467, checked in by cameron, 6 years ago

Updates

File size: 9.8 KB
Line 
1%-----------------------------------------------------------------------------
2%
3%               Template for sigplanconf LaTeX Class
4%
5% Name:         sigplanconf-template.tex
6%
7% Purpose:      A template for sigplanconf.cls, which is a LaTeX 2e class
8%               file for SIGPLAN conference proceedings.
9%
10% Guide:        Refer to "Author's Guide to the ACM SIGPLAN Class,"
11%               sigplanconf-guide.pdf
12%
13% Author:       Paul C. Anagnostopoulos
14%               Windfall Software
15%               978 371-2316
16%               paul@windfall.com
17%
18% Created:      15 February 2005
19%
20%-----------------------------------------------------------------------------
21
22
23\documentclass{sigplanconf}
24
25% The following \documentclass options may be useful:
26
27% preprint      Remove this option only once the paper is in final form.
28% 10pt          To set in 10-point type instead of 9-point.
29% 11pt          To set in 11-point type instead of 9-point.
30% authoryear    To obtain author/year citation style instead of numeric.
31
32\usepackage{amsmath}
33
34
35\begin{document}
36
37\special{papersize=8.5in,11in}
38\setlength{\pdfpageheight}{\paperheight}
39\setlength{\pdfpagewidth}{\paperwidth}
40
41\conferenceinfo{PPoPP 2014}{February 15-19, 2013, Orland, Florida, United States} 
42\copyrightyear{2013} 
43\copyrightdata{978-1-nnnn-nnnn-n/yy/mm} 
44\doi{nnnnnnn.nnnnnnn}
45
46% Uncomment one of the following two, if you are not going for the
47% traditional copyright transfer agreement.
48
49%\exclusivelicense                % ACM gets exclusive license to publish,
50                                  % you retain copyright
51
52%\permissiontopublish             % ACM gets nonexclusive license to publish
53                                  % (paid open-access papers,
54                                  % short abstracts)
55
56\titlebanner{banner above paper title}        % These are ignored unless
57\preprintfooter{short description of paper}   % 'preprint' option specified.
58
59\title{Bitwise Data Parallelism in Regular Expression Matching}
60\subtitle{Subtitle Text, if any}
61
62\authorinfo{Robert D. Cameron \and Kenneth S. Herdy \and Dan Lin \and Meng Lin \and Ben Hull \and Thomas S. Shermer \and Arrvindh Shriraman}
63           {Simon Fraser University}
64           {\{cameron,ksherdy,lindanl,linmengl,bhull,shermer,ashriram\}@cs.sfu.ca}
65
66\maketitle
67
68\begin{abstract}
69\input{abstract}
70\end{abstract}
71\category{Theory of computation}{Formal languages and automata theory}{Regular languages}
72\category{Computer systems organization}{Parallel architectures}{Single instruction, multiple data}
73
74% general terms are not compulsory anymore,
75% you may leave them out
76%\terms
77%term1, term2
78
79\keywords
80regular expression matching, grep, parallel bit stream technology
81
82\section{Introduction}
83
84The use of regular expressions to search texts for occurrences
85of string patterns has a long history and
86remains a pervasive technique throughout computing applications today.
87{\em a brief history}
88
89There has been considerable interest in using parallelization techniques
90to improve the performance of regular expression matching.
91{\em a brief review}  \cite{scarpazza2011top}
92GPUs\cite{lin2013accelerating}
93FPGAs\cite{lee2007high}
94SIMD\cite{salapura2012accelerating}
95Cell BE\cite{scarpazza2011top}
96
97
98
99Whereas the existing approaches to parallelization have been
100based on adapting traditional sequential algorithms to emergent
101parallel architectures, we introduce both a new algorithmic
102approach and its implementation on SIMD and GPU architectures.
103This approach relies on a bitwise data parallel view of text
104streams as well as a surprising use of addition to match
105runs of characters in a single step.  The closest previous
106work is that underlying bit-parallel XML parsing using 128-bit SSE2 SIMD
107technology together with a parallel scanning primitive also
108based on addition \cite{cameron2011parallel}.   
109However, in contrast to the deterministic, longest-match
110scanning associated with the ScanThru primitive of that
111work, we introduce here a new primitive MatchStar
112that can be used in full generality for nondeterministic
113regular expression matching.   We also introduce a long-stream
114addition technique involving a further application of MatchStar
115that enables us to scale the technique to $n$-bit addition
116in $log_{64} n$ steps.   We ultimately apply this technique,
117for example, to perform
118synchronized 4096-bit addition on GPU warps of 64 threads.
119
120There is also a strong keyword match between the bit-parallel
121data streams used in our approach and the bit-parallelism
122used for NFA state transitions in the classical algorithms of
123Wu and Manber \cite{wu1992agrep}, Baez-Yates and Gonnet \cite{baeza1992new}
124and Navarro and Raffinot \cite{navarro1998bit}.
125However those algorithms use bit-parallelism in a fundamentally
126different way: representing all possible current NFA states
127as a bit vector and performing parallel transitions to a new
128set of states using table lookups and bitwise logic.    Whereas
129our approach can match multiple characters per step, bit-parallel
130NFA algorithms proceed through the input one byte at a time.
131Nevertheless, the agrep \cite{wu1992agrep} and
132nrgrep \cite{navarro2000} programs implemented using these techniques remain
133among the strongest competitors in regular expression matching
134performance, so we include them in our comparative evaluation.
135
136
137The remainder of this paper is organized as follows.
138Section \ref{sec:unbounded} presents our basic algorithm using
139a model of unbounded bitwise data parallelism. 
140Section \ref{sec:regexp} 
141Section \ref{sec:analysis} 
142Section \ref{sec:SSE2} 
143Section \ref{sec:AVX2} 
144Section \ref{sec:GPU} 
145Section \ref{sec:Concl} concludes the paper with a discussion of areas for future work.
146
147
148\section{An Unbounded Model for Bitwise Regular Expression Matching}\label{sec:unbounded}
149
150Whereas all traditional regular expression matching engines are
151fundamentally based on an element-at-a-time processing model,
152the approach we introduce in this paper is based on quite a
153different conceptual model: processing all elements of an input data
154stream simultaneously.    Depending on the available parallel
155processing resources, an actual implementation may divide an
156input stream into segments and process the segments
157sequentially.   Within each segment, all elements of the input
158stream are processed in parallel.   The approach is inherently
159scalable and can be adapted to a variety of parallel architectures.
160The fundamental primitives required are bitwise logic and
161long-stream addition. 
162
163A key concept in this approach is the derivation of bit streams
164that are parallel to the input data stream, i.e., in one-to-one
165correspondence with the data element positions of the input
166streams.   Typically, the input stream is a byte stream comprising
167the 8-bit character code units of a particular encoding such
168as extended ASCII, ISO-8859-1 or UTF-8.   However, the method may also
169easily be used with wider code units such as the 16-bit code units of
170UTF-16.   In the case of a byte stream, the first step is to transpose
171the byte stream into eight parallel bit streams, such that bit stream
172$i$ comprises the $i^\text{th}$ bit of each byte.   These streams form
173a set of basis bit streams from which many other parallel bit
174streams can be calculated, such as character class bit
175streams such that each bit $j$ of the stream specifies
176whether character $j$ of the input stream is in the class
177or not.   {\bf perhaps a figure here.}
178
179Using the SIMD capabilities of commodity processors,
180the basis bit streams and character class bit streams may be
181efficiently computed using the Parabix framework that
182has been developed and used extensively for high-performance
183XML parsing \cite{}.  For example, working with the 128-bit
184SIMD registers on commodity Intel and AMD processors,
185the Parabix framework can be employed to process input
186streams in blocks of 128 bytes at a time.   However, the
187general model is not dependent on any particular data block size
188and can be scaled for arbitrary sized blocks depending on
189available resources.   
190
191\section{Bitwise Regular Expression Matching with MatchStar}\label{sec:regexp}
192
193
194\subsection{Bitwise Data Parallelism vs. Bit-Parallel State Transition}
195The bitap algorithms use bitwise operations in an orthogonal way.
196
197\section{Analytical Comparison with DFA and NFA Implementations}\label{sec:analysis}
198
199\begin{enumerate}
200\item Operations
201\item Memory behaviour per input byte: note tables of DFA/NFA.
202
203Bille and Throup \em{Faster regular expression matching}\cite{bille2009faster}
204
205\end{enumerate}
206
207
208
209\section{Commodity SIMD Implementation and Experimental Evaluation}\label{sec:SSE2}
210\subsection{Implementation Notes}
211\subsection{Evaluation Methodology}
212\subsection{Comparison}
213
214
215
216\section{SIMD Scalability}\label{sec:AVX2}
217\subsection{AVX Stream Addition}
218  We use MatchStar for carry propagation.
219
220\section{GPU Implementation}\label{sec:GPU}
221
222\section{Miscellaneous}
223\subsection{Skipping}
224\subsection{Unicode}
225
226\section{Conclusion}\label{sec:Concl}
227\subsection{Contributions}
228\begin{enumerate}
229\item New Algorithm Class for Regular Expression Matching
230\item MatchStar for Character Class Repetition
231\item Long Stream Addition
232\item Implementations showing performance and scalability
233\end{enumerate}
234\subsection{Future Work}
235\begin{enumerate}
236\item Substring capture
237\item Unicode character classes
238\item Nonregular regexp features: zero-width assertions, backreferences
239\item Multicore for ruleset parallelism
240\end{enumerate}
241
242
243
244%\appendix
245%\section{Appendix Title}
246
247%This is the text of the appendix, if you need one.
248
249\acks
250
251This research was supported by grants from the Natural Sciences and Engineering Research Council of Canada and
252MITACS, Inc.
253
254% We recommend abbrvnat bibliography style.
255
256\bibliographystyle{abbrvnat}
257
258% The bibliography should be embedded for final submission.
259 
260\bibliography{reference}
261
262%\begin{thebibliography}{}
263%\softraggedright
264
265%\bibitem[Smith et~al.(2009)Smith, Jones]{smith02}
266%P. Q. Smith, and X. Y. Jones. ...reference text...
267%
268%\end{thebibliography}
269
270
271\end{document}
272
273
Note: See TracBrowser for help on using the repository browser.