Changeset 3881 for docs


Ignore:
Timestamp:
Jun 17, 2014, 10:27:39 PM (5 years ago)
Author:
ksherdy
Message:

Editted introduction.

Location:
docs/Working/re
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • docs/Working/re/pact051-cameron.tex

    r3880 r3881  
    6464\section{Introduction}
    6565
    66 The use of regular expressions to search texts for occurrences
    67 of string patterns has a long history and
    68 remains a pervasive technique throughout computing applications today.
     66The use of regular expressions to search texts
     67for patterns has a long history and
     68remains an important technique.
    6969% {\em a brief history}
    70 The origins of regular expression matching date back to automata theory
    71 developed by Kleene in the 1950s \cite{kleene1951}.
    72 Thompson \cite{thompson1968} is credited with the first construction to convert regular expressions
     70%The origins of regular expression matching date back to automata theory
     71%studied by Kleene in the 1950s \cite{kleene1951}.
     72Thompson \cite{thompson1968} is credited with the
     73first construction to convert regular expressions
    7374to nondeterministic finite automata (NFA).
    74 Following Thompson's approach, a regular expression of length $m$ is converted
    75 to an NFA with $O(m)$ states. It is then possible to search a text of length $n$ using the
    76 NFA in worst case $O(mn)$ time. Often, a more efficient choice
    77 is to convert an NFA into a DFA. A DFA has a single active state at any time
     75Following Thompson's approach,
     76a regular expression of length $m$ is converted
     77to an NFA with $O(m)$ states. Using the
     78NFA it is possible to search a text
     79of length $n$ in $O(mn)$ time.
     80Frequently, a more efficient choice
     81is to convert an NFA into a deterministic finite automata (DFA).
     82A DFA maintains a single active state
    7883throughout the matching process and
    79 hence it is possible to search a text of length $n$ in $O(n)$ time\footnote{It is well
     84hence, using a DFA it is possible to
     85search a text of length $n$ in $O(n)$ time\footnote{It is well
    8086known that the conversion of an NFA to an equivalent DFA may result
    81 in state explosion. That is, the number of resultant DFA states may increase exponentially.}.
    82 
    83 A significant proportion of the research in fast regular expression matching can be
     87in \textit{state explosion}, i.e., the number of resultant DFA states may increase exponentially.}.
     88
     89A significant proportion of the research
     90in fast regular expression matching can be
    8491regarded as the ``quest for efficient automata'' \cite{navarro98fastand}.
    8592In \cite{baeza1992new}, Baeza-Yates and Gonnet
     
    9097performs can be reduced by a factor $w$.
    9198Building on this observation, the Shift-Or algorithm simulates an NFA using
    92 bitwise operations and achieves $O(\frac{nm}{w})$ worst-case time \cite{navarro2000}.
     99bitwise operations and achieves $O(\frac{nm}{w})$ time in the worst-case \cite{navarro2000}.
    93100A disadvantage of the Shift-Or approach
    94101is an inability to skip input characters.
    95102Simple string matching algorithms,
    96 such as the Boyer-Moore family of algorithms \cite{boyer1977fast,horspool1980practical}, skip input characters
     103such as the Boyer-Moore family of algorithms \cite{boyer1977fast,horspool1980practical},
     104skip input characters
    97105to achieve sublinear times in the average case.
    98106% Backward Dawg Matching (BDM) string matching algorithms \cite{crochemore1994text}
     
    108116matching tools for simple patterns \cite{navarro2000}.
    109117
    110 There has been considerable recent
    111 interest in accelerating regular expression matching
    112 on parallel hardware
    113 such as multicore processors (CPUs),
     118Recently, there has been considerable
     119interest in the use of
     120parallel hardware such as multicore processors (CPUs),
    114121general purpose graphics processing units (GPGPUs),
    115122field-programmable gate arrays (FPGAs),
    116 and specialized architectures such as
    117 the Cell Broadband Engine (Cell BE). % FPGA results (synthesis of patterns into logic circuits) vs. memory based approaches (STTs in memory)
    118 %CPU
     123or specialized architectures such as
     124the Cell Broadband Engine (Cell BE) to
     125accelerate regular expression matching. % FPGA results (synthesis of patterns into logic circuits) vs. memory based approaches (STTs in memory)
     126Generally, speedups are achieved by
     127using parallel hardware features to improve the throughput of
     128multiple instances of a matching problem at a time,
     129i.e., by matching against sets of patterns or multiple input streams.
     130In contrast, our approach uses parallelism to accelerate the throughput of
     131a single problem instance, i.e., a single regular expression
     132matched against a single input stream.
     133
     134%\paragraph{Multicore}
     135In related work targeting multicore hardware,
    119136Scarpazza and Braudaway \cite{scarpazza2008fast} demonstrated that
    120137text processing algorithms that exhibit irregular memory access patterns
    121 can be efficiently executed on multicore hardware.
    122 In related work, Pasetto et al presented a flexible tool that
     138can be efficiently executed.
     139Pasetto et al \cite{pasetto2010} presented a flexible tool that
    123140performs small-ruleset regular expression matching at a rate of
    124 2.88 Gbps per chip on Intel Xeon E5472 hardware \cite{pasetto2010}.
     1412.88 Gbps per chip on Intel Xeon E5472 hardware.
    125142Naghmouchi et al \cite{scarpazza2011top,naghmouchi2010} demonstrated that the Aho-Corasick (AC)
    126143string matching algorithm \cite{aho1975} is well-suited for parallel
    127144implementation on multicore CPUs, GPGPUs and the Cell BE.
    128 On each system, both thread-level parallelism (cores) and data-level parallelism
    129 (SIMD units) were leveraged for performance.
    130 Salapura et al \cite{salapura2012accelerating} advocated the use of vector-style processing for regular expressions
     145% On each system, both thread-level parallelism (cores) and data-level parallelism
     146% (SIMD units) were leveraged for performance.
     147Salapura et al \cite{salapura2012accelerating} advocated
     148the use of vector-style processing for regular expressions
    131149in business analytics applications and leveraged the SIMD hardware available
    132150on multicore processors to achieve a speedup of greater than 1.8 over a
    133 range of data sizes of interest.
    134 %Cell
    135 In \cite{scarpazza2008}, Scarpazza and Russell presented a SIMD tokenizer
     151range of data sizes.
     152%\paragraph{Cell Broadband Engine}
     153On the Cell Broadband Engine, Scarpazza and Russell presented a SIMD tokenizer
    136154that delivered 1.00--1.78 Gbps on a single
    137 Cell BE chip and extended this approach for emulation on the Intel Larrabee
     155chip \cite{scarpazza2008} and extended this approach for emulation on the Intel Larrabee
    138156instruction set \cite{scarpazza2009larrabee}.
    139 On the Cell BE, Scarpazza \cite{scarpazza2009cell} described a pattern matching
     157Furthermore, in \cite{scarpazza2009cell} Scarpazza, described a pattern matching
    140158implementation that delivered a throughput of 40
    141159Gbps for a small dictionary of approximately 100 patterns and a throughput of 3.3-3.4
     
    143161presented a string matching implementation for automata that achieved
    1441624 Gbps on the Cell BE.
    145 % GPU
     163%\paragraph{GPGPUs}
    146164On GPGPUs, Tumeo et al \cite{tumeo2010efficient} presented a chunk-based
    147165implementation of the AC algorithm for
     
    152170although system throughput was limited to 15 Gbps
    153171\cite{lin2013accelerating}.
    154 Most recently, Mytkowicz et al have developed a method for combining
    155 SIMD parallellism and data parallelism
     172Most recently, Mytkowicz et al \cite{DBLP:conf/asplos/MytkowiczMS14}
     173have developed a method for combining
     174SIMD parallelism and data parallelism on multicore hardware.
     175Of each of these related works, this approach stands
     176out since it also focuses on the acceleration
     177of matching against a single input stream.
    156178
    157179Whereas the existing approaches to parallelization have been
Note: See TracChangeset for help on using the changeset viewer.