Ignore:
Timestamp:
May 10, 2013, 1:10:27 PM (6 years ago)
Author:
ksherdy
Message:

Rough notes.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • docs/Working/re/re-main.tex

    r3123 r3124  
    3838Performance results show a dramatic speed-up over traditional and state-of-the-art alternatives.
    3939
    40 
    4140\end{abstract}
    4241
     
    4443\label{Introduction}
    4544%\input{introduction.tex}
     45
     46Regular expresssion matching is an extensively studied problem with a multitude
     47of algorithms and software tools developed to the demands of particular problem contexts.
    4648
    4749Given a text T$_{1..n}$ of n characters and a pattern P, the pattern matching problem can be
     
    5254The pattern P can be just a simple string,
    5355but it can also be, for example, a regular expression.
    54 In this paper our focus is regular expression matching.
    5556
    5657A regular expression, or pattern, is an expression that specifies a set of strings.
    57 A regular expression is composed of (i) basic strings and (ii)
     58A regular expression is composed of (i) simple strings (ii) the empty or (ii)
    5859union, concatenation and Kleene closure of other regular expressions.
    5960To avoid parentheses it is assumed that the Kleene star has the highest priority,
    6061next concatenation and then alternation, however, most formalisms provides grouping
    6162operators to allow the definition of scope and operator precedence.
    62 
    6363Readers unfamiliar with the concept of regular expression matching are referred
    6464classical texts such as \cite{aho2007}.
    6565
    66 \subsection{Grep}
    67 
    6866Regular expression matching is commonly performed using a variety of
    69 publically available software tools. Namely, the UNIX grep family,
    70 the GNU grep family, agrep, cgrep, sgrep, nrgrep, and Perl regular
     67publically available software tools. The most prominent, UNIX grep,
     68Gnu grep, agrep, cgrep, nrgrep, and Perl regular
    7169expressions \cite{Abou-assaleh04surveyof}.
    7270
    73 % Grep
    74 % Unix grep
    75 % Gnu grep
    76 % agrep
    77 % nrgrep
     71Amongst these Gnu grep, agrep, and nrgrep are widely known and considered as
     72the fastest regular expression matching tools in practice \cite{}.
    7873
    79 Of particular interest are well-studied and performance oriented
    80 Gnu grep, agrep, and nrgrep.
     74Of particular interest to this study are the performance oriented Gnu grep, agrep, and nrgrep.
    8175
    82 As such, we compare the performance of our parallel \bitstream{} techniques against
     76% motivation / previous work
     77Although the finite state machine methods used in the scanning and parsing of
     78text streams is considered to be the hardest of the “13 dwarves” to parallelize
     79[1], parallel bitstream technology shows considerable promise for these types of
     80applications [3, 4]. In this approach, character streams are processed W positions
     81at a time using the W-bit SIMD registers commonly found on commodity processors
     82(e.g., 128-bit XMM registers on Intel/AMD chips). This is achieved by
     83first slicing the byte streams into eight separate basis bitstreams, one for each bit
     84position within the byte. These basis bitstreams are then combined with bitwise
     85logic and shifting operations to compute further parallel bit streams of interest.
     86
     87We further increase the parallelism in our methods by introducing a new parallel
     88scanning primitive which we have coined 'Match Star' that returns all matches in a single
     89operation and eliminates the need for back tracking ... (ELABORATE)
     90
     91--- STATE the content of the next sections. ---
     92
     93
     94We compare the performance of our parallel \bitstream{} techniques against
    8395various grep concentrate on the simpler case of
    8496reporting initial or final occurrence positions.
     97
     98
     99\section{Background}
     100\label{Background}
     101%\input{background.tex}
    85102
    86103% Background
    87104
    88105% History
    89 
    90 Regular expresssion matching is an extensively studied problem with a multitude
    91 of algorithms and software tools developed to the demands of particular problem contexts.
    92106
    93107Historically, the origins of regular expression matching date back to automata theory
     
    114128may have O(2^m) states.
    115129
    116 Overall, the general process is first to build a
    117 NFA from the regular expression, convert the NFA into a
    118 DFA, minimize the number of states in the DFA,
    119 and finally use the DFA to scan the text.
     130In general, the general process is first to build a
     131NFA from the regular expression and simulate the NFA on text input,
     132or alternatively to convert the NFA into a
     133DFA, optionally minimize the number of states in the DFA,
     134and finally simulate the DFA on text input.
    120135
    121 \section{Background}
    122 \label{Background}
    123 %\input{background.tex}
    124136
    125137\section{Methodology}
Note: See TracChangeset for help on using the changeset viewer.