Changeset 3123 for docs


Ignore:
Timestamp:
May 9, 2013, 5:26:33 PM (6 years ago)
Author:
ksherdy
Message:

Check-in rough notes only.

Location:
docs/Working/re
Files:
4 added
4 edited

Legend:

Unmodified
Added
Removed
  • docs/Working/re/re-main.aux

    r3121 r3123  
    11\relax
     2\citation{aho2007}
    23\@writefile{toc}{\contentsline {section}{\numberline {1}Introduction}{1}}
    34\newlabel{Introduction}{{1}{1}}
    4 \@writefile{toc}{\contentsline {section}{\numberline {2}Background}{1}}
    5 \newlabel{Background}{{2}{1}}
    6 \@writefile{toc}{\contentsline {section}{\numberline {3}Methodology}{1}}
    7 \newlabel{Methodology}{{3}{1}}
    8 \@writefile{toc}{\contentsline {section}{\numberline {4}Experimental Results}{1}}
    9 \newlabel{results}{{4}{1}}
    10 \@writefile{toc}{\contentsline {section}{\numberline {5}Conclusion and Future Work}{1}}
    11 \newlabel{conclusion}{{5}{1}}
     5\citation{kleene1951representation}
     6\bibstyle{acm}
     7\bibdata{reference}
     8\bibcite{aho2007}{1}
     9\bibcite{kleene1951representation}{2}
     10\@writefile{toc}{\contentsline {section}{\numberline {2}Background}{2}}
     11\newlabel{Background}{{2}{2}}
     12\@writefile{toc}{\contentsline {section}{\numberline {3}Methodology}{2}}
     13\newlabel{Methodology}{{3}{2}}
     14\@writefile{toc}{\contentsline {section}{\numberline {4}Experimental Results}{2}}
     15\newlabel{results}{{4}{2}}
     16\@writefile{toc}{\contentsline {section}{\numberline {5}Conclusion and Future Work}{2}}
     17\newlabel{conclusion}{{5}{2}}
  • docs/Working/re/re-main.log

    r3121 r3123  
    1 This is pdfTeX, Version 3.1415926-1.40.10 (TeX Live 2009/Debian) (format=pdflatex 2012.10.7)  7 MAY 2013 18:03
     1This is pdfTeX, Version 3.1415926-1.40.10 (TeX Live 2009/Debian) (format=pdflatex 2012.10.7)  9 MAY 2013 16:27
    22entering extended mode
    33 %&-line parsing enabled.
     
    249249\openout1 = `re-main.aux'.
    250250
    251 LaTeX Font Info:    Checking defaults for OML/cmm/m/it on input line 17.
    252 LaTeX Font Info:    ... okay on input line 17.
    253 LaTeX Font Info:    Checking defaults for T1/cmr/m/n on input line 17.
    254 LaTeX Font Info:    ... okay on input line 17.
    255 LaTeX Font Info:    Checking defaults for OT1/cmr/m/n on input line 17.
    256 LaTeX Font Info:    ... okay on input line 17.
    257 LaTeX Font Info:    Checking defaults for OMS/cmsy/m/n on input line 17.
    258 LaTeX Font Info:    ... okay on input line 17.
    259 LaTeX Font Info:    Checking defaults for OMX/cmex/m/n on input line 17.
    260 LaTeX Font Info:    ... okay on input line 17.
    261 LaTeX Font Info:    Checking defaults for U/cmr/m/n on input line 17.
    262 LaTeX Font Info:    ... okay on input line 17.
    263 LaTeX Font Info:    External font `cmex10' loaded for size
    264 (Font)              <12> on input line 20.
    265 LaTeX Font Info:    External font `cmex10' loaded for size
    266 (Font)              <8> on input line 20.
    267 LaTeX Font Info:    External font `cmex10' loaded for size
    268 (Font)              <6> on input line 20.
     251LaTeX Font Info:    Checking defaults for OML/cmm/m/it on input line 20.
     252LaTeX Font Info:    ... okay on input line 20.
     253LaTeX Font Info:    Checking defaults for T1/cmr/m/n on input line 20.
     254LaTeX Font Info:    ... okay on input line 20.
     255LaTeX Font Info:    Checking defaults for OT1/cmr/m/n on input line 20.
     256LaTeX Font Info:    ... okay on input line 20.
     257LaTeX Font Info:    Checking defaults for OMS/cmsy/m/n on input line 20.
     258LaTeX Font Info:    ... okay on input line 20.
     259LaTeX Font Info:    Checking defaults for OMX/cmex/m/n on input line 20.
     260LaTeX Font Info:    ... okay on input line 20.
     261LaTeX Font Info:    Checking defaults for U/cmr/m/n on input line 20.
     262LaTeX Font Info:    ... okay on input line 20.
     263LaTeX Font Info:    External font `cmex10' loaded for size
     264(Font)              <12> on input line 23.
     265LaTeX Font Info:    External font `cmex10' loaded for size
     266(Font)              <8> on input line 23.
     267LaTeX Font Info:    External font `cmex10' loaded for size
     268(Font)              <6> on input line 23.
     269LaTeX Font Info:    External font `cmex10' loaded for size
     270(Font)              <7> on input line 47.
     271LaTeX Font Info:    External font `cmex10' loaded for size
     272(Font)              <5> on input line 47.
    269273 [1
    270274
    271 {/var/lib/texmf/fonts/map/pdftex/updmap/pdftex.map}] (./re-main.aux) )
     275{/var/lib/texmf/fonts/map/pdftex/updmap/pdftex.map}] (./re-main.bbl) [2]
     276(./re-main.aux) )
    272277Here is how much of TeX's memory you used:
    273  442 strings out of 493848
    274  4315 string characters out of 1152823
    275  52050 words of memory out of 3000000
    276  3776 multiletter control sequences out of 15000+50000
    277  7887 words of font info for 28 fonts, out of 3000000 for 9000
     278 455 strings out of 493848
     279 4467 string characters out of 1152823
     280 54050 words of memory out of 3000000
     281 3784 multiletter control sequences out of 15000+50000
     282 8534 words of font info for 30 fonts, out of 3000000 for 9000
    278283 714 hyphenation exceptions out of 8191
    279284 23i,6n,19p,165b,199s stack positions out of 5000i,500n,10000p,200000b,50000s
    280 </usr/sha
    281 re/texmf-texlive/fonts/type1/public/amsfonts/cm/cmbx12.pfb></usr/share/texmf-te
    282 xlive/fonts/type1/public/amsfonts/cm/cmbx9.pfb></usr/share/texmf-texlive/fonts/
    283 type1/public/amsfonts/cm/cmr10.pfb></usr/share/texmf-texlive/fonts/type1/public
    284 /amsfonts/cm/cmr12.pfb></usr/share/texmf-texlive/fonts/type1/public/amsfonts/cm
    285 /cmr17.pfb>
    286 Output written on re-main.pdf (1 page, 57666 bytes).
     285</usr/share/texmf-texlive/fonts/type1/public/amsfonts/cm/cmbx1
     2862.pfb></usr/share/texmf-texlive/fonts/type1/public/amsfonts/cm/cmbx9.pfb></usr/
     287share/texmf-texlive/fonts/type1/public/amsfonts/cm/cmcsc10.pfb></usr/share/texm
     288f-texlive/fonts/type1/public/amsfonts/cm/cmmi7.pfb></usr/share/texmf-texlive/fo
     289nts/type1/public/amsfonts/cm/cmr10.pfb></usr/share/texmf-texlive/fonts/type1/pu
     290blic/amsfonts/cm/cmr12.pfb></usr/share/texmf-texlive/fonts/type1/public/amsfont
     291s/cm/cmr17.pfb></usr/share/texmf-texlive/fonts/type1/public/amsfonts/cm/cmr7.pf
     292b></usr/share/texmf-texlive/fonts/type1/public/amsfonts/cm/cmr9.pfb></usr/share
     293/texmf-texlive/fonts/type1/public/amsfonts/cm/cmti10.pfb>
     294Output written on re-main.pdf (2 pages, 132063 bytes).
    287295PDF statistics:
    288  26 PDF objects out of 1000 (max. 8388607)
     296 49 PDF objects out of 1000 (max. 8388607)
    289297 0 named destinations out of 1000 (max. 500000)
    290298 1 words of extra memory for PDF output out of 10000 (max. 10000000)
  • docs/Working/re/re-main.tex

    r3121 r3123  
    22\usepackage[utf8]{inputenc}
    33
     4\def \Bitstream{Bit Stream}
     5\def \bitstream{bit stream}
     6
    47%opening
    5 \title{Fast Regular Expression Matching using Parallel Bit Streams}
     8\title{Fast Regular Expression Matching using Parallel \Bitstream{}s}
    69\author{
    710{Robert D. Cameron} \\
     
    2225\begin{abstract}
    2326
     27A data parallel regular expression matching method using the concept of bitstream technology
     28is introduced and studied in application to the problem of fast regular expression matching.
     29
     30The method is based on the concept of parallel
     31\bitstream{} technology, in which parallel streams of bits are formed such
     32that each stream comprises bits in one-to-one correspondence with the
     33character code units of a source data stream.
     34
     35On processors supporting W-bit addition operations, the method processes W source characters 
     36in parallel and performs up to W finite state transitions per clock cycle.
     37
     38Performance results show a dramatic speed-up over traditional and state-of-the-art alternatives.
     39
     40
    2441\end{abstract}
    2542
     
    2744\label{Introduction}
    2845%\input{introduction.tex}
     46
     47Given a text T$_{1..n}$ of n characters and a pattern P, the pattern matching problem can be
     48stated as follows. Find all the text positions of T that start an occurrence of P. 
     49Alternatively, one may want all the final positions of occurrences. Some
     50applications require slightly different output such as the line that matches the pattern.
     51
     52The pattern P can be just a simple string,
     53but it can also be, for example, a regular expression.
     54In this paper our focus is regular expression matching.
     55
     56A regular expression, or pattern, is an expression that specifies a set of strings.
     57A regular expression is composed of (i) basic strings and (ii)
     58union, concatenation and Kleene closure of other regular expressions.
     59To avoid parentheses it is assumed that the Kleene star has the highest priority,
     60next concatenation and then alternation, however, most formalisms provides grouping
     61operators to allow the definition of scope and operator precedence.
     62
     63Readers unfamiliar with the concept of regular expression matching are referred
     64classical texts such as \cite{aho2007}.
     65
     66\subsection{Grep}
     67
     68Regular expression matching is commonly performed using a variety of
     69publically available software tools. Namely, the UNIX grep family,
     70the GNU grep family, agrep, cgrep, sgrep, nrgrep, and Perl regular
     71expressions \cite{Abou-assaleh04surveyof}.
     72
     73% Grep
     74% Unix grep
     75% Gnu grep
     76% agrep
     77% nrgrep
     78
     79Of particular interest are well-studied and performance oriented
     80Gnu grep, agrep, and nrgrep.
     81
     82As such, we compare the performance of our parallel \bitstream{} techniques against
     83various grep concentrate on the simpler case of
     84reporting initial or final occurrence positions.
     85
     86% Background
     87
     88% History
     89
     90Regular expresssion matching is an extensively studied problem with a multitude
     91of algorithms and software tools developed to the demands of particular problem contexts.
     92
     93Historically, the origins of regular expression matching date back to automata theory
     94and formal language theory developed by Kleene in the 1950s \cite{kleene1951representation}.
     95
     96In 1959, Dana and Scott demonstrated that
     97NFAs can be simulated using Deterministic Finite Automata (DFA) in which each DFA
     98state corresponds to a set of NFA states.
     99
     100Thompson, in 1968, is credited with the first construction to convert regular expressions
     101to nondeterministic finite automata (NFA) for regular expression matching.
     102
     103Thompson’s publication \cite{thompson1968} marked the beginning of a long line of
     104regular expression implementations that construct automata for pattern matching.
     105
     106The traditional technique [16] to search a regular expression of length m in
     107a text of length n is to first convert the expression into a non-deterministic
     108automaton (NFA) with O(m) nodes. It is possible to search the text using the
     109NFA directly in O(mn) worst case time. The cost comes from the fact that
     110more than one state of the NFA may be active at each step, and therefore all
     111may need to be updated. A more effcient choice is to convert the NFA into a
     112DFA. A DFA has only a single active state and allows to search the text at
     113O(n) worst-case optimal. The problem with this approach is that the DFA
     114may have O(2^m) states.
     115
     116Overall, the general process is first to build a
     117NFA from the regular expression, convert the NFA into a
     118DFA, minimize the number of states in the DFA,
     119and finally use the DFA to scan the text.
    29120
    30121\section{Background}
     
    44135%\input{conclusion.tex}
    45136
     137{
     138  \bibliographystyle{acm}
     139  \bibliography{reference}
     140}
     141
    46142\end{document}
Note: See TracChangeset for help on using the changeset viewer.