# Changeset 3135

Ignore:
Timestamp:
May 13, 2013, 3:20:24 PM (6 years ago)
Message:

Minor edit.

Location:
docs/Working/re
Files:
2 edited

### Legend:

Unmodified
 r3126 when a partially successful search path fails. The remainder of this paper is organized as follows. Section~\ref{Background} presents background material on classic regular expression pattern matching techniques and provides insight into the efficiency of traditional regular expression software tools. Section~\ref{Parallel Bitwise Data Streams} describes out data parallel efficiency of traditional regular expression software tools. Next, Section~\ref{Data Parallel Bitstreams} describes our data parallel regular expression matching techniques. Section~\ref{Compiler Technology} Section~\ref{Methodology} describes the evaluation framework and Section~\ref{Experimental Results} presents a detailed performance analysis of our data parallel bitstream techniques against Gnu grep, agrep, and nr-grep. Section~\ref{Conclusion} concludes the paper. regular expression implementations that construct automata for pattern matching. The traditional technique [16] to search a regular expression of length m in a text of length n is to first convert the expression into a non-deterministic automaton (NFA) with O(m) nodes. It is possible to search the text using the NFA directly in O(mn) worst case time. The cost comes from the fact that more than one state of the NFA may be active at each step, and therefore all may need to be updated. A more effcient choice is to convert the NFA into a DFA. A DFA has only a single active state and allows to search the text at O(n) worst-case optimal. The problem with this approach is that the DFA may have O($2^m$) states. In general, the general process is first to build a NFA from the regular expression and simulate the NFA on text input, or alternatively to convert the NFA into a DFA, optionally minimize the number of states in the DFA, and finally simulate the DFA on text input. Following Thompson's approach, a regular expression of length m is first converted to an NFA with O(m) nodes. It is possible to search a text of length n using the NFA directly in O(mn) worst case time. Alternatively, a more effcient choice is to convert the NFA into a DFA. A DFA has only a single active state and allows to search the text at O(n) worst-case optimal. It is well known that the conversion of the NFA to the DFA may result in the state explosion problem. That is the resultant DFA may have O($2^m$) states. \section{Parallel Bitwise Data Streams}
 r3126 \usepackage[utf8]{inputenc} \def \Bitstream{Bit Stream} \def \bitstream{bit stream} %opening \title{Fast Regular Expression Matching using Parallel \Bitstream{}s} \title{Fast Regular Expression Matching using Parallel Bitstreams} \author{ {Robert D. Cameron} \\ The method is based on the concept of parallel \bitstream{} technology, in which parallel streams of bits are formed such bitstream technology, in which parallel streams of bits are formed such that each stream comprises bits in one-to-one correspondence with the character code units of a source data stream. when a partially successful search path fails. The remainder of this paper is organized as follows. Section~\ref{Background} presents background material on classic regular expression pattern matching techniques and provides insight into the efficiency of traditional regular expression software tools. Section~\ref{Bitwise Parallel Data Streams} describes out data parallel efficiency of traditional regular expression software tools. Next, Section~\ref{Data Parallel Bitstreams} describes our data parallel regular expression matching techniques. Section~\ref{Compiler Technology} Section~\ref{Methodology} describes the evaluation framework and Section~\ref{Experimental Results} presents a detailed performance analysis of our data parallel \bitstream{} techniques against presents a detailed performance analysis of our data parallel bitstream techniques against Gnu grep, agrep, and nr-grep. Section~\ref{conclusion} concludes the paper. Section~\ref{Conclusion} concludes the paper. \section{Background} regular expression implementations that construct automata for pattern matching. The traditional technique [16] to search a regular expression of length m in a text of length n is to first convert the expression into a non-deterministic automaton (NFA) with O(m) nodes. It is possible to search the text using the NFA directly in O(mn) worst case time. The cost comes from the fact that more than one state of the NFA may be active at each step, and therefore all may need to be updated. A more effcient choice is to convert the NFA into a DFA. A DFA has only a single active state and allows to search the text at O(n) worst-case optimal. The problem with this approach is that the DFA may have O(2^m) states. In general, the general process is first to build a NFA from the regular expression and simulate the NFA on text input, or alternatively to convert the NFA into a DFA, optionally minimize the number of states in the DFA, and finally simulate the DFA on text input. Following Thompson's approach, a regular expression of length m is first converted to an NFA with O(m) nodes. It is possible to search a text of length n using the NFA directly in O(mn) worst case time. Alternatively, a more effcient choice is to convert the NFA into a DFA. A DFA has only a single active state and allows to search the text at O(n) worst-case optimal. The problem with this approach is that the DFA may have O($2^m$) states. \section{Parallel Bitwise Data Streams} \label{Parallel Bitwise Data Streams} \section{Compiler Technology} \section{Experimental Results} \label{results} \label{Experimental Results} %\input{results.tex} \section{Conclusion and Future Work} \label{conclusion} \section{Conclusion} \label{Conclusion} %\input{conclusion.tex}