Changeset 3135
 Timestamp:
 May 13, 2013, 3:20:24 PM (6 years ago)
 Location:
 docs/Working/re
 Files:

 2 edited
Legend:
 Unmodified
 Added
 Removed

docs/Working/re/remain.tex
r3126 r3135 88 88 when a partially successful search path fails. 89 89 90 91 90 The remainder of this paper is organized as follows. 92 93 91 Section~\ref{Background} presents background material on classic 94 92 regular expression pattern matching techniques and provides insight into the 95 efficiency of traditional regular expression software tools. 96 97 Section~\ref{Parallel Bitwise Data Streams} describes out data parallel 93 efficiency of traditional regular expression software tools. Next, 94 Section~\ref{Data Parallel Bitstreams} describes our data parallel 98 95 regular expression matching techniques. 99 100 96 Section~\ref{Compiler Technology} 101 102 97 Section~\ref{Methodology} describes the evaluation framework and Section~\ref{Experimental Results} 103 98 presents a detailed performance analysis of our data parallel bitstream techniques against 104 99 Gnu grep, agrep, and nrgrep. 105 106 100 Section~\ref{Conclusion} concludes the paper. 107 101 … … 127 121 regular expression implementations that construct automata for pattern matching. 128 122 129 The traditional technique [16] to search a regular expression of length m in 130 a text of length n is to first convert the expression into a nondeterministic 131 automaton (NFA) with O(m) nodes. It is possible to search the text using the 132 NFA directly in O(mn) worst case time. The cost comes from the fact that 133 more than one state of the NFA may be active at each step, and therefore all 134 may need to be updated. A more effcient choice is to convert the NFA into a 135 DFA. A DFA has only a single active state and allows to search the text at 136 O(n) worstcase optimal. The problem with this approach is that the DFA 137 may have O($2^m$) states. 138 139 In general, the general process is first to build a 140 NFA from the regular expression and simulate the NFA on text input, 141 or alternatively to convert the NFA into a 142 DFA, optionally minimize the number of states in the DFA, 143 and finally simulate the DFA on text input. 123 Following Thompson's approach, a regular expression of length m is first converted 124 to an NFA with O(m) nodes. It is possible to search a text of length n using the 125 NFA directly in O(mn) worst case time. Alternatively, a more effcient choice 126 is to convert the NFA into a DFA. A DFA has only a single active state and 127 allows to search the text at O(n) worstcase optimal. It is well known that the 128 conversion of the NFA to the DFA may result in the state explosion problem. 129 That is the resultant DFA may have O($2^m$) states. 144 130 145 131 \section{Parallel Bitwise Data Streams} 
docs/Working/re/remain.tex.backup
r3126 r3135 2 2 \usepackage[utf8]{inputenc} 3 3 4 \def \Bitstream{Bit Stream}5 \def \bitstream{bit stream}6 7 4 %opening 8 \title{Fast Regular Expression Matching using Parallel \Bitstream{}s}5 \title{Fast Regular Expression Matching using Parallel Bitstreams} 9 6 \author{ 10 7 {Robert D. Cameron} \\ … … 29 26 30 27 The method is based on the concept of parallel 31 \bitstream{}technology, in which parallel streams of bits are formed such28 bitstream technology, in which parallel streams of bits are formed such 32 29 that each stream comprises bits in onetoone correspondence with the 33 30 character code units of a source data stream. … … 91 88 when a partially successful search path fails. 92 89 93 94 90 The remainder of this paper is organized as follows. 95 96 91 Section~\ref{Background} presents background material on classic 97 92 regular expression pattern matching techniques and provides insight into the 98 efficiency of traditional regular expression software tools. 99 100 Section~\ref{Bitwise Parallel Data Streams} describes out data parallel 93 efficiency of traditional regular expression software tools. Next, 94 Section~\ref{Data Parallel Bitstreams} describes our data parallel 101 95 regular expression matching techniques. 102 103 96 Section~\ref{Compiler Technology} 104 105 97 Section~\ref{Methodology} describes the evaluation framework and Section~\ref{Experimental Results} 106 presents a detailed performance analysis of our data parallel \bitstream{}techniques against98 presents a detailed performance analysis of our data parallel bitstream techniques against 107 99 Gnu grep, agrep, and nrgrep. 108 109 Section~\ref{conclusion} concludes the paper. 100 Section~\ref{Conclusion} concludes the paper. 110 101 111 102 \section{Background} … … 130 121 regular expression implementations that construct automata for pattern matching. 131 122 132 The traditional technique [16] to search a regular expression of length m in 133 a text of length n is to first convert the expression into a nondeterministic 134 automaton (NFA) with O(m) nodes. It is possible to search the text using the 135 NFA directly in O(mn) worst case time. The cost comes from the fact that 136 more than one state of the NFA may be active at each step, and therefore all 137 may need to be updated. A more effcient choice is to convert the NFA into a 138 DFA. A DFA has only a single active state and allows to search the text at 139 O(n) worstcase optimal. The problem with this approach is that the DFA 140 may have O(2^m) states. 141 142 In general, the general process is first to build a 143 NFA from the regular expression and simulate the NFA on text input, 144 or alternatively to convert the NFA into a 145 DFA, optionally minimize the number of states in the DFA, 146 and finally simulate the DFA on text input. 147 123 Following Thompson's approach, a regular expression of length m is first converted 124 to an NFA with O(m) nodes. It is possible to search a text of length n using the 125 NFA directly in O(mn) worst case time. Alternatively, a more effcient choice 126 is to convert the NFA into a DFA. A DFA has only a single active state and 127 allows to search the text at O(n) worstcase optimal. The problem with this 128 approach is that the DFA may have O($2^m$) states. 148 129 149 130 150 131 \section{Parallel Bitwise Data Streams} 151 132 \label{Parallel Bitwise Data Streams} 152 153 133 154 134 \section{Compiler Technology} … … 177 157 178 158 \section{Experimental Results} 179 \label{ results}159 \label{Experimental Results} 180 160 %\input{results.tex} 181 161 182 \section{Conclusion and Future Work}183 \label{ conclusion}162 \section{Conclusion} 163 \label{Conclusion} 184 164 %\input{conclusion.tex} 185 165
Note: See TracChangeset
for help on using the changeset viewer.