Ignore:
Timestamp:
Feb 10, 2015, 12:17:27 PM (4 years ago)
Author:
cameron
Message:

Initial bitwise example, table placeholder

File:
1 edited

Legend:

Unmodified
Added
Removed
  • docs/Working/icGrep/evaluation.tex

    r4481 r4488  
    99at the impact of optimizations and multithreading.
    1010
    11 \subsection{ICgrep vs. Contemporary Competitors}
     11\subsection{Simple Property Expressions}
    1212
    1313A key feature of Unicode level 1 support in regular expression engines
     
    2121regular expression formed with a one of the property expressions and a
    2222positive lookbehind assertion on the other, while set difference uses
    23 a negative lookbehind assertion.  As all three programs support lookbehind
    24 assertions in this way, we systematically generated set intersection and
    25 difference in this way.
     23a negative lookbehind assertion. 
    2624
    2725We generated a set of regular expressions involving all Unicode values of
    28 the Unicode general
    29 category property ({\tt gc}) and all values of the Unicode script property ({\tt sc}).  We then generated
     26the Unicode general category property ({\tt gc}) and all values of the Unicode
     27script property ({\tt sc}). 
     28We then generated
    3029expressions involving random pairs of {\tt gc} and {\tt sc}
    3130values combined with a random set operator chosen from union, intersection and difference.
    32 All property values are represented at least once.   A small number of
     31All property values are represented at least once.   
     32A small number of
    3333expressions were removed because they involved properties not supported by pcre2grep.
    3434In the end 246 test expressions were constructed in this process.
     
    8585\end{figure}
    8686
     87\subsection{Complex Expressions}
     88
     89We also comparative performance of the matching engines on a series
     90of more complex expressions as shown in Table \ref{table:complexexpr}.
     91
     92\begin{table}
     93\begin{center}
     94\begin{tabular}{|c|r|r|r|}  \hline
     95Regular & \multicolumn{3}{|c|}{CPU cycles per byte} \\ \cline{2-4}
     96Expression & icGrep{} & pcre2grep & ugrep \\ \hline
     97blah    & 1 & 1000 & 100 \\ \hline
     98\end{tabular}
     99\caption{Matching Times for Complex Expressions}\label{table:complexexpr}
     100\end{center}
     101\end{table}
    87102
    88103\subsection{Optimizations of Bitwise Methods}
     
    113128\verb:(^|[ ])[a-zA-Z]{11,33}([.!? ]|$):, for example.
    114129
    115 To assess the effectiveness of inserting if-statements, the
     130To control the insertion of if-statements into dynamically
     131generated code, the
    116132number of non-nullable pattern elements between the if-tests
    117133can be set with the {\tt -if-insertion-gap=} option.   The
     
    142158of the Unicode blocks represented in the input document.   For the classes
    143159covering the largest numbers of codepoints, we observed slowdowns of up to 5X.
    144 
    145160
    146161
Note: See TracChangeset for help on using the changeset viewer.