Changeset 4503 for docs


Ignore:
Timestamp:
Feb 11, 2015, 6:40:31 PM (4 years ago)
Author:
cameron
Message:

Evaluation

Location:
docs/Working/icGrep
Files:
3 edited

Legend:

Unmodified
Added
Removed
  • docs/Working/icGrep/bitgrep.bib

    r4490 r4503  
    8585  title={Parabix: Boosting the efficiency of text processing on commodity processors},
    8686  author={Lin, Dan and Medforth, Nigel and Herdy, Kenneth S and Shriraman, Arrvindh and Cameron, Rob},
    87   booktitle={High Performance Computer Architecture (HPCA), 2012 IEEE 18th International Symposium on},
     87  booktitle={HPCA}, 
    8888  pages={1--12},
    8989  year={2012},
     
    130130 author = {Cameron, Robert D. and Shermer, Thomas C. and Shriraman, Arrvindh and Herdy, Kenneth S. and Lin, Dan and Hull, Benjamin R. and Lin, Meng},
    131131 title = {Bitwise Data Parallelism in Regular Expression Matching},
    132  booktitle = {Proceedings of the 23rd International Conference on Parallel Architectures and Compilation (PACT)},
     132 booktitle = {PACT},
    133133 series = {PACT '14},
    134134 year = {2014},
     
    147147  title={Architectural support for {SWAR} text processing with parallel bit streams: the inductive doubling principle},
    148148  author={Cameron, Robert D and Lin, Dan},
    149   booktitle={ACM Sigplan Notices},
     149  booktitle={ASPLOS},
    150150  volume={44},
    151151  pages={337--348},
     
    292292  author={Asanovic, Krste and Bodik, Ras and Catanzaro, Bryan Christopher and Gebis, Joseph James and Husbands, Parry and Keutzer, Kurt and Patterson, David A and Plishker, William Lester and Shalf, John and Williams, Samuel Webb and others},
    293293  year={2006},
    294   institution={Technical Report UCB/EECS-2006-183, EECS Department, University of California, Berkeley}
     294  number ={UCB/EECS-2006-183},
     295  institution={EECS Department, University of California, Berkeley}
    295296}
    296297
     
    298299  title={Data-parallel finite-state machines},
    299300  author={Mytkowicz, Todd and Musuvathi, Madanlal and Schulte, Wolfram},
    300   booktitle={Proceedings of the 19th international conference on Architectural support for programming languages and operating systems},
     301  booktitle={ASPLOS},
    301302  pages={529--542},
    302303  year={2014},
     
    316317  title={Accelerating business analytics applications},
    317318  author={Salapura, Valentina and Karkhanis, Tejas and Nagpurkar, Priya and Moreira, Jose},
    318   booktitle={High Performance Computer Architecture (HPCA), 2012 IEEE 18th International Symposium on},
     319  booktitle={HPCA},
    319320  pages={1--10},
    320321  year={2012},
     
    324325  title={GPU-based NFA implementation for memory efficient high speed regular expression matching},
    325326  author={Zu, Yuan and Yang, Ming and Xu, Zhonghu and Wang, Lin and Tian, Xin and Peng, Kunyang and Dong, Qunfeng},
    326   booktitle={PPoPP '12 - Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming},
     327  booktitle={PPoPP},
    327328  pages={129--140},
    328329  year={2012},
  • docs/Working/icGrep/evaluation.tex

    r4502 r4503  
    22
    33In this section, we report on the evaluation of ICgrep performance, looking
    4 at three aspects.   First we consider a performance studies in a series
    5 of Unicode regular expression search problems in comparison to the
    6 contemporary competitors, including pcre2grep released in January 2015
    7 and ugrep of the ICU 54.1 software distribution.  Then we move on to
    8 investigate some performance aspects of ICgrep internal methods, looking
    9 at the impact of optimizations and multithreading.
     4at three aspects.   First we examine some performance aspects of
     5ICgrep internal methods, looking at the impact of optimizations discussed previously.
     6Then we move on to a systematic performance study of icGrep{} search
     7performance with named Unicode property searches in comparison to two
     8contemporary competitors, namely, pcre2grep released in January 2015
     9and ugrep of the ICU 54.1 software distribution.  Finally, we look
     10at some more complex expressions and also look at the impact
     11of multithreading icGrep{}.
     12
     13\subsection{Optimizations of Bitwise Methods}
     14
     15In order to support evaluation of bitwise methods, as well as to support
     16the teaching of those methods and ongoing research, \icGrep{} has an array
     17of command-line options.   This makes it relatively straightforward
     18to report on certain performance aspects of ICgrep, while others require
     19special builds. 
     20
     21For example, the command-line switch {\tt -disable-matchstar} can be used
     22to eliminate the use of the MatchStar operation for handling
     23Kleene-* repetition of character classes.   In this case, \icGrep{} substitutes
     24a while loop that iteratively extends match results.   
     25Surprisingly, this
     26does not change performance much in many practical cases.   
     27In each block,
     28the maximum iteration count is the maximum length run encountered; the
     29overall performance is based on the average of these maximums throughout the
     30file.   But when search for XML tags using the regular expression
     31\verb:<[^!?][^>]*>:, a slowdown of more than 2X may be found in files
     32with many long tags. 
     33
     34%The {\tt -disable-log2-bounded-repetition} flag allows these
     35%effectiveness of the special techniques for bounded repetition of
     36%byte classes to be assessed.   A slowdown of 30\% was observed
     37%with the searches using the regular expression
     38%\verb:(^|[ ])[a-zA-Z]{11,33}([.!? ]|$):, for example.
     39
     40To control the insertion of if-statements into dynamically
     41generated code, the
     42number of non-nullable pattern elements between the if-tests
     43can be set with the {\tt -if-insertion-gap=} option.   The
     44default value in \icGrep{} is 3, setting the gap to 100 effectively
     45turns of if-insertion.   Eliminating if-insertion sometimes improves
     46performance by avoiding the extra if tests and branch mispredications.
     47For patterns with long strings, however, there can be a substantial
     48slowdown; searching for a pattern of length 40 slows down by more
     49than 50\% without the if-statement short-circuiting.
     50
     51ICgrep also provides options that allow
     52various internal representations to be printed out.   These
     53can aid in understanding and/or debugging performance issues.
     54For example, the option
     55{\tt -print-REs} show the parsed regular expression as it goes
     56through various transformations.   The internal \Pablo{} code generated
     57may be displayed with {\tt -print-\Pablo{}}.  This can be quite useful in
     58helping understand the match process.   It also possible to print out the
     59generated LLVM IR code ({\tt -dump-generated-IR}), but this may be
     60less useful as it includes many
     61details of low-level carry-handling that obscures the core logic.
     62
     63The precompiled calculations of the various Unicode properties
     64are each placed in if-hierarchies as described previously.   To assess the
     65impact of this strategy, we built a version of icGrep without such
     66if-hierarchies.  In this case, when a Unicode property class is defined,
     67bitwise logic equations are applied for all members of the class independent
     68of the Unicode blocks represented in the input document.   For the classes
     69covering the largest numbers of codepoints, we observed slowdowns of up to 5X.
    1070
    1171\subsection{Simple Property Expressions}
     
    3393expressions were removed because they involved properties not supported by pcre2grep.
    3494In the end 246 test expressions were constructed in this process.
     95
     96\begin{figure}
     97\begin{center}
     98\pgfplotstableread[col sep = comma]{data/icgrep-scatter.csv}\icgrep
     99\pgfplotstableread[col sep = comma]{data/ugrep541-scatter.csv}\ugrep
     100\pgfplotstableread[col sep = comma]{data/pcre2grep-scatter.csv}\pcre
     101
     102\begin{tikzpicture}
     103\begin{axis}[
     104grid=both,
     105x tick label style={ /pgf/number format/1000 sep=},
     106ylabel={CPU Cycles Per Byte},
     107xlabel={Percentage of Matching Lines},
     108minor y tick num={1},
     109xmax=100,
     110height=0.5\textwidth,
     111legend style={at={(1.05,0.5)},
     112anchor=west,legend columns=1,
     113align=left,draw=none,column sep=2ex}
     114]
     115\addplot+[no markers,line width=2pt,color=blue!60,solid] table {\icgrep};
     116\addplot+[no markers,line width=2pt,color=red!60,solid] table {\ugrep};
     117\addplot+[no markers,line width=2pt,color=brown!60,solid] table {\pcre};
     118\legend{icGrep,ugrep541,pcre2grep}
     119\end{axis}
     120
     121
     122\end{tikzpicture}
     123\end{center}
     124\caption{Matching Performance for Simple Property Expressions}\label{fig:property_test}
     125\end{figure}
    35126
    36127We selected a set of Wikimedia XML files in several major languages representing
     
    54145in all cases.
    55146
    56 \begin{figure}
    57 \begin{center}
    58 \pgfplotstableread[col sep = comma]{data/icgrep-scatter.csv}\icgrep
    59 \pgfplotstableread[col sep = comma]{data/ugrep541-scatter.csv}\ugrep
    60 \pgfplotstableread[col sep = comma]{data/pcre2grep-scatter.csv}\pcre
    61 
    62 \begin{tikzpicture}
    63 \begin{axis}[
    64 grid=both,
    65 x tick label style={ /pgf/number format/1000 sep=},
    66 ylabel={CPU Cycles Per Byte},
    67 xlabel={Percentage of Matching Lines},
    68 minor y tick num={1},
    69 xmax=100,
    70 height=0.5\textwidth,
    71 legend style={at={(1.05,0.5)},
    72 anchor=west,legend columns=1,
    73 align=left,draw=none,column sep=2ex}
    74 ]
    75 \addplot+[no markers,line width=2pt,color=blue!60,solid] table {\icgrep};
    76 \addplot+[no markers,line width=2pt,color=red!60,solid] table {\ugrep};
    77 \addplot+[no markers,line width=2pt,color=brown!60,solid] table {\pcre};
    78 \legend{icGrep,ugrep541,pcre2grep}
    79 \end{axis}
    80 
    81 
    82 \end{tikzpicture}
    83 \end{center}
    84 \caption{Matching Performance for Simple Property Expressions}\label{fig:property_test}
    85 \end{figure}
    86147
    87148\subsection{Complex Expressions}
    88149
     150\begin{table}\centering % requires booktabs
     151\small
     152\begin{tabular}{@{}p{2.7cm}p{10.8cm}@{}}
     153\textbf{Name}&\textbf{Regular Expression}\\
     154\toprule
     155Alphanumeric \#1&\verb`^[\p{L}\p{N}]*((\p{L}\p{N})|(\p{N}\p{L}))[\p{L}\p{N}]*$`\\
     156\midrule
     157Alphanumeric \#2&\verb`[\p{L}\p{N}]*((\p{L}\p{N})|(\p{N}\p{L}))[\p{L}\p{N}]*`\\
     158\midrule
     159Arabic&\verb`^[\p{Arabic}\p{Common}]*\p{Arabic}[\p{Arabic}\p{Common}]*$`\\
     160\midrule
     161Currency&\verb`(\p{Sc}\s*(\d*|(\d{1,3}([,.]\d{3})*))([,.]\d{2}?)?)|`\newline\verb`((\d*|(\d{1,3}([,.]\d{3})*))([,.]\d{2}?)?\s*\p{Sc})`\\
     162\midrule
     163Cyrillic&\verb`[\p{Pi}\p{Po}]\p{Cyrillic}{6,}[\p{Pf}\p{Pe}]`\\
     164\midrule
     165Email &\verb`([^\p{Z}<]+@[\p{L}\p{M}\p{N}.-]+\.(\p{L}\p{M}*){2,6})(>|\p{Z}|$)`\\
     166\bottomrule
     167\end{tabular}
     168\caption{Regular Expressions}\label{table:regularexpr}
     169\end{table}
     170
     171
    89172We also comparative performance of the matching engines on a series
    90 of more complex expressions as shown in Table \ref{table:complexexpr}.
     173of more complex expressions as shown in Table \ref{table:regularexpr}.
     174The first two are alphanumeric expressions, differing only in the first
     175one is anchored to match the entire line.  The third
     176searches for lines consisting of text in Arabic script.
     177The fourth expression is a published currency expression taken from
     178Stewart and Uckelman \cite{stewart2013unicode}.
     179An expression matching runs of 6 or more Cyrillic script characters enclosed
     180in initial/opening and final/ending punctuation is fifth in the list.
     181The final expression is an email expression that allows internationalized
     182names.
     183
    91184
    92185% \begin{table}
     
    101194% \end{table}
    102195
    103 \begin{samepage}
    104 \begin{table}\centering % requires booktabs
    105 \small
    106 \begin{tabular}{@{}p{2.7cm}p{10.8cm}@{}}
    107 \textbf{Name}&\textbf{Regular Expression}\\
    108 \toprule
    109 alphanumeric \#1&\verb`^[\p{L}\p{N}]*((\p{L}\p{N})|(\p{N}\p{L}))[\p{L}\p{N}]*$`\\
    110 \midrule
    111 alphanumeric \#2&\verb`[\p{L}\p{N}]*((\p{L}\p{N})|(\p{N}\p{L}))[\p{L}\p{N}]*`\\
    112 \midrule
    113 arabic&\verb`^[\p{Arabic}\p{Common}]*\p{Arabic}[\p{Arabic}\p{Common}]*$`\\
    114 \midrule
    115 currency&\verb`(\p{Sc}\s*(\d*|(\d{1,3}([,.]\d{3})*))([,.]\d{2}?)?)|`\newline\verb`((\d*|(\d{1,3}([,.]\d{3})*))([,.]\d{2}?)?\s*\p{Sc})`\\
    116 \midrule
    117 email &\verb`([^\p{Z}<]+@[\p{L}\p{M}\p{N}.-]+\.(\p{L}\p{M}*){2,6})(>|\p{Z}|$)`\\
    118 \bottomrule
    119 \end{tabular}
    120 \caption{Regular Expressions}\label{table:regularexpr}
    121 \end{table}
    122196
    123197\begin{table}\centering % requires booktabs
     
    142216\end{table}
    143217
    144 \end{samepage}
    145 
    146 
    147 
    148 
    149 
    150 
    151 
    152 
    153 
    154 
    155 
    156 
    157 
    158 \subsection{Optimizations of Bitwise Methods}
    159 
    160 In order to support evaluation of bitwise methods, as well as to support
    161 the teaching of those methods and ongoing research, \icGrep{} has an array
    162 of command-line options.   This makes it relatively straightforward
    163 to report on certain performance aspects of ICgrep, while others require
    164 special builds. 
    165 
    166 For example, the command-line switch {\tt -disable-matchstar} can be used
    167 to eliminate the use of the MatchStar operation for handling
    168 Kleene-* repetition of character classes.   In this case, \icGrep{} substitutes
    169 a while loop that iteratively extends match results.   
    170 Surprisingly, this
    171 does not change performance much in many practical cases.   
    172 In each block,
    173 the maximum iteration count is the maximum length run encountered; the
    174 overall performance is based on the average of these maximums throughout the
    175 file.   But when search for XML tags using the regular expression
    176 \verb:<[^!?][^>]*>:, a slowdown of more than 2X may be found in files
    177 with many long tags. 
    178 
    179 The {\tt -disable-log2-bounded-repetition} flag allows these
    180 effectiveness of the special techniques for bounded repetition of
    181 byte classes to be assessed.   A slowdown of 30\% was observed
    182 with the searches using the regular expression
    183 \verb:(^|[ ])[a-zA-Z]{11,33}([.!? ]|$):, for example.
    184 
    185 To control the insertion of if-statements into dynamically
    186 generated code, the
    187 number of non-nullable pattern elements between the if-tests
    188 can be set with the {\tt -if-insertion-gap=} option.   The
    189 default value in \icGrep{} is 3, setting the gap to 100 effectively
    190 turns of if-insertion.   Eliminating if-insertion sometimes improves
    191 performance by avoiding the extra if tests and branch mispredications.
    192 For patterns with long strings, however, there can be a substantial
    193 slowdown; searching for a pattern of length 40 slows down by more
    194 than 50\% without the if-statement short-circuiting.
    195 
    196 ICgrep also provides options that allow
    197 various internal representations to be printed out.   These
    198 can aid in understanding and/or debugging performance issues.
    199 For example, the option
    200 {\tt -print-REs} show the parsed regular expression as it goes
    201 through various transformations.   The internal \Pablo{} code generated
    202 may be displayed with {\tt -print-\Pablo{}}.  This can be quite useful in
    203 helping understand the match process.   It also possible to print out the
    204 generated LLVM IR code ({\tt -dump-generated-IR}), but this may be
    205 less useful as it includes many
    206 details of low-level carry-handling that obscures the core logic.
    207 
    208 The precompiled calculations of the various Unicode properties
    209 are each placed in if-hierarchies as described previously.   To assess the
    210 impact of this strategy, we built a version of icGrep without such
    211 if-hierarchies.  In this case, when a Unicode property class is defined,
    212 bitwise logic equations are applied for all members of the class independent
    213 of the Unicode blocks represented in the input document.   For the classes
    214 covering the largest numbers of codepoints, we observed slowdowns of up to 5X.
     218The performance results are shown in Table \ref{table:complexexpr}.
     219In each case, we report seconds taken per GB of input averaged over 10
     220runs each on our Wikimedia document collection.
     221The most striking aspect of the results is that both ugrep and pcregrep
     222show dramatic slowdowns with ambiguities in regular expressions.
     223This is most clearly illustrated in the different performance figures
     224for the two Alphanumeric test expressions, but is also evident in the
     225Arabic, Currency and Email expressions.   By way of contrast, icGrep{}
     226maintains consistent fast performance in all test scenarios. 
     227
     228The multithreaded \icGrep{} shows speedup in every case, but balancing
     229of the workload across multiple cores is clearly an area for further work. 
     230Nevertheless, our three thread system shows a speedup of over
     231the single threaded version by up to 40\%.
     232
     233
     234
     235
     236
     237
     238
     239
     240
     241
     242
     243
     244
    215245
    216246
Note: See TracChangeset for help on using the changeset viewer.