Changeset 4564 for docs


Ignore:
Timestamp:
May 15, 2015, 8:48:16 AM (4 years ago)
Author:
nmedfort
Message:

Edits

Location:
docs/Working/icGrep
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • docs/Working/icGrep/evaluation.tex

    r4561 r4564  
    205205runs each on our Wikimedia document collection.
    206206
    207 %
    208 
    209 % \begin{table}
    210 % \begin{center}
    211 % \begin{tabular}{|c|r|r|r|}  \hline
    212 % Regular & \multicolumn{3}{|c|}{CPU cycles per byte} \\ \cline{2-4}
    213 % Expression & icGrep{} & pcre2grep & ugrep \\ \hline
    214 % blah  & 1 & 1000 & 100 \\ \hline
    215 % \end{tabular}
    216 % \caption{Matching Times for Complex Expressions}\label{table:complexexpr}
    217 % \end{center}
    218 % \end{table}
    219 
    220 % \begin{table*}[htbp]
    221 % \begin{center}
    222 % \footnotesize
    223 % \begin{tabular}{|l||l|l|}
    224 % \hline
    225 % Processor & i7-2600 (3.4GHz) & i7-4700MQ (2.4GHz) \\ \hline
    226 % L1 Cache & 256KB & 256KB  \\ \hline       
    227 % L2 Cache & 1MB & 1MB  \\ \hline
    228 % L3 Cache & 8MB & 8MB \\ \hline
    229 % Bus & 1333Mhz & 1600Mhz \\ \hline
    230 % Memory & 8GB & 8GB \\ \hline
    231 % \end{tabular}
    232 % \caption{Platform Hardware Specs}
    233 % \label{hwinfo}
    234 % \end{center}
    235 % \vspace{-20pt}
    236 % \end{table*}
    237 
    238207\begin{table}[ht]\centering % requires booktabs
    239208\newcolumntype{T}{c}
    240209\small\vspace{-2em}
    241210\begin{tabular}{@{}p{3cm}r@{~--~}rp{4pt}r@{~--~}rp{4pt}r@{~--~}rp{4pt}r@{~--~}rp{4pt}@{}}
    242 &\multicolumn{6}{c}{\textbf{\icGrep{} (SSE2)}}\\
     211&\multicolumn{6}{c}{\textbf{\icGrep{}}}\\
    243212\cmidrule[1pt](lr){2-7}
    244213\cmidrule[1pt](lr){8-10}
     
    254223\bottomrule
    255224\end{tabular}
    256 \caption{Matching Times for Complex Expressions (Seconds Per GB)}\label{table:complexexpr}
     225\caption{Matching Times for Complex Expressions (s/GB)}\label{table:complexexpr}
    257226\vspace{-2em}
    258227\end{table}
     
    289258which helps reduce register pressure.   The AVX2 results are for \icGrep{}
    290259compiled to use the 256-bit AVX2 instructions, processing blocks of 256 bytes at a time.
    291 
    292 
    293 
    294 % \begin{table}[h]\centering % requires booktabs,siunitx
    295 % \small
    296 % \vspace{-2em}
    297 % \begin{tabular}{@{}p{3cm}l@{~}r@{~~}l@{~}r@{~~}l@{~}r@{~~}l@{~}r@{~~}l@{~}r@{~~}l@{~}r@{~~}@{}}
    298 % &\multicolumn{6}{c}{\textbf{SEQ}}&\multicolumn{6}{c}{\textbf{MT}}\\
    299 % \cmidrule[1pt](lr){2-7}
    300 % \cmidrule[1pt](lr){8-13}
    301 % \textbf{Expression}&\multicolumn{2}{c}{\textbf{SSE2}}&\multicolumn{2}{c}{\textbf{AVX1}}&\multicolumn{2}{c}{\textbf{AVX2}}&\multicolumn{2}{c}{\textbf{SSE2}}&\multicolumn{2}{c}{\textbf{AVX1}}&\multicolumn{2}{c}{\textbf{AVX2}}\\
    302 % \toprule
    303 % Alphanumeric \#1&1.28&(.06)&1.35&(.05)&1.64&(.16)&1.41&(.06)&1.44&(.06)&1.96&(.18)\\
    304 % Alphanumeric \#2&1.27&(.06)&1.32&(.05)&1.77&(.19)&1.39&(.07)&1.39&(.04)&2.18&(.22)\\
    305 % Arabic&1.21&(.07)&1.28&(.08)&1.43&(.16)&1.30&(.05)&1.30&(.05)&1.63&(.13)\\
    306 % Currency&1.01&(.05)&1.03&(.06)&1.06&(.12)&1.05&(.05)&1.06&(.05)&1.21&(.08)\\
    307 % Cyrillic&1.18&(.06)&1.25&(.05)&1.13&(.10)&1.26&(.04)&1.33&(.04)&1.22&(.10)\\
    308 % Email&1.32&(.04)&1.38&(.05)&1.86&(.21)&1.42&(.04)&1.46&(.05)&2.17&(.26)\\
    309 % \midrule
    310 % \textit{Geomean}&1.21&&1.26&&1.45&&1.30&&1.32&&1.68&\\
    311 % \bottomrule
    312 % \end{tabular}
    313 % \caption{Speedups of Complex Expressions for i7-2600 / i7-4700MQ $(\sigma)$}\label{table:relperf}
    314 % \vspace{-2em}
    315 % \end{table}
    316260
    317261\begin{table}[h]\centering % requires booktabs,siunitx
     
    335279\bottomrule
    336280\end{tabular}
    337 \caption{Speedups of Complex Expressions for i7-2600 / i7-4700MQ $(\sigma)$}\label{table:relperf}
     281\caption{Speedup (Base/Actual) of Complex Expressions on i7-4700MQ $(\sigma)$}\label{table:relperf}
    338282\vspace{-2em}
    339283\end{table}
     
    343287but some mixed results due to the limitations of 256 bit addition.   Combining
    344288the AVX2 ISA with multithreading gives and average overall 61\% speedup compared to base.
    345 
    346 % Interestingly, the SSE2 column of Table \ref{table:relperf} shows that by simply using a newer hardware and compiler
    347 % improves performance by $21\%$ and $30\%$ for the sequential and multithreaded versions of \icGrep{}.
    348 % %
    349 % By taking advantage of the improved AVX1 and AVX2 ISA there are further improvements but AVX2 exhibits
    350 % higher variation between datasets.
    351 % %
    352 % This appears to be a consequence of complex Kleene-* repetitions (i.e., those that cannot utilize the MatchStar operation)
    353 % both resulting in increased register pressure and worse branch misprediction because of the characteristics in the datasets
    354 % themselves.
    355 % %
    356 %
    357 
    358289
    359290
Note: See TracChangeset for help on using the changeset viewer.