Ignore:
Timestamp:
Feb 24, 2014, 2:24:02 AM (5 years ago)
Author:
cameron
Message:

Substitute gre2p for grep; cite GPU long-stream add; remove excess figures

File:
1 edited

Legend:

Unmodified
Added
Removed
  • docs/Working/re/avx2.tex

    r3637 r3642  
    7575file {data/avxinstructions3.dat};
    7676
    77 \legend{Bitstreams,NRGrep,Grep,Annot}
     77\legend{Bitstreams,NRGrep,Gre2p,Annot}
    7878\end{axis}
    7979\end{tikzpicture}
     
    126126file {data/avxcycles3.dat};
    127127
    128 \legend{Bitstreams,NRGrep,Grep,Annot}
     128\legend{Bitstreams,NRGrep,Gre2p,Annot}
    129129\end{axis}
    130130\end{tikzpicture}
     
    136136instruction count was reflected in a significant speed-up
    137137in the bitstreams implementation.  However, the speed-up was
    138 considerably less than expected.  As shown in \ref{fig:AVXIPC}
    139 the AVX2 version has lost some of the superscalar efficiency
    140 of the SSE2 code.   This is a performance debugging issue
    141 that we have yet to resolve.
    142 
    143 
    144 \begin{figure}
    145 \begin{center}
    146 \begin{tikzpicture}
    147 \begin{axis}[
    148 xtick=data,
    149 ylabel=Change in Instructions per Cycle,
    150 xticklabels={@,Date,Email,URIorEmail,HexBytes},
    151 tick label style={font=\tiny},
    152 enlarge x limits=0.15,
    153 %enlarge y limits={0.15, upper},
    154 ymin=0,
    155 legend style={at={(0.5,-0.15)},
    156 anchor=north,legend columns=-1},
    157 ybar,
    158 bar width=7pt,
    159 ]
    160 \addplot
    161 file {data/avxipc1.dat};
    162 \addplot
    163 file {data/avxipc2.dat};
    164 \addplot
    165 file {data/avxipc3.dat};
    166 
    167 
    168 
    169 \legend{Bitstreams,NRGrep,Grep,Annot}
    170 \end{axis}
    171 \end{tikzpicture}
    172 \end{center}
    173 \caption{Change in Instructions Per Cycle With AVX2}\label{fig:AVXIPC}
    174 \end{figure}
    175 
    176 Overall, the results on our AVX2 machine were quite good,
     138considerably less than expected. 
     139The bitstreams code  on AVX2 has suffered from a considerable
     140reduction in instructions per cycle compared to the SSE2
     141implementation, possibly indicating
     142that our grep implementation has become memory-bound.
     143Nevertheless, the overall results on our AVX2 machine were quite encouraging,
    177144demonstrating very good scalability of the bitwise data-parallel approach.
    178145
Note: See TracChangeset for help on using the changeset viewer.