Changeset 1423 for docs


Ignore:
Timestamp:
Sep 1, 2011, 12:33:44 AM (8 years ago)
Author:
cameron
Message:

Further improvements

Location:
docs/EuroPar2011/Pres
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • docs/EuroPar2011/Pres/europar-slides.tex

    r1403 r1423  
    2929\end{frame}
    3030
     31\begin{frame}[fragile]
     32\frametitle{The 13th Dwarf: Finite State Machines}
     33
     34\begin{block}<+->{Berkeley Landscape of Parallel Computing Research}
     35\begin{itemize}
     36\item 13 dwarves for abstract classes of computing problems.
     37\item 13th dwarf: finite state machines, parsing.
     38\item Considered hardest dwarf to parallelize:
     39\begin{itemize}
     40\item ``embarassingly sequential'',
     41%\item
     42``Nothing helps!''
     43\end{itemize}
     44\end{itemize}
     45\end{block}
     46
     47\begin{block}<+->{64 Finite State Transitions per CPU Cycle}
     48\begin{itemize}
     49\item Main contribution: addition for parallel scanning.
     50\item Carry propagation between bit positions = FSM transition.
     51\item 64 carry propagations per addition on commodity processors.
     52\item Previously unexploited parallelism.
     53\end{itemize}
     54\end{block}
     55
     56\end{frame}
     57
    3158
    3259
     
    3865\section{Parallel Bit Stream Technology}
    3966
    40 \begin{frame}[fragile]
    41 \frametitle{The 13th Dwarf: Finite State Machines}
    42 
    43 \begin{block}<+->{Berkeley Landscape of Parallel Computing Research}
    44 \begin{itemize}
    45 \item 13 dwarves for abstract classes of computing problems.
    46 \item 13th dwarf: finite state machines, parsing.
    47 \item Considered hardest dwarf to parallelize:
    48 \begin{itemize}
    49 \item ``embarassingly sequential'',
    50 %\item
    51 ``Nothing helps!''
    52 \end{itemize}
    53 \end{itemize}
    54 \end{block}
    55 
    56 \begin{block}<+->{Finite State Transitions at 200 GHz?}
    57 \begin{itemize}
    58 \item Use addition for parallel scanning.
    59 \item Carry propagation between bit positions = FSM transition.
    60 \item Commodity processors @ 3.1 GHz.
    61 \item 64 carry propagations per addition.
    62 \item Effective speed for FSM transitions: $3.1 \times 64 > 200$ GHz.
    63 \end{itemize}
    64 \end{block}
    65 
    66 \end{frame}
    6767
    6868
     
    9191
    9292
    93 \begin{frame}
     93\begin{frame}[fragile]
    9494\frametitle{Goal: High Performance Text Processing}
    9595
    96 Why form parallel bit streams?
    97 
    98 \begin{itemize}
    99 \item Byte-at-a-time text processing is too slow.
     96
     97\begin{block}<+->{Byte-at-a-time text processing is too slow.}
    10098\begin{itemize}
    10199\item Example: XML scan for ``{\tt <}''.
    102100\item Byte-at-time loop computes only 1 info bit per iteration!
    103101\end{itemize}
    104 \item So let's compute those bits in parallel!
     102\end{block}
     103\begin{block}<+->{So let's compute those bits in parallel}
    105104\begin{itemize}
    106105\item Bitwise logic on basis streams $b_i \rightarrow${\tt [<]} stream.
    107106\item Process 128 positions at a time using SSE registers.
    108107\end{itemize}
     108\end{block}
     109\begin{block}<+->{Parabix 1 XML Parser}
     110\begin{itemize}
    109111\item Find next ``{\tt <}'' with {\em bit scan} instruction (e.g., Intel bsf).
    110 \begin{itemize}
    111112\item Advance up to 63 positions at once.
    112113\end{itemize}
    113 %\item Or use parallel scanning with bitstream addition!
    114 
    115 \end{itemize}
     114\end{block}
    116115\end{frame}
    117116
     
    373372\begin{itemize}
    374373\item All XML constructs can be fully parsed using these techniques.
    375 \item Comments, CDATA, processing instructions must be done first.
     374\item Parse comments, CDATA, processing instructions first.
    376375\item Form mask bitstream marking the interiors of these constructs.
    377376\item Remaining {\tt <} and {\tt \&} must be tag and reference starts.
     
    453452
    454453\begin{frame}[fragile]
    455 \frametitle{Performance: CPU Cycles per kB}
     454\frametitle{Execution Time: CPU Cycles per kB}
    456455\begin{figure}
    457456\begin{center}
Note: See TracChangeset for help on using the changeset viewer.