Changeset 252 for docs/ASPLOS09


Ignore:
Timestamp:
Dec 26, 2008, 11:50:05 AM (10 years ago)
Author:
cameron
Message:

Adjustments

Location:
docs/ASPLOS09
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • docs/ASPLOS09/asplos094-cameron.tex

    r251 r252  
    873873transposition.
    874874
    875 The existence of high-performance algorithms for transformation of
    876 character data between byte stream and parallel bit stream form
    877 in both directions makes it possible to consider applying these
    878 transformations multiple times during text processing applications.
    879 Just as the time domain and frequency domain each have their
    880 use in signal processing applications, the byte stream form and
    881 parallel bit stream form can then each be used at will in
    882 character stream applications.
    883 
    884 
    885 
    886 \section{Parallel Bit Deletion}
    887 
    888 \begin{figure*}[tbh]
     875\begin{figure*}[t]
    889876\begin{center}
    890877\begin{tabular}{|c||c|c|c|c|c|c|c|c|}
     
    904891\end{figure*}
    905892
     893The existence of high-performance algorithms for transformation of
     894character data between byte stream and parallel bit stream form
     895in both directions makes it possible to consider applying these
     896transformations multiple times during text processing applications.
     897Just as the time domain and frequency domain each have their
     898use in signal processing applications, the byte stream form and
     899parallel bit stream form can then each be used at will in
     900character stream applications.
     901
     902
     903
     904\section{Parallel Bit Deletion}
     905
     906
    906907Parallel bit deletion is an important operation that allows string
    907908editing operations to be carried out while in parallel bit stream
     
    10011002\subsection{Parity}
    10021003
    1003 \begin{figure}
     1004\begin{figure}[h]
    10041005\begin{center}\small
    10051006\begin{verbatim}
     
    10161017\end{figure}
    10171018
    1018 \begin{figure}
     1019\begin{figure}[h]
    10191020\begin{center}\small
    10201021\begin{verbatim}
     
    10731074
    10741075\subsection{String/Decimal/Integer Conversion}
    1075 \begin{figure}
     1076\begin{figure}[h]
    10761077\begin{center}\small
    10771078\begin{verbatim}
     
    10851086\end{figure}
    10861087
    1087 \begin{figure}
     1088\begin{figure}[h]
    10881089\begin{center}\small
    10891090\begin{verbatim}
     
    11201121higher one by 10000 and adding.  Overall, 20
    11211122operations are required for this implementation
    1122 as well as the corresponding SWAR implementation
    1123 for sets of 32-bit fields.  Preloading of 6 constants
    1124 into registers for repeated use can reduce the number of
    1125 operations to 14 at the cost of significant register
     1123as well as the corresponding RefA implementation
     1124for sets of 32-bit fields.  Under the RefB model, preloading of
     11256 constants into registers for repeated use can reduce the
     1126number of operations to 14 at the cost of register
    11261127pressure.
    11271128
     
    11311132half-operand modifiers, with only one operand
    11321133of each of the addition and multiplication operations
    1133 modified at each level.  Overall, this implementation
    1134 requires 9 operations, or 6 operations with 3
    1135 preloaded constants.  This represents more than a 2X
     1134modified at each level.  Overall, the IDISA-A implementation
     1135requires 9 operations, while the IDISA-B model requires
     11366 operations with 3 preloaded registers.
     1137In either case, this represents more than a 2X
    11361138reduction in instruction count as well as a 2X reduction
    11371139in register pressure.
Note: See TracChangeset for help on using the changeset viewer.