# Changeset 231

Ignore:
Timestamp:
Dec 10, 2008, 9:19:54 PM (11 years ago)
Message:

Modified section on horizontal operations

File:
1 edited

### Legend:

Unmodified
 r230 left instruction.  Right justification by shifting an $n$ bit field $i$ positions to the right is equivalent to a left rotate of $n-i$ positions.  These rotation amounts are computed by the operation \newline positions.  These rotation amounts are computed by the operation \verb#rj=simd<8>::sub(simd<8>::const(8), cts_4)# as shown in row 5, except that don't care fields (which won't be subsequently used) The left shift amounts are calculated by \verb#lj=simd<8>::srli<4>(cts_4)# as shown in row 6, and are combined with the right shift amounts by the selection operation \newline \verb#rot_8=simd_if(simd<16>::const(0xFF00), rj, lj)# by the selection operation \verb#rot_8=simd_if(simd<16>::const(0xFF00), rj, lj)# as shown in row 7.  Using these computed values, the inductive step is completed by application of the operation \newline \verb#rslt_16=simd<8>::rotl(rslt_8, rot_8)# is completed by application of the operation \verb#rslt_16=simd<8>::rotl(rslt_8, rot_8)# as shown in row 8. {\em vertical} operations which combine corresponding fields of different registers.  Horizontal operations can be found that combine two (e.g., haddpd on SSE3), four (e.g, \verb:si_orx: on SPU), eight (e.g, psadbw on SSE) or sixteen values (e.g., vcmpequb on Altivec).  Some can be found that combine two (e.g., \verb:haddpd: on SSE3), four (e.g, \verb:si_orx: on SPU), eight (e.g, \verb:psadbw: on SSE) or sixteen values (e.g., \verb:vcmpequb: on Altivec).  Some horizontal operations have a vertical component as well. For example, psadbw first forms the absolute value of For example, \verb:psadbw: first forms the absolute value of the difference of eight corresponding byte fields before performing horizontal add of the eight values, while vsum4ubs on Altivec performs horizontal add of sets of \verb:vsum4ubs: on Altivec performs horizontal add of sets of four unsigned 8-bit fields within one register and then combines the result horizontally with operations in general. By making use of \verb:: half-operand modifier combinations, the inductive doubling architecture offers systematic support for horizontal operations on pairs of adjacent fields. In contrast to this {\em ad hoc} support on commodity processors, IDISA offers a completely systematic treatment of horizontal operations without any special features beyond the inductive doubling features already described. In the simplest case, any vertical operation \verb#simd::F# on $n$-bit fields gives rise to an immediate horizontal operation \verb#simd::F(r, r)# for combining adjacent pairs of $n/2$ bit fields. For example, \verb#simd<16>::add# adds values in adjacent 8 bit fields to produce 16 bit results, while \verb#simd<32>::min# can produce the minimum value of adjacent 16-bit fields.  In general, \newline \verb#simd::F# denotes the horizontal binary combination of adjacent fields for any operator $F$ and field width $n$. Horizontal combinations of larger numbers of fields makes use of the inductive doubling property. For example, consider the or-across operation \verb:si_orx: minimum value of adjacent 16-bit fields. Thus any binary horizontal operation can be implemented in a single IDISA instruction making use of the \verb:: operand modifier combination. Horizontal combinations of four adjacent fields can also be realized in a general way through two steps of inductive doubling.  For example, consider the or-across operation \verb:si_orx: of the SPU, that performs a logical or operation on four 32-bit fields.  This four field combination involves two steps in the inductive doubling approach. can easily be implemented with the following two operations. %\begin{singlespace} \begin{verbatim} \end{verbatim} %\end{singlespace} This example is also interesting in showing a potential value for supporting bitwise logical operations at different field widths, i.e., specifically for use with half-operand modifiers. Similarly, to combine any eight fields simply requires three inductive doubling steps using the desired operator at successive power-of-two field widths, while combining sixteen fields requires four such operations. In this way, the inductive doubling architecture provides systematic support for horizontal operations well beyond the existing facilities of commodity architectures, although lacking some of the special features found in some cases. In general, systematic support for horizontal combinations of sets of $2^h$ adjacent fields may be realized through $h$ inductive double steps in a similar fashion. Thus, IDISA esssentially offers systematic support for horizontal operations entirely through the use of \verb:: half-operand modifier combinations. Systematic support for general horizontal operations under IDISA also creates opportunity for a design tradeoff: offsetting the circuit complexity of half-operand modifiers with potential elimination of dedicated logic for some {/ad hoc} horizontal SIMD operations. Even if legacy support for these operations is required, it may be possible to provide that support through software or firmware rather than a full hardware implementation.  Evaluation of these possibilities in the context of particular architectures is a potential area for further work. \section{Implementation}