# Changeset 244

Ignore:
Timestamp:
Dec 23, 2008, 8:01:15 AM (10 years ago)
Message:

Operand fetch unit logic

Location:
docs/ASPLOS09
Files:
2 edited

### Legend:

Unmodified
 r243 Only one of these values may be 1; both are 0 if no modifier is specified. In addition, the SIDU supplies decoded field width information to both the SOFU and to the SIMD Instruction Execute Unit (SIEU). The SIDU also supplies decoded field width signals $w_k$ for each field width $2^k$ to both the SOFU and to the SIMD Instruction Execute Unit (SIEU).  Only one of the field width signals has the value 1. The SIDU also supplies decoded SIMD opcode information to SIEU and a decoded $A$-bit register address for the destination register to $n=2^k$ for $0 \leq k \leq K$.  This involves additional circuitry for field widths not supported in existing processors.  For inductive doubling algorithms in support of parallel bit streams, the principal need is for additional circuitry to support 2-bit and 4-bit field widths.  This circuity is generally less complicated than that for larger fields.  Support for circuitry at these width has other applications as well.   For example, DNA sequences are frequently represented using packed sequences of 2-bit codes for the four possible nucleotides\cite{}, while the need for accurate financial calculation has seen a resurgence of the 4-bit packed BCD format for decimal floating point \cite{}. in existing processors.  In our evaluation model, IDISA-A adds support for 2-bit, 4-bit and 128-bit field widths in comparison with the RefA architecture, while IDISA-B similarly extends RefB. When execution of the SWAR instruction is \subsection{Operand Fetch Unit Logic} Discussion of gate-level implementation. The SOFU is responsible for implementing the half-operand modification logic for each of up to two input operands fetched from SRF.  For each operand, this logic is implemented using the decoded half-operand modifiers signals $h$ and $l$, the decoded field width signals $w_k$ and the 128-bit operand value $r$ fetched from SRF to produce a modified 128-bit operand value $s$ following the requirements of equations (4), (5) and (6) above.  Those equations must be applied for each possible modifier and each field width to determine the possible values $s[i]$ for each bit position $i$.  For example, consider bit position 41, whose binary 7-bit address is $0101001$. Considering the address bits left to right, each 1 bit corresponds to a field width for which this bit lies in the lower $n/2$ bits (widths 2, 16, 64), while each 0 bit corresponds to a field width for which this bit lies in the high $n/2$ bits. In response to the half-operand modifier signal $h$, this bit may receive a value from the corresponding field of width 2, 16 or 64 whose address bit is 0, namely $r[40]$, $r[33]$ or $r[9]$.   Otherwise, this bit receives the value $r[41]$, in the case of no half-operand modifier, or a low half-operand modifier in conjunction with a field width signal $w_2$, $w_{16}$ or $w_{64}$. The overall logic for determining this bit value is thus given as follows. \begin{eqnarray*} s[41] & = & h \wedge (w_2 \wedge r[40] \vee w_{16} \wedge r[33] \vee w_{64} \wedge r[9]) \\ & & \vee \neg h \wedge (\neg l \vee w_2 \vee w_{16} \vee w_{64}) \wedge r[41] \end{eqnarray*} Similar logic is determined for each of the 128 bit positions. For each of the 7 field widths, 64 bits are in the low $n/2$ bits, resulting in 448 2-input and gates for the $w_k \wedge r[i]$ terms. For 120 of the bit positions, or gates are needed to combine these terms; $441 -120 = 321$ 2-input or gates are required.  Another 127 2-input and gates combine these values with the $h$ signal. In the case of a low-half-operand modifier, the or-gates combining $w_k$ signals can share circuitry.  For each bit position $i=2^k+j$ one additional or gate is required beyond that for position $j$. Thus 127 2-input or gates are required.  Another 256 2-input and gates are required for combination with the $\not h$  and $r[i]$ terms.  The terms for the low and high half-operand modifiers are then combined with an additional 127 2-input or gates.   Thus, the circuity complexity for the combinational logic implementation of half-operand modifiers within the SOFU is 1279 2-input gates per operand, or 2558 gates in total.