Changes between Version 1 and Version 2 of ParabixTransform
 Timestamp:
 Apr 20, 2014, 2:01:00 PM (3 years ago)
Legend:
 Unmodified
 Added
 Removed
 Modified

ParabixTransform
v1 v2 17 17 on the availability of a family of horizontal packing operations. The operations 18 18 required each have the IDISA pattern {{{hsimd<{2,4,8}>::pack{h,l}(e1, e2)}}}, 19 in which {{{e1}}} and {{{e2}}} are 2^k bit input registers,19 in which {{{e1}}} and {{{e2}}} are 2^k^bit input registers, 20 20 the field width of the fields processed is either 2, 4, or 8, and 21 the packing operation selects the bits comprising either the high ({{{h}}}) or low ({{{l}}}) half of each field. The result in each case is a single 2^k bit value21 the packing operation selects the bits comprising either the high ({{{h}}}) or low ({{{l}}}) half of each field. The result in each case is a single 2^k^bit value 22 22 comprising the packed bits that are selected. For example, 23 23 {{{hsimd<2>::packh(e1, e2)}}} selects the high bit of each 2bit field 24 24 in the concatenation of {{{e1}}} and {{{e2}}}, returning the packed set of 25 2^k bits as a single 2^kbit value.25 2^k^ bits as a single 2^k^bit value. 26 26 The following example illustrates this operation working with 16bit registers. 27 {{{e1}}}{{{AaBbCcDd}}}{{{EeFfGgHh}}} 28 {{{e2}}}{{{JjKkLlMm}}}{{{NnPpQqRr}}} 27 {{{e1}}}{{{AaBbCcDd}}}{{{EeFfGgHh}}} 28 {{{e2}}}{{{JjKkLlMm}}}{{{NnPpQqRr}}} 29 29 {{{hsimd<2>::packh(e1, e2)}}}{{{ABCDEFGH}}}{{{JKLMNPQR}}} 30 30 Similarly, {{{hsimd<8>::packl(e1, e2)}}} 31 31 selects the low 4bits of each 8bit field in the concatenation of {{{e1}}} and {{{e2}}}, 32 again returning the result as a single 2^k bit value, as illustrated by the following example.33 {{{e1}}}{{{AaBbCcDd}}}{{{EeFfGgHh}}} 34 {{{e2}}}{{{JjKkLlMm}}}{{{NnPpQqRr}}} 35 {{{hsimd<8>::packh(e1, e2)}}}{{{CdDdGgHh}} {{{LlMmQqRr}}}32 again returning the result as a single 2^k^bit value, as illustrated by the following example. 33 {{{e1}}}{{{AaBbCcDd}}}{{{EeFfGgHh}}} 34 {{{e2}}}{{{JjKkLlMm}}}{{{NnPpQqRr}}} 35 {{{hsimd<8>::packh(e1, e2)}}}{{{CdDdGgHh}}}{{{LlMmQqRr}}} 36 36 37 37 Using these operations it is possible to perform transposition in a 38 straightforward fashion. Given a 2^k byte sequence held consecutively38 straightforward fashion. Given a 2^k^byte sequence held consecutively 39 39 in 8 registers {{{s0}}}, {{{s1}}}, … {{{s7}}}, the following 40 40 3step transformation process performs transposition to parallel bit streams.