Changes between Version 1 and Version 2 of ParabixTransform


Ignore:
Timestamp:
Apr 20, 2014, 2:01:00 PM (3 years ago)
Author:
cameron
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • ParabixTransform

    v1 v2  
    1717on the availability of a family of horizontal packing operations.  The operations
    1818required each have the IDISA pattern {{{hsimd<{2,4,8}>::pack{h,l}(e1, e2)}}},
    19 in which {{{e1}}} and {{{e2}}} are 2^k-bit input registers,
     19in which {{{e1}}} and {{{e2}}} are 2^k^-bit input registers,
    2020the field width of the fields processed is either 2, 4, or 8, and
    21 the packing operation selects the bits comprising either the high ({{{h}}}) or low ({{{l}}}) half of each field.   The result in each case is a single 2^k-bit value
     21the packing operation selects the bits comprising either the high ({{{h}}}) or low ({{{l}}}) half of each field.   The result in each case is a single 2^k^-bit value
    2222comprising the packed bits that are selected.    For example,
    2323{{{hsimd<2>::packh(e1, e2)}}} selects the high bit of each 2-bit field
    2424in the concatenation of {{{e1}}} and {{{e2}}}, returning the packed set of
    25 2^k bits as a single 2^k-bit value.   
     252^k^ bits as a single 2^k^-bit value.   
    2626The following example illustrates this operation working with 16-bit registers.
    27 ||{{{e1}}}||{{{AaBbCcDd}}}||{{{EeFfGgHh}}}|||
    28 ||{{{e2}}}||{{{JjKkLlMm}}}||{{{NnPpQqRr}}}|||
     27||{{{e1}}}||{{{AaBbCcDd}}}||{{{EeFfGgHh}}}||
     28||{{{e2}}}||{{{JjKkLlMm}}}||{{{NnPpQqRr}}}||
    2929||{{{hsimd<2>::packh(e1, e2)}}}||{{{ABCDEFGH}}}||{{{JKLMNPQR}}}||
    3030Similarly, {{{hsimd<8>::packl(e1, e2)}}}
    3131selects the low 4-bits of each 8-bit field in the concatenation of {{{e1}}} and {{{e2}}},
    32 again returning the result as a single 2^k-bit value, as illustrated by the following example.
    33 ||{{{e1}}}||{{{AaBbCcDd}}}||{{{EeFfGgHh}}}|||
    34 ||{{{e2}}}||{{{JjKkLlMm}}}||{{{NnPpQqRr}}}|||
    35 ||{{{hsimd<8>::packh(e1, e2)}}}||{{{CdDdGgHh}}||{{{LlMmQqRr}}}||
     32again returning the result as a single 2^k^-bit value, as illustrated by the following example.
     33||{{{e1}}}||{{{AaBbCcDd}}}||{{{EeFfGgHh}}}||
     34||{{{e2}}}||{{{JjKkLlMm}}}||{{{NnPpQqRr}}}||
     35||{{{hsimd<8>::packh(e1, e2)}}}||{{{CdDdGgHh}}}||{{{LlMmQqRr}}}||
    3636
    3737Using these operations it is possible to perform transposition in a
    38 straightforward fashion.   Given a 2^k-byte sequence held consecutively
     38straightforward fashion.   Given a 2^k^-byte sequence held consecutively
    3939in 8 registers {{{s0}}}, {{{s1}}}, … {{{s7}}}, the following
    40403-step transformation process performs transposition to parallel bit streams.