Version 6 (modified by cameron, 5 years ago) (diff)


Towards a Library of Shufflevector Patterns

The LLVM IR contains a very powerful vector operation called shufflevector. Unless the target architecture supports a similarly powerful operation directly, good code generation for shufflevector instructions with arbitrary shuffle patterns is very difficult.

However, there are many shufflevector patterns that map to simpler operations, both at the LLVM IR level and at the target architecture level. This document introduces a library of patterns as well as a systematic approach to code generation for shufflevector operations.

Pattern Recognition

With a potentially large library of shuffle vector patterns and corresponding implementations, how do we systematically search for patterns without incurring high pattern matching costs for brute-force sequential search? One answer is to abstract the patterns and to define quick, parallel methods of pattern recognition.

Shuffle Vector Canonical Form

In order to facilitate recognition of shuffle vector patterns, we first perform some transformations to put shuffle vectors in canonical form.

An Initial Example

For example, consider the following three shufflevector patterns are all equivalent.

shufflevector <4 x i32> zeroinitializer, <4 x i32> %x, <4 x i32> <i32 4, i32 2, i32 5, i32 1>

shufflevector <4 x i32> %x, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 6, i32 1, i32 5>

shufflevector <4 x i32> %x, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 4, i32 1, i32 4>

This example illustrates two points about equivalent forms. First, it is always possible to produce equivalent results by exchanging the order of the two input vectors and modifying the shuffle selection mask, accordingly. Second, when one of the input vectors is a constant vector (the vector of all zeroes zeroinitializer, in this case), then we can any selection mask index in this vector is equivalent to any other.

This example also illustrates two rules for the conversion to canonical form in this case. As described below, if zeroinitializer occurs as an input vector, then it should normally occur as the second input vector. Secondly, if the selection mask values for a constant vector always refer to the lowest element. In the example, the selection mask values 5 and 6 are replaced by 4.

Canonical Selection from Constant Splats

In general, in discussing canonical forms, we refer to the three consecutive arguments of shufflevector as vector 1 or v1, vector 2 or v2 and mask.

A constant splat is a constant vector having the same value in every field. When used as a vector, zeroinitializer is a constant splat of the value 0.

Whenever one of the input vectors is a constant splat vector, then the canonical selection mask values for selecting a value from this vector must alwayws specify selection from the first field in the vector. That is, if the size of the input vectors is n, and vector 2 is a constant splat, then the index n is the canonical index for selection from this vector. Conversion to constant indexes in this case is illustrated in the second step of the example above. If vector 1 is a constant splat, then the canonical index for selection from this vector is 0.

Canonical Order of Vectors

Consider a shufflevector invocation of the following form.

shufflevector <n x <ty>> v1, <n x <ty>> v2, <m x i32> mask

Given such a shufflevector is always possible to reverse the order of the vectors with a suitable transformation of the mask.

shufflevector <n x <ty>> v2, <n x <ty>> v1, <m x i32> mask'

The mask transformation replaces each mask element mask[i] with the value which is either mask[i] + n, if mask[i] < n, or mask[i] - n, if mask[i] >= n. This may be expressed by the following equation.

mask'[i] = (mask[i] + n) mod (2n)

The following rules define canonical order for the vector arguments.

  1. If one of the vectors is undef, then vector 2 must be undef in canonical order.
  2. If one of the vectors is zeroinitializer (or the equivalent constant), and the other vector is not undef, then vector 2 must be zeroinitializer in canonical order.
  3. If one of the vectors is a constant vector and the other vector is neither constant nor undef, then vector 2 must be a constant vector in canonical order.
  4. At least one of the selector values must be < n.

If all of the selector values in a mask are >= n, then the vector arguments must be exchanged to be place the shuffle vector instance in canonical order.

Standard Pattern Examples

In the following examples, the patterns are defined assuming that the shufflevector arguments have been first placed in canonical form as described above.

The Rotate Pattern

The double vector rotate pattern is recognized as follows. Form the first-order difference vector of the shuffle index vector.

The Shift Left Pattern

The Shift Right Logical Pattern

The Shift Right Arithmetic Pattern

The Zero-Extend Pattern

The Merge Pattern

The Pack Pattern

The Parallel Extract Pattern

The Parallel Deposit Pattern