Version 13 (modified by cameron, 5 years ago) (diff)


The ShuffleVector Project

This project investigates code generation for the LLVM shufflevector operation, particularly in the case that the shuffle mask is a compile-time constant.

For example, the shuffle mask pattern for a shuffle vector could be just a byte swap, for example.

%v3 = shufflevector <8 x i8> %v1, <8 x i8> undef,
                    <8 x i32> <i32 1, i32 0, i32 3, i32 2, i32 5, i32 4, i32 7, i32 6>  ; yields <8 x i8>

Transforming this to %t0 = bitcast %v1 to i64 @llvm.bswap.i64(i64 %t0) may allow efficient implementation on an architecture supporting byte swap, but not shuffle.

Generalizing this pattern, we may have arbitrary rotations expressed using shuffle masks. For example, consider the shufflevector of 4-bit fields:

%v3 = shufflevector <8 x i4> %v1, <8 x i4> undef,
              <8 x i32> <i32 1, i32 2, i32 3, i32 0, i32 5, i32 6, i32 7, i32 4>  ; yields <8 x i8>

Shuffles on 4-bit fields are generally not supported by SIMD instruction sets, but this one can be implemented by transforming to 16-bit vector shift operations.

%t0 = bitcast %v1 to <2 x i16>
%t1 = shl %t0, <2 x i16> <i16 12, i16 12> 
%t2 = lshr %t0, <2 x i16> <i16 4, i16 4> 
%v3 = xor %t1, %t2

Can these examples be turned into general rules that systematically capture these special cases?

Project Components

  1. Analysis - analysis components analyze shufflevector instances, classifying the type of shufflevector operation.

The isByteSwap64 function of is a simple analysis tool. The Shuffle Pattern Library is an attempt to create a systematic infrastructure for pattern analysis.

  1. ShuffleVector IR Transformations. IR transformations replace shufflevector instances with a sequence of other LLVM IR operations. The replaceShuffleVectorWithByteSwap64 of sfl_to_bswap.cpp is an example.
  1. ShuffleVector Code Generation. Some patterns recognized by the analysis phase may represent operations that cannot be conveniently be expressed by other LLVM IR code, but may nevertheless correspond to specific machine instructions on a particular target architecture.

Examples may be found in the BitShuffle subproject to generate code for the Haswell new instructions pext and pdep.

  1. Pass Management. Pass managers implement ShuffleVector analysis and transformation according to the Pass interface. ShufflevectorToByteSwapPass of sfl_to_bswap.cpp is an example.

Lai ShengZhang's GitHub