Version 8 (modified by cameron, 8 years ago) (diff)

--

# IDISA Toolkit Project

## Introduction to IDISA

Although there are now a great many defined SIMD instruction set architectures such as Altivec, VIS, SSE, AVX, in widespread use, there is no widely accepted low-level programming model for cross-platform SIMD programming.

An early attempt to define portable SIMD instructions was that of Fisher and Dietz, who coined the term SWAR (SIMD Within a Register).

Randall J. Fisher and Henry G. Dietz, Compiling for SIMD Within a Register Lecture Notes in Computer Science 1656: Languages and Compilers for Parallel Computing Springer, Berlin, 1999, pp. 290-305.

IDISA (Inductive Doubling Instruction Set Architecture) is our uniform programming model for high-performance SIMD (Single Instruction Multiple Data) programming on multiple computing platforms. While all modern processor families support SIMD instruction sets, the specific instructions available vary considerably from platform to platform. Furthermore, the instruction set designs for each platform tend to involve relatively ad hoc combinations of operations, field widths and vertical or horizontal SIMD processing models. In contrast, the IDISA architecture provides a simple, general model with uniform treatment of SIMD operations at all power-of-2 field widths as support for fully general vertical and horizontal SIMD programming.

### IDISA Vertical SIMD Syntax

The following IDISA notation in C++ template syntax present a fully general structure for vertical SIMD operations for any given basic binary operation on power-of-2 field widths. Let n = 2k be the field width in bits. Let f be a basic binary operation defined on n bit quantities producing an n bit result. Let N be the SIMD vector size in bits where N = 2K. Then v=simd<n>::f(a,b) denotes the general pattern for a vertical SIMD operation yielding an output SIMD vector v, given two input SIMD vectors a and b. For each field vi of v, the value computed is f(ai, bi). For example, given 128-bit SIMD vectors, simd<8>::add(a,b) represents the simultaneous addition of sixteen 8-bit fields.

See the list of IDISA Vertical operations for the individual operations and their semantics.

### IDISA Horizontal SIMD Syntax

A slight variant of this notation provides a fully general structure for horizontal SIMD operations combining pairs of adjacent fields. Given binary operation f on n bit fields and two N bit vectors a and b, let c be the 2N bit concatenation of a and b. Then v=hsimd<n>::f(a, b) denotes the application of f in the horizontal combination of all sets of adjacent fields of c such that vi,=f(c2i), c2i+1).

See the list of IDISA Horizontal operations for the individual operations and their semantics.

## Project

The IDISA toolkit project is to support the use of IDISA as a standard programming model for portable SIMD programming. The project has the following components.

### IDISA Generator Kit

The IDISA generator kit is used to generate IDISA implementations for given source language/compiler/architecture combinations. For example, we could generate an IDISA language consist of a C library using GCC vector conventions for the Power PC Altivec instruction set, or a C++ library using MSVC conventions for the Intel SSE2 instruction set. However, it should also have the flexibility for non-SIMD implementations such as implementation of a Python library using Python conventions for operations on unbounded bitstreams.

The generator kit should include optimization technology to ensure that the best possible IDISA implementation is realized for any given platform.

Note that there are lots of potential tricks. Another case occurs with the simd<16>::pack

1. For example,

consider the implementation of simd_hl<2>::add(a), where addition is natively supported for only larger field widths. A direct implementation requires 1 shift, two mask and one add operation.

But one of the masks can be eliminated by taking advantage of the properties of 2-bit subtraction.

simd_hl<2>::add(a) = simd<16>::sub(a, simd<16>::srli(a, 1) & simd<2>::constant(1))

### IDISA Test Generator

The test generator complements the generator kit by producing a comprehensive test suite for correctness testing of IDISA implementations.

### IDISA Compile-Time Specialization Kit

The compile-time specialization kit is used to provide optimized implementations of IDISA under known static properties of operand values. For example, if it is known that the high bit of each 4-bit field in registers a and b is zero, then a simd<4>::add(a,b) operation with no direct implementation on a particular platform can be realized by a wider-width operation that is, such as simd<16>::add(a,b) on most platforms.

Another case is implementing the IDISA nonsaturating packl using the saturating pack found with SSE, for example. In this case, the default definition requires masking:

template<> inline SIMD_type simd<16>::packl(SIMD_type r1, SIMD_type r2) {