source: icGREP/icgrep-devel/icgrep/kernels/pdep_kernel.h @ 6034

Last change on this file since 6034 was 5870, checked in by nmedfort, 20 months ago

Modified PDEP kernel

File size: 2.1 KB
Line 
1/*
2 *  Copyright (c) 2017 International Characters.
3 *  This software is licensed to the public under the Open Software License 3.0.
4 */
5#ifndef PDEP_KERNEL_H
6#define PDEP_KERNEL_H
7
8#include "kernel.h"
9#include <llvm/IR/Value.h>
10#include <string>
11
12/*
13
14Conceptually, given an unbounded input stream set of k streams and a marker stream, this kernel uses the
15Parallel Bits Deposit (PDEP) instruction to copy the input items from the i-th input stream to the i-th
16output stream the positions indicated by the marker bits. All other output items are set to zero. E.g.,
17
18 SOURCE >  abcdefgh i0000000 00000000 00000000
19 MARKER >  ...1.1.1 .....11. ..1...1. ...1.1..
20 OUTPUT >  ...a.b.c .....de. ..f...g. ...h.i..
21
22The complicating factor of this Kernel is that it assumes the input streams are *swizzled*. I.e., it
23"divides" each block of the marker stream into k elements, M_1 ... M_k, and applies the PDEP operation
24using M_i to the each of the k elements in the i-th input (swizzled) stream.
25
26            CONCEPTUAL VIEW OF INPUT STREAM SET                    ACTUAL LAYOUT OF INPUT STREAM SET
27
28 STREAM 0  abcde...  fg......  hijklm..  nopqrst.     SWIZZLE 0  abcde...  uvwxy...  OPQRS...  89abc...
29 STREAM 1  uvwxy...  zA......  BCDEFG..  HIJKLMN.     SWIZZLE 1  fg......  zA......  TU......  de......
30 STREAM 2  OPQRS...  TU......  VWXYZ0..  1234567.     SWIZZLE 2  hijklm..  BCDEFG..  VWXYZ0..  fghijk..
31 STREAM 3  89abc...  de......  fghijk..  lmnopqr.     SWIZZLE 3  nopqrst.  HIJKLMN.  1234567.  lmnopqr.
32
33
34NOTE: this kernel does *NOT* unswizzle the output. This will eventually be the responsibility of the
35pipeline to ensure it is done when needed.
36
37*/
38
39namespace kernel {
40
41class PDEPkernel final : public MultiBlockKernel {
42public:
43    PDEPkernel(const std::unique_ptr<kernel::KernelBuilder> & b, const unsigned swizzleFactor = 4, std::string name = "PDEP");
44    bool isCachable() const override { return true; }
45    bool hasSignature() const override { return false; }
46private:
47    void generateMultiBlockLogic(const std::unique_ptr<KernelBuilder> & b, llvm::Value * const numOfStrides) final;
48private:
49    const unsigned mSwizzleFactor;
50};   
51
52}
53   
54#endif
Note: See TracBrowser for help on using the repository browser.