Changes between Version 2 and Version 3 of ParallelDeletion


Ignore:
Timestamp:
Mar 15, 2016, 3:19:04 PM (3 years ago)
Author:
cameron
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • ParallelDeletion

    v2 v3  
    2424
    2525 1. Apply a single deletion mask to a single bit stream.
    26  1. Apply a single deletion mask to each individual bit stream in a set of N bit streams.
    27  1. Apply a deletion mask to a byte stream.
     26 1. Apply a single deletion mask to each individual bit stream in a set of N parallel bit streams.
     27 1. Apply a deletion mask to a stream of larger units, for example a stream of bytes.
     28
     29In the event that N bit streams are processed, it may be possible to reduce the overall
     30cost by sharing preprocessing overhead.   
    2831
    2932== Parallel Deletion Methods ==
     
    4447the new block to become the new pending block. 
    4548
     49=== Intra-Field Deletion ===
     50
     51When the ultimate goal is to produce streams of bytes, doublebytes, or other units,
     52then it is often beneficial to limit the bit stream deletion work to smaller field sizes.
     53For example, suppose that a stream of doublebytes (16-bit units) is to be produced after
     54application of parallel deletion to a the corresponding 16 parallel bit streams.
     55Working with 128-bit registers, each register will ultimately hold at most 8 doublebytes.
     56Then we may proceed as follows:
     57 1.  Perform parallel deletion within 8-bit fields, so that nondeleted bits are leftmost
     58in each such field.
     59 2.  Transpose to double byte form, producing a set of 16 register values in which the
     60nondeleted double bytes are leftmost in each register.
     613.   Sequentially write the register values to the output stream, advancing the output
     62stream pointer by the count of nondeleted elements each time.
     63
     64
    4665=== Intra-Block Parallel Deletion ===
    4766
    4867There are several different methods that can be used to implement parallel deletion within
    49 SIMD registers.
     68SIMD registers.  Depending on the problem variation, it is often beneficial to combine
     69methods.
    5070
    51  1.  Parallel prefix deletion
    52  1.  SIMD deletion by left-result induction
    53  1.  SIMD deletion by central result induction
     71 1.  Parallel prefix deletion.
     72 1.  SIMD deletion by left-result induction.
     73 1.  SIMD deletion by central result induction.
    5474 1.  Deletion using PEXT bit manipulation instructions.
     75 1.  Deletion by permutation vector computation.
    5576
    5677