Changes between Version 5 and Version 6 of ParallelDeletion


Ignore:
Timestamp:
Mar 15, 2016, 3:56:32 PM (4 years ago)
Author:
cameron
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • ParallelDeletion

    v5 v6  
    7373 1.  Sequentially write the register values to the output stream, advancing the output stream pointer by the count of nondeleted bytes for each register in turn.
    7474
     75== Summary of Deletion Methods ==
     76
     77=== Parallel Prefix Deletion ===
     78
     79This method uses only bitwise logic and shifting to perform deletion. 
     80For deletion within 2^k bit fields, k steps are required.   Each step i
     81involves application of a mask to select of bits to move 2^i positions. 
     824 operations are required per step.
     83However, the calculation of masks requires k^2 preprocessing steps.
     84However, when applied to deletion of data from several parallel bit streams, the
     85preprocessing cost is shared across the streams.
     86
     87=== Deletion by Left Result Induction ===
     88
     89In this method, deletion steps ensure that results are leftmost within
     90fields of size 2^i at step i.   The process involves SIMD multiplication
     91operations at each step.
     92
     93=== Deletion by Central Result Induction ===
     94
     95In this method, deletion steps produce results that span or touch the
     96centre position of each 2^(i+1) bit field at step i.   The process
     97involves SIMD rotate operations at each step.
     98
     99=== Deletion by PEXT ===
     100
     101Intel introduced the PEXT instruction with its Haswell processors. 
     102This operation uses a mask to select nondeleted bits of a 64-bit value
     103and compress them together leftmost in a 64-bit result.  Thus it performs
     104parallel deletion in a single step, with a mask that is the inverse of the
     105deletion mask.
     106
     107=== Deletion by Permutation Vector Computation ===
     108
     109This method is suitable for performing deletion at byte or large granularities,
     110given a processor supporting shuffle or permutation operations with variable selectors.
    75111
    76112
    77113
     114