Changeset 4550

May 13, 2015, 10:19:26 AM (3 years ago)

Mention short vector SIMD methods earlier, ref principled speculation.

4 edited


  • docs/Working/icGrep/bitgrep.bib

    r4503 r4550  
    304304  organization={ACM}
     307  title={Challenging the embarrassingly sequential: parallelizing finite state machine-based computations through principled speculation},
     308  author={Zhao, Zhijia and Wu, Bo and Shen, Xipeng},
     309  booktitle={ASPLOS},
     310  pages={543--558},
     311  year={2014},
     312  organization={ACM}
    307316  title={Top-performance tokenization and small-ruleset regular expression matching},
  • docs/Working/icGrep/icgrep.tex

    r4506 r4550  
    30 \title{Bitwise Data Parallelism with LLVM: The ICgrep Case Study}
     30\title{Bitwise Data Parallelism with LLVM: The icGrep Case Study}
    36 Bitwise data parallelism has recently been shown to have considerable promise
     36Bitwise data parallelism using short vector (SIMD) instructions has recently been shown to have considerable promise
    3737as the basis for a new, fundamentally parallel, style of regular expression
    38 processing.  This paper examines the application of this
     39This paper examines the application of this
    3940approach to the development a full-featured Unicode-capable open-source grep
    4041implementation.  Constructed using a layered architecture
    4445of dynamic compilation and bitwise data parallelism.   
    4546In performance comparisons with several contemporary alternatives,
    46 10$\times$ or better speedups are often observed.
     4710$\times$ or better speedups are often observed. 
  • docs/Working/icGrep/introduction.tex

    r4504 r4550  
    1919built on these techniques to accelerate business analytics applications
    2020using SSE instructions on commodity processors.   
    21 Zu et al~\cite{zu2012gpu} use GPU technology to implement NFA-based regular
     21Zu et al.~\cite{zu2012gpu} use GPU technology to implement NFA-based regular
    2222expression matching with parallelism devoted both to processing
    2323a compressed active state array as well as to handling matching of
    2929finite state machines (FSMs) are considered the hardest to parallelize (embarrassingly sequential) \cite{asanovic2006landscape}.
    3030However, some success has been reported recently along two independent lines of
    31 research.   Mytkowicz et al \cite{mytkowicz2014data} use SIMD shuffle operations
    32 to implement composable DFA transitions using dynamic convergence to
    33 reduce the number of states in play at any one time and range coalescing
    34 to compact the transition tables.   Unfortunately, the method seems unlikely
    35 to apply well to Unicode regular expression matching problems, which routinely
    36 require thousands of DFA states for named Unicode properties.
    37 Building on the Parabix framework, Cameron et al.~\cite{cameron2014bitwise} introduce
    38 regular expression matching using a new bitwise
    39 data parallel approach.
     31research.   The first focuses on data-parallel division of an input stream into separately
     32processed segments on multiple cores.   In this case, the challenge is
     33to identify the appropriate starting state or states that need to be
     34considered for all but the first segment.   Mytkowicz et al. \cite{mytkowicz2014data}
     35use dynamic convergence and range coalescing to dramatically reduce the number
     36of states in play at any point, while Zhao et al. \cite{zhao2014challenging} address the problem using
     37principled speculation.   Introducing a second line of research,
     38Cameron et al.~\cite{cameron2014bitwise} investigate new regular expression
     39algorithms based on the concept of bitwise data parallelism, together with
     40implementations that exploit short vector SIMD instructions to accelerate
     41regular expression matching even within a single core.
    4143In this paper, we report on the use of the implementation of a full
Note: See TracChangeset for help on using the changeset viewer.