# Changeset 4550

Ignore:
Timestamp:
May 13, 2015, 10:19:26 AM (4 years ago)
Message:

Mention short vector SIMD methods earlier, ref principled speculation.

Location:
docs/Working/icGrep
Files:
4 edited

Unmodified
Added
Removed
• ## docs/Working/icGrep/bitgrep.bib

 r4503 organization={ACM} } @inproceedings{zhao2014challenging, title={Challenging the embarrassingly sequential: parallelizing finite state machine-based computations through principled speculation}, author={Zhao, Zhijia and Wu, Bo and Shen, Xipeng}, booktitle={ASPLOS}, pages={543--558}, year={2014}, organization={ACM} } @article{scarpazza2011top, title={Top-performance tokenization and small-ruleset regular expression matching},
• ## docs/Working/icGrep/icgrep.tex

 r4506 \begin{document} \title{Bitwise Data Parallelism with LLVM: The ICgrep Case Study} \title{Bitwise Data Parallelism with LLVM: The icGrep Case Study} \author{Anonymous} \begin{abstract} Bitwise data parallelism has recently been shown to have considerable promise Bitwise data parallelism using short vector (SIMD) instructions has recently been shown to have considerable promise as the basis for a new, fundamentally parallel, style of regular expression processing.  This paper examines the application of this processing. This paper examines the application of this approach to the development a full-featured Unicode-capable open-source grep implementation.  Constructed using a layered architecture of dynamic compilation and bitwise data parallelism. In performance comparisons with several contemporary alternatives, 10$\times$ or better speedups are often observed. 10$\times$ or better speedups are often observed. \end{abstract}
• ## docs/Working/icGrep/introduction.tex

 r4504 built on these techniques to accelerate business analytics applications using SSE instructions on commodity processors. Zu et al~\cite{zu2012gpu} use GPU technology to implement NFA-based regular Zu et al.~\cite{zu2012gpu} use GPU technology to implement NFA-based regular expression matching with parallelism devoted both to processing a compressed active state array as well as to handling matching of finite state machines (FSMs) are considered the hardest to parallelize (embarrassingly sequential) \cite{asanovic2006landscape}. However, some success has been reported recently along two independent lines of research.   Mytkowicz et al \cite{mytkowicz2014data} use SIMD shuffle operations to implement composable DFA transitions using dynamic convergence to reduce the number of states in play at any one time and range coalescing to compact the transition tables.   Unfortunately, the method seems unlikely to apply well to Unicode regular expression matching problems, which routinely require thousands of DFA states for named Unicode properties. Building on the Parabix framework, Cameron et al.~\cite{cameron2014bitwise} introduce regular expression matching using a new bitwise data parallel approach. research.   The first focuses on data-parallel division of an input stream into separately processed segments on multiple cores.   In this case, the challenge is to identify the appropriate starting state or states that need to be considered for all but the first segment.   Mytkowicz et al. \cite{mytkowicz2014data} use dynamic convergence and range coalescing to dramatically reduce the number of states in play at any point, while Zhao et al. \cite{zhao2014challenging} address the problem using principled speculation.   Introducing a second line of research, Cameron et al.~\cite{cameron2014bitwise} investigate new regular expression algorithms based on the concept of bitwise data parallelism, together with implementations that exploit short vector SIMD instructions to accelerate regular expression matching even within a single core. In this paper, we report on the use of the implementation of a full
Note: See TracChangeset for help on using the changeset viewer.