Changeset 4452 for docs


Ignore:
Timestamp:
Feb 3, 2015, 2:47:09 PM (4 years ago)
Author:
cameron
Message:

Intro

Location:
docs/Working/icGrep
Files:
4 edited

Legend:

Unmodified
Added
Removed
  • docs/Working/icGrep/background.tex

    r4450 r4452  
    5757to support high-performance streaming text processing based on a bit-parallel
    5858transform represenation of text.  In this representation, a text $T$ is represented as a set of parallel
    59 bit streams $B_i$, such that bit $j$ of stream $B_i$ is the $i^{\text{th}}$ bit of
     59bit streams $B_i$, such that bit $j$ of stream $B_i$ is the $i^{\mbox{th}}$ bit of
    6060character code unit $j$ of $T$.  The Parabix methods have been used to
    6161accelerate Unicode transcoding \cite{cameron2008case}, protein search \cite{green2009modeling},
    6262XML parsing \cite{cameron2011parallel}, and, most recently, regular expression search \cite{cameron2014bitwise}.
    63 
    6463
    6564
  • docs/Working/icGrep/bitgrep.bib

    r4448 r4452  
    295295}
    296296
    297 
     297@inproceedings{mytkowicz2014data,
     298  title={Data-parallel finite-state machines},
     299  author={Mytkowicz, Todd and Musuvathi, Madanlal and Schulte, Wolfram},
     300  booktitle={Proceedings of the 19th international conference on Architectural support for programming languages and operating systems},
     301  pages={529--542},
     302  year={2014},
     303  organization={ACM}
     304}
     305@article{scarpazza2011top,
     306  title={Top-performance tokenization and small-ruleset regular expression matching},
     307  author={Scarpazza, Daniele Paolo},
     308  journal={International Journal of Parallel Programming},
     309  volume={39},
     310  number={1},
     311  pages={3--32},
     312  year={2011},
     313  publisher={Springer}
     314}
     315@inproceedings{salapura2012accelerating,
     316  title={Accelerating business analytics applications},
     317  author={Salapura, Valentina and Karkhanis, Tejas and Nagpurkar, Priya and Moreira, Jose},
     318  booktitle={High Performance Computer Architecture (HPCA), 2012 IEEE 18th International Symposium on},
     319  pages={1--10},
     320  year={2012},
     321  organization={IEEE}
     322}
     323@inproceedings{zu2012gpu,
     324  title={GPU-based NFA implementation for memory efficient high speed regular expression matching},
     325  author={Zu, Yuan and Yang, Ming and Xu, Zhonghu and Wang, Lin and Tian, Xin and Peng, Kunyang and Dong, Qunfeng},
     326  booktitle={ACM SIGPLAN Notices},
     327  volume={47},
     328  number={8},
     329  pages={129--140},
     330  year={2012},
     331  organization={ACM}
     332}
     333
  • docs/Working/icGrep/introduction.tex

    r4448 r4452  
    11\section{Introduction}
     2
    23The venerable Unix {\tt grep} program is an everyday tool widely used
    34to search for lines in text files matching a given regular expression
     
    910parallelization have generally concentrated on the use of SIMD, multicore
    1011or GPU technologies to accelerate multiple instances of independent
    11 matching problems.   Brief review ...
     12matching problems. 
     13Scarpazza \cite{scarpazza2011top} used SIMD and multicore parallelism
     14to accelerate small ruleset tokenization applications on the Cell
     15Broadband Engine while Valaspura \cite{salapura2012accelerating}
     16built on these techniques to accelerate business analytics applications
     17using SSE instructions on commodity processors.   
     18Zu et al \cite{zu2012gpu} use GPU technology to implement NFA-based regular
     19expression matching with parallelism devoted both to processing
     20a compressed active state array as well as to handling matching of
     21multiple packet instances.
    1222These works have not generally tackled Unicode matching problems.
    1323
    1424Using parallel methods to accelerate matching of a single pattern on a
    15 single input stream is more difficult.  Indeed, the finite state machines (FSMs)
    16 underlying are considered the hardest to parallelize (embarassingly sequential)
    17 of the 13 dwarves identified in the Berkeley landscape of parallel computing research \cite{asanovic2006landscape}.
     25single input stream is more difficult.  Indeed, of the 13 dwarves identified in the Berkeley overview of parallel computing research,
     26finite state machines (FSMs) are considered the hardest to parallelize (embarassingly sequential) \cite{asanovic2006landscape}.
    1827However, some success has been reported recently along two independent lines of
    19 research.   Mytkowicz...   Cameron/Parabix....
    20 These works do not address Unicode issues.
     28research.   Mytkowicz et al \cite{mytkowicz2014data} use SIMD shuffle operations
     29to implement composable DFA transitions using dynamic convergence to
     30reduce the number of states in play at any one time and range coalescing
     31to compact the transition tables.   Unfortunately, the method seems unlikely
     32to apply well to Unicode regular expression matching problems, which routinely
     33require thousands of DFA states for named Unicode properties.
     34Building on the Parabix framework, Cameron at al \cite{cameron2014bitwise} introduce
     35regular expression matching using the bitwise
     36data parallel approach together with the MatchStar primitive
     37for efficient implementation of
     38Kleene-* character-class repetitions.
    2139
    2240In this paper, we report on the use of the implementation of a full
     
    2947to classical grep implementations, ICgrep offers dramatic performance
    3048acceleration in ASCII-based and Unicode matching performance alike.
    31 
    32 
    3349
    3450The remainder of this paper is organized as follows.   Section \ref{sec:background}
Note: See TracChangeset for help on using the changeset viewer.