Changeset 3466


Ignore:
Timestamp:
Sep 12, 2013, 12:09:26 PM (5 years ago)
Author:
cameron
Message:

Updates

Location:
docs/Working/re
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • docs/Working/re/ppopp-re.tex

    r3464 r3466  
    9696approach and its implementation on SIMD and GPU architectures.
    9797This approach relies on a bitwise data parallel view of text
    98 streams as well as a surprising use of addition to implement
    99 matching operations.   The closest previous work is that
    100 underlying bit-parallel XML parsing using 128-bit SSE2 SIMD
     98streams as well as a surprising use of addition to match
     99runs of characters in a single step.  The closest previous
     100work is that underlying bit-parallel XML parsing using 128-bit SSE2 SIMD
    101101technology together with a parallel scanning primitive also
    102102based on addition \cite{cameron2011parallel}.   
     
    108108addition technique involving a further application of MatchStar
    109109that enables us to scale the technique to $n$-bit addition
    110 in $log_{64} n$ steps.
     110in $log_{64} n$ steps.   We ultimately apply this technique,
     111for example, to perform
     112synchronized 4096-bit addition on GPU warps of 64 threads.
    111113
    112114There is also a strong keyword match between the bit-parallel
    113 data streams used in our approach and the bit-parallelism of
    114 regular expression using the bitap and Wu-Manber NFA
    115 (nondeterministic finite automata) algorithms \cite{}.
     115data streams used in our approach and the bit-parallelism
     116used for NFA state transitions in the classical algorithms of
     117Wu and Manber \cite{wu1992agrep}, Baez-Yates and Gonnet \cite{baeza1992new}
     118and Navarro and Raffinot \cite{navarro1998bit}.
    116119However those algorithms use bit-parallelism in a fundamentally
    117 different way: to represent all possible current NFA states
    118 as a bit vector and to perform byte-at-a-time transitions
    119 using the bitwise operations.   Nevertheless, the agrep and
    120 nrgrep programs implemented using these techniques remain
    121 among the strongest competitors in overall performance
    122 to our implementations.
    123 
     120different way: representing all possible current NFA states
     121as a bit vector and performing parallel transitions to a new
     122set of states using table lookups and bitwise logic.    Whereas
     123our approach can match multiple characters per step, bit-parallel
     124NFA algorithms proceed through the input one byte at a time.
     125Nevertheless, the agrep \cite{wu1992agrep} and
     126nrgrep \cite{navarro2000} programs implemented using these techniques remain
     127among the strongest competitors in regular expression matching
     128performance, so we include them in our comparative evaluation.
    124129
    125130
    126131The remainder of this paper is organized as follows.
    127 Section \ref{sec:bitwise}
     132Section \ref{sec:unbounded} presents our basic algorithm using
     133a model of unbounded bitwise data parallelism. 
    128134Section \ref{sec:regexp}
    129135Section \ref{sec:analysis}
     
    134140
    135141
    136 \section{Unbounded Bitwise Data Parallelism}\label{sec:bitwise}
     142\section{An Unbounded Model for Bitwise Regular Expression Matching}\label{sec:unbounded}
    137143
    138144Whereas all traditional regular expression matching engines are
     
    158164UTF-16.   In the case of a byte stream, the first step is to transpose
    159165the byte stream into eight parallel bit streams, such that bit stream
    160 $i$ comprises the $i$\th bit of each byte.   These streams form
     166$i$ comprises the $i^\text{th}$ bit of each byte.   These streams form
    161167a set of basis bit streams from which many other parallel bit
    162168streams can be calculated, such as character class bit
  • docs/Working/re/reference.bib

    r3464 r3466  
    8686
    8787@inproceedings{cameron2008high,
    88   title={High performance XML parsing using parallel bit stream technology},
     88  title={High performance {XML} parsing using parallel bit stream technology},
    8989  author={Cameron, Robert D and Herdy, Kenneth S and Lin, Dan},
    9090  booktitle={Proceedings of the 2008 conference of the center for advanced studies on collaborative research: meeting of minds},
     
    9595
    9696@inproceedings{cameron2009parallel,
    97   title={Parallel bit stream technology as a foundation for XML parsing performance},
     97  title={Parallel bit stream technology as a foundation for {XML} parsing performance},
    9898  author={Cameron, Rob and Herdy, Ken and Amiri, Ehsan},
    9999  booktitle={International Symposium on Processing XML Efficiently: Overcoming Limits on Space, Time, or Bandwidth},
     
    103103
    104104@incollection{cameron2011parallel,
    105   title={Parallel scanning with bitstream addition: An xml case study},
     105  title={Parallel scanning with bitstream addition: An {XML} case study},
    106106  author={Cameron, Robert D and Amiri, Ehsan and Herdy, Kenneth S and Lin, Dan and Shermer, Thomas C and Popowich, Fred P},
    107107  booktitle={Euro-Par 2011 Parallel Processing},
     
    157157  publisher={ACM}
    158158}
     159
     160@article{baeza1992new,
     161  title={A new approach to text searching},
     162  author={Baeza-Yates, Ricardo and Gonnet, Gaston H},
     163  journal={Communications of the ACM},
     164  volume={35},
     165  number={10},
     166  pages={74--82},
     167  year={1992},
     168  publisher={ACM}
     169}
Note: See TracChangeset for help on using the changeset viewer.