Changeset 3652 for docs/Working

Feb 24, 2014, 4:00:36 PM (6 years ago)

Minor clean-up

2 edited


  • docs/Working/re/re-main.tex

    r3649 r3652  
    8585text processing algorithms that exhibit irregular memory access patterns
    8686can be efficiently executed on multicore hardware.
    87 In related work, Pasetto et al. presented a flexible tool that
     87In related work, Pasetto et al presented a flexible tool that
    8888performs small-ruleset regular expression matching at a rate of
    89892.88 Gbps per chip on Intel Xeon E5472 hardware \cite{pasetto2010}.
    90 Naghmouchi et al. \cite{scarpazza2011top,naghmouchi2010} demonstrated that the Aho-Corasick (AC)
     90Naghmouchi et al \cite{scarpazza2011top,naghmouchi2010} demonstrated that the Aho-Corasick (AC)
    9191string matching algorithm \cite{aho1975} is well suited for parallel
    9292implementation on multicore CPUs, GPGPUs and the Cell BE.
    9393On each hardware, both thread-level parallelism (cores) and data-level parallelism
    9494(SIMD units) were leveraged for performance.
    95 Salapura et. al. \cite{salapura2012accelerating} advocated the use of vector-style processing for regular expressions
     95Salapura et al \cite{salapura2012accelerating} advocated the use of vector-style processing for regular expressions
    9696in business analytics applications and leveraged the SIMD hardware available
    9797on multi-core processors to acheive a speedup of greater than 1.8 over a
    1091094 Gbps on the Cell BE.
    110110% GPU
    111 In more recent work, Tumeo et al. \cite{tumeo2010efficient} presented a chunk-based
     111In more recent work, Tumeo et al \cite{tumeo2010efficient} presented a chunk-based
    112112implementation of the AC algorithm for
    113113accelerating string matching on GPGPUs. Lin et al., proposed
    114114the Parallel Failureless Aho-Corasick (PFAC)
    115115algorithm to accelerate pattern matching on GPGPU hardware and
    116 achieved 143 Gbps throughput, 14.74 times faster
    117 than the AC algorithm performed on a four core
    118 multi-core processor using OpenMP \cite{lin2013accelerating}.
     116achieved 143 Gbps raw data throughput,
     117although system throughput was limited to 15 Gbps \cite{lin2013accelerating}.
    120119Whereas the existing approaches to parallelization have been
Note: See TracChangeset for help on using the changeset viewer.