    586586the 64 work groups.  Each work group carries out the regular
    587587expression matching operations 4096 bytes at a time using SIMT
    588 processing.  Figure \ref{fig:GPUadd} shows our implementation
    589 of long-stream addition on the GPU.  Each thread maintains
    590 its own carry and bubble values in shared memroy and performs
     588processing.   We were able to adapt our long-stream addition
     589model to the GPU as shown in Figure \ref{fig:GPUadd}.  The GPU
     590does not directly support the mask and spread operations,
     591but we are able to simulate them using shared memory.
     592Each thread maintains
     593its own carry and bubble values in shared memory and performs
    591594synchronized updates with the other threads using a six-step
    592 parallel-prefix style process.
     595parallel-prefix style process.  Others have implemented
     596long-stream addition on the GPU using similar techniques.
    599603and also accessed by GPU for further processing. Therefore,
    600604the expensive data transferring time that needed by traditional
    601 discrete GPUs is hinden and we compare only the kernel execution
    602 time with our SSE2 and AVX mplementations as shown in Figure
     605discrete GPUs is hidden and we compare only the kernel execution
     606time with our SSE2 and AVX implementations as shown in Figure
    603607\ref{fig:SSE-AVX-GPU}. The GPU version gives 30\% to 55\% performance
    604608improvement over SSE version and 10\% to 40\% performance
    616620further processing rather than jump to the next block with a
    617621simple IF test. Therefore, the performance of different
    618 regular expresions is depended on the number of function calls
    619 to the long-stream addition and the total number of matches
     622regular expresions is dependent on the number of
     623long-stream addition operations and the total number of matches
    620624of a given input.
    666670ylabel=Running Time (ms per megabyte),
    667 xticklabels={@,Date,Email,URIorEmail,xquote},
    668672tick label style={font=\tiny},
    669673enlarge x limits=0.15,
