Changeset 4479 for docs/Working/icGrep


Ignore:
Timestamp:
Feb 8, 2015, 11:02:09 AM (5 years ago)
Author:
cameron
Message:

Polishing the section on the dynamic grep engine.

Location:
docs/Working/icGrep
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • docs/Working/icGrep/architecture.tex

    r4476 r4479  
    7373
    7474
    75 As shown in Figure~\ref{fig:execution}, \icGrep{} takes the input data and transposes it into 8 parallel bit streams through the Transposition module.
    76 The required streams, e.g. the line break stream, can then be generated using the 8 basis bits streams.
     75Figure~\ref{fig:execution} shows the structure of the \icGrep{} matching engine.
     76The input data is transposed into 8 parallel bit streams through the Transposition module.
     77Using the 8 basis bits streams, the Required Streams Generator computes the
     78line break streams, UTF-8 validation streams and the Initial and NonFinal streams
     79needed to support ScanThru and MatchStar with UTF-8 data.
    7780The Dynamic Matcher, dynamically compiled via LLVM, retrieves the 8 basis bits and the required streams from their memory addresses and starts the matching process.
    78 The Named Property Library that includes all the predefined Unicode categories is installed into the Dynamic Matcher and can be called during the matching process.
    79 The Dynamic Matcher returns one bitstream that marks all the matching positions.
    80 Finally, a Match Scanner scans through the returned bitstream and calculates the total counts or writes the context of each match position.
     81During the matching process, any references to named Unicode properties generate calls to the appropriate routine in the Named Property Library.
     82The Dynamic Matcher returns one bitstream that marks all the positions that fully match the compiled regular expression.
     83Finally, a Match Scanner scans through the returned bitstream to select the matching lines and generate the normal grep output.
    8184
    8285We can also apply a pipeline parallelism strategy to further speed up the process of \icGrep{}.
    83 Transposition and the Required Streams Generator can be performed in a separate thread and start even before the dynamic compilation starts.
     86Transposition and the Required Streams Generator can be performed in a separate thread which can start even before the dynamic compilation starts.
    8487The output of Transposition and the Required Streams Generator, that is the 8 basis bits streams and the required streams,
    85 needs to be stored in a shared memory space so that the Dynamic Matcher can read from it.
    86 To more efficiently use memory, we allocate only a limited amount of space for the shared data.
    87 After each chunk of the shared space is filled with bitstream data,
    88 the thread starts writing to the first chunk if it has been released by Dynamic Matcher.
    89 Otherwise, it will wait for Dynamic Matcher until it finishes processing that chunk.
    90 Therefore, the performance is depended on the slowest thread.
     88are stored in a shared memory buffer for susequent processing by the Dynamic Matcher once compilation is complete.
     89A single thread performs both compilation and matching using the computed basis and required streams.
     90To avoid L2 cache contention, we allocate only a limited amount of space for the shared data in a circular buffer.
     91The performance is dependent on the slowest thread.   
    9192In the case that the cost of transposition and required stream generation is more than the matching process,
    9293we can further divide up the work and assign two threads for Transposition and Required Streams Generator.
Note: See TracChangeset for help on using the changeset viewer.