     174As shown in Figure \ref{fig:execution}, icGrep takes the input data and transposed it into 8 parallel bit streams through S2P module.
     175The required streams, e.g. line break stream, can then be generated using the 8 basis bits streams.
     176The JIT function retrieves the 8 basis bits and the required streams from their memory addresses and starts the matching process.
     177Named Property Library that includes all the predefined Unicode categories is installed into JIT function and can be called during the matching process.
     178JIT function returns one bitstream that marks all the matching positions.
     179A match scanner will scan through this bitstream and calculate the total counts or write the context of each match position.
     181We can also apply a pipeline parallelism strategy to further speed up the process of icGrep.
     182S2P and Required Streams Generator can be process in a separate thread and start even before the dynamic compilation starts.
     183The output of S2P and Required Streams Generator, that is the 8 basis bits streams and the required streams,
     184needs to be stored in a shared memory space so that the JIT function can read from it.
     185To be more efficient of memory space usage, we only allocate limit amount of space for the shared data.
     186When each chunk of the shared space is filled up with the bitstream data,
     187the thread will start writing to the first chunk if it is released by JIT function.
     188Otherwise, it will wait for JIT function until it finishes processing that chunk.
     189Therefore, the performance is depended on the slowest thread.
     190In the case that the cost of transposition and required stream generation is more than the matching process,
     191we can further divide up the work and assign two threads for S2P and Required Streams Generator.
