Changeset 1301 for docs


Ignore:
Timestamp:
Aug 8, 2011, 3:38:11 PM (8 years ago)
Author:
ksherdy
Message:

Minor edits.

Location:
docs/PACT2011
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • docs/PACT2011/03-research.tex

    r1300 r1301  
    110110
    111111In general, the set of bit positions in a bit stream may be considered to be the current parsing
    112 positions of multiple parses taking place in parallel throughout the source data stream. Although it is not explicitly shown in these prior examples,
    113 error bit streams can be used to identify any well-formedness errors found during the parsing process. Error positions are gathered and
    114 processed in as a final post processing step. A further aspect of the parallel cursor method with bit stream addition is that the conditional branch statements used to identify
    115 syntax error at each each parsing position are eliminated. Hence, Parabix2 offers additional parallelism over Parabix1 in the form of multiple cursor
    116 parsing and further reduces branch misprediction penalties.
     112positions of multiple parses taking place in parallel throughout the source data stream. Although it is not explicitly shown in these prior examples, error bit streams can be used to identify any well-formedness errors found during the parsing process. Error positions are gathered and
     113processed in as a final post processing step. A further aspect of the parallel cursor method with bit stream addition is that the conditional branch statements used to identify syntax error at each each parsing position are eliminated. Hence, Parabix2 offers additional parallelism over Parabix1 in the form of multiple cursor parsing and further reduces branch misprediction penalties.
     114
     115\subsection{Parallel Bit Stream Compilation}
     116
     117While the description of parallel bit stream parsing in the previous section works conceptually on
     118unbounded bit streams, in practice, a corresponding C implementation to process input streams into blocks
     119of size equal to the SIMD register width of the target processor is required. In our work, we leverage the unbounded
     120integer type of the Python programming language. Using a restricted subset of Python, we prototype and validate the
     121functionality of applications, such as XML validation and UTF-8 to UTF-16 transcoding. We then compile this Python code
     122into equivalent block-at-a-time C code. The key question becomes how to transfer information from one block to the next whenever
     123token scans cross block boundaries.
     124
     125The answer lies in carry bit propagation. Since the parallel $scanto$ operation relies solely on bit-wise addition and logical operations,
     126block-to-block information transfer is captured in entirety by the carry bit associated with each underlying addition operation. Logical operations
     127do not require information flow across block boundaries. Properly determining, initializing and inserting carry bits into a block-by-block
     128implementation is tedious and error prone. Thus we have developed compiler technology to automatically transform parallel bit stream
     129Python code to block-at-a-time C implementations. Details are beyond the scope of this paper, but are described in the on-line
     130source code repository at parabix.costar.sfu.ca.
    117131
    118132
     133 
Note: See TracChangeset for help on using the changeset viewer.