Changeset 3433 for proto/RE


Ignore:
Timestamp:
Aug 13, 2013, 6:48:04 AM (6 years ago)
Author:
cameron
Message:

fix

Location:
proto/RE
Files:
4 edited

Legend:

Unmodified
Added
Removed
  • proto/RE/doc/README

    r3022 r3433  
    1 Author: Ben Hull
    2 Created on: April 13, 2013
     1Research Issue:  Efficient Skipping
    32
    4 ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
    5 RE Instructions
    6 ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
    73
    8 1. Modify the input regular expression (RE/input/test/test1.re).
    9    RE/input/grammar/re.scatter has the current grammar.
     4Line-Based Skipping
    105
    11 2. Make in RE.
     6if the fixed strings do not exist in the line: skip the line
    127
    13 3. Make in RE/output.
    148
    15 4. Make in RE/output/src.
     9need a way to propagate bits to line begin/end
     10 
     11
     12
     13Suppose  simd<fw>::any  gives masks which are <fw> in length all zeroes or all ones
     14         simd<fw>::none
     15
     16Let p be a bit stream such that p is required to be 1 at some position on a line.
     17
     18
     19NotFoundInLine = scanthru(LineStart, [P|LineEnd]) & LineEnd
     20
     21
     22
     23
     24If a segment has a line end, but no occurrence of p, then we can
     25rule out any match up to the line end.
     26
     27
     28
     29
     30Compute
     31   
     32m1 = simd_and(span_stream, marker)
     33m2 = simd_and(simd<2>::all(span_stream), simd<2>::any(m1))
     34
     35
     36
     37
     38
  • proto/RE/doc/UTF8_class.py

    r3250 r3433  
    99#
    1010# Definitions for debugging/prototyping
    11 def ByteClassCompiler(lbyte): return "%x" % lbyte
    12 def ByteRangeCompiler(lbyte, hbyte): return "[%x-%x]" % (lbyte, hbyte)
     11def ByteClassCompiler(lbyte): return "\\x%x" % lbyte
     12def ByteRangeCompiler(lbyte, hbyte): return "[\\x%x-\\x%x]" % (lbyte, hbyte)
    1313def make_or(e1, e2): return "(%s | %s)" % (e1, e2)
    1414def make_and(e1, e2): return "(%s & %s)" % (e1, e2)
     
    3939   else: return 0x10FFFF
    4040
    41 def UTF8_range_compiler(lo, hi):
    42    hlen = UTF8_length(hi)
    43    # If different length code unit sequences are involved, make
    44    # a union of equilength subranges.
    45    if hlen > UTF8_length(lo):
    46      m = max_codepoint_of_length(hlen - 1)
    47      return make_or(UTF8_range_compiler(lo, m), UTF8_range_compiler(m+1, hi))
    48    #
    49    return matched_sequence_compiler(lo, hi, 1, hlen)
    5041
    5142def matched_sequence_compiler(lo, hi, n, hlen):
     43   """ Helper function to generate the code necessary to match bytes
     44       n through hlen (1-based indexing) of the range of UTF-8 sequences
     45       for codepoints lo through hi. """
    5246   hbyte = UTF8_byte(hi, n)
    5347   lbyte = UTF8_byte(lo, n)
     
    7973   return make_shift_forward(ByteRangeCompiler(lbyte, hbyte), hlen - n)
    8074
     75def UTF8_range_compiler(lo, hi):
     76   hlen = UTF8_length(hi)
     77   # If different length code unit sequences are involved, make
     78   # a union of equilength subranges.
     79   if hlen > UTF8_length(lo):
     80     m = max_codepoint_of_length(hlen - 1)
     81     return make_or(UTF8_range_compiler(lo, m), UTF8_range_compiler(m+1, hi))
     82   #
     83   return matched_sequence_compiler(lo, hi, 1, hlen)
    8184
     85
  • proto/RE/doc/re_proto.hs

    r3427 r3433  
    6464removeNullablePrefix (Join (r:rs))
    6565   | minMatchLen(r) == 0  = removeNullablePrefix(Join rs)
    66    | otherwise            = Join (r:rs)
     66   | otherwise            = Join ((removeNullablePrefix r):rs)
    6767removeNullablePrefix (Alt rs) = Alt (map removeNullablePrefix rs)
    6868removeNullablePrefix (Opt r) = Join []
  • proto/RE/output/grep_template.cpp

    r3410 r3433  
    3939#define SEGMENT_SIZE (BLOCK_SIZE * SEGMENT_BLOCKS)
    4040
     41#define BUFFER_SEGMENTS 32
     42#define BUFFER_SIZE (BUFFER_SEGMENTS * SEGMENT_SIZE)
    4143@global
    4244
Note: See TracChangeset for help on using the changeset viewer.