Changeset 4342


Ignore:
Timestamp:
Dec 15, 2014, 10:28:18 PM (4 years ago)
Author:
cameron
Message:

Unicode line breaks; $ matches at final byte position

File:
1 edited

Legend:

Unmodified
Added
Removed
  • icGREP/icgrep-devel/icgrep/re/re_compiler.cpp

    r4340 r4342  
    6868
    6969//#define USE_IF_FOR_NONFINAL 1
    70 #define UNICODE_LINE_BREAK false
     70#define UNICODE_LINE_BREAK true
    7171
    7272   
     
    186186    else if (isa<End>(re)) {
    187187        if (UNICODE_LINE_BREAK) {
    188             // We would have to advance to the end of the Unicode LB category,
    189             // but that violates our marker assumption (a third marker type: atNextFinal???)
    190             throw std::runtime_error("Unsupported: $ with Unicode line break");
     188            PabloAST * nextPos = nextUnicodePosition(marker, pb);
     189            return makeFinalPositionMarker("end", pb.createAnd(nextPos, mUnicodeLineBreak), pb);
    191190        }
    192191        PabloAST * nextPos = postPositionVar(marker, pb);  // For LF match
Note: See TracChangeset for help on using the changeset viewer.