Changes between Version 6 and Version 7 of CharacterClassCompiler


Ignore:
Timestamp:
Mar 8, 2016, 3:47:28 PM (3 years ago)
Author:
cameron
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • CharacterClassCompiler

    v6 v7  
    115115==== Construction of Character Class Objects ====
    116116
    117  1.  Character classes may be built-up using the operations defined in [source:icGREP/icgrep-devel/re/re_cc.h], as follows.
     117 1.  Character classes may be built-up using the operations defined in [source:icGREP/icgrep-devel/icgrep/re/re_cc.h], as follows.
    118118     1.  Basic character classes may be constructed using a single codepoint or a codepoint range, using {{{re::makeCC(cp)}}} or {{{re:makeCC(lo_cp, hi_cp)}}}, respectively.
    119119     1.  Character classes may be further constructed by union of two character classes using {{{re:makeCC(cc1, cc2)}}}.
    120120     1.  Subtraction and intersection of character classes is also supported.
    121  1.  Character classes may also be constructed by parsing a character class expression as a regular expression object, using the {{{re::parse}}} routine of [source:icGREP/icgrep-devel/re/re_parser.h]
     121 1.  Character classes may also be constructed by parsing a character class expression as a regular expression object, using the {{{re::parse}}} routine of [source:icGREP/icgrep/icgrep-devel/re/re_parser.h]
    122122 1.  In the Parabix+LLVM framework, the current character class objects are based on Unicode codepoints in the range 0 to 0x10FFFF.
    123123     1.  Character class definition based on other alphabets can also potentially be supported - future work.
     124
     125==== Compilation of Character Code Unit Classes ----
     126 1.  When a character class is confined to a single code unit (byte, presently), the {{{compileCC}}} operations of [source:icGREP/icgrep/icgrep-devel/cc/cc_compiler.h] can be used to generate Pablo code, in a manner analogous to the Python character class compiler.
     127
     128
     129==== Compilation of Full UTF-8 Character Classes ====
     130 1.  Character class objects include code unit values from the full space of Unicode codepoints.
     131 1.  Compiling full Unicode definitions can be performed by the[source:icGREP/icgrep-devel/icgrep/UCD/ucd_compiler.hpp UCD compiler.}
     132
     133