Changes between Initial Version and Version 1 of IcGrepUnicodeLevel2

Nov 1, 2015, 11:16:56 AM (3 years ago)



  • IcGrepUnicodeLevel2

    v1 v1  
     1= Unicode Level 2 Support in icGrep =
     3Building on the full [wiki:IcGrepUnicodeLevel1 Unicode Level 1] support of icGrep 1.0,
     4we're actively working towards full support of the Unicode Level 2 requirements of
     5[ Unicode Technical Standard #18] in icGrep 2.0.
     7The active development version of icGrep can be found in [source:icGREP/icgrep-devel].
     9== RL2.2 Extended Grapheme Clusters ==
     11From r4852, icGrep supports the following extended graphme cluster features.
     13 * `\X` to match a single extended grapheme cluster.
     14 * `\b{g}` syntax as a zero-width assertion for extended grapheme cluster boundaries.
     15 * `\B{g}` syntax as a zero-width assertion for the internal codepoint boundaries within extended grapheme clusters.   
     16 * `(?g)` syntax to enable grapheme cluster mode: regular expression elements must always match full grapheme clusters.
     18== RL2.3 Default Word Boundaries ==
     23== RL2.4 Default Case Conversion ==
     28== RL2.5 Name Properties ==
     30From r4852, icGrep supports `\N{}' syntax for codepoint names.   
     32== RL2.6 Wildcards in Property Values ==
     34From r4852, icGrep supports arbitrary regular expressions within `\N{}' syntax for codepoint names.
     35For example, `\N{\bSMIL(E|ING)\b}` denotes the set of all Unicode codepoints having one of the words
     36SMILE or SMILING in their names (Emoji search!).
     38== RL2.7 Full Properties ==
     40The [source:parabix/icGREP/icgrep-devel development version] of icGrep now
     41provides substantially more coverage of Unicode properties than icGrep 1.0.
     42See our separate page on [wiki:ICgrepPropertySupport Property Support in icgrep].