Version 1 (modified by cameron, 4 years ago) (diff) |
---|

# Unicode Level 1 Support in icGrep

## RL1.1 Hex Notation

icGrep implements RL1.1 using backslash escape sequences beginning `\x`, `\u` and `\U`.
An arbitrary Unicode codepoint may be represented by 1 to 6 hexadecimal digits enclosed in
braces following either the `\x`, `\u` escapes. Alternatively, a codepoint may be
represented using exactly 8 hexadecimal digits following the `\U` escape, without braces.
Thus, the forms `\x{1D11E`}, `\u{1D11E`} and `\U0001D11E`}, all represent U+1D11E (musical symbol G clef).

For compatibility with legacy implementations, icgrep also accepts short forms without braces,
consisting of 1 or 2 hex digits following `\x` or exactly 4 hex digits following `\u`.

Also for compatibility, icgrep accepts octal notation. An arbitrary codepoint may be
represented by 1 to 8 octal digits enclosed in braces following the `\o` escape.
The short form consisting of 0 to 3 octal digits following `\0` (without braces) is also recognized.