wiki:IcGrepUnicodeLevel1

Version 1 (modified by cameron, 4 years ago) (diff)

--

Unicode Level 1 Support in icGrep

RL1.1 Hex Notation

icGrep implements RL1.1 using backslash escape sequences beginning \x, \u and \U. An arbitrary Unicode codepoint may be represented by 1 to 6 hexadecimal digits enclosed in braces following either the \x, \u escapes. Alternatively, a codepoint may be represented using exactly 8 hexadecimal digits following the \U escape, without braces. Thus, the forms \x{1D11E}, \u{1D11E} and \U0001D11E}, all represent U+1D11E (musical symbol G clef).

For compatibility with legacy implementations, icgrep also accepts short forms without braces, consisting of 1 or 2 hex digits following \x or exactly 4 hex digits following \u.

Also for compatibility, icgrep accepts octal notation. An arbitrary codepoint may be represented by 1 to 8 octal digits enclosed in braces following the \o escape. The short form consisting of 0 to 3 octal digits following \0 (without braces) is also recognized.