Changeset 1201 for proto/charsetcompiler


Ignore:
Timestamp:
Jul 2, 2011, 2:33:00 PM (8 years ago)
Author:
cameron
Message:

Xerces prototype files

File:
1 edited

Legend:

Unmodified
Added
Removed
  • proto/charsetcompiler/inputs/UTF8_XML

    r914 r1201  
     1#  Definitions for UTF-8 plus XML 0xFFFE/0xFFFF identification.
     2#
    13u8_unibyte = [\x00-\x7F]
    24u8_prefix = [\xC0-\xFF]
     5u8_suffix = [\x80-\xBF]
     6# 2-byte sequences
    37u8_prefix2 = [\xC0-\xDF]
    4 u8_prefix3 = [\xE0-\xEF]
    5 u8_prefix4 = [\xF0-\xFF]
    6 u8_suffix = [\x80-\xBF]
    78# For 2 byte-sequence validation
    89u8_badprefix2 = [\xC0-\xC1]
     10# 3-byte sequences
     11u8_prefix3or4 = [\xE0-\xFF]
     12u8_prefix3 = [\xE0-\xEF]
    913# For 3 byte-sequence validation
    1014xE0 = [\xE0]
     
    1620xBF = [\xBF]
    1721xBE = [\xBE]
     22# 4-byte sequences
     23u8_prefix4 = [\xF0-\xFF]
    1824# 4 byte sequence validation
    1925u8_badprefix4 = [\xF5-\xFF]
Note: See TracChangeset for help on using the changeset viewer.