Changeset 4223


Ignore:
Timestamp:
Oct 8, 2014, 8:43:55 AM (5 years ago)
Author:
cameron
Message:

More functions for utf8 lib

Location:
proto/charsetcompiler
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • proto/charsetcompiler/unicode_category_compiler.py

    r4222 r4223  
    337337        return [group1] + subpartitions
    338338
    339 #
    340 def is_low_codepoint_after_byte(codepoint, byte):
    341     for i in range(byte, utf8_length(codepoint)):
    342         if utf8_byte(codepoint, i+1) != 0x80: return False
    343     return True
    344 
    345 def is_high_codepoint_after_byte(codepoint, byte):
    346     for i in range(byte, utf8_length(codepoint)):
    347         if utf8_byte(codepoint, i+1) != 0xBF: return False
    348     return True
    349 
    350339# Ensure the sequence of preceding bytes is defined, up to, but
    351340# not including the given byte_no
  • proto/charsetcompiler/utf8_lib.py

    r4222 r4223  
    6464
    6565
     66#
     67def is_low_codepoint_after_byte(codepoint, byte):
     68    for i in range(byte, utf8_length(codepoint)):
     69        if utf8_byte(codepoint, i+1) != 0x80: return False
     70    return True
    6671
     72def is_high_codepoint_after_byte(codepoint, byte):
     73    for i in range(byte, utf8_length(codepoint)):
     74        if utf8_byte(codepoint, i+1) != 0xBF: return False
     75    return True
     76
     77
Note: See TracChangeset for help on using the changeset viewer.