Changes between Version 5 and Version 6 of CharSetArch


Ignore:
Timestamp:
Jul 22, 2008, 5:12:17 AM (11 years ago)
Author:
cameron
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • CharSetArch

    v5 v6  
    11= Character Set Architecture =
    22
    3 Parabix has a character set architecture that is designed to provide high-performance native parsing for a wide variety of character sets.
    4 The architecture uses both the C++ template mechanism and an object hierarchy for character sets within families. 
     3The XML specification allows individual XML documents to be encoded in any of
     4a wide variety of character sets.   The Parabix character set architecture
     5is designed to provide high-performance native parsing for any reasonable
     6character set, while also easing the burden on application developers
     7by efficient transcoding to UTF-8 or UTF-16, as needed.
     8
     9The architecture uses both the C++ template mechanism and an object hierarchy for character sets within families.
     10
     11== Document Character Set, Working Character Set ==
     12
     13Conceptually, Parabix may be considered to be a family of parsing engines, one for
     14each possible pair of values (DCS, WCS) where DCS is the document character set
     15in which the XML document is encoded and WCS is the working character set for
     16strings that are delivered to the application (UTF-8 or UTF-16).   By supplying
     17specific (DCS, WCS) pairs at compile time through the template mechanism,
     18individual members of the family can be instantiated.   However, the space
     19of potential DCS values is itself organized as a family that is encoded
     20as a C++ object hierarchy.   This allows partial specification of character-set
     21family at compile-time, together with run-time determination of the processing
     22required for particular members of the family.
     23
    524
    625== Pseudo-ASCII ==