Fork us on GitHub Follow us on Facebook Follow us on Twitter

Changes between Version 4 and Version 5 of StringAPI


Ignore:
Timestamp:
2013-12-31T12:48:28Z (5 years ago)
Author:
Martin Decky
Comment:

clarify wide strings

Legend:

Unmodified
Added
Removed
Modified
  • StringAPI

    v4 v5  
    1414HelenOS uses the Universal Character Set or UCS (as defined by ISO/IEC 10646) for representing characters throughout the system. A single ''character'' is represented as `wchar_t` (32-bit). Normally all ''strings'' are represented in UTF-8 and null-terminated. A string is usually declared as `char *`. The API also has limited support for strings that are not null-terimanted (or sub-strings).
    1515
    16 There is also limited support for ''wide strings''. These are encoded in UTF-32 and null-terminated. Wide strings can represent exactly the same characters like normal strings. However, with UTF-8 each character is encoded as one or more bytes. With UTF-32, which is used for the wide strings, each character is encoded as exactly four bytes.
     16There is also limited support for ''wide strings'', usually declared as `wchar_t *`. These are encoded in UTF-32 and null-terminated. Wide strings can represent exactly the same characters like normal strings. However, with UTF-8 each character is encoded as one or more bytes. With UTF-32, which is used for the wide strings, each character is encoded as exactly four bytes. Both wide characters and wide strings are encoded using natural byte order (little-endian on little-endian platforms, big-endian on big-endian platforms). The wide strings should never start with the "byte order mark" (BOM) character, the byte order is implicit.
    1717
    1818== Character and String Literals ==