Changeset 1d2f85e in mainline for boot/generic/src/str.c


Ignore:
Timestamp:
2019-02-05T18:26:54Z (6 years ago)
Author:
Jiří Zárevúcky <zarevucky.jiri@…>
Parents:
08e103d4
Message:

Change documentation of <str.h> functions to use unambiguous terms

File:
1 edited

Legend:

Unmodified
Added
Removed
  • boot/generic/src/str.c

    r08e103d4 r1d2f85e  
    3737 * Strings and characters use the Universal Character Set (UCS). The standard
    3838 * strings, called just strings are encoded in UTF-8. Wide strings (encoded
    39  * in UTF-32) are supported to a limited degree. A single character is
     39 * in UTF-32) are supported to a limited degree. A single code point is
    4040 * represented as wchar_t.@n
    4141 *
     
    4646 *  byte                  8 bits stored in uint8_t (unsigned 8 bit integer)
    4747 *
    48  *  character             UTF-32 encoded Unicode character, stored in wchar_t
     48 *  character             UTF-32 encoded Unicode code point, stored in wchar_t
    4949 *                        (signed 32 bit integer), code points 0 .. 1114111
    5050 *                        are valid
     
    6262 *                        the NULL-terminator), size_t
    6363 *
    64  *  [wide] string length  number of CHARACTERS in a [wide] string (excluding
     64 *  [wide] string length  number of CODE POINTS in a [wide] string (excluding
    6565 *                        the NULL-terminator), size_t
    6666 *
     
    7676 *                            NULL-terminator)
    7777 *
    78  *  length  l        size_t   number of CHARACTERS in a string (excluding the
     78 *  length  l        size_t   number of CODE POINTS in a string (excluding the
    7979 *                            null terminator)
    8080 *
     
    8585 * Function naming prefixes:@n
    8686 *
    87  *  chr_    operate on characters
     87 *  chr_    operate on code points
    8888 *  ascii_  operate on ASCII characters
    8989 *  str_    operate on strings
     
    9898 *  pointer (char *, wchar_t *)
    9999 *  byte offset (size_t)
    100  *  character index (size_t)
     100 *  code point index (size_t)
    101101 *
    102102 */
     
    128128#define CONT_BITS  6
    129129
    130 /** Decode a single character from a string.
    131  *
    132  * Decode a single character from a string of size @a size. Decoding starts
     130/** Decode a single code point from an UTF-8 encoded string.
     131 *
     132 * Decode a single code point from a string of size @a size. Decoding starts
    133133 * at @a offset and this offset is moved to the beginning of the next
    134  * character. In case of decoding error, offset generally advances at least
     134 * code point. In case of decoding error, offset generally advances at least
    135135 * by one. However, offset is never moved beyond size.
    136136 *
     
    139139 * @param size   Size of the string (in bytes).
    140140 *
    141  * @return Value of decoded character, U_SPECIAL on decoding error or
     141 * @return Value of decoded code point, U_SPECIAL on decoding error or
    142142 *         NULL if attempt to decode beyond @a size.
    143143 *
     
    198198}
    199199
    200 /** Encode a single character to string representation.
    201  *
    202  * Encode a single character to string representation (i.e. UTF-8) and store
     200/** Encode a single code point to a UTF-8 string representation.
     201 *
     202 * Encode a single code point to a UTF-8 string representation and store
    203203 * it into a buffer at @a offset. Encoding starts at @a offset and this offset
    204  * is moved to the position where the next character can be written to.
    205  *
    206  * @param ch     Input character.
     204 * is moved to the position where the next code point can be written to.
     205 *
     206 * @param ch     Input code point.
    207207 * @param str    Output buffer.
    208208 * @param offset Byte offset where to start writing.
    209209 * @param size   Size of the output buffer (in bytes).
    210210 *
    211  * @return EOK if the character was encoded successfully, EOVERFLOW if there
    212  *         was not enough space in the output buffer or EINVAL if the character
     211 * @return EOK if the code point was encoded successfully, EOVERFLOW if there
     212 *         was not enough space in the output buffer or EINVAL if the code point
    213213 *         code was invalid.
    214214 */
     
    289289}
    290290
    291 /** Get size of string with length limit.
     291/** Get size of string with code point count limit.
    292292 *
    293293 * Get the number of bytes which are used by up to @a max_len first
    294  * characters in the string @a str. If @a max_len is greater than
    295  * the length of @a str, the entire string is measured (excluding the
    296  * NULL-terminator).
     294 * code points in the string @a str. If @a max_len is greater than
     295 * the number of code points in @a str, the entire string is measured
     296 * (excluding the NULL-terminator).
    297297 *
    298298 * @param str     String to consider.
    299  * @param max_len Maximum number of characters to measure.
    300  *
    301  * @return Number of bytes used by the characters.
     299 * @param max_len Maximum number of code points to measure.
     300 *
     301 * @return Number of bytes used by the code points.
    302302 *
    303303 */
     
    317317}
    318318
    319 /** Get number of characters in a string.
    320  *
    321  * @param str NULL-terminated string.
    322  *
    323  * @return Number of characters in string.
     319/** Get number of unicode code points in a UTF-8 encoded string.
     320 *
     321 * @param str NULL-terminated UTF-8 string.
     322 *
     323 * @return Number of code points in the string.
    324324 *
    325325 */
     
    335335}
    336336
    337 /** Check whether character is plain ASCII.
    338  *
    339  * @return True if character is plain ASCII.
     337/** Check whether code point is plain ASCII.
     338 *
     339 * @return True if code point is plain ASCII.
    340340 *
    341341 */
     
    348348}
    349349
    350 /** Check whether character is valid
    351  *
    352  * @return True if character is a valid Unicode code point.
     350/** Check whether code point is valid
     351 *
     352 * @return True if code point is a valid Unicode code point.
    353353 *
    354354 */
     
    365365 * Do a char-by-char comparison of two NULL-terminated strings.
    366366 * The strings are considered equal iff their length is equal
    367  * and both strings consist of the same sequence of characters.
    368  *
    369  * A string S1 is less than another string S2 if it has a character with
    370  * lower value at the first character position where the strings differ.
     367 * and both strings consist of the same sequence of code points.
     368 *
     369 * A string S1 is less than another string S2 if it has a code point with
     370 * lower value at the first code point position where the strings differ.
    371371 * If the strings differ in length, the shorter one is treated as if
    372  * padded by characters with a value of zero.
     372 * padded by code points with a value of zero.
    373373 *
    374374 * @param s1 First string to compare.
     
    409409 * No more than @a size bytes are written. If the size of the output buffer
    410410 * is at least one byte, the output string will always be well-formed, i.e.
    411  * null-terminated and containing only complete characters.
     411 * null-terminated and containing only complete code points.
    412412 *
    413413 * @param dest  Destination buffer.
Note: See TracChangeset for help on using the changeset viewer.