Changeset 1d2f85e in mainline for kernel/generic/src/lib/str.c


Ignore:
Timestamp:
2019-02-05T18:26:54Z (6 years ago)
Author:
Jiří Zárevúcky <zarevucky.jiri@…>
Parents:
08e103d4
Message:

Change documentation of <str.h> functions to use unambiguous terms

File:
1 edited

Legend:

Unmodified
Added
Removed
  • kernel/generic/src/lib/str.c

    r08e103d4 r1d2f85e  
    4141 * Strings and characters use the Universal Character Set (UCS). The standard
    4242 * strings, called just strings are encoded in UTF-8. Wide strings (encoded
    43  * in UTF-32) are supported to a limited degree. A single character is
     43 * in UTF-32) are supported to a limited degree. A single code point is
    4444 * represented as wchar_t.@n
    4545 *
     
    5050 *  byte                  8 bits stored in uint8_t (unsigned 8 bit integer)
    5151 *
    52  *  character             UTF-32 encoded Unicode character, stored in wchar_t
     52 *  character             UTF-32 encoded Unicode code point, stored in wchar_t
    5353 *                        (signed 32 bit integer), code points 0 .. 1114111
    5454 *                        are valid
     
    6666 *                        the NULL-terminator), size_t
    6767 *
    68  *  [wide] string length  number of CHARACTERS in a [wide] string (excluding
     68 *  [wide] string length  number of CODE POINTS in a [wide] string (excluding
    6969 *                        the NULL-terminator), size_t
    7070 *
     
    8080 *                            NULL-terminator)
    8181 *
    82  *  length  l        size_t   number of CHARACTERS in a string (excluding the
     82 *  length  l        size_t   number of CODE POINTS in a string (excluding the
    8383 *                            null terminator)
    8484 *
     
    8989 * Function naming prefixes:@n
    9090 *
    91  *  chr_    operate on characters
     91 *  chr_    operate on code points
    9292 *  ascii_  operate on ASCII characters
    9393 *  str_    operate on strings
     
    102102 *  pointer (char *, wchar_t *)
    103103 *  byte offset (size_t)
    104  *  character index (size_t)
     104 *  code point index (size_t)
    105105 *
    106106 */
     
    137137#define CONT_BITS  6
    138138
    139 /** Decode a single character from a string.
    140  *
    141  * Decode a single character from a string of size @a size. Decoding starts
     139/** Decode a single code point from an UTF-8 encoded string.
     140 *
     141 * Decode a single code point from a string of size @a size. Decoding starts
    142142 * at @a offset and this offset is moved to the beginning of the next
    143  * character. In case of decoding error, offset generally advances at least
     143 * code point. In case of decoding error, offset generally advances at least
    144144 * by one. However, offset is never moved beyond size.
    145145 *
     
    148148 * @param size   Size of the string (in bytes).
    149149 *
    150  * @return Value of decoded character, U_SPECIAL on decoding error or
     150 * @return Value of decoded code point, U_SPECIAL on decoding error or
    151151 *         NULL if attempt to decode beyond @a size.
    152152 *
     
    207207}
    208208
    209 /** Encode a single character to string representation.
    210  *
    211  * Encode a single character to string representation (i.e. UTF-8) and store
     209/** Encode a single code point to a UTF-8 string representation.
     210 *
     211 * Encode a single code point to a UTF-8 string representation and store
    212212 * it into a buffer at @a offset. Encoding starts at @a offset and this offset
    213  * is moved to the position where the next character can be written to.
    214  *
    215  * @param ch     Input character.
     213 * is moved to the position where the next code point can be written to.
     214 *
     215 * @param ch     Input code point.
    216216 * @param str    Output buffer.
    217217 * @param offset Byte offset where to start writing.
    218218 * @param size   Size of the output buffer (in bytes).
    219219 *
    220  * @return EOK if the character was encoded successfully, EOVERFLOW if there
    221  *         was not enough space in the output buffer or EINVAL if the character
     220 * @return EOK if the code point was encoded successfully, EOVERFLOW if there
     221 *         was not enough space in the output buffer or EINVAL if the code point
    222222 *         code was invalid.
    223223 */
     
    313313}
    314314
    315 /** Get size of string with length limit.
     315/** Get size of string with code point count limit.
    316316 *
    317317 * Get the number of bytes which are used by up to @a max_len first
    318  * characters in the string @a str. If @a max_len is greater than
    319  * the length of @a str, the entire string is measured (excluding the
    320  * NULL-terminator).
     318 * code points in the string @a str. If @a max_len is greater than
     319 * the number of code points in @a str, the entire string is measured
     320 * (excluding the NULL-terminator).
    321321 *
    322322 * @param str     String to consider.
    323  * @param max_len Maximum number of characters to measure.
    324  *
    325  * @return Number of bytes used by the characters.
     323 * @param max_len Maximum number of code points to measure.
     324 *
     325 * @return Number of bytes used by the code points.
    326326 *
    327327 */
     
    344344 *
    345345 * Get the number of bytes which are used by up to @a max_len first
    346  * wide characters in the wide string @a str. If @a max_len is greater than
     346 * code points in the wide string @a str. If @a max_len is greater than
    347347 * the length of @a str, the entire wide string is measured (excluding the
    348348 * NULL-terminator).
    349349 *
    350350 * @param str     Wide string to consider.
    351  * @param max_len Maximum number of wide characters to measure.
    352  *
    353  * @return Number of bytes used by the wide characters.
     351 * @param max_len Maximum number of code points to measure.
     352 *
     353 * @return Number of bytes used by the code points.
    354354 *
    355355 */
     
    359359}
    360360
    361 /** Get number of characters in a string.
    362  *
    363  * @param str NULL-terminated string.
    364  *
    365  * @return Number of characters in string.
     361/** Get number of unicode code points in a UTF-8 encoded string.
     362 *
     363 * @param str NULL-terminated UTF-8 string.
     364 *
     365 * @return Number of code points in the string.
    366366 *
    367367 */
     
    377377}
    378378
    379 /** Get number of characters in a wide string.
     379/** Get number of code points in a wide string.
    380380 *
    381381 * @param str NULL-terminated wide string.
    382382 *
    383  * @return Number of characters in @a str.
     383 * @return Number of code points in @a str.
    384384 *
    385385 */
     
    394394}
    395395
    396 /** Get number of characters in a string with size limit.
     396/** Get number of code points in a string with size limit.
    397397 *
    398398 * @param str  NULL-terminated string.
    399399 * @param size Maximum number of bytes to consider.
    400400 *
    401  * @return Number of characters in string.
     401 * @return Number of code points in string.
    402402 *
    403403 */
     
    413413}
    414414
    415 /** Get number of characters in a string with size limit.
     415/** Get number of code points in a string with size limit.
    416416 *
    417417 * @param str  NULL-terminated string.
    418418 * @param size Maximum number of bytes to consider.
    419419 *
    420  * @return Number of characters in string.
     420 * @return Number of code points in string.
    421421 *
    422422 */
     
    435435}
    436436
    437 /** Check whether character is plain ASCII.
    438  *
    439  * @return True if character is plain ASCII.
     437/** Check whether code point is plain ASCII.
     438 *
     439 * @return True if code point is plain ASCII.
    440440 *
    441441 */
     
    448448}
    449449
    450 /** Check whether character is valid
    451  *
    452  * @return True if character is a valid Unicode code point.
     450/** Check whether code point is valid
     451 *
     452 * @return True if code point is a valid Unicode code point.
    453453 *
    454454 */
     
    465465 * Do a char-by-char comparison of two NULL-terminated strings.
    466466 * The strings are considered equal iff their length is equal
    467  * and both strings consist of the same sequence of characters.
    468  *
    469  * A string S1 is less than another string S2 if it has a character with
    470  * lower value at the first character position where the strings differ.
     467 * and both strings consist of the same sequence of code points.
     468 *
     469 * A string S1 is less than another string S2 if it has a code point with
     470 * lower value at the first code point position where the strings differ.
    471471 * If the strings differ in length, the shorter one is treated as if
    472  * padded by characters with a value of zero.
     472 * padded by code points with a value of zero.
    473473 *
    474474 * @param s1 First string to compare.
     
    509509 * The strings are considered equal iff
    510510 * min(str_code_points(s1), max_len) == min(str_code_points(s2), max_len)
    511  * and both strings consist of the same sequence of characters,
    512  * up to max_len characters.
    513  *
    514  * A string S1 is less than another string S2 if it has a character with
    515  * lower value at the first character position where the strings differ.
     511 * and both strings consist of the same sequence of code points,
     512 * up to max_len code points.
     513 *
     514 * A string S1 is less than another string S2 if it has a code point with
     515 * lower value at the first code point position where the strings differ.
    516516 * If the strings differ in length, the shorter one is treated as if
    517  * padded by characters with a value of zero. Only the first max_len
    518  * characters are considered.
     517 * padded by code points with a value of zero. Only the first max_len
     518 * code points are considered.
    519519 *
    520520 * @param s1      First string to compare.
    521521 * @param s2      Second string to compare.
    522  * @param max_len Maximum number of characters to consider.
     522 * @param max_len Maximum number of code points to consider.
    523523 *
    524524 * @return 0 if the strings are equal, -1 if the first is less than the second,
     
    564564 * No more than @a size bytes are written. If the size of the output buffer
    565565 * is at least one byte, the output string will always be well-formed, i.e.
    566  * null-terminated and containing only complete characters.
     566 * null-terminated and containing only complete code points.
    567567 *
    568568 * @param dest  Destination buffer.
     
    594594 * @a dest. No more than @a size bytes are written. The output string will
    595595 * always be well-formed, i.e. null-terminated and containing only complete
    596  * characters.
     596 * code points.
    597597 *
    598598 * No more than @a n bytes are read from the input string, so it does not
     
    652652}
    653653
    654 /** Find first occurence of character in string.
     654/** Find first occurence of code point in string.
    655655 *
    656656 * @param str String to search.
    657  * @param ch  Character to look for.
    658  *
    659  * @return Pointer to character in @a str or NULL if not found.
     657 * @param ch  code point to look for.
     658 *
     659 * @return Pointer to code point in @a str or NULL if not found.
    660660 */
    661661char *str_chr(const char *str, wchar_t ch)
     
    674674}
    675675
    676 /** Insert a wide character into a wide string.
    677  *
    678  * Insert a wide character into a wide string at position
    679  * @a pos. The characters after the position are shifted.
     676/** Insert a code point into a wide string.
     677 *
     678 * Insert a code point into a wide string at position
     679 * @a pos. The code points after the position are shifted.
    680680 *
    681681 * @param str     String to insert to.
    682  * @param ch      Character to insert to.
    683  * @param pos     Character index where to insert.
    684  * @param max_pos Characters in the buffer.
     682 * @param ch      Code point to insert.
     683 * @param pos     Code point index where to insert.
     684 * @param max_pos Number of code points that fit in the buffer.
    685685 *
    686686 * @return True if the insertion was sucessful, false if the position
     
    704704}
    705705
    706 /** Remove a wide character from a wide string.
    707  *
    708  * Remove a wide character from a wide string at position
    709  * @a pos. The characters after the position are shifted.
     706/** Remove a code point from a wide string.
     707 *
     708 * Remove a code point from a wide string at position
     709 * @a pos. The code points after the position are shifted.
    710710 *
    711711 * @param str String to remove from.
    712  * @param pos Character index to remove.
     712 * @param pos Code point index to remove.
    713713 *
    714714 * @return True if the removal was sucessful, false if the position
     
    732732/** Duplicate string.
    733733 *
    734  * Allocate a new string and copy characters from the source
    735  * string into it. The duplicate string is allocated via sleeping
    736  * malloc(), thus this function can sleep in no memory conditions.
    737  *
    738  * The allocation cannot fail and the return value is always
    739  * a valid pointer. The duplicate string is always a well-formed
     734 * Allocate a new string and copy the contents of the source string into it.
     735 * The duplicate string is allocated as if by malloc().
     736 *
     737 * If successful, the duplicate string is always a well-formed
    740738 * null-terminated UTF-8 string, but it can differ from the source
    741739 * string on the byte level.
     
    743741 * @param src Source string.
    744742 *
    745  * @return Duplicate string.
     743 * @return Duplicate string, or NULL if allocation failed.
    746744 *
    747745 */
     
    760758 *
    761759 * Allocate a new string and copy up to @max_size bytes from the source
    762  * string into it. The duplicate string is allocated via sleeping
    763  * malloc(), thus this function can sleep in no memory conditions.
     760 * string into it. The duplicate string is allocated as if by malloc().
    764761 * No more than @max_size + 1 bytes is allocated, but if the size
    765762 * occupied by the source string is smaller than @max_size + 1,
    766763 * less is allocated.
    767764 *
    768  * The allocation cannot fail and the return value is always
    769  * a valid pointer. The duplicate string is always a well-formed
     765 * If successful, the duplicate string is always a well-formed
    770766 * null-terminated UTF-8 string, but it can differ from the source
    771767 * string on the byte level.
Note: See TracChangeset for help on using the changeset viewer.