Changeset 1d2f85e in mainline


Ignore:
Timestamp:
2019-02-05T18:26:54Z (6 years ago)
Author:
Jiří Zárevúcky <zarevucky.jiri@…>
Parents:
08e103d4
Message:

Change documentation of <str.h> functions to use unambiguous terms

Files:
5 edited

Legend:

Unmodified
Added
Removed
  • boot/generic/src/str.c

    r08e103d4 r1d2f85e  
    3737 * Strings and characters use the Universal Character Set (UCS). The standard
    3838 * strings, called just strings are encoded in UTF-8. Wide strings (encoded
    39  * in UTF-32) are supported to a limited degree. A single character is
     39 * in UTF-32) are supported to a limited degree. A single code point is
    4040 * represented as wchar_t.@n
    4141 *
     
    4646 *  byte                  8 bits stored in uint8_t (unsigned 8 bit integer)
    4747 *
    48  *  character             UTF-32 encoded Unicode character, stored in wchar_t
     48 *  character             UTF-32 encoded Unicode code point, stored in wchar_t
    4949 *                        (signed 32 bit integer), code points 0 .. 1114111
    5050 *                        are valid
     
    6262 *                        the NULL-terminator), size_t
    6363 *
    64  *  [wide] string length  number of CHARACTERS in a [wide] string (excluding
     64 *  [wide] string length  number of CODE POINTS in a [wide] string (excluding
    6565 *                        the NULL-terminator), size_t
    6666 *
     
    7676 *                            NULL-terminator)
    7777 *
    78  *  length  l        size_t   number of CHARACTERS in a string (excluding the
     78 *  length  l        size_t   number of CODE POINTS in a string (excluding the
    7979 *                            null terminator)
    8080 *
     
    8585 * Function naming prefixes:@n
    8686 *
    87  *  chr_    operate on characters
     87 *  chr_    operate on code points
    8888 *  ascii_  operate on ASCII characters
    8989 *  str_    operate on strings
     
    9898 *  pointer (char *, wchar_t *)
    9999 *  byte offset (size_t)
    100  *  character index (size_t)
     100 *  code point index (size_t)
    101101 *
    102102 */
     
    128128#define CONT_BITS  6
    129129
    130 /** Decode a single character from a string.
    131  *
    132  * Decode a single character from a string of size @a size. Decoding starts
     130/** Decode a single code point from an UTF-8 encoded string.
     131 *
     132 * Decode a single code point from a string of size @a size. Decoding starts
    133133 * at @a offset and this offset is moved to the beginning of the next
    134  * character. In case of decoding error, offset generally advances at least
     134 * code point. In case of decoding error, offset generally advances at least
    135135 * by one. However, offset is never moved beyond size.
    136136 *
     
    139139 * @param size   Size of the string (in bytes).
    140140 *
    141  * @return Value of decoded character, U_SPECIAL on decoding error or
     141 * @return Value of decoded code point, U_SPECIAL on decoding error or
    142142 *         NULL if attempt to decode beyond @a size.
    143143 *
     
    198198}
    199199
    200 /** Encode a single character to string representation.
    201  *
    202  * Encode a single character to string representation (i.e. UTF-8) and store
     200/** Encode a single code point to a UTF-8 string representation.
     201 *
     202 * Encode a single code point to a UTF-8 string representation and store
    203203 * it into a buffer at @a offset. Encoding starts at @a offset and this offset
    204  * is moved to the position where the next character can be written to.
    205  *
    206  * @param ch     Input character.
     204 * is moved to the position where the next code point can be written to.
     205 *
     206 * @param ch     Input code point.
    207207 * @param str    Output buffer.
    208208 * @param offset Byte offset where to start writing.
    209209 * @param size   Size of the output buffer (in bytes).
    210210 *
    211  * @return EOK if the character was encoded successfully, EOVERFLOW if there
    212  *         was not enough space in the output buffer or EINVAL if the character
     211 * @return EOK if the code point was encoded successfully, EOVERFLOW if there
     212 *         was not enough space in the output buffer or EINVAL if the code point
    213213 *         code was invalid.
    214214 */
     
    289289}
    290290
    291 /** Get size of string with length limit.
     291/** Get size of string with code point count limit.
    292292 *
    293293 * Get the number of bytes which are used by up to @a max_len first
    294  * characters in the string @a str. If @a max_len is greater than
    295  * the length of @a str, the entire string is measured (excluding the
    296  * NULL-terminator).
     294 * code points in the string @a str. If @a max_len is greater than
     295 * the number of code points in @a str, the entire string is measured
     296 * (excluding the NULL-terminator).
    297297 *
    298298 * @param str     String to consider.
    299  * @param max_len Maximum number of characters to measure.
    300  *
    301  * @return Number of bytes used by the characters.
     299 * @param max_len Maximum number of code points to measure.
     300 *
     301 * @return Number of bytes used by the code points.
    302302 *
    303303 */
     
    317317}
    318318
    319 /** Get number of characters in a string.
    320  *
    321  * @param str NULL-terminated string.
    322  *
    323  * @return Number of characters in string.
     319/** Get number of unicode code points in a UTF-8 encoded string.
     320 *
     321 * @param str NULL-terminated UTF-8 string.
     322 *
     323 * @return Number of code points in the string.
    324324 *
    325325 */
     
    335335}
    336336
    337 /** Check whether character is plain ASCII.
    338  *
    339  * @return True if character is plain ASCII.
     337/** Check whether code point is plain ASCII.
     338 *
     339 * @return True if code point is plain ASCII.
    340340 *
    341341 */
     
    348348}
    349349
    350 /** Check whether character is valid
    351  *
    352  * @return True if character is a valid Unicode code point.
     350/** Check whether code point is valid
     351 *
     352 * @return True if code point is a valid Unicode code point.
    353353 *
    354354 */
     
    365365 * Do a char-by-char comparison of two NULL-terminated strings.
    366366 * The strings are considered equal iff their length is equal
    367  * and both strings consist of the same sequence of characters.
    368  *
    369  * A string S1 is less than another string S2 if it has a character with
    370  * lower value at the first character position where the strings differ.
     367 * and both strings consist of the same sequence of code points.
     368 *
     369 * A string S1 is less than another string S2 if it has a code point with
     370 * lower value at the first code point position where the strings differ.
    371371 * If the strings differ in length, the shorter one is treated as if
    372  * padded by characters with a value of zero.
     372 * padded by code points with a value of zero.
    373373 *
    374374 * @param s1 First string to compare.
     
    409409 * No more than @a size bytes are written. If the size of the output buffer
    410410 * is at least one byte, the output string will always be well-formed, i.e.
    411  * null-terminated and containing only complete characters.
     411 * null-terminated and containing only complete code points.
    412412 *
    413413 * @param dest  Destination buffer.
  • kernel/generic/include/str.h

    r08e103d4 r1d2f85e  
    6666#define STR_NO_LIMIT  ((size_t) -1)
    6767
    68 /** Maximum size of a string containing @c length characters */
     68/** Maximum size of a string containing @c length code points */
    6969#define STR_BOUNDS(length)  ((length) << 2)
    7070
  • kernel/generic/src/lib/str.c

    r08e103d4 r1d2f85e  
    4141 * Strings and characters use the Universal Character Set (UCS). The standard
    4242 * strings, called just strings are encoded in UTF-8. Wide strings (encoded
    43  * in UTF-32) are supported to a limited degree. A single character is
     43 * in UTF-32) are supported to a limited degree. A single code point is
    4444 * represented as wchar_t.@n
    4545 *
     
    5050 *  byte                  8 bits stored in uint8_t (unsigned 8 bit integer)
    5151 *
    52  *  character             UTF-32 encoded Unicode character, stored in wchar_t
     52 *  character             UTF-32 encoded Unicode code point, stored in wchar_t
    5353 *                        (signed 32 bit integer), code points 0 .. 1114111
    5454 *                        are valid
     
    6666 *                        the NULL-terminator), size_t
    6767 *
    68  *  [wide] string length  number of CHARACTERS in a [wide] string (excluding
     68 *  [wide] string length  number of CODE POINTS in a [wide] string (excluding
    6969 *                        the NULL-terminator), size_t
    7070 *
     
    8080 *                            NULL-terminator)
    8181 *
    82  *  length  l        size_t   number of CHARACTERS in a string (excluding the
     82 *  length  l        size_t   number of CODE POINTS in a string (excluding the
    8383 *                            null terminator)
    8484 *
     
    8989 * Function naming prefixes:@n
    9090 *
    91  *  chr_    operate on characters
     91 *  chr_    operate on code points
    9292 *  ascii_  operate on ASCII characters
    9393 *  str_    operate on strings
     
    102102 *  pointer (char *, wchar_t *)
    103103 *  byte offset (size_t)
    104  *  character index (size_t)
     104 *  code point index (size_t)
    105105 *
    106106 */
     
    137137#define CONT_BITS  6
    138138
    139 /** Decode a single character from a string.
    140  *
    141  * Decode a single character from a string of size @a size. Decoding starts
     139/** Decode a single code point from an UTF-8 encoded string.
     140 *
     141 * Decode a single code point from a string of size @a size. Decoding starts
    142142 * at @a offset and this offset is moved to the beginning of the next
    143  * character. In case of decoding error, offset generally advances at least
     143 * code point. In case of decoding error, offset generally advances at least
    144144 * by one. However, offset is never moved beyond size.
    145145 *
     
    148148 * @param size   Size of the string (in bytes).
    149149 *
    150  * @return Value of decoded character, U_SPECIAL on decoding error or
     150 * @return Value of decoded code point, U_SPECIAL on decoding error or
    151151 *         NULL if attempt to decode beyond @a size.
    152152 *
     
    207207}
    208208
    209 /** Encode a single character to string representation.
    210  *
    211  * Encode a single character to string representation (i.e. UTF-8) and store
     209/** Encode a single code point to a UTF-8 string representation.
     210 *
     211 * Encode a single code point to a UTF-8 string representation and store
    212212 * it into a buffer at @a offset. Encoding starts at @a offset and this offset
    213  * is moved to the position where the next character can be written to.
    214  *
    215  * @param ch     Input character.
     213 * is moved to the position where the next code point can be written to.
     214 *
     215 * @param ch     Input code point.
    216216 * @param str    Output buffer.
    217217 * @param offset Byte offset where to start writing.
    218218 * @param size   Size of the output buffer (in bytes).
    219219 *
    220  * @return EOK if the character was encoded successfully, EOVERFLOW if there
    221  *         was not enough space in the output buffer or EINVAL if the character
     220 * @return EOK if the code point was encoded successfully, EOVERFLOW if there
     221 *         was not enough space in the output buffer or EINVAL if the code point
    222222 *         code was invalid.
    223223 */
     
    313313}
    314314
    315 /** Get size of string with length limit.
     315/** Get size of string with code point count limit.
    316316 *
    317317 * Get the number of bytes which are used by up to @a max_len first
    318  * characters in the string @a str. If @a max_len is greater than
    319  * the length of @a str, the entire string is measured (excluding the
    320  * NULL-terminator).
     318 * code points in the string @a str. If @a max_len is greater than
     319 * the number of code points in @a str, the entire string is measured
     320 * (excluding the NULL-terminator).
    321321 *
    322322 * @param str     String to consider.
    323  * @param max_len Maximum number of characters to measure.
    324  *
    325  * @return Number of bytes used by the characters.
     323 * @param max_len Maximum number of code points to measure.
     324 *
     325 * @return Number of bytes used by the code points.
    326326 *
    327327 */
     
    344344 *
    345345 * Get the number of bytes which are used by up to @a max_len first
    346  * wide characters in the wide string @a str. If @a max_len is greater than
     346 * code points in the wide string @a str. If @a max_len is greater than
    347347 * the length of @a str, the entire wide string is measured (excluding the
    348348 * NULL-terminator).
    349349 *
    350350 * @param str     Wide string to consider.
    351  * @param max_len Maximum number of wide characters to measure.
    352  *
    353  * @return Number of bytes used by the wide characters.
     351 * @param max_len Maximum number of code points to measure.
     352 *
     353 * @return Number of bytes used by the code points.
    354354 *
    355355 */
     
    359359}
    360360
    361 /** Get number of characters in a string.
    362  *
    363  * @param str NULL-terminated string.
    364  *
    365  * @return Number of characters in string.
     361/** Get number of unicode code points in a UTF-8 encoded string.
     362 *
     363 * @param str NULL-terminated UTF-8 string.
     364 *
     365 * @return Number of code points in the string.
    366366 *
    367367 */
     
    377377}
    378378
    379 /** Get number of characters in a wide string.
     379/** Get number of code points in a wide string.
    380380 *
    381381 * @param str NULL-terminated wide string.
    382382 *
    383  * @return Number of characters in @a str.
     383 * @return Number of code points in @a str.
    384384 *
    385385 */
     
    394394}
    395395
    396 /** Get number of characters in a string with size limit.
     396/** Get number of code points in a string with size limit.
    397397 *
    398398 * @param str  NULL-terminated string.
    399399 * @param size Maximum number of bytes to consider.
    400400 *
    401  * @return Number of characters in string.
     401 * @return Number of code points in string.
    402402 *
    403403 */
     
    413413}
    414414
    415 /** Get number of characters in a string with size limit.
     415/** Get number of code points in a string with size limit.
    416416 *
    417417 * @param str  NULL-terminated string.
    418418 * @param size Maximum number of bytes to consider.
    419419 *
    420  * @return Number of characters in string.
     420 * @return Number of code points in string.
    421421 *
    422422 */
     
    435435}
    436436
    437 /** Check whether character is plain ASCII.
    438  *
    439  * @return True if character is plain ASCII.
     437/** Check whether code point is plain ASCII.
     438 *
     439 * @return True if code point is plain ASCII.
    440440 *
    441441 */
     
    448448}
    449449
    450 /** Check whether character is valid
    451  *
    452  * @return True if character is a valid Unicode code point.
     450/** Check whether code point is valid
     451 *
     452 * @return True if code point is a valid Unicode code point.
    453453 *
    454454 */
     
    465465 * Do a char-by-char comparison of two NULL-terminated strings.
    466466 * The strings are considered equal iff their length is equal
    467  * and both strings consist of the same sequence of characters.
    468  *
    469  * A string S1 is less than another string S2 if it has a character with
    470  * lower value at the first character position where the strings differ.
     467 * and both strings consist of the same sequence of code points.
     468 *
     469 * A string S1 is less than another string S2 if it has a code point with
     470 * lower value at the first code point position where the strings differ.
    471471 * If the strings differ in length, the shorter one is treated as if
    472  * padded by characters with a value of zero.
     472 * padded by code points with a value of zero.
    473473 *
    474474 * @param s1 First string to compare.
     
    509509 * The strings are considered equal iff
    510510 * min(str_code_points(s1), max_len) == min(str_code_points(s2), max_len)
    511  * and both strings consist of the same sequence of characters,
    512  * up to max_len characters.
    513  *
    514  * A string S1 is less than another string S2 if it has a character with
    515  * lower value at the first character position where the strings differ.
     511 * and both strings consist of the same sequence of code points,
     512 * up to max_len code points.
     513 *
     514 * A string S1 is less than another string S2 if it has a code point with
     515 * lower value at the first code point position where the strings differ.
    516516 * If the strings differ in length, the shorter one is treated as if
    517  * padded by characters with a value of zero. Only the first max_len
    518  * characters are considered.
     517 * padded by code points with a value of zero. Only the first max_len
     518 * code points are considered.
    519519 *
    520520 * @param s1      First string to compare.
    521521 * @param s2      Second string to compare.
    522  * @param max_len Maximum number of characters to consider.
     522 * @param max_len Maximum number of code points to consider.
    523523 *
    524524 * @return 0 if the strings are equal, -1 if the first is less than the second,
     
    564564 * No more than @a size bytes are written. If the size of the output buffer
    565565 * is at least one byte, the output string will always be well-formed, i.e.
    566  * null-terminated and containing only complete characters.
     566 * null-terminated and containing only complete code points.
    567567 *
    568568 * @param dest  Destination buffer.
     
    594594 * @a dest. No more than @a size bytes are written. The output string will
    595595 * always be well-formed, i.e. null-terminated and containing only complete
    596  * characters.
     596 * code points.
    597597 *
    598598 * No more than @a n bytes are read from the input string, so it does not
     
    652652}
    653653
    654 /** Find first occurence of character in string.
     654/** Find first occurence of code point in string.
    655655 *
    656656 * @param str String to search.
    657  * @param ch  Character to look for.
    658  *
    659  * @return Pointer to character in @a str or NULL if not found.
     657 * @param ch  code point to look for.
     658 *
     659 * @return Pointer to code point in @a str or NULL if not found.
    660660 */
    661661char *str_chr(const char *str, wchar_t ch)
     
    674674}
    675675
    676 /** Insert a wide character into a wide string.
    677  *
    678  * Insert a wide character into a wide string at position
    679  * @a pos. The characters after the position are shifted.
     676/** Insert a code point into a wide string.
     677 *
     678 * Insert a code point into a wide string at position
     679 * @a pos. The code points after the position are shifted.
    680680 *
    681681 * @param str     String to insert to.
    682  * @param ch      Character to insert to.
    683  * @param pos     Character index where to insert.
    684  * @param max_pos Characters in the buffer.
     682 * @param ch      Code point to insert.
     683 * @param pos     Code point index where to insert.
     684 * @param max_pos Number of code points that fit in the buffer.
    685685 *
    686686 * @return True if the insertion was sucessful, false if the position
     
    704704}
    705705
    706 /** Remove a wide character from a wide string.
    707  *
    708  * Remove a wide character from a wide string at position
    709  * @a pos. The characters after the position are shifted.
     706/** Remove a code point from a wide string.
     707 *
     708 * Remove a code point from a wide string at position
     709 * @a pos. The code points after the position are shifted.
    710710 *
    711711 * @param str String to remove from.
    712  * @param pos Character index to remove.
     712 * @param pos Code point index to remove.
    713713 *
    714714 * @return True if the removal was sucessful, false if the position
     
    732732/** Duplicate string.
    733733 *
    734  * Allocate a new string and copy characters from the source
    735  * string into it. The duplicate string is allocated via sleeping
    736  * malloc(), thus this function can sleep in no memory conditions.
    737  *
    738  * The allocation cannot fail and the return value is always
    739  * a valid pointer. The duplicate string is always a well-formed
     734 * Allocate a new string and copy the contents of the source string into it.
     735 * The duplicate string is allocated as if by malloc().
     736 *
     737 * If successful, the duplicate string is always a well-formed
    740738 * null-terminated UTF-8 string, but it can differ from the source
    741739 * string on the byte level.
     
    743741 * @param src Source string.
    744742 *
    745  * @return Duplicate string.
     743 * @return Duplicate string, or NULL if allocation failed.
    746744 *
    747745 */
     
    760758 *
    761759 * Allocate a new string and copy up to @max_size bytes from the source
    762  * string into it. The duplicate string is allocated via sleeping
    763  * malloc(), thus this function can sleep in no memory conditions.
     760 * string into it. The duplicate string is allocated as if by malloc().
    764761 * No more than @max_size + 1 bytes is allocated, but if the size
    765762 * occupied by the source string is smaller than @max_size + 1,
    766763 * less is allocated.
    767764 *
    768  * The allocation cannot fail and the return value is always
    769  * a valid pointer. The duplicate string is always a well-formed
     765 * If successful, the duplicate string is always a well-formed
    770766 * null-terminated UTF-8 string, but it can differ from the source
    771767 * string on the byte level.
  • uspace/lib/c/generic/str.c

    r08e103d4 r1d2f85e  
    4141 * Strings and characters use the Universal Character Set (UCS). The standard
    4242 * strings, called just strings are encoded in UTF-8. Wide strings (encoded
    43  * in UTF-32) are supported to a limited degree. A single character is
     43 * in UTF-32) are supported to a limited degree. A single code point is
    4444 * represented as wchar_t.@n
    4545 *
     
    5050 *  byte                  8 bits stored in uint8_t (unsigned 8 bit integer)
    5151 *
    52  *  character             UTF-32 encoded Unicode character, stored in wchar_t
     52 *  character             UTF-32 encoded Unicode code point, stored in wchar_t
    5353 *                        (signed 32 bit integer), code points 0 .. 1114111
    5454 *                        are valid
     
    6666 *                        the NULL-terminator), size_t
    6767 *
    68  *  [wide] string length  number of CHARACTERS in a [wide] string (excluding
     68 *  [wide] string length  number of CODE POINTS in a [wide] string (excluding
    6969 *                        the NULL-terminator), size_t
    7070 *
     
    8080 *                            NULL-terminator)
    8181 *
    82  *  length  l        size_t   number of CHARACTERS in a string (excluding the
     82 *  length  l        size_t   number of CODE POINTS in a string (excluding the
    8383 *                            null terminator)
    8484 *
     
    8989 * Function naming prefixes:@n
    9090 *
    91  *  chr_    operate on characters
     91 *  chr_    operate on code points
    9292 *  ascii_  operate on ASCII characters
    9393 *  str_    operate on strings
     
    102102 *  pointer (char *, wchar_t *)
    103103 *  byte offset (size_t)
    104  *  character index (size_t)
     104 *  code point index (size_t)
    105105 *
    106106 */
     
    138138#define CONT_BITS  6
    139139
    140 /** Decode a single character from a string.
    141  *
    142  * Decode a single character from a string of size @a size. Decoding starts
     140/** Decode a single code point from an UTF-8 encoded string.
     141 *
     142 * Decode a single code point from a string of size @a size. Decoding starts
    143143 * at @a offset and this offset is moved to the beginning of the next
    144  * character. In case of decoding error, offset generally advances at least
     144 * code point. In case of decoding error, offset generally advances at least
    145145 * by one. However, offset is never moved beyond size.
    146146 *
     
    149149 * @param size   Size of the string (in bytes).
    150150 *
    151  * @return Value of decoded character, U_SPECIAL on decoding error or
     151 * @return Value of decoded code point, U_SPECIAL on decoding error or
    152152 *         NULL if attempt to decode beyond @a size.
    153153 *
     
    208208}
    209209
    210 /** Decode a single character from a string to the left.
    211  *
    212  * Decode a single character from a string of size @a size. Decoding starts
     210/** Decode a single code point from an UTF-8 encoded string to the left.
     211 *
     212 * Decode a single code point from a string of size @a size. Decoding starts
    213213 * at @a offset and this offset is moved to the beginning of the previous
    214  * character. In case of decoding error, offset generally decreases at least
     214 * code point. In case of decoding error, offset generally decreases at least
    215215 * by one. However, offset is never moved before 0.
    216216 *
     
    219219 * @param size   Size of the string (in bytes).
    220220 *
    221  * @return Value of decoded character, U_SPECIAL on decoding error or
     221 * @return Value of decoded code point, U_SPECIAL on decoding error or
    222222 *         NULL if attempt to decode beyond @a start of str.
    223223 *
     
    251251}
    252252
    253 /** Encode a single character to string representation.
    254  *
    255  * Encode a single character to string representation (i.e. UTF-8) and store
     253/** Encode a single code point to a UTF-8 string representation.
     254 *
     255 * Encode a single code point to a UTF-8 string representation and store
    256256 * it into a buffer at @a offset. Encoding starts at @a offset and this offset
    257  * is moved to the position where the next character can be written to.
    258  *
    259  * @param ch     Input character.
     257 * is moved to the position where the next code point can be written to.
     258 *
     259 * @param ch     Input code point.
    260260 * @param str    Output buffer.
    261261 * @param offset Byte offset where to start writing.
    262262 * @param size   Size of the output buffer (in bytes).
    263263 *
    264  * @return EOK if the character was encoded successfully, EOVERFLOW if there
    265  *         was not enough space in the output buffer or EINVAL if the character
     264 * @return EOK if the code point was encoded successfully, EOVERFLOW if there
     265 *         was not enough space in the output buffer or EINVAL if the code point
    266266 *         code was invalid.
    267267 */
     
    357357}
    358358
    359 /** Get size of string with length limit.
     359/** Get size of string with code point count limit.
    360360 *
    361361 * Get the number of bytes which are used by up to @a max_len first
    362  * characters in the string @a str. If @a max_len is greater than
    363  * the length of @a str, the entire string is measured (excluding the
    364  * NULL-terminator).
     362 * code points in the string @a str. If @a max_len is greater than
     363 * the number of code points in @a str, the entire string is measured
     364 * (excluding the NULL-terminator).
    365365 *
    366366 * @param str     String to consider.
    367  * @param max_len Maximum number of characters to measure.
    368  *
    369  * @return Number of bytes used by the characters.
     367 * @param max_len Maximum number of code points to measure.
     368 *
     369 * @return Number of bytes used by the code points.
    370370 *
    371371 */
     
    425425 *
    426426 * Get the number of bytes which are used by up to @a max_len first
    427  * wide characters in the wide string @a str. If @a max_len is greater than
     427 * code points in the wide string @a str. If @a max_len is greater than
    428428 * the length of @a str, the entire wide string is measured (excluding the
    429429 * NULL-terminator).
    430430 *
    431431 * @param str     Wide string to consider.
    432  * @param max_len Maximum number of wide characters to measure.
    433  *
    434  * @return Number of bytes used by the wide characters.
     432 * @param max_len Maximum number of code points to measure.
     433 *
     434 * @return Number of bytes used by the code points.
    435435 *
    436436 */
     
    440440}
    441441
    442 /** Get number of characters in a string.
    443  *
    444  * @param str NULL-terminated string.
    445  *
    446  * @return Number of characters in string.
     442/** Get number of unicode code points in a UTF-8 encoded string.
     443 *
     444 * @param str NULL-terminated UTF-8 string.
     445 *
     446 * @return Number of code points in the string.
    447447 *
    448448 */
     
    458458}
    459459
    460 /** Get number of characters in a wide string.
     460/** Get number of code points in a wide string.
    461461 *
    462462 * @param str NULL-terminated wide string.
    463463 *
    464  * @return Number of characters in @a str.
     464 * @return Number of code points in @a str.
    465465 *
    466466 */
     
    475475}
    476476
    477 /** Get number of characters in a string with size limit.
     477/** Get number of code points in a string with size limit.
    478478 *
    479479 * @param str  NULL-terminated string.
    480480 * @param size Maximum number of bytes to consider.
    481481 *
    482  * @return Number of characters in string.
     482 * @return Number of code points in string.
    483483 *
    484484 */
     
    494494}
    495495
    496 /** Get number of characters in a string with size limit.
     496/** Get number of code points in a string with size limit.
    497497 *
    498498 * @param str  NULL-terminated string.
    499499 * @param size Maximum number of bytes to consider.
    500500 *
    501  * @return Number of characters in string.
     501 * @return Number of code points in string.
    502502 *
    503503 */
     
    516516}
    517517
    518 /** Get character display width on a character cell display.
    519  *
    520  * @param ch    Character
    521  * @return      Width of character in cells.
     518/** Get display width of a code point on a character cell display.
     519 *
     520 * @param ch    Code point
     521 * @return      Display width in cells.
    522522 */
    523523size_t chr_width(wchar_t ch)
     
    543543}
    544544
    545 /** Check whether character is plain ASCII.
    546  *
    547  * @return True if character is plain ASCII.
     545/** Check whether code point is plain ASCII.
     546 *
     547 * @return True if code point is plain ASCII.
    548548 *
    549549 */
     
    556556}
    557557
    558 /** Check whether character is valid
    559  *
    560  * @return True if character is a valid Unicode code point.
     558/** Check whether code point is valid
     559 *
     560 * @return True if code point is a valid Unicode code point.
    561561 *
    562562 */
     
    573573 * Do a char-by-char comparison of two NULL-terminated strings.
    574574 * The strings are considered equal iff their length is equal
    575  * and both strings consist of the same sequence of characters.
    576  *
    577  * A string S1 is less than another string S2 if it has a character with
    578  * lower value at the first character position where the strings differ.
     575 * and both strings consist of the same sequence of code points.
     576 *
     577 * A string S1 is less than another string S2 if it has a code point with
     578 * lower value at the first code point position where the strings differ.
    579579 * If the strings differ in length, the shorter one is treated as if
    580  * padded by characters with a value of zero.
     580 * padded by code points with a value of zero.
    581581 *
    582582 * @param s1 First string to compare.
     
    617617 * The strings are considered equal iff
    618618 * min(str_code_points(s1), max_len) == min(str_code_points(s2), max_len)
    619  * and both strings consist of the same sequence of characters,
    620  * up to max_len characters.
    621  *
    622  * A string S1 is less than another string S2 if it has a character with
    623  * lower value at the first character position where the strings differ.
     619 * and both strings consist of the same sequence of code points,
     620 * up to max_len code points.
     621 *
     622 * A string S1 is less than another string S2 if it has a code point with
     623 * lower value at the first code point position where the strings differ.
    624624 * If the strings differ in length, the shorter one is treated as if
    625  * padded by characters with a value of zero. Only the first max_len
    626  * characters are considered.
     625 * padded by code points with a value of zero. Only the first max_len
     626 * code points are considered.
    627627 *
    628628 * @param s1      First string to compare.
    629629 * @param s2      Second string to compare.
    630  * @param max_len Maximum number of characters to consider.
     630 * @param max_len Maximum number of code points to consider.
    631631 *
    632632 * @return 0 if the strings are equal, -1 if the first is less than the second,
     
    671671 * Do a char-by-char comparison of two NULL-terminated strings.
    672672 * The strings are considered equal iff their length is equal
    673  * and both strings consist of the same sequence of characters
     673 * and both strings consist of the same sequence of code points
    674674 * when converted to lower case.
    675675 *
    676  * A string S1 is less than another string S2 if it has a character with
    677  * lower value at the first character position where the strings differ.
     676 * A string S1 is less than another string S2 if it has a code point with
     677 * lower value at the first code point position where the strings differ.
    678678 * If the strings differ in length, the shorter one is treated as if
    679  * padded by characters with a value of zero.
     679 * padded by code points with a value of zero.
    680680 *
    681681 * @param s1 First string to compare.
     
    717717 * The strings are considered equal iff
    718718 * min(str_code_points(s1), max_len) == min(str_code_points(s2), max_len)
    719  * and both strings consist of the same sequence of characters,
    720  * up to max_len characters.
    721  *
    722  * A string S1 is less than another string S2 if it has a character with
    723  * lower value at the first character position where the strings differ.
     719 * and both strings consist of the same sequence of code points,
     720 * up to max_len code points.
     721 *
     722 * A string S1 is less than another string S2 if it has a code point with
     723 * lower value at the first code point position where the strings differ.
    724724 * If the strings differ in length, the shorter one is treated as if
    725  * padded by characters with a value of zero. Only the first max_len
    726  * characters are considered.
     725 * padded by code points with a value of zero. Only the first max_len
     726 * code points are considered.
    727727 *
    728728 * @param s1      First string to compare.
    729729 * @param s2      Second string to compare.
    730  * @param max_len Maximum number of characters to consider.
     730 * @param max_len Maximum number of code points to consider.
    731731 *
    732732 * @return 0 if the strings are equal, -1 if the first is less than the second,
     
    808808 * No more than @a size bytes are written. If the size of the output buffer
    809809 * is at least one byte, the output string will always be well-formed, i.e.
    810  * null-terminated and containing only complete characters.
     810 * null-terminated and containing only complete code points.
    811811 *
    812812 * @param dest  Destination buffer.
     
    838838 * @a dest. No more than @a size bytes are written. The output string will
    839839 * always be well-formed, i.e. null-terminated and containing only complete
    840  * characters.
     840 * code points.
    841841 *
    842842 * No more than @a n bytes are read from the input string, so it does not
     
    871871 * Size of the destination buffer is @a dest. If the size of the output buffer
    872872 * is at least one byte, the output string will always be well-formed, i.e.
    873  * null-terminated and containing only complete characters.
     873 * null-terminated and containing only complete code points.
    874874 *
    875875 * @param dest   Destination buffer.
     
    10301030 *
    10311031 * @param dest  Destination buffer.
    1032  * @param dlen  Number of utf16 characters that fit in the destination buffer.
     1032 * @param dlen  Number of utf16 code points that fit in the destination buffer.
    10331033 * @param src   Source string.
    10341034 *
     
    11881188}
    11891189
    1190 /** Find first occurence of character in string.
     1190/** Find first occurence of code point in string.
    11911191 *
    11921192 * @param str String to search.
    1193  * @param ch  Character to look for.
    1194  *
    1195  * @return Pointer to character in @a str or NULL if not found.
     1193 * @param ch  code point to look for.
     1194 *
     1195 * @return Pointer to code point in @a str or NULL if not found.
    11961196 */
    11971197char *str_chr(const char *str, wchar_t ch)
     
    12151215 * @param n   Needle (substring to look for)
    12161216 *
    1217  * @return Pointer to character in @a hs or @c NULL if not found.
     1217 * @return Pointer to substring in @a hs or @c NULL if not found.
    12181218 */
    12191219char *str_str(const char *hs, const char *n)
     
    12321232}
    12331233
    1234 /** Removes specified trailing characters from a string.
     1234/** Removes specified trailing code points from a string.
    12351235 *
    12361236 * @param str String to remove from.
    1237  * @param ch  Character to remove.
     1237 * @param ch  Code point to remove.
    12381238 */
    12391239void str_rtrim(char *str, wchar_t ch)
     
    12601260}
    12611261
    1262 /** Removes specified leading characters from a string.
     1262/** Removes specified leading code points from a string.
    12631263 *
    12641264 * @param str String to remove from.
    1265  * @param ch  Character to remove.
     1265 * @param ch  code point to remove.
    12661266 */
    12671267void str_ltrim(char *str, wchar_t ch)
     
    12861286}
    12871287
    1288 /** Find last occurence of character in string.
     1288/** Find last occurence of code point in string.
    12891289 *
    12901290 * @param str String to search.
    1291  * @param ch  Character to look for.
    1292  *
    1293  * @return Pointer to character in @a str or NULL if not found.
     1291 * @param ch  code point to look for.
     1292 *
     1293 * @return Pointer to code point in @a str or NULL if not found.
    12941294 */
    12951295char *str_rchr(const char *str, wchar_t ch)
     
    13091309}
    13101310
    1311 /** Insert a wide character into a wide string.
    1312  *
    1313  * Insert a wide character into a wide string at position
    1314  * @a pos. The characters after the position are shifted.
     1311/** Insert a code point into a wide string.
     1312 *
     1313 * Insert a code point into a wide string at position
     1314 * @a pos. The code points after the position are shifted.
    13151315 *
    13161316 * @param str     String to insert to.
    1317  * @param ch      Character to insert to.
    1318  * @param pos     Character index where to insert.
    1319  * @param max_pos Characters in the buffer.
     1317 * @param ch      Code point to insert.
     1318 * @param pos     Code point index where to insert.
     1319 * @param max_pos Number of code points that fit in the buffer.
    13201320 *
    13211321 * @return True if the insertion was sucessful, false if the position
     
    13391339}
    13401340
    1341 /** Remove a wide character from a wide string.
    1342  *
    1343  * Remove a wide character from a wide string at position
    1344  * @a pos. The characters after the position are shifted.
     1341/** Remove a code point from a wide string.
     1342 *
     1343 * Remove a code point from a wide string at position
     1344 * @a pos. The code points after the position are shifted.
    13451345 *
    13461346 * @param str String to remove from.
    1347  * @param pos Character index to remove.
     1347 * @param pos Code point index to remove.
    13481348 *
    13491349 * @return True if the removal was sucessful, false if the position
     
    13671367/** Duplicate string.
    13681368 *
    1369  * Allocate a new string and copy characters from the source
    1370  * string into it. The duplicate string is allocated via sleeping
    1371  * malloc(), thus this function can sleep in no memory conditions.
    1372  *
    1373  * The allocation cannot fail and the return value is always
    1374  * a valid pointer. The duplicate string is always a well-formed
     1369 * Allocate a new string and copy the contents of the source string into it.
     1370 * The duplicate string is allocated as if by malloc().
     1371 *
     1372 * If successful, the duplicate string is always a well-formed
    13751373 * null-terminated UTF-8 string, but it can differ from the source
    13761374 * string on the byte level.
     
    13781376 * @param src Source string.
    13791377 *
    1380  * @return Duplicate string.
     1378 * @return Duplicate string, or NULL if allocation failed.
    13811379 *
    13821380 */
     
    13951393 *
    13961394 * Allocate a new string and copy up to @max_size bytes from the source
    1397  * string into it. The duplicate string is allocated via sleeping
    1398  * malloc(), thus this function can sleep in no memory conditions.
     1395 * string into it. The duplicate string is allocated as if by malloc().
    13991396 * No more than @max_size + 1 bytes is allocated, but if the size
    14001397 * occupied by the source string is smaller than @max_size + 1,
    14011398 * less is allocated.
    14021399 *
    1403  * The allocation cannot fail and the return value is always
    1404  * a valid pointer. The duplicate string is always a well-formed
     1400 * If successful, the duplicate string is always a well-formed
    14051401 * null-terminated UTF-8 string, but it can differ from the source
    14061402 * string on the byte level.
  • uspace/lib/c/include/str.h

    r08e103d4 r1d2f85e  
    5555#define STR_NO_LIMIT  ((size_t) -1)
    5656
    57 /** Maximum size of a string containing @c length characters */
     57/** Maximum size of a string containing @c length code points */
    5858#define STR_BOUNDS(length)  ((length) << 2)
    5959
Note: See TracChangeset for help on using the changeset viewer.