Changeset 1d2f85e in mainline
- Timestamp:
- 2019-02-05T18:26:54Z (6 years ago)
- Parents:
- 08e103d4
- Files:
-
- 5 edited
Legend:
- Unmodified
- Added
- Removed
-
boot/generic/src/str.c
r08e103d4 r1d2f85e 37 37 * Strings and characters use the Universal Character Set (UCS). The standard 38 38 * strings, called just strings are encoded in UTF-8. Wide strings (encoded 39 * in UTF-32) are supported to a limited degree. A single c haracteris39 * in UTF-32) are supported to a limited degree. A single code point is 40 40 * represented as wchar_t.@n 41 41 * … … 46 46 * byte 8 bits stored in uint8_t (unsigned 8 bit integer) 47 47 * 48 * character UTF-32 encoded Unicode c haracter, stored in wchar_t48 * character UTF-32 encoded Unicode code point, stored in wchar_t 49 49 * (signed 32 bit integer), code points 0 .. 1114111 50 50 * are valid … … 62 62 * the NULL-terminator), size_t 63 63 * 64 * [wide] string length number of C HARACTERS in a [wide] string (excluding64 * [wide] string length number of CODE POINTS in a [wide] string (excluding 65 65 * the NULL-terminator), size_t 66 66 * … … 76 76 * NULL-terminator) 77 77 * 78 * length l size_t number of C HARACTERS in a string (excluding the78 * length l size_t number of CODE POINTS in a string (excluding the 79 79 * null terminator) 80 80 * … … 85 85 * Function naming prefixes:@n 86 86 * 87 * chr_ operate on c haracters87 * chr_ operate on code points 88 88 * ascii_ operate on ASCII characters 89 89 * str_ operate on strings … … 98 98 * pointer (char *, wchar_t *) 99 99 * byte offset (size_t) 100 * c haracterindex (size_t)100 * code point index (size_t) 101 101 * 102 102 */ … … 128 128 #define CONT_BITS 6 129 129 130 /** Decode a single c haracter from astring.131 * 132 * Decode a single c haracterfrom a string of size @a size. Decoding starts130 /** Decode a single code point from an UTF-8 encoded string. 131 * 132 * Decode a single code point from a string of size @a size. Decoding starts 133 133 * at @a offset and this offset is moved to the beginning of the next 134 * c haracter. In case of decoding error, offset generally advances at least134 * code point. In case of decoding error, offset generally advances at least 135 135 * by one. However, offset is never moved beyond size. 136 136 * … … 139 139 * @param size Size of the string (in bytes). 140 140 * 141 * @return Value of decoded c haracter, U_SPECIAL on decoding error or141 * @return Value of decoded code point, U_SPECIAL on decoding error or 142 142 * NULL if attempt to decode beyond @a size. 143 143 * … … 198 198 } 199 199 200 /** Encode a single c haracter tostring representation.201 * 202 * Encode a single c haracter to string representation (i.e. UTF-8)and store200 /** Encode a single code point to a UTF-8 string representation. 201 * 202 * Encode a single code point to a UTF-8 string representation and store 203 203 * it into a buffer at @a offset. Encoding starts at @a offset and this offset 204 * is moved to the position where the next c haractercan be written to.205 * 206 * @param ch Input c haracter.204 * is moved to the position where the next code point can be written to. 205 * 206 * @param ch Input code point. 207 207 * @param str Output buffer. 208 208 * @param offset Byte offset where to start writing. 209 209 * @param size Size of the output buffer (in bytes). 210 210 * 211 * @return EOK if the c haracterwas encoded successfully, EOVERFLOW if there212 * was not enough space in the output buffer or EINVAL if the c haracter211 * @return EOK if the code point was encoded successfully, EOVERFLOW if there 212 * was not enough space in the output buffer or EINVAL if the code point 213 213 * code was invalid. 214 214 */ … … 289 289 } 290 290 291 /** Get size of string with lengthlimit.291 /** Get size of string with code point count limit. 292 292 * 293 293 * Get the number of bytes which are used by up to @a max_len first 294 * c haracters in the string @a str. If @a max_len is greater than295 * the length of @a str, the entire string is measured (excluding the296 * NULL-terminator).294 * code points in the string @a str. If @a max_len is greater than 295 * the number of code points in @a str, the entire string is measured 296 * (excluding the NULL-terminator). 297 297 * 298 298 * @param str String to consider. 299 * @param max_len Maximum number of c haracters to measure.300 * 301 * @return Number of bytes used by the c haracters.299 * @param max_len Maximum number of code points to measure. 300 * 301 * @return Number of bytes used by the code points. 302 302 * 303 303 */ … … 317 317 } 318 318 319 /** Get number of characters in astring.320 * 321 * @param str NULL-terminated string.322 * 323 * @return Number of c haracters instring.319 /** Get number of unicode code points in a UTF-8 encoded string. 320 * 321 * @param str NULL-terminated UTF-8 string. 322 * 323 * @return Number of code points in the string. 324 324 * 325 325 */ … … 335 335 } 336 336 337 /** Check whether c haracteris plain ASCII.338 * 339 * @return True if c haracteris plain ASCII.337 /** Check whether code point is plain ASCII. 338 * 339 * @return True if code point is plain ASCII. 340 340 * 341 341 */ … … 348 348 } 349 349 350 /** Check whether c haracteris valid351 * 352 * @return True if c haracteris a valid Unicode code point.350 /** Check whether code point is valid 351 * 352 * @return True if code point is a valid Unicode code point. 353 353 * 354 354 */ … … 365 365 * Do a char-by-char comparison of two NULL-terminated strings. 366 366 * The strings are considered equal iff their length is equal 367 * and both strings consist of the same sequence of c haracters.368 * 369 * A string S1 is less than another string S2 if it has a c haracterwith370 * lower value at the first c haracterposition where the strings differ.367 * and both strings consist of the same sequence of code points. 368 * 369 * A string S1 is less than another string S2 if it has a code point with 370 * lower value at the first code point position where the strings differ. 371 371 * If the strings differ in length, the shorter one is treated as if 372 * padded by c haracters with a value of zero.372 * padded by code points with a value of zero. 373 373 * 374 374 * @param s1 First string to compare. … … 409 409 * No more than @a size bytes are written. If the size of the output buffer 410 410 * is at least one byte, the output string will always be well-formed, i.e. 411 * null-terminated and containing only complete c haracters.411 * null-terminated and containing only complete code points. 412 412 * 413 413 * @param dest Destination buffer. -
kernel/generic/include/str.h
r08e103d4 r1d2f85e 66 66 #define STR_NO_LIMIT ((size_t) -1) 67 67 68 /** Maximum size of a string containing @c length c haracters */68 /** Maximum size of a string containing @c length code points */ 69 69 #define STR_BOUNDS(length) ((length) << 2) 70 70 -
kernel/generic/src/lib/str.c
r08e103d4 r1d2f85e 41 41 * Strings and characters use the Universal Character Set (UCS). The standard 42 42 * strings, called just strings are encoded in UTF-8. Wide strings (encoded 43 * in UTF-32) are supported to a limited degree. A single c haracteris43 * in UTF-32) are supported to a limited degree. A single code point is 44 44 * represented as wchar_t.@n 45 45 * … … 50 50 * byte 8 bits stored in uint8_t (unsigned 8 bit integer) 51 51 * 52 * character UTF-32 encoded Unicode c haracter, stored in wchar_t52 * character UTF-32 encoded Unicode code point, stored in wchar_t 53 53 * (signed 32 bit integer), code points 0 .. 1114111 54 54 * are valid … … 66 66 * the NULL-terminator), size_t 67 67 * 68 * [wide] string length number of C HARACTERS in a [wide] string (excluding68 * [wide] string length number of CODE POINTS in a [wide] string (excluding 69 69 * the NULL-terminator), size_t 70 70 * … … 80 80 * NULL-terminator) 81 81 * 82 * length l size_t number of C HARACTERS in a string (excluding the82 * length l size_t number of CODE POINTS in a string (excluding the 83 83 * null terminator) 84 84 * … … 89 89 * Function naming prefixes:@n 90 90 * 91 * chr_ operate on c haracters91 * chr_ operate on code points 92 92 * ascii_ operate on ASCII characters 93 93 * str_ operate on strings … … 102 102 * pointer (char *, wchar_t *) 103 103 * byte offset (size_t) 104 * c haracterindex (size_t)104 * code point index (size_t) 105 105 * 106 106 */ … … 137 137 #define CONT_BITS 6 138 138 139 /** Decode a single c haracter from astring.140 * 141 * Decode a single c haracterfrom a string of size @a size. Decoding starts139 /** Decode a single code point from an UTF-8 encoded string. 140 * 141 * Decode a single code point from a string of size @a size. Decoding starts 142 142 * at @a offset and this offset is moved to the beginning of the next 143 * c haracter. In case of decoding error, offset generally advances at least143 * code point. In case of decoding error, offset generally advances at least 144 144 * by one. However, offset is never moved beyond size. 145 145 * … … 148 148 * @param size Size of the string (in bytes). 149 149 * 150 * @return Value of decoded c haracter, U_SPECIAL on decoding error or150 * @return Value of decoded code point, U_SPECIAL on decoding error or 151 151 * NULL if attempt to decode beyond @a size. 152 152 * … … 207 207 } 208 208 209 /** Encode a single c haracter tostring representation.210 * 211 * Encode a single c haracter to string representation (i.e. UTF-8)and store209 /** Encode a single code point to a UTF-8 string representation. 210 * 211 * Encode a single code point to a UTF-8 string representation and store 212 212 * it into a buffer at @a offset. Encoding starts at @a offset and this offset 213 * is moved to the position where the next c haractercan be written to.214 * 215 * @param ch Input c haracter.213 * is moved to the position where the next code point can be written to. 214 * 215 * @param ch Input code point. 216 216 * @param str Output buffer. 217 217 * @param offset Byte offset where to start writing. 218 218 * @param size Size of the output buffer (in bytes). 219 219 * 220 * @return EOK if the c haracterwas encoded successfully, EOVERFLOW if there221 * was not enough space in the output buffer or EINVAL if the c haracter220 * @return EOK if the code point was encoded successfully, EOVERFLOW if there 221 * was not enough space in the output buffer or EINVAL if the code point 222 222 * code was invalid. 223 223 */ … … 313 313 } 314 314 315 /** Get size of string with lengthlimit.315 /** Get size of string with code point count limit. 316 316 * 317 317 * Get the number of bytes which are used by up to @a max_len first 318 * c haracters in the string @a str. If @a max_len is greater than319 * the length of @a str, the entire string is measured (excluding the320 * NULL-terminator).318 * code points in the string @a str. If @a max_len is greater than 319 * the number of code points in @a str, the entire string is measured 320 * (excluding the NULL-terminator). 321 321 * 322 322 * @param str String to consider. 323 * @param max_len Maximum number of c haracters to measure.324 * 325 * @return Number of bytes used by the c haracters.323 * @param max_len Maximum number of code points to measure. 324 * 325 * @return Number of bytes used by the code points. 326 326 * 327 327 */ … … 344 344 * 345 345 * Get the number of bytes which are used by up to @a max_len first 346 * wide characters in the wide string @a str. If @a max_len is greater than346 * code points in the wide string @a str. If @a max_len is greater than 347 347 * the length of @a str, the entire wide string is measured (excluding the 348 348 * NULL-terminator). 349 349 * 350 350 * @param str Wide string to consider. 351 * @param max_len Maximum number of wide characters to measure.352 * 353 * @return Number of bytes used by the wide characters.351 * @param max_len Maximum number of code points to measure. 352 * 353 * @return Number of bytes used by the code points. 354 354 * 355 355 */ … … 359 359 } 360 360 361 /** Get number of characters in astring.362 * 363 * @param str NULL-terminated string.364 * 365 * @return Number of c haracters instring.361 /** Get number of unicode code points in a UTF-8 encoded string. 362 * 363 * @param str NULL-terminated UTF-8 string. 364 * 365 * @return Number of code points in the string. 366 366 * 367 367 */ … … 377 377 } 378 378 379 /** Get number of c haracters in a wide string.379 /** Get number of code points in a wide string. 380 380 * 381 381 * @param str NULL-terminated wide string. 382 382 * 383 * @return Number of c haracters in @a str.383 * @return Number of code points in @a str. 384 384 * 385 385 */ … … 394 394 } 395 395 396 /** Get number of c haracters in a string with size limit.396 /** Get number of code points in a string with size limit. 397 397 * 398 398 * @param str NULL-terminated string. 399 399 * @param size Maximum number of bytes to consider. 400 400 * 401 * @return Number of c haracters in string.401 * @return Number of code points in string. 402 402 * 403 403 */ … … 413 413 } 414 414 415 /** Get number of c haracters in a string with size limit.415 /** Get number of code points in a string with size limit. 416 416 * 417 417 * @param str NULL-terminated string. 418 418 * @param size Maximum number of bytes to consider. 419 419 * 420 * @return Number of c haracters in string.420 * @return Number of code points in string. 421 421 * 422 422 */ … … 435 435 } 436 436 437 /** Check whether c haracteris plain ASCII.438 * 439 * @return True if c haracteris plain ASCII.437 /** Check whether code point is plain ASCII. 438 * 439 * @return True if code point is plain ASCII. 440 440 * 441 441 */ … … 448 448 } 449 449 450 /** Check whether c haracteris valid451 * 452 * @return True if c haracteris a valid Unicode code point.450 /** Check whether code point is valid 451 * 452 * @return True if code point is a valid Unicode code point. 453 453 * 454 454 */ … … 465 465 * Do a char-by-char comparison of two NULL-terminated strings. 466 466 * The strings are considered equal iff their length is equal 467 * and both strings consist of the same sequence of c haracters.468 * 469 * A string S1 is less than another string S2 if it has a c haracterwith470 * lower value at the first c haracterposition where the strings differ.467 * and both strings consist of the same sequence of code points. 468 * 469 * A string S1 is less than another string S2 if it has a code point with 470 * lower value at the first code point position where the strings differ. 471 471 * If the strings differ in length, the shorter one is treated as if 472 * padded by c haracters with a value of zero.472 * padded by code points with a value of zero. 473 473 * 474 474 * @param s1 First string to compare. … … 509 509 * The strings are considered equal iff 510 510 * min(str_code_points(s1), max_len) == min(str_code_points(s2), max_len) 511 * and both strings consist of the same sequence of c haracters,512 * up to max_len c haracters.513 * 514 * A string S1 is less than another string S2 if it has a c haracterwith515 * lower value at the first c haracterposition where the strings differ.511 * and both strings consist of the same sequence of code points, 512 * up to max_len code points. 513 * 514 * A string S1 is less than another string S2 if it has a code point with 515 * lower value at the first code point position where the strings differ. 516 516 * If the strings differ in length, the shorter one is treated as if 517 * padded by c haracters with a value of zero. Only the first max_len518 * c haracters are considered.517 * padded by code points with a value of zero. Only the first max_len 518 * code points are considered. 519 519 * 520 520 * @param s1 First string to compare. 521 521 * @param s2 Second string to compare. 522 * @param max_len Maximum number of c haracters to consider.522 * @param max_len Maximum number of code points to consider. 523 523 * 524 524 * @return 0 if the strings are equal, -1 if the first is less than the second, … … 564 564 * No more than @a size bytes are written. If the size of the output buffer 565 565 * is at least one byte, the output string will always be well-formed, i.e. 566 * null-terminated and containing only complete c haracters.566 * null-terminated and containing only complete code points. 567 567 * 568 568 * @param dest Destination buffer. … … 594 594 * @a dest. No more than @a size bytes are written. The output string will 595 595 * always be well-formed, i.e. null-terminated and containing only complete 596 * c haracters.596 * code points. 597 597 * 598 598 * No more than @a n bytes are read from the input string, so it does not … … 652 652 } 653 653 654 /** Find first occurence of c haracterin string.654 /** Find first occurence of code point in string. 655 655 * 656 656 * @param str String to search. 657 * @param ch Characterto look for.658 * 659 * @return Pointer to c haracterin @a str or NULL if not found.657 * @param ch code point to look for. 658 * 659 * @return Pointer to code point in @a str or NULL if not found. 660 660 */ 661 661 char *str_chr(const char *str, wchar_t ch) … … 674 674 } 675 675 676 /** Insert a wide characterinto a wide string.677 * 678 * Insert a wide characterinto a wide string at position679 * @a pos. The c haracters after the position are shifted.676 /** Insert a code point into a wide string. 677 * 678 * Insert a code point into a wide string at position 679 * @a pos. The code points after the position are shifted. 680 680 * 681 681 * @param str String to insert to. 682 * @param ch C haracter to insert to.683 * @param pos C haracterindex where to insert.684 * @param max_pos Charactersin the buffer.682 * @param ch Code point to insert. 683 * @param pos Code point index where to insert. 684 * @param max_pos Number of code points that fit in the buffer. 685 685 * 686 686 * @return True if the insertion was sucessful, false if the position … … 704 704 } 705 705 706 /** Remove a wide characterfrom a wide string.707 * 708 * Remove a wide characterfrom a wide string at position709 * @a pos. The c haracters after the position are shifted.706 /** Remove a code point from a wide string. 707 * 708 * Remove a code point from a wide string at position 709 * @a pos. The code points after the position are shifted. 710 710 * 711 711 * @param str String to remove from. 712 * @param pos C haracterindex to remove.712 * @param pos Code point index to remove. 713 713 * 714 714 * @return True if the removal was sucessful, false if the position … … 732 732 /** Duplicate string. 733 733 * 734 * Allocate a new string and copy characters from the source 735 * string into it. The duplicate string is allocated via sleeping 736 * malloc(), thus this function can sleep in no memory conditions. 737 * 738 * The allocation cannot fail and the return value is always 739 * a valid pointer. The duplicate string is always a well-formed 734 * Allocate a new string and copy the contents of the source string into it. 735 * The duplicate string is allocated as if by malloc(). 736 * 737 * If successful, the duplicate string is always a well-formed 740 738 * null-terminated UTF-8 string, but it can differ from the source 741 739 * string on the byte level. … … 743 741 * @param src Source string. 744 742 * 745 * @return Duplicate string .743 * @return Duplicate string, or NULL if allocation failed. 746 744 * 747 745 */ … … 760 758 * 761 759 * Allocate a new string and copy up to @max_size bytes from the source 762 * string into it. The duplicate string is allocated via sleeping 763 * malloc(), thus this function can sleep in no memory conditions. 760 * string into it. The duplicate string is allocated as if by malloc(). 764 761 * No more than @max_size + 1 bytes is allocated, but if the size 765 762 * occupied by the source string is smaller than @max_size + 1, 766 763 * less is allocated. 767 764 * 768 * The allocation cannot fail and the return value is always 769 * a valid pointer. The duplicate string is always a well-formed 765 * If successful, the duplicate string is always a well-formed 770 766 * null-terminated UTF-8 string, but it can differ from the source 771 767 * string on the byte level. -
uspace/lib/c/generic/str.c
r08e103d4 r1d2f85e 41 41 * Strings and characters use the Universal Character Set (UCS). The standard 42 42 * strings, called just strings are encoded in UTF-8. Wide strings (encoded 43 * in UTF-32) are supported to a limited degree. A single c haracteris43 * in UTF-32) are supported to a limited degree. A single code point is 44 44 * represented as wchar_t.@n 45 45 * … … 50 50 * byte 8 bits stored in uint8_t (unsigned 8 bit integer) 51 51 * 52 * character UTF-32 encoded Unicode c haracter, stored in wchar_t52 * character UTF-32 encoded Unicode code point, stored in wchar_t 53 53 * (signed 32 bit integer), code points 0 .. 1114111 54 54 * are valid … … 66 66 * the NULL-terminator), size_t 67 67 * 68 * [wide] string length number of C HARACTERS in a [wide] string (excluding68 * [wide] string length number of CODE POINTS in a [wide] string (excluding 69 69 * the NULL-terminator), size_t 70 70 * … … 80 80 * NULL-terminator) 81 81 * 82 * length l size_t number of C HARACTERS in a string (excluding the82 * length l size_t number of CODE POINTS in a string (excluding the 83 83 * null terminator) 84 84 * … … 89 89 * Function naming prefixes:@n 90 90 * 91 * chr_ operate on c haracters91 * chr_ operate on code points 92 92 * ascii_ operate on ASCII characters 93 93 * str_ operate on strings … … 102 102 * pointer (char *, wchar_t *) 103 103 * byte offset (size_t) 104 * c haracterindex (size_t)104 * code point index (size_t) 105 105 * 106 106 */ … … 138 138 #define CONT_BITS 6 139 139 140 /** Decode a single c haracter from astring.141 * 142 * Decode a single c haracterfrom a string of size @a size. Decoding starts140 /** Decode a single code point from an UTF-8 encoded string. 141 * 142 * Decode a single code point from a string of size @a size. Decoding starts 143 143 * at @a offset and this offset is moved to the beginning of the next 144 * c haracter. In case of decoding error, offset generally advances at least144 * code point. In case of decoding error, offset generally advances at least 145 145 * by one. However, offset is never moved beyond size. 146 146 * … … 149 149 * @param size Size of the string (in bytes). 150 150 * 151 * @return Value of decoded c haracter, U_SPECIAL on decoding error or151 * @return Value of decoded code point, U_SPECIAL on decoding error or 152 152 * NULL if attempt to decode beyond @a size. 153 153 * … … 208 208 } 209 209 210 /** Decode a single c haracter from astring to the left.211 * 212 * Decode a single c haracterfrom a string of size @a size. Decoding starts210 /** Decode a single code point from an UTF-8 encoded string to the left. 211 * 212 * Decode a single code point from a string of size @a size. Decoding starts 213 213 * at @a offset and this offset is moved to the beginning of the previous 214 * c haracter. In case of decoding error, offset generally decreases at least214 * code point. In case of decoding error, offset generally decreases at least 215 215 * by one. However, offset is never moved before 0. 216 216 * … … 219 219 * @param size Size of the string (in bytes). 220 220 * 221 * @return Value of decoded c haracter, U_SPECIAL on decoding error or221 * @return Value of decoded code point, U_SPECIAL on decoding error or 222 222 * NULL if attempt to decode beyond @a start of str. 223 223 * … … 251 251 } 252 252 253 /** Encode a single c haracter tostring representation.254 * 255 * Encode a single c haracter to string representation (i.e. UTF-8)and store253 /** Encode a single code point to a UTF-8 string representation. 254 * 255 * Encode a single code point to a UTF-8 string representation and store 256 256 * it into a buffer at @a offset. Encoding starts at @a offset and this offset 257 * is moved to the position where the next c haractercan be written to.258 * 259 * @param ch Input c haracter.257 * is moved to the position where the next code point can be written to. 258 * 259 * @param ch Input code point. 260 260 * @param str Output buffer. 261 261 * @param offset Byte offset where to start writing. 262 262 * @param size Size of the output buffer (in bytes). 263 263 * 264 * @return EOK if the c haracterwas encoded successfully, EOVERFLOW if there265 * was not enough space in the output buffer or EINVAL if the c haracter264 * @return EOK if the code point was encoded successfully, EOVERFLOW if there 265 * was not enough space in the output buffer or EINVAL if the code point 266 266 * code was invalid. 267 267 */ … … 357 357 } 358 358 359 /** Get size of string with lengthlimit.359 /** Get size of string with code point count limit. 360 360 * 361 361 * Get the number of bytes which are used by up to @a max_len first 362 * c haracters in the string @a str. If @a max_len is greater than363 * the length of @a str, the entire string is measured (excluding the364 * NULL-terminator).362 * code points in the string @a str. If @a max_len is greater than 363 * the number of code points in @a str, the entire string is measured 364 * (excluding the NULL-terminator). 365 365 * 366 366 * @param str String to consider. 367 * @param max_len Maximum number of c haracters to measure.368 * 369 * @return Number of bytes used by the c haracters.367 * @param max_len Maximum number of code points to measure. 368 * 369 * @return Number of bytes used by the code points. 370 370 * 371 371 */ … … 425 425 * 426 426 * Get the number of bytes which are used by up to @a max_len first 427 * wide characters in the wide string @a str. If @a max_len is greater than427 * code points in the wide string @a str. If @a max_len is greater than 428 428 * the length of @a str, the entire wide string is measured (excluding the 429 429 * NULL-terminator). 430 430 * 431 431 * @param str Wide string to consider. 432 * @param max_len Maximum number of wide characters to measure.433 * 434 * @return Number of bytes used by the wide characters.432 * @param max_len Maximum number of code points to measure. 433 * 434 * @return Number of bytes used by the code points. 435 435 * 436 436 */ … … 440 440 } 441 441 442 /** Get number of characters in astring.443 * 444 * @param str NULL-terminated string.445 * 446 * @return Number of c haracters instring.442 /** Get number of unicode code points in a UTF-8 encoded string. 443 * 444 * @param str NULL-terminated UTF-8 string. 445 * 446 * @return Number of code points in the string. 447 447 * 448 448 */ … … 458 458 } 459 459 460 /** Get number of c haracters in a wide string.460 /** Get number of code points in a wide string. 461 461 * 462 462 * @param str NULL-terminated wide string. 463 463 * 464 * @return Number of c haracters in @a str.464 * @return Number of code points in @a str. 465 465 * 466 466 */ … … 475 475 } 476 476 477 /** Get number of c haracters in a string with size limit.477 /** Get number of code points in a string with size limit. 478 478 * 479 479 * @param str NULL-terminated string. 480 480 * @param size Maximum number of bytes to consider. 481 481 * 482 * @return Number of c haracters in string.482 * @return Number of code points in string. 483 483 * 484 484 */ … … 494 494 } 495 495 496 /** Get number of c haracters in a string with size limit.496 /** Get number of code points in a string with size limit. 497 497 * 498 498 * @param str NULL-terminated string. 499 499 * @param size Maximum number of bytes to consider. 500 500 * 501 * @return Number of c haracters in string.501 * @return Number of code points in string. 502 502 * 503 503 */ … … 516 516 } 517 517 518 /** Get character display widthon a character cell display.519 * 520 * @param ch C haracter521 * @return Width of characterin cells.518 /** Get display width of a code point on a character cell display. 519 * 520 * @param ch Code point 521 * @return Display width in cells. 522 522 */ 523 523 size_t chr_width(wchar_t ch) … … 543 543 } 544 544 545 /** Check whether c haracteris plain ASCII.546 * 547 * @return True if c haracteris plain ASCII.545 /** Check whether code point is plain ASCII. 546 * 547 * @return True if code point is plain ASCII. 548 548 * 549 549 */ … … 556 556 } 557 557 558 /** Check whether c haracteris valid559 * 560 * @return True if c haracteris a valid Unicode code point.558 /** Check whether code point is valid 559 * 560 * @return True if code point is a valid Unicode code point. 561 561 * 562 562 */ … … 573 573 * Do a char-by-char comparison of two NULL-terminated strings. 574 574 * The strings are considered equal iff their length is equal 575 * and both strings consist of the same sequence of c haracters.576 * 577 * A string S1 is less than another string S2 if it has a c haracterwith578 * lower value at the first c haracterposition where the strings differ.575 * and both strings consist of the same sequence of code points. 576 * 577 * A string S1 is less than another string S2 if it has a code point with 578 * lower value at the first code point position where the strings differ. 579 579 * If the strings differ in length, the shorter one is treated as if 580 * padded by c haracters with a value of zero.580 * padded by code points with a value of zero. 581 581 * 582 582 * @param s1 First string to compare. … … 617 617 * The strings are considered equal iff 618 618 * min(str_code_points(s1), max_len) == min(str_code_points(s2), max_len) 619 * and both strings consist of the same sequence of c haracters,620 * up to max_len c haracters.621 * 622 * A string S1 is less than another string S2 if it has a c haracterwith623 * lower value at the first c haracterposition where the strings differ.619 * and both strings consist of the same sequence of code points, 620 * up to max_len code points. 621 * 622 * A string S1 is less than another string S2 if it has a code point with 623 * lower value at the first code point position where the strings differ. 624 624 * If the strings differ in length, the shorter one is treated as if 625 * padded by c haracters with a value of zero. Only the first max_len626 * c haracters are considered.625 * padded by code points with a value of zero. Only the first max_len 626 * code points are considered. 627 627 * 628 628 * @param s1 First string to compare. 629 629 * @param s2 Second string to compare. 630 * @param max_len Maximum number of c haracters to consider.630 * @param max_len Maximum number of code points to consider. 631 631 * 632 632 * @return 0 if the strings are equal, -1 if the first is less than the second, … … 671 671 * Do a char-by-char comparison of two NULL-terminated strings. 672 672 * The strings are considered equal iff their length is equal 673 * and both strings consist of the same sequence of c haracters673 * and both strings consist of the same sequence of code points 674 674 * when converted to lower case. 675 675 * 676 * A string S1 is less than another string S2 if it has a c haracterwith677 * lower value at the first c haracterposition where the strings differ.676 * A string S1 is less than another string S2 if it has a code point with 677 * lower value at the first code point position where the strings differ. 678 678 * If the strings differ in length, the shorter one is treated as if 679 * padded by c haracters with a value of zero.679 * padded by code points with a value of zero. 680 680 * 681 681 * @param s1 First string to compare. … … 717 717 * The strings are considered equal iff 718 718 * min(str_code_points(s1), max_len) == min(str_code_points(s2), max_len) 719 * and both strings consist of the same sequence of c haracters,720 * up to max_len c haracters.721 * 722 * A string S1 is less than another string S2 if it has a c haracterwith723 * lower value at the first c haracterposition where the strings differ.719 * and both strings consist of the same sequence of code points, 720 * up to max_len code points. 721 * 722 * A string S1 is less than another string S2 if it has a code point with 723 * lower value at the first code point position where the strings differ. 724 724 * If the strings differ in length, the shorter one is treated as if 725 * padded by c haracters with a value of zero. Only the first max_len726 * c haracters are considered.725 * padded by code points with a value of zero. Only the first max_len 726 * code points are considered. 727 727 * 728 728 * @param s1 First string to compare. 729 729 * @param s2 Second string to compare. 730 * @param max_len Maximum number of c haracters to consider.730 * @param max_len Maximum number of code points to consider. 731 731 * 732 732 * @return 0 if the strings are equal, -1 if the first is less than the second, … … 808 808 * No more than @a size bytes are written. If the size of the output buffer 809 809 * is at least one byte, the output string will always be well-formed, i.e. 810 * null-terminated and containing only complete c haracters.810 * null-terminated and containing only complete code points. 811 811 * 812 812 * @param dest Destination buffer. … … 838 838 * @a dest. No more than @a size bytes are written. The output string will 839 839 * always be well-formed, i.e. null-terminated and containing only complete 840 * c haracters.840 * code points. 841 841 * 842 842 * No more than @a n bytes are read from the input string, so it does not … … 871 871 * Size of the destination buffer is @a dest. If the size of the output buffer 872 872 * is at least one byte, the output string will always be well-formed, i.e. 873 * null-terminated and containing only complete c haracters.873 * null-terminated and containing only complete code points. 874 874 * 875 875 * @param dest Destination buffer. … … 1030 1030 * 1031 1031 * @param dest Destination buffer. 1032 * @param dlen Number of utf16 c haracters that fit in the destination buffer.1032 * @param dlen Number of utf16 code points that fit in the destination buffer. 1033 1033 * @param src Source string. 1034 1034 * … … 1188 1188 } 1189 1189 1190 /** Find first occurence of c haracterin string.1190 /** Find first occurence of code point in string. 1191 1191 * 1192 1192 * @param str String to search. 1193 * @param ch Characterto look for.1194 * 1195 * @return Pointer to c haracterin @a str or NULL if not found.1193 * @param ch code point to look for. 1194 * 1195 * @return Pointer to code point in @a str or NULL if not found. 1196 1196 */ 1197 1197 char *str_chr(const char *str, wchar_t ch) … … 1215 1215 * @param n Needle (substring to look for) 1216 1216 * 1217 * @return Pointer to characterin @a hs or @c NULL if not found.1217 * @return Pointer to substring in @a hs or @c NULL if not found. 1218 1218 */ 1219 1219 char *str_str(const char *hs, const char *n) … … 1232 1232 } 1233 1233 1234 /** Removes specified trailing c haracters from a string.1234 /** Removes specified trailing code points from a string. 1235 1235 * 1236 1236 * @param str String to remove from. 1237 * @param ch C haracterto remove.1237 * @param ch Code point to remove. 1238 1238 */ 1239 1239 void str_rtrim(char *str, wchar_t ch) … … 1260 1260 } 1261 1261 1262 /** Removes specified leading c haracters from a string.1262 /** Removes specified leading code points from a string. 1263 1263 * 1264 1264 * @param str String to remove from. 1265 * @param ch Characterto remove.1265 * @param ch code point to remove. 1266 1266 */ 1267 1267 void str_ltrim(char *str, wchar_t ch) … … 1286 1286 } 1287 1287 1288 /** Find last occurence of c haracterin string.1288 /** Find last occurence of code point in string. 1289 1289 * 1290 1290 * @param str String to search. 1291 * @param ch Characterto look for.1292 * 1293 * @return Pointer to c haracterin @a str or NULL if not found.1291 * @param ch code point to look for. 1292 * 1293 * @return Pointer to code point in @a str or NULL if not found. 1294 1294 */ 1295 1295 char *str_rchr(const char *str, wchar_t ch) … … 1309 1309 } 1310 1310 1311 /** Insert a wide characterinto a wide string.1312 * 1313 * Insert a wide characterinto a wide string at position1314 * @a pos. The c haracters after the position are shifted.1311 /** Insert a code point into a wide string. 1312 * 1313 * Insert a code point into a wide string at position 1314 * @a pos. The code points after the position are shifted. 1315 1315 * 1316 1316 * @param str String to insert to. 1317 * @param ch C haracter to insert to.1318 * @param pos C haracterindex where to insert.1319 * @param max_pos Charactersin the buffer.1317 * @param ch Code point to insert. 1318 * @param pos Code point index where to insert. 1319 * @param max_pos Number of code points that fit in the buffer. 1320 1320 * 1321 1321 * @return True if the insertion was sucessful, false if the position … … 1339 1339 } 1340 1340 1341 /** Remove a wide characterfrom a wide string.1342 * 1343 * Remove a wide characterfrom a wide string at position1344 * @a pos. The c haracters after the position are shifted.1341 /** Remove a code point from a wide string. 1342 * 1343 * Remove a code point from a wide string at position 1344 * @a pos. The code points after the position are shifted. 1345 1345 * 1346 1346 * @param str String to remove from. 1347 * @param pos C haracterindex to remove.1347 * @param pos Code point index to remove. 1348 1348 * 1349 1349 * @return True if the removal was sucessful, false if the position … … 1367 1367 /** Duplicate string. 1368 1368 * 1369 * Allocate a new string and copy characters from the source 1370 * string into it. The duplicate string is allocated via sleeping 1371 * malloc(), thus this function can sleep in no memory conditions. 1372 * 1373 * The allocation cannot fail and the return value is always 1374 * a valid pointer. The duplicate string is always a well-formed 1369 * Allocate a new string and copy the contents of the source string into it. 1370 * The duplicate string is allocated as if by malloc(). 1371 * 1372 * If successful, the duplicate string is always a well-formed 1375 1373 * null-terminated UTF-8 string, but it can differ from the source 1376 1374 * string on the byte level. … … 1378 1376 * @param src Source string. 1379 1377 * 1380 * @return Duplicate string .1378 * @return Duplicate string, or NULL if allocation failed. 1381 1379 * 1382 1380 */ … … 1395 1393 * 1396 1394 * Allocate a new string and copy up to @max_size bytes from the source 1397 * string into it. The duplicate string is allocated via sleeping 1398 * malloc(), thus this function can sleep in no memory conditions. 1395 * string into it. The duplicate string is allocated as if by malloc(). 1399 1396 * No more than @max_size + 1 bytes is allocated, but if the size 1400 1397 * occupied by the source string is smaller than @max_size + 1, 1401 1398 * less is allocated. 1402 1399 * 1403 * The allocation cannot fail and the return value is always 1404 * a valid pointer. The duplicate string is always a well-formed 1400 * If successful, the duplicate string is always a well-formed 1405 1401 * null-terminated UTF-8 string, but it can differ from the source 1406 1402 * string on the byte level. -
uspace/lib/c/include/str.h
r08e103d4 r1d2f85e 55 55 #define STR_NO_LIMIT ((size_t) -1) 56 56 57 /** Maximum size of a string containing @c length c haracters */57 /** Maximum size of a string containing @c length code points */ 58 58 #define STR_BOUNDS(length) ((length) << 2) 59 59
Note:
See TracChangeset
for help on using the changeset viewer.