Changeset 1d2f85e in mainline for boot/generic/src/str.c
- Timestamp:
- 2019-02-05T18:26:54Z (6 years ago)
- Parents:
- 08e103d4
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
boot/generic/src/str.c
r08e103d4 r1d2f85e 37 37 * Strings and characters use the Universal Character Set (UCS). The standard 38 38 * strings, called just strings are encoded in UTF-8. Wide strings (encoded 39 * in UTF-32) are supported to a limited degree. A single c haracteris39 * in UTF-32) are supported to a limited degree. A single code point is 40 40 * represented as wchar_t.@n 41 41 * … … 46 46 * byte 8 bits stored in uint8_t (unsigned 8 bit integer) 47 47 * 48 * character UTF-32 encoded Unicode c haracter, stored in wchar_t48 * character UTF-32 encoded Unicode code point, stored in wchar_t 49 49 * (signed 32 bit integer), code points 0 .. 1114111 50 50 * are valid … … 62 62 * the NULL-terminator), size_t 63 63 * 64 * [wide] string length number of C HARACTERS in a [wide] string (excluding64 * [wide] string length number of CODE POINTS in a [wide] string (excluding 65 65 * the NULL-terminator), size_t 66 66 * … … 76 76 * NULL-terminator) 77 77 * 78 * length l size_t number of C HARACTERS in a string (excluding the78 * length l size_t number of CODE POINTS in a string (excluding the 79 79 * null terminator) 80 80 * … … 85 85 * Function naming prefixes:@n 86 86 * 87 * chr_ operate on c haracters87 * chr_ operate on code points 88 88 * ascii_ operate on ASCII characters 89 89 * str_ operate on strings … … 98 98 * pointer (char *, wchar_t *) 99 99 * byte offset (size_t) 100 * c haracterindex (size_t)100 * code point index (size_t) 101 101 * 102 102 */ … … 128 128 #define CONT_BITS 6 129 129 130 /** Decode a single c haracter from astring.131 * 132 * Decode a single c haracterfrom a string of size @a size. Decoding starts130 /** Decode a single code point from an UTF-8 encoded string. 131 * 132 * Decode a single code point from a string of size @a size. Decoding starts 133 133 * at @a offset and this offset is moved to the beginning of the next 134 * c haracter. In case of decoding error, offset generally advances at least134 * code point. In case of decoding error, offset generally advances at least 135 135 * by one. However, offset is never moved beyond size. 136 136 * … … 139 139 * @param size Size of the string (in bytes). 140 140 * 141 * @return Value of decoded c haracter, U_SPECIAL on decoding error or141 * @return Value of decoded code point, U_SPECIAL on decoding error or 142 142 * NULL if attempt to decode beyond @a size. 143 143 * … … 198 198 } 199 199 200 /** Encode a single c haracter tostring representation.201 * 202 * Encode a single c haracter to string representation (i.e. UTF-8)and store200 /** Encode a single code point to a UTF-8 string representation. 201 * 202 * Encode a single code point to a UTF-8 string representation and store 203 203 * it into a buffer at @a offset. Encoding starts at @a offset and this offset 204 * is moved to the position where the next c haractercan be written to.205 * 206 * @param ch Input c haracter.204 * is moved to the position where the next code point can be written to. 205 * 206 * @param ch Input code point. 207 207 * @param str Output buffer. 208 208 * @param offset Byte offset where to start writing. 209 209 * @param size Size of the output buffer (in bytes). 210 210 * 211 * @return EOK if the c haracterwas encoded successfully, EOVERFLOW if there212 * was not enough space in the output buffer or EINVAL if the c haracter211 * @return EOK if the code point was encoded successfully, EOVERFLOW if there 212 * was not enough space in the output buffer or EINVAL if the code point 213 213 * code was invalid. 214 214 */ … … 289 289 } 290 290 291 /** Get size of string with lengthlimit.291 /** Get size of string with code point count limit. 292 292 * 293 293 * Get the number of bytes which are used by up to @a max_len first 294 * c haracters in the string @a str. If @a max_len is greater than295 * the length of @a str, the entire string is measured (excluding the296 * NULL-terminator).294 * code points in the string @a str. If @a max_len is greater than 295 * the number of code points in @a str, the entire string is measured 296 * (excluding the NULL-terminator). 297 297 * 298 298 * @param str String to consider. 299 * @param max_len Maximum number of c haracters to measure.300 * 301 * @return Number of bytes used by the c haracters.299 * @param max_len Maximum number of code points to measure. 300 * 301 * @return Number of bytes used by the code points. 302 302 * 303 303 */ … … 317 317 } 318 318 319 /** Get number of characters in astring.320 * 321 * @param str NULL-terminated string.322 * 323 * @return Number of c haracters instring.319 /** Get number of unicode code points in a UTF-8 encoded string. 320 * 321 * @param str NULL-terminated UTF-8 string. 322 * 323 * @return Number of code points in the string. 324 324 * 325 325 */ … … 335 335 } 336 336 337 /** Check whether c haracteris plain ASCII.338 * 339 * @return True if c haracteris plain ASCII.337 /** Check whether code point is plain ASCII. 338 * 339 * @return True if code point is plain ASCII. 340 340 * 341 341 */ … … 348 348 } 349 349 350 /** Check whether c haracteris valid351 * 352 * @return True if c haracteris a valid Unicode code point.350 /** Check whether code point is valid 351 * 352 * @return True if code point is a valid Unicode code point. 353 353 * 354 354 */ … … 365 365 * Do a char-by-char comparison of two NULL-terminated strings. 366 366 * The strings are considered equal iff their length is equal 367 * and both strings consist of the same sequence of c haracters.368 * 369 * A string S1 is less than another string S2 if it has a c haracterwith370 * lower value at the first c haracterposition where the strings differ.367 * and both strings consist of the same sequence of code points. 368 * 369 * A string S1 is less than another string S2 if it has a code point with 370 * lower value at the first code point position where the strings differ. 371 371 * If the strings differ in length, the shorter one is treated as if 372 * padded by c haracters with a value of zero.372 * padded by code points with a value of zero. 373 373 * 374 374 * @param s1 First string to compare. … … 409 409 * No more than @a size bytes are written. If the size of the output buffer 410 410 * is at least one byte, the output string will always be well-formed, i.e. 411 * null-terminated and containing only complete c haracters.411 * null-terminated and containing only complete code points. 412 412 * 413 413 * @param dest Destination buffer.
Note:
See TracChangeset
for help on using the changeset viewer.