mirror of
https://git.tartarus.org/simon/putty.git
synced 2025-07-01 03:22:48 -05:00
decode_utf8: add an enumeration of failure reasons.
Now you can optionally get back an enum value indicating whether the character was successfully decoded, or whether U+FFFD was substituted due to some kind of problem, and if the latter, what problem. For a start, this allows distinguishing 'real' U+FFFD (encoded legitimately in the input) from one invented by the decoder. Also, it allows the recipient of the decode to treat failures differently, either by passing on a useful error report to the user (as utf8_unknown_char now does) or by doing something special. In particular, there are two distinct error codes for a truncated UTF-8 encoding, depending on whether it was truncated by the end of the input or by encountering a non-continuation byte. The former code means that the string is not legal UTF-8 _as it is_, but doesn't rule out it being a (bytewise) prefix of a legal UTF-8 string - so if a client is receiving UTF-8 data a byte at a time, they can treat that error code specially and not make it a fatal error.
This commit is contained in:
@ -1365,7 +1365,7 @@ int mb_to_wc(int codepage, int flags, const char *mbstr, int mblen,
|
||||
|
||||
while (get_avail(src)) {
|
||||
wchar_t wcbuf[2];
|
||||
size_t nwc = decode_utf8_to_wchar(src, wcbuf);
|
||||
size_t nwc = decode_utf8_to_wchar(src, wcbuf, NULL);
|
||||
|
||||
for (size_t i = 0; i < nwc; i++) {
|
||||
if (remaining > 0) {
|
||||
|
Reference in New Issue
Block a user