putty-source

mirror of https://git.tartarus.org/simon/putty.git synced 2025-01-09 17:38:00 +00:00

Author	SHA1	Message	Date
Simon Tatham	4f756d2a4d	Rework Unicode conversion APIs to use a BinarySink. The previous mb_to_wc and wc_to_mb had horrible and also buggy APIs. This commit introduces a fresh pair of functions to replace them, which generate output by writing to a BinarySink. So it's now up to the caller to decide whether it wants the output written to a fixed-size buffer with overflow checking (via buffer_sink), or dynamically allocated, or even written directly to some other output channel. Nothing uses the new functions yet. I plan to migrate things over in upcoming commits. What was wrong with the old APIs: they had that awkward undocumented Windows-specific 'flags' parameter that I described in the previous commit and took out of the dup_X_to_Y wrappers. But much worse, the semantics for buffer overflow were not just undocumented but actually inconsistent. dup_wc_to_mb() in utils assumed that the underlying wc_to_mb would fill the buffer nearly full and return the size of data it wrote. In fact, this was untrue in the case where wc_to_mb called WideCharToMultiByte: that returns straight-up failure, setting the Windows error code to ERROR_INSUFFICIENT_BUFFER. It _does_ partially fill the output buffer, but doesn't tell you how much it wrote! What's wrong with the new API: it's a bit awkward to write a sequence of wchar_t in native byte order to a byte-oriented BinarySink, so people using put_mb_to_wc directly have to do some annoying pointer casting. But I think that's less horrible than the previous APIs. Another change: in the new API for wc_to_mb, defchr can be "", but not NULL.	2024-09-26 11:30:07 +01:00
Simon Tatham	edce3fb9da	Add platform-independent Unicode setup function. Similarly to the one I just added for FontSpec: in a cross-platform main source file, you don't really want to mess about with per-platform ifdefs just to initialise a 'struct unicode_data' from a Conf. But until now, you had to, because init_ucs had a different prototype on Windows and Unix. I plan to use this in future test programs. But an immediate positive effect is that it removes the only platform-dependent call from fuzzterm.c. So now that could be built on Windows too, given only an appropriate cmake stanza. (Not that I have much idea if it's useful to fuzz the terminal separately on multiple platforms, but it's nice to know that it's possible if anyone does need to.)	2023-02-18 14:10:27 +00:00
Simon Tatham	5a28658a6d	Remove uni_tbl from struct unicode_data. Instead of maintaining a single sparse table mapping Unicode to the currently selected code page, we now maintain a collection of such tables mapping Unicode to any code page we've so far found a need to work with, and we add code pages to that list as necessary, and never throw them away (since there are a limited number of them). This means that the wc_to_mb family of functions are effectively stateless: they no longer depend on a 'struct unicode_data' corresponding to the current terminal settings. So I've removed that parameter from all of them. This fills in the missing piece of yesterday's commit `a216d86106`: now wc_to_mb too should be able to handle internally-implemented character sets, by hastily making their reverse mapping table if it doesn't already have it. (That was only a _latent_ bug, because the only use of wc_to_mb in the cross-platform or Windows code _did_ want to convert to the currently selected code page, so the old strategy worked in that case. But there was no protection against an unworkable use of it being added later.)	2022-06-01 09:28:25 +01:00
Simon Tatham	8a907510dd	decode_codepage(): add missing const in prototype.	2022-06-01 08:29:29 +01:00
Simon Tatham	cf41bc0c62	Unix mb_to_wc: add missing bounds checks. Checking various implementations of these functions against each other, I noticed by eyeball review that some of the special cases in mb_to_wc() never check the buffer limit at all. Yikes! Fortunately, I think there's no vulnerability, because these special cases are ones that write out at most one wide char per multibyte char, and at all the call sites (including dup_mb_to_wc) we allocate that much even for the first attempt. The only exception to that is the call in key_event() in unix/window.c, which uses a fixed-size output buffer, but its input will always be the data generated by an X keystroke event. So that one can only overrun the buffer if an X key event manages to translate into more than 32 wide characters of text - and even if that does come up in some exotic edge case, it will at least not be happening under _enemy_ control.	2022-03-12 18:51:21 +00:00
Simon Tatham	f39c51f9a7	Rename most of the platform source files. This gets rid of all those annoying 'win', 'ux' and 'gtk' prefixes which made filenames annoying to type and to tab-complete. Also, as with my other recent renaming sprees, I've taken the opportunity to expand and clarify some of the names so that they're not such cryptic abbreviations.	2021-04-26 18:00:01 +01:00

6 Commits