putty-source

mirror of https://git.tartarus.org/simon/putty.git synced 2025-01-09 17:38:00 +00:00

Author	SHA1	Message	Date
Simon Tatham	4cb429e3f4	Update to Unicode 15. Now I have a script I can easily re-run, there's no reason not to do just that! This updates all of the new generated header files for the UCD.zip that comes with Unicode 15.0.0. I've re-run my bidi test suite against 15.0.0's file of test cases, and confirmed they all pass.	2022-11-11 08:44:07 +00:00
Simon Tatham	4bb37233a5	Commit read_ucd.py's output and switch over to it. This removes the superseded tables in source files, and also all the code snippets in comments that generated them.	2022-11-11 08:44:07 +00:00
Simon Tatham	9a84a89c32	Add a batch of missing 'static's.	2022-09-03 12:02:48 +01:00
Simon Tatham	01d8561446	do_bidi: initialise paragraphOverride correctly. I'd forgotten to initialise it at all, which meant it was set to zero by the initial memset of the whole BidiContext on creation. But in our enumeration of bidi character types, zero corresponds to L (the most common left-to-right alphabetic character class), and as a value for paragraphOverride, that is not neutral. As a result, a command such as this (assuming UTF-8) echo -e '\xD7\x90\xD7\x91' would produce Hebrew aleph and beth in the correct display order (aleph on the right), but aligned to the left margin of the terminal instead of the right margin, because the overall direction of the line was taken to be forcibly overridden to "left-to-right" instead of being inferred dynamically from the line contents. do_bidi() is a tiny wrapper on the inner function that does all the real work. And the inner function has been subjected to the whole Unicode 14 bidi conformance test. So naturally, the "trivial" but untested function just outside it is where the embarrassing bug was.	2022-05-24 17:43:48 +01:00
Simon Tatham	1f6fa876e3	do_bidi: remove a pointless assert. When the textlen parameter became a size_t, it became unsigned, so it stopped being useful to assert() its non-negativity. Spotted by Coverity. Harmless, but ordinary compilers have been known to emit annoying warnings about that kind of thing too, so it's worth fixing just to avoid noise.	2022-01-29 18:24:31 +00:00
Simon Tatham	e744071a03	Remove some unused variables. clang warned about these in the recent bidi work.	2021-10-16 12:03:39 +01:00
Simon Tatham	54930cf784	bidi.c: correct comments. I accidentally deleted the original author's name in my rewrite, which was unnecessarily unfriendly given that some of their code is still here. Also I made a thinko in my explanation of the U+00AD problem.	2021-10-10 22:55:41 +01:00
Simon Tatham	93ba74579a	Test rig for the new bidi algorithm. This standalone CLI program runs the UCD bidi tests in the form provided in Unicode 14.0.0. You can run it by just saying bidi_test --class BidiTest.txt --char BidiCharacterTest.txt assuming those two UCD files are in the current directory.	2021-10-10 15:00:30 +01:00
Simon Tatham	b8be01adca	Complete rewrite of the bidi algorithm. A user reported that PuTTY's existing bidi algorithm will generate misordered text in cases like this (assuming UTF-8): echo -e '12 A \xD7\x90\xD7\x91 B' The hex codes in the middle are the Hebrew letters aleph and beth. Appearing in the middle of a line whose primary direction is left-to-right, those two letters should appear in the opposite order, but not cause the rest of the line to move around. That is, you expect the displayed text in this situation to be 12 A <beth><aleph> B But in fact, the digits '12' were erroneously reversed, so you would actually see '21 A <beth><aleph> B'. I tried to debug the existing bidi algorithm, but it was very hard, because the Unicode bidi spec has been extensively changed since Arabeyes contributed that code, and I couldn't even reliably work out which version of the spec the code was intended to implement. I found some problems, notably that the resolution phase was running once on the whole line instead of separately on runs of characters at the same level, and also that the 'sor' and 'eor' values were being wrongly computed. But I had no way to test any fix to ensure it hadn't introduced another bug somewhere else. Unicode provides a set of conformance tests in the UCD. That was just what I wanted - but they're too up-to-date to run against the old algorithm and expect to pass! So, paradoxically, it seemed to me that the _easiest_ way to fix this bidi bug would be to bring absolutely everything up to date. But the revised bidi algorithm is significantly more complicated, so I also didn't think it would be sensible to try to gradually evolve the existing code into it. Instead, I've done a complete rewrite of my own. The new code implements the full UAX#9 rev 44 algorithm, including in particular support for the new 'directional isolate' control characters, and also special handling for matched pairs of brackets in the text (see rule N0 in the spec). I've managed to get it to pass the entire UCD conformance test suite, so I'm reasonably confident it's right, or at the very least a lot closer to right than the old algorithm was. So the upshot is: the test case shown at the top of this file now passes, but also, other detailed bidi handling might have changed, certainly some cases involving brackets, but perhaps also other things that were either bugs in the old algorithm or updates to the standard.	2021-10-10 15:00:30 +01:00
Simon Tatham	caa16deb1c	bidi.c: update the API. The input length field is now a size_t rather than an int, on general principles. The return value is now void (we weren't using the previous return value at all). And we now require the client to have previously allocated a BidiContext, which will allow allocated storage to be reused between runs, saving a lot of churn on malloc. (However, the current BidiContext doesn't contain anything interesting. I could have moved the existing mallocs into it, but there's no point, since I'm about to rewrite the whole thing anyway.)	2021-10-10 14:55:16 +01:00
Simon Tatham	d7548d0449	Move bidi gettype main() into its own file. That's what I've usually been doing with any main()s I find under ifdef; there's no reason this should be an exception. If we're keeping it in the code at all, we should ensure it carries on compiling. I've also created a new header file bidi.h, containing pieces of the bidi definitions shared between bidi.c and the new source file.	2021-10-10 14:53:25 +01:00
Simon Tatham	0377c689f2	Start a 'terminal' source subdirectory. This contains terminal.c, bidi.c (formerly minibidi.c), and terminal.h. I'm about to make a couple more bidi-related source files, so it seems worth starting by making a place to put them that won't be cluttering up the top level.	2021-10-10 14:37:10 +01:00

12 Commits