This is the payoff from the last few commits of refactoring. It fixes
the following race-condition bug in terminal application redraw:
* server sends a window-resizing escape sequence
* terminal requests a window resize from the front end
* server sends further escape sequences to perform a redraw of some
full-screen application, which assume that the window resize has
occurred and the window is already its new size
* terminal processes all those sequences in the context of the old
window size, while the front end is still thinking
* window resize completes in the front end and term_size() tells the
terminal it now has its new size, but it's too late, the screen
redraw has made a total mess.
(Perhaps the server might even send its window resize + followup
redraw all in one SSH packet, so that it's all queued in term->inbuf
in one go.)
As far as I can see, handling of this case has been broken more or
less forever in the GTK frontend (where window resizes are inherently
asynchronous due to the way X11 works, and we've never done anything
to compensate for that). On Windows, where window size is changed via
SetWindowPos which is synchronous, it used to work, but broke in
commit d74308e90e (i.e. between 0.74 and 0.75), which made all
the ancillary window updates run on the same delayed-action timer as
ordinary text display.
So, it's time to fix it, and I think now I should be able to fix it in
GTK as well as on Windows.
Now, as soon as we've set the term->win_resize_pending flag (in
response to a resize escape sequence), the next return to the top of
the main loop in term_out will terminate output processing early,
leaving any further terminal data still in the term->inbuf bufchain.
Once we get a term_size() callback from the front end telling us our
new size, we reset term->win_resize_pending, which unblocks output
processing again, and we also queue a toplevel callback to have
another try at term_out() so that it will be unblocked promptly.
To implement this I've changed term->win_resize_pending from a bool
into a three-state enumeration, so that we can tell the difference
between 'pending' in the sense of not yet having sent our resize
request to the frontend, and in the sense of waiting for the frontend
to reply. That way, a window resize from the GUI user at least won't
be mistaken for the response to our resize request if it arrives in
the former state. (It can still be mistaken for one in the latter
case, but if the user is resizing the window at the same time as the
server-side application is doing critically size-dependent redrawing,
I don't think there can be any reasonable expectation of nothing going
wrong.)
As mentioned in the previous commit, some failure modes under X11 (in
particular the window manager process getting wedged in some way) can
result in no response being received to a ConfigureWindow request. In
that situation, it seems to me that we really _shouldn't_ sit there
waiting forever - perhaps it's technically the WM's fault and not
ours, but what kind of X window are you most likely to want to use to
do emergency WM repair? A terminal window, of course, so it would be
exceptionally unhelpful to make any terminal window stop working
completely in this situation! Hence, there's a fallback timeout in
terminal.c, so that if we don't receive a response in _too_ long,
we'll assume one is not forthcoming, and resume processing terminal
data at the old window size. The fallback timeout is set to 5 seconds,
following existing practice in libXt (DEFAULT_WM_TIMEOUT).
There's no actual need to copy the data from term->inbuf into a local
variable inside term_out(). We can simply store a pointer and length,
and use the data _in situ_ - as long as we remember how much of it
we've used, and bufchain_consume() it when the routine exits.
Getting rid of that awkward and size-limited local array should
marginally improve performance. But also, it opens up the possibility
to suddenly suspend handling of terminal data and leave everything not
yet processed in the bufchain, because now we never remove anything
from the bufchain until _after_ it's been processed, so there's no
need to awkwardly push the unused segment of localbuf[] back on to the
front of the bufchain if we need to do that.
NFC, but as usual, I have a plan to use the new capability in a
followup commit.
This tiny refactoring makes a convenient function for setting all the
'pending' flags and triggering a callback for the next window update.
This saves a bit of code, but that's not really the main point (or
else I'd have done the same to all the other similar things like
window moves). The point is that in a future commit I'm going to want
to do an extra thing on every server-controlled window resize, and
this refactoring gives me a single place to put that extra action.
This tiny refactoring replaces three identical checks at call sites,
not all as well commented as each other, with a check in just one
place with the best of the three comments.
Previously, when we scrolled the terminal, the newly exposed line at
the bottom would be immediately allocated a trust status corresponding
to the current state of the terminal. So if you're in trusted mode and
you print a newline, then the line scrolled on at the bottom
immediately gets a trust sigil, whether you subsequently print
anything on it or not.
Up until now, that hasn't mattered, because we always _do_ print
something on it. But if you don't - if you send \r\n\r\n to
deliberately leave a blank line - then it turns out that's not what we
want after all, because if the screen _doesn't_ scroll, the
passed-over line remains completely blank, whereas if it does scroll
the blank line gets a trust sigil, which is inconsistent.
Now, terminal lines newly exposed by a scroll have untrusted status,
just the same as terminal lines that were present in the initial blank
screen. They only become trusted if you actually print at least one
character on them (whereupon check_trust_status will re-clear them
just in case). And this is now independent of whether the terminal has
scrolled or not.
This is the same as the previous FUNKY_XTERM mode if you don't press
any modifier keys, but now Shift or Ctrl or Alt with function keys
adds an extra bitmap parameter. The bitmaps are the same as the ones
used by the new SHARROW_BITMAP arrow key mode.
As well as affecting the bitmap field in the escape sequence, it was
_also_ having its otherwise standard effect of prefixing Esc to the
whole sequence. It shouldn't do both.
This commit introduces a new config option for how to handle shifted
arrow keys.
In the default mode (SHARROW_APPLICATION), we do what we've always
done: Ctrl flips the arrow keys between sending their most usual
escape sequences (ESC [ A ... ESC [ D) and sending the 'application
cursor keys' sequences (ESC O A ... ESC O D). Whichever of those modes
is currently configured, Ctrl+arrow sends the other one.
In the new mode (SHARROW_BITMAP), application cursor key mode is
unaffected by any shift keys, but the default sequences acquire two
numeric arguments. The first argument is 1 (reflecting the fact that a
shifted arrow key still notionally moves just 1 character cell); the
second is the bitmap (1 for Shift) + (2 for Alt) + (4 for Ctrl),
offset by 1. (Except that if _none_ of those modifiers is pressed,
both numeric arguments are simply omitted.)
The new bitmap mode is what current xterm generates, and also what
Windows ConPTY seems to expect. If you start an ordinary Command
Prompt and launch into WSL, those are the sequences it will generate
for shifted arrow keys; conversely, if you run a Command Prompt within
a ConPTY, then these sequences for Ctrl+arrow will have the effect you
expect in cmd.exe command-line editing (going backward or forward a
word). For that reason, I enable this mode unconditionally when
launching Windows pterm.
While fixing the previous commit I noticed that window titles don't
actually _work_ properly if you change the terminal character set,
because the text accumulated in the OSC string buffer is sent to the
TermWin as raw bytes, with no indication of what character set it
should interpret them as. You might get lucky if you happened to
choose the right charset (in particular, UTF-8 is a common default),
but if you change the charset half way through a run, then there's
certainly no way the frontend will know to interpret two window titles
sent before and after the change in two different charsets.
So, now win_set_title() and win_set_icon_title() both include a
codepage parameter along with the byte string, and it's up to them to
translate the provided window title from that encoding to whatever the
local window system expects to receive.
On Windows, that's wide-string Unicode, so we can just use the
existing dup_mb_to_wc utility function. But in GTK, it's UTF-8, so I
had to write an extra utility function to encode a wide string as
UTF-8.
When the terminal is in UTF-8 mode, we accumulate UTF-8 text normally
in the OSC string buffer - but the byte 0x9C is interpreted as the C1
control character String Terminator, which terminates the OSC
sequence. That's not really what you want in UTF-8 mode, because 0x9C
is also a perfectly normal UTF-8 continuation character. For example,
you'd expect this to set the window title to "FÜNF":
echo -ne '\033]0;FÜNF\007'
but in fact, by the sheer chance that Ü is encoded with an 0x9C byte,
you get a window title consisting of "F" followed by an illegal-
encoding marker, and the OSC sequence is terminated abruptly so that
the trailing 'NF' is printed normally to the terminal and then the BEL
generates a beep.
Now, in UTF-8 mode, we only support the C1 control for ST if it
appears in the form of the proper UTF-8 encoding of U+009C. So that
example now 'works', at least in the sense that the terminal considers
the OSC sequence to terminate where the sender expected it to
terminate.
Another case where we interpret 0x9C inappropriately as ST is if the
terminal is in a single-byte character set in which that character is
a printing one. In CP437, for example, you can't set a window title
containing a pound sign, because its encoding is 0x9C.
This commit by itself doesn't make those window titles _work_, in the
sense of coming out looking right. They just mean that the OSC
sequence is not terminated at the wrong place. The actual title
rendering will be fixed in the next commit.
I accidentally deleted the original author's name in my rewrite, which
was unnecessarily unfriendly given that some of their code is still
here. Also I made a thinko in my explanation of the U+00AD problem.
This standalone CLI program runs the UCD bidi tests in the form
provided in Unicode 14.0.0. You can run it by just saying
bidi_test --class BidiTest.txt --char BidiCharacterTest.txt
assuming those two UCD files are in the current directory.
A user reported that PuTTY's existing bidi algorithm will generate
misordered text in cases like this (assuming UTF-8):
echo -e '12 A \xD7\x90\xD7\x91 B'
The hex codes in the middle are the Hebrew letters aleph and beth.
Appearing in the middle of a line whose primary direction is
left-to-right, those two letters should appear in the opposite order,
but not cause the rest of the line to move around. That is, you expect
the displayed text in this situation to be
12 A <beth><aleph> B
But in fact, the digits '12' were erroneously reversed, so you would
actually see '21 A <beth><aleph> B'.
I tried to debug the existing bidi algorithm, but it was very hard,
because the Unicode bidi spec has been extensively changed since
Arabeyes contributed that code, and I couldn't even reliably work out
which version of the spec the code was intended to implement. I found
some problems, notably that the resolution phase was running once on
the whole line instead of separately on runs of characters at the same
level, and also that the 'sor' and 'eor' values were being wrongly
computed. But I had no way to test any fix to ensure it hadn't
introduced another bug somewhere else.
Unicode provides a set of conformance tests in the UCD. That was just
what I wanted - but they're too up-to-date to run against the old
algorithm and expect to pass!
So, paradoxically, it seemed to me that the _easiest_ way to fix this
bidi bug would be to bring absolutely everything up to date. But the
revised bidi algorithm is significantly more complicated, so I also
didn't think it would be sensible to try to gradually evolve the
existing code into it. Instead, I've done a complete rewrite of my
own.
The new code implements the full UAX#9 rev 44 algorithm, including in
particular support for the new 'directional isolate' control
characters, and also special handling for matched pairs of brackets in
the text (see rule N0 in the spec). I've managed to get it to pass the
entire UCD conformance test suite, so I'm reasonably confident it's
right, or at the very least a lot closer to right than the old
algorithm was.
So the upshot is: the test case shown at the top of this file now
passes, but also, other detailed bidi handling might have changed,
certainly some cases involving brackets, but perhaps also other things
that were either bugs in the old algorithm or updates to the standard.
The input length field is now a size_t rather than an int, on general
principles. The return value is now void (we weren't using the
previous return value at all). And we now require the client to have
previously allocated a BidiContext, which will allow allocated storage
to be reused between runs, saving a lot of churn on malloc.
(However, the current BidiContext doesn't contain anything
interesting. I could have moved the existing mallocs into it, but
there's no point, since I'm about to rewrite the whole thing anyway.)
This makes it easier to create the matching array of type names in
bidi_gettype.c, and eliminates the need for an assertion to check the
array matched the enum. And I'm about to need to add more types, so
let's start by making that trivially easy.
That's what I've usually been doing with any main()s I find under
ifdef; there's no reason this should be an exception. If we're keeping
it in the code at all, we should ensure it carries on compiling.
I've also created a new header file bidi.h, containing pieces of the
bidi definitions shared between bidi.c and the new source file.
This contains terminal.c, bidi.c (formerly minibidi.c), and
terminal.h. I'm about to make a couple more bidi-related source files,
so it seems worth starting by making a place to put them that won't be
cluttering up the top level.