1
0
mirror of https://git.tartarus.org/simon/putty.git synced 2025-01-09 17:38:00 +00:00
Commit Graph

7396 Commits

Author SHA1 Message Date
Simon Tatham
069f7c8b21 Fix behaviour of backspace in a 1-column terminal.
This is the first bug found as a direct result of writing that
terminal test program - I added some tests for things I expected to
work already, and some of them didn't, proving immediately that it was
a good idea!

If the terminal is one column wide, and you've printed a
character (hence, set the wrapnext flag), what should backspace do?
Surely it should behave like any other backspace with wrapnext set,
i.e. clear the wrapnext flag, returning the cursor's _logical_
position to the location of the most recently printed character. But
in fact it was anti-wrapping to the previous line, because I'd got the
cases in the wrong order in the if-else chain that forms the backspace
handler. So the handler for 'we're in column 0, wrapping time' was
coming before 'wrapnext is set, just clear it'.

Now wrapnext is checked _first_, before checking anything at all. Any
time we can just clear that, we should.
2023-03-05 10:26:42 +00:00
Simon Tatham
9ba742ad9f Make backspace take account of LATTR_WRAPPED2.
Suppose an application tries to print a double-width character
starting in the rightmost column of the screen, so that we apply our
emergency fix of wrapping to the next line immediately and printing
the character in the first two columns. Suppose they then backspace
twice, taking the cursor to the RHS and then the LHS of that
character. What should happen if they backspace a third time?

Our previous behaviour was to completely ignore the unusual situation,
and do the same thing we'd do in any other backspace from column 0:
anti-wrap the cursor to the last column of the previous line, leaving
it in the empty character cell that was skipped when the DW char
couldn't be printed in it.

But I think this isn't the best response, because it breaks the
invariant that printing N columns' worth of graphic characters and
then backspacing N times should leave the cursor on the first of those
characters. If I print "a가" (for example) and then backspace three
times, I want the cursor on the a, _even_ if weird line wrapping
behaviour happened somewhere in that sequence.

(Rationale: this helps naïve terminal applications which don't even
know what the terminal width is, and aren't tracking their absolute x
position. In particular, the simplistic line-based input systems that
appear in OS kernels and our own lineedit.c will want to emit a fixed
number of backspace-space-backspace sequences to delete characters
previously entered on to the line by the user. They still need to
check the wcwidth of the characters they're emitting, so that they can
BSB twice for a DW character or 0 times for a combining one, but it
would be *hugely* more awkward for them to ask the terminal where the
cursor is so that they can take account of difficult line wraps!)

We already have the ability to _recognise_ this situation: on a line
that was wrapped in this unusual way, we set the LATTR_WRAPPED2 line
attribute flag, to prevent the empty rightmost column from injecting
an unwanted space into copy-pastes from the terminal. Now we also use
the same flag to cause the backspace control character to do something
interesting.

This was the fix that inspired me to start writing test_terminal,
because I knew it was touching a delicate area. However, in the course
of writing this fix and its tests, I encountered two (!) further bugs,
which I'll fix in followup commits!
2023-03-05 10:18:50 +00:00
Simon Tatham
21a31c19b7 Add some tests of line wrapping.
As promised in the previous commit, I'm adding tests of the area I'm
about to mess with.
2023-03-05 10:18:50 +00:00
Simon Tatham
57536cb7a3 Initial work on a terminal test program.
This has all the basic necessities to become a test of the terminal's
behaviour, in terms of how its data structures evolve as output is
sent to it, and perhaps also (by filling in the stub TermWin more
usefully) testing what it draws during updates and what it sends in
response to query sequences.

For the moment, all I've done is to set up the framework, and add one
demo test of printing some ordinary text and observing that it appears
in the data structures and the cursor has moved.

I expect that writing a full test of terminal.c will be a very big
job. But perhaps I or someone else will find time to prod it gradually
in the background of other work. In particular, when I'm _modifying_
any part of the terminal code, it would be good to add some tests for
the part I'm changing, before making the change, and check they still
work afterwards.
2023-03-05 10:18:50 +00:00
Simon Tatham
c890449d76 Expose lineptr and unlineptr outside terminal.c.
This will allow test programs to look more easily at the terminal's
data structures.
2023-03-05 10:18:50 +00:00
Simon Tatham
1b8fb1d436 terminal: remove the 'screen' parameter from lineptr().
It wasn't used for anything except in an assert statement, which was
triggered by the use of the scrlineptr() macro wrapper. Now moved that
check into scrlineptr() itself, via a helper function that passes the
line number of the scrlineptr() call site.

(Yes, this is introducing another modalfatalbox in terminal.c, much
like the dreaded line==NULL one that caused us so many headaches in
past decades. But the check in question was being done _already_ by
the assert in lineptr(), so this change shouldn't make it go off in
any _more_ circumstances - and now, if it does, it will at least give
us slightly more useful information about where the problem is!)
2023-03-05 10:18:50 +00:00
Simon Tatham
f9943e2ffd term_get_userpass_input: support the prompts->utf8 flag.
This continues the programme of UTF-8 support in authentication, begun
in commit f4519b6533 which arranged for console userpass prompts
to function in UTF-8 when the prompts_t asked them to. Since the new
line editing setup works properly when it _is_ in UTF-8 mode, I can
now also arrange that it puts the terminal into UTF-8 mode in the
right circumstances.

I've extended the applicability of the '-legacy-charset-handling' flag
introduced by the commit mentioned above, so that now it's not
specific to the console front end. Now you can give it to GUI PuTTY as
well, which restores the previous (wrong) behaviour of accepting
username and password prompt input in the main session's configured
character set. So if this change breaks someone's workflow, they
should be able to have it back.
2023-03-04 14:06:04 +00:00
Simon Tatham
1a7e4ec8d4 New centralised version of local line editing.
This takes over from both the implementation in ldisc.c and the one in
term_get_userpass_input, which were imperfectly duplicating each
other's functionality. The new version should be more consistent
between the two already, and also, it means further improvements can
now be made in just one place.

In the course of this, I've restructured the inside of ldisc.c by
moving the input_queue bufchain to the other side of the translation
code in ldisc_send. Previously, ldisc_send received a string, an
optional 'dedicated key' indication (bodgily signalled by a negative
length) and an 'interactive' flag, translated that somehow into a
combination of raw backend output and specials, and saved the latter
in input_queue. Now it saves the original (string, dedicated flag,
interactive flag) data in input_queue, and doesn't do the translation
until the data is pulled back _out_ of the queue. That's because the
new line editing system expects to receive something much closer to
the original data format.

The term_get_userpass_input system is also substantially restructured.
Instead of ldisc.c handing each individual keystroke to terminal.c so
that it can do line editing on it, terminal.c now just gives the Ldisc
a pointer to its instance of the new TermLineEditor object - and then
ldisc.c can put keystrokes straight into that, in the same way it
would put them into its own TermLineEditor, without having to go via
terminal.c at all. So the term_get_userpass_input edifice is only
called back when the line editor actually delivers the answer to a
username or password prompt.

(I considered not _even_ having a separate TermLineEditor for password
prompts, and just letting ldisc.c use its own. But the problem is that
some of the behaviour differences between the two line editors are
deliberate, for example the use of ^D to signal 'abort this prompt',
and the use of Escape as an alternative line-clearing command. So
TermLineEditor has a flags word that allows ldisc and terminal to set
it up differently. Also this lets me give the two TermLineEditors a
different vtable of callback functions, which is a convenient way for
terminal.c to get notified when a prompt has been answered.)

The new line editor still passes all the tests I wrote for the old
one. But it already has a couple of important improvements, both in
the area of UTF-8 handling:

Firstly, when we display a UTF-8 character on the terminal, we check
with the terminal how many character cells it occupied, and then if
the user deletes it again from the editing buffer, we can emit the
right number of backspace-space-backspace sequences. (The old ldisc
line editor incorrectly assumed all Unicode characters had terminal
with 1, partly because its buffer was byte- rather than character-
oriented and so it was more than enough work just finding where the
character _start_ was.)

Secondly, terminal.c's userpass line editor would never emit a byte in
the 80-BF range to the terminal at all, which meant that nontrivial
UTF-8 characters always came out as U+FFFD blobs!
2023-03-04 13:55:50 +00:00
Simon Tatham
7a48837471 Add a test rig for ldisc's local line editing.
I'm about to rewrite it completely, so the first thing I need to do is
to write tests for as much of the functionality as possible, so that I
can check the new implementation behaves in the same ways.
2023-03-04 13:05:20 +00:00
Simon Tatham
fd43ff6e27 Move SessionSpecial definitions into their own header.
This will allow me to re-include it elsewhere, to make an array of the
specials' names for diagnostic purposes.
2023-03-04 13:05:20 +00:00
Simon Tatham
b5645f79dd Document our long-standing workarounds policy.
For years I've been following the principle that before I'll add
auto-detection of an SSH server bug, I want the server maintainer to
have fixed the bug, so that the list of affected version numbers
triggering the workaround is complete, and to provide an incentive for
implementations to gradually converge on rightness.

*Finally*, I've got round to documenting that policy in public, for
the Feedback page!
2023-02-28 18:58:14 +00:00
Simon Tatham
23c408d49d Move the logeventf wrappers into their own source file.
Separating them from logging.c allows them to be shared between the
real logging.c and the new stub no-logging.c.
2023-02-18 14:11:31 +00:00
Simon Tatham
334d4f315e Add some extra stub modules.
Also for use in test programs: stub modules that provide non-
functional versions of logging, printing and storage.
2023-02-18 14:11:31 +00:00
Simon Tatham
edce3fb9da Add platform-independent Unicode setup function.
Similarly to the one I just added for FontSpec: in a cross-platform
main source file, you don't really want to mess about with
per-platform ifdefs just to initialise a 'struct unicode_data' from a
Conf. But until now, you had to, because init_ucs had a different
prototype on Windows and Unix.

I plan to use this in future test programs. But an immediate positive
effect is that it removes the only platform-dependent call from
fuzzterm.c. So now that could be built on Windows too, given only an
appropriate cmake stanza. (Not that I have much idea if it's useful to
fuzz the terminal separately on multiple platforms, but it's nice to
know that it's possible if anyone does need to.)
2023-02-18 14:10:27 +00:00
Simon Tatham
4341ba6d5c Add platform-independent fontspec_new_default() function.
Constructing a FontSpec in platform-independent code is awkward,
because you can't call fontspec_new() outside the platform subdirs
(since its prototype varies per platform). But sometimes you just need
_some_ valid FontSpec, e.g. to put in a Conf that will be used in some
place where you don't actually care about font settings, such as a
purely CLI program.

Both Unix and Windows _have_ an idiom for this, but they're different,
because their FontSpec constructors have different prototypes. The
existing CLI tools have always had per-platform main source files, so
they just use the locally appropriate method of constructing a boring
don't-care FontSpec.

But if you want a _platform-independent_ main source file, such as you
might find in a test program, then that's rather awkward. Better to
have a platform-independent API for making a default FontSpec.
2023-02-18 14:10:21 +00:00
Simon Tatham
9e01de7c2b decode_utf8: add an enumeration of failure reasons.
Now you can optionally get back an enum value indicating whether the
character was successfully decoded, or whether U+FFFD was substituted
due to some kind of problem, and if the latter, what problem.

For a start, this allows distinguishing 'real' U+FFFD (encoded
legitimately in the input) from one invented by the decoder. Also, it
allows the recipient of the decode to treat failures differently,
either by passing on a useful error report to the user (as
utf8_unknown_char now does) or by doing something special.

In particular, there are two distinct error codes for a truncated
UTF-8 encoding, depending on whether it was truncated by the end of
the input or by encountering a non-continuation byte. The former code
means that the string is not legal UTF-8 _as it is_, but doesn't rule
out it being a (bytewise) prefix of a legal UTF-8 string - so if a
client is receiving UTF-8 data a byte at a time, they can treat that
error code specially and not make it a fatal error.
2023-02-17 17:16:54 +00:00
Simon Tatham
9d308b39da Reinstate putty.chm in Windows binary zipfiles.
A user just reported that it hasn't been there since 0.76. This turns
out to be because I put the wrong pathname on the 'zip' commands in
Buildscr (miscounted the number of ../ segments).

I would have noticed immediately, if Info-Zip had failed with an error
when it found I'd given it a nonexistent filename to add to the zip
file. But in fact it just prints a warning and proceeds to add all the
other files I specified. It looks as if it will only return a nonzero
exit status if _all_ the filenames you specified were nonexistent.

Therefore, I've rewritten the zip-creation commands so that they run
zip once per file. That way if any file is unreadable we _will_ get a
build error.

(Also, while I'm here, I took the opportunity to get rid of that ugly
ls|grep.)
2023-02-04 15:36:55 +00:00
Simon Tatham
658ec0457f Move Windows definition of CP_UTF8 into windows subdir.
I've only just noticed that the definition of CP_UTF8 as 65001 (the
Windows code page number for UTF-8) is in the main putty.h, under an
ifdef that checks whether the per-platform header file had already
defined it to something else. That's a silly way to do things! Better
that the Windows-specific definition goes _in_ the Windows platform
header, and putty.h contains no fallback. That way, anyone writing a
third separate platform directory will get an error reminding them
that they have to provide the right definition for their platform,
instead of finding out later via a runtime failure.
2023-01-28 15:01:31 +00:00
Jacob Nevins
343f64c2ca 'private key' -> 'SSH private key' in new FAQ. 2023-01-22 10:13:37 +00:00
Simon Tatham
e289265e37 Fix build failure on systems without fstatat.
cmake's configure-time #defines (at least the way I use them) are
defined to 0 or 1, rather than sometimes not defined at all, so you
have to test them with plain #if rather than #ifdef or #if defined.

I _thought_ I'd caught all of those in previous fixes, but apparently
there were a couple still lurking. Oops.
2023-01-18 18:06:45 +00:00
Jacob Nevins
89014315ed It's a new year. 2023-01-07 14:03:12 +00:00
Jacob Nevins
943e54db4a FAQ about private key configuration control move.
This is genuinely a FAQ -- we've been asked about it 3-4 times now.
2023-01-07 14:02:14 +00:00
Simon Tatham
37f67bc937 Another minor docs typo. 2022-12-30 20:08:46 +00:00
Simon Tatham
752f5028f0 Fix typo in 'plink -share' documentation. 2022-12-30 11:09:31 +00:00
Simon Tatham
add3f89005 Formatting: normalise to { on same line.
There were remarkably few of these, but I spotted one while preparing
the previous commit, and then found a handful more.
2022-12-28 15:37:57 +00:00
Simon Tatham
d509a2dc1e Formatting: normalise to put a space after condition keywords.
'if (thing)' is the local style here, not 'if(thing)'. Similarly with
'for' and 'while'.
2022-12-28 15:32:24 +00:00
Simon Tatham
6fcc7ed728 Formatting: fix a few mis-spaced assignments.
I spotted one of those in the raw backend the other day, and now I've
got round to finding a bunch more and fixing them.
2022-12-28 15:28:36 +00:00
Simon Tatham
9f2e1e6e03 Prevent sending double-EOF in raw backend.
You can't call sk_write_eof() twice on the same socket, because the
second one will fail an assertion. Instead, you're supposed to know
you've already sent EOF, and not try to send it again.

The call to sk_write_eof() in raw_special (triggered by pressing ^D in
GUI PuTTY, for example) sets the flag raw->sent_socket_eof in an
attempt to prevent this. But it doesn't _check_ that flag, so a second
press of ^D can reach that assertion failure.
2022-12-21 15:19:15 +00:00
Simon Tatham
5ade8c0047 ldisc: fix unwanted double-action of ^U.
In ldisc's line editing mode, pressing ^U is supposed to erase the
current unsent line rather than inserting a literal ^U into the
buffer. In fact, when using a non-Telnet backend, it erases the
line *and* inserts ^U into the buffer!

This happens because it shares a case handler with three other
disruptive control characters (^C, ^\, ^Z), which all also clear the
line-editing buffer before doing their various actions. But in
non-Telnet mode, their actions become literal insertion of themselves,
so the combined effect is to erase the line and them self-insert.

I'm not 100% convinced that was what I actually meant to do with those
characters. But it _certainly_ wasn't what I meant to do with ^U, so
that one at least I should fix right now!
2022-12-21 15:16:21 +00:00
Simon Tatham
95b926865a GTK: fix crash changing font size when terminal maximised.
When I maximised a terminal window today and then used Ctrl-< to
reduce its font size (expecting that the window size would stay the
same but more characters would be squeezed in), pterm failed the
assertion in term_request_resize_completed() that checks
term->win_resize_pending == WIN_RESIZE_AWAIT_REPLY.

This happened because in this situation request_resize_internal() was
called from within window.c rather than from within the terminal code
itself. So the terminal didn't know a resize is pending at all, and
was surprised to be told that one had finished.

request_resize_internal() already has a flag parameter to tell it
whether a given resize came from the terminal or not. On the main code
path, that flag is used to decide whether to notify the terminal. But
on the early exit path when the window is maximised, we weren't
checking the flag. An easy fix.
2022-12-04 11:53:06 +00:00
Simon Tatham
c14f0e02cc Stop selectable GTK message boxes clobbering PRIMARY.
I noticed today that when GTK PuTTY puts up a message box such as a
host key dialog, which calls our create_message_box function with
selectable=true (so that the host key fingerprint can be conveniently
copy-pasted), a side effect is to take the X11 PRIMARY selection away
from whoever previously had it, even though the message box isn't
actually selecting anything right now.

I don't fully understand what's going on, but it apparently has
something to do with 'select on focus' behaviour, in which tabbing
into a selectable text control automatically selects its entire
contents. That makes sense for edit boxes, but not really for this
kind of thing.

Unfortunately, GTK apparently has no per-widget configuration to turn
that off. (The closest I found is not even per _application_: it lives
in GtkSettings, whose documentation says that it's general across all
GTK apps run by a user!)

So instead I work around it by moving the gtk_label_set_selectable
call to after the focus of the new window has already been sorted out.
Ugly, but it seems to work.
2022-11-27 13:18:39 +00:00
Simon Tatham
f4519b6533 Add UTF-8 support to the new Windows ConsoleIO system.
This allows you to set a flag in conio_setup() which causes the
returned ConsoleIO object to interpret all its output as UTF-8, by
translating it to UTF-16 and using WriteConsoleW to write it in
Unicode. Similarly, input is read using ReadConsoleW and decoded from
UTF-16 to UTF-8.

This flag is set to false in most places, to avoid making sudden
breaking changes. But when we're about to present a prompts_t to the
user, it's set from the new 'utf8' flag in that prompt, which in turn
is set by the userauth layer in any case where the prompts are going
to the server.

The idea is that this should be the start of a fix for the long-
standing character-set handling bug that strings transmitted during
SSH userauth (usernames, passwords, k-i prompts and responses) are all
supposed to be in UTF-8, but we've always encoded them in whatever our
input system happens to be using, and not done any tidying up on them.
We get occasional complaints about this from users whose passwords
contain characters that are encoded differently between UTF-8 and
their local encoding, but I've never got round to fixing it because
it's a large piece of engineering.

Indeed, this isn't nearly the end of it. The next step is to add UTF-8
support to all the _other_ ways of presenting a prompts_t, as best we
can.

Like the previous change to console handling, it seems very likely
that this will break someone's workflow. So there's a fallback
command-line option '-legacy-charset-handling' to revert to PuTTY's
previous behaviour.
2022-11-26 10:49:03 +00:00
Simon Tatham
80aed96286 New system for reading prompts from the console.
Until now, the command-line PuTTY tools (PSCP, PSFTP and Plink) have
presented all the kinds of interactive prompt (password/passphrase,
host key, the assorted weak-crypto warnings, 'append to log file?') on
standard error, and read the responses from standard input.

This is unfortunate because if you're redirecting their standard
input (especially likely with Plink) then the prompt responses will
consume some of the intended session data. It would be better to
present the prompts _on the console_, even if that's not where stdin
or stderr point.

On Unix, we've been doing this for ages, by opening /dev/tty directly.
On Windows, we didn't, because I didn't know how. But I've recently
found out: you can open the magic file names CONIN$ and CONOUT$, which
will point at your actual console, if one is available.

So now, if it's possible, the command-line tools will do that. But if
the attempt to open CONIN$ and CONOUT$ fails, they'll fall back to the
old behaviour (in particular, if no console is available at all).

In order to make this happen consistently across all the prompt types,
I've introduced a new object called ConsoleIO, which holds whatever
file handles are necessary, knows whether to close them
afterwards (yes if they were obtained by opening CONFOO$, no if
they're the standard I/O handles), and presents a BinarySink API to
write to them and a custom API to read a line of text.

This seems likely to break _someone's_ workflow. So I've added an
option '-legacy-stdio-prompts' to restore the old behaviour.
2022-11-26 10:48:59 +00:00
Simon Tatham
f91c3127ad split_into_argv: add special case for program name.
In the Windows API, there are two places you can get a command line in
the form of a single unsplit string. One is via the command-line
parameter to WinMain(); the other is by calling GetCommandLine(). But
the two have different semantics: the WinMain command line string is
only the part after the program name, whereas GetCommandLine() returns
the full command line _including_ the program name.

PuTTY has never yet had to parse the full output of GetCommandLine,
but I have plans that will involve it beginning to do so. So I need to
make sure the utility function split_into_argv() can handle it.

This is not trivial because the quoting convention is different for
the program name than for everything else. In the program's normal
arguments, parsed by the C library startup code, the convention is
that backslashes are special when they appear before a double quote,
because that's how you write a literal double quote. But in the
program name, backslashes are _never_ special, because that's how
CreateProcess parses the program name at the start of the command
line, and the C library must follow suit in order to correctly
identify where the program name ends and the arguments begin.

In particular, consider a command line such as this:

    "C:\Program Files\Foo\"foo.exe "hello \"world\""

The \" in the middle of the program name must be treated as a literal
backslash, followed by a non-literal double quote which matches the
one at the start of the string and causes the space in 'Program Files'
to be treated as part of the pathname. But the same \" when it appears
in the subsequent argument is treated as an escaped double quote, and
turns into a literal " in the argument string.

This commit adds support for this special initial-word handling in
split_into_argv(), via an extra boolean argument indicating whether to
turn that mode on. However, all existing call sites set the flag to
false, because the new mode isn't needed _yet_. So there should be no
functional change.
2022-11-26 10:32:36 +00:00
Simon Tatham
dbd0bde415 New utility function burnwcs().
Just like burnstr(), it memsets a NUL-terminated string to all zeroes
before freeing it. The only difference is that it does it to a string
of wchar_t.
2022-11-26 10:32:36 +00:00
Simon Tatham
1625fd8fcb Handle the -batch option centrally in cmdline.c.
This removes one case from several of the individual tools'
command-line parsers, and moves it into a central place where it will
automatically be supported by any tool containing console.c.

In order to make that not cause a link failure, there's now a
stubs/no-console.c which GUI clients of cmdline.c must include.
2022-11-26 10:31:18 +00:00
Simon Tatham
819efc3c21 Support horizontal scroll events in mouse tracking.
Horizontal scroll events aren't generated by the traditional mouse
wheel, but they can be generated by trackpad gestures, though this
isn't always configured on.

The cross-platform and Windows parts of this patch is due to
Christopher Plewright; I added the GTK support.
2022-11-26 10:29:27 +00:00
Simon Tatham
5f2eff2fea Build option to disable scrollback compression.
This was requested by a downstream of the code, who wanted to change
the time/space tradeoff in the terminal. I currently have no plans to
change this setting for upstream PuTTY, although there is a cmake
option for it just to make testing it easy.

To avoid sprinkling ifdefs over the whole terminal code, the strategy
is to keep the separate type 'compressed_scrollback_line', and turn it
into a typedef for a 'termline *'. So compressline() becomes almost
trivial, and decompressline() even more so.

Memory management is the fiddly part. To make this work sensibly on
both sides, I've broken up each of compressline() and decompressline()
into two versions, one of which takes ownership of (and logically
speaking frees) its input, and the other doesn't. So at call sites
where a function was followed by a free, it's now calling the
'and_free' version of the function, and where the input object was
reused afterwards, it's calling the 'no_free' version. This means that
in different branches of the #if, I can make one function call the
other or vice versa, and no call site is stuck with having to do
things in a more roundabout way than necessary.

The freeing of the _return_ value from decompressline() is handled for
us, because termlines already have a 'temporary' flag which is set
when they're returned from the decompressor, and anyone receiving a
termline from lineptr() calls unlineptr() when they're finished with
it, which will _conditionally_ free it, depending on that 'temporary'
flag. So in the new mode, 'temporary' is never set at all, and all
those unlineptr() calls do nothing.

However, we also still need to free compressed lines properly when
they're actually being thrown away (scrolled off the top of the
scrollback, or cleaned up in term_free), and for that, I've made a new
special-purpose free_compressed_line() function.
2022-11-20 15:04:00 +00:00
Simon Tatham
fec6719a2b Fix duplicate call to term_resize_request_completed().
A KDE user observed that if you 'dock' a GTK PuTTY window to the side
of the screen (by dragging it to the RH edge, causing it to
half-maximise over the right-hand half of the display, similarly to
Windows), and then send a terminal resize sequence, then PuTTY fails
the assertion in term_resize_request_completed() which expects that an
unacknowledged resize request was currently in flight.

When drawing_area_setup() calls term_resize_request_completed() in
response to the inst->term_resize_notification_required flag, it
resets the inst->win_resize_pending flag, but doesn't reset
inst->term_resize_notification_required. As a result, the _next_ call
to drawing_area_setup will find that flag still set, and make a
duplicate call to term_resize_request_completed, after the terminal no
longer believes it's waiting for a response to a resize request. And
in this 'docked to the right-hand side of the display' state, KDE
apparently triggers two calls to drawing_area_setup() in quick
succession, making this bug manifest.

I could fix this by clearing inst->term_resize_notification_required.
But inspecting all the other call sites, it seems clear to me that my
original intention was for inst->term_resize_notification_required to
be a flag that's only meaningful if inst->win_resize_pending is set.
So I think a better fix is to conditionalise the check in
drawing_area_setup so that we don't even check
inst->term_resize_notification_required if !inst->win_resize_pending.
2022-11-14 22:21:49 +00:00
Ben Jackson
3cfbd3df0f Support xterm any-event mouse tracking
From https://invisible-island.net/xterm/ctlseqs/ctlseqs.html#h3-Any-event-tracking:

    Any-event mode is the same as button-event mode, except that all motion
    events are reported, even if no mouse button is down.  It is enabled by
    specifying 1003 to DECSET.

Normally the front ends only report mouse events when buttons are
pressed, so we introduce a MA_MOVE event with MBT_NOTHING set to
indicate such a mouse movement.
2022-11-11 17:26:09 +00:00
Simon Tatham
854d78eef3 Fix build failure on Visual Studio.
Unlike clang, VS didn't like me using the value of one 'static const'
integer variable to compute the value of another, and complained
'initializer is not a constant'. Replaced all those variables with an
enum, which should also more reliably ensure that even an
unsophisticated compiler doesn't actually reserve data-section space
for them.
2022-11-11 12:42:19 +00:00
Simon Tatham
d3e186e81b Function to check a UTF-8 string for unknown characters.
So we can reject things we don't know how to NFC yet.
2022-11-11 08:49:05 +00:00
Simon Tatham
b35d23f699 Implement Unicode normalisation.
A new module in 'utils' computes NFC and NFD, via a new set of data
tables generated by read_ucd.py.

The new module comes with a new test program, which can read the
NormalizationTest.txt that appears in the Unicode Character Database.
All the tests pass, as of Unicode 15.
2022-11-11 08:48:18 +00:00
Simon Tatham
4cb429e3f4 Update to Unicode 15.
Now I have a script I can easily re-run, there's no reason not to do
just that! This updates all of the new generated header files for the
UCD.zip that comes with Unicode 15.0.0.

I've re-run my bidi test suite against 15.0.0's file of test cases,
and confirmed they all pass.
2022-11-11 08:44:07 +00:00
Simon Tatham
4bb37233a5 Commit read_ucd.py's output and switch over to it.
This removes the superseded tables in source files, and also all the
code snippets in comments that generated them.
2022-11-11 08:44:07 +00:00
Simon Tatham
430af47a38 Polish the output of read_ucd.py.
The initial outputs were all deliberately inconsistent with each
other, so that each one exactly matched the existing table I was
trying to replace.

Now I've done that check, I can clean them up. Normalised spacing and
case to be consistent; removed pointless indentation (these are now
include files, so they don't have to be indented to the same level as
the array declaration surrounding each one's #include); added a header
comment in each autogenerated file, saying that it's autogenerated,
what it's for, and who it's used by.

The currently supported version number of Unicode is also exposed in a
header file, so that I can put it in diagnostics.
2022-11-11 08:44:01 +00:00
Simon Tatham
b72c9aba28 New script to generate Unicode data tables.
This will replace the various pieces of Perl scattered throughout the
code base in comments above long boring data tables. The idea is that
those long boring tables will move into header files in the new
'unicode' directory, and will be #included from the source files that
use the tables.

One benefit is that I won't have to page tediously past the tables to
get to the actual code I want to edit. But more importantly, it should
now become easy to update to a new version of Unicode, by re-running
just one script and committing the changed versions of all the headers
in the 'unicode' subdir.

This version of the script regenerates six Unicode-derived tables in
the existing source code in a byte-for-byte identical form. In the
next commits I'll clean it up, commit the output, and delete the
tables from their previous locations.

(One table I _haven't_ incorporated into this system is the Arabic
shaping table in bidi.c, because my attempt to regenerate it came out
not matching the original at all. That _might_ be because the table is
based on an old Unicode standard and desperately needs updating, but
it might also be because I misunderstood how it works. So I'll leave
sorting that out for another time.)
2022-11-09 19:21:02 +00:00
Simon Tatham
69e217d23a Make decode_utf8() read from a BinarySource.
This enables it to handle data that isn't presented as a
NUL-terminated string.

In particular, the NUL byte can appear _within_ the string and be
correctly translated to the NUL wide character. So I've been able to
remove the awkwardness in the test rig of having to include the
terminating NUL in every test to ensure NUL has been tested, and
instead, insert a single explicit test for it.

Similarly to the previous commit, the simplification at the (one) call
site gives me a strong feeling of 'this is what the API should have
been all along'!
2022-11-09 19:21:02 +00:00
Simon Tatham
d89f2bfc55 Fix typo in decode_utf8 tests.
The test in question was supposed to contain the spurious UTF-8
encoding that 0xD800 would have if it were not a surrogate. But the
final continuation character 0x80 was instead 0x00.

The test passed anyway, because ED A0 was regarded as a truncated
sequence, instead of ED A0 80 being regarded as an illegal encoding of
a surrogate, and both return the same output!
2022-11-09 19:21:02 +00:00
Simon Tatham
834b58e39b Make encode_utf8() output to a BinarySink.
Previously it output to an ordinary char buffer, and returned the
number of bytes it had written. But three out of the four call sites
immediately chucked the resulting bytes into a BinarySink anyway. The
fourth, in windows/unicode.c, really is writing into successive
locations of a fixed-size buffer - but we can make that into a
BinarySink too, using the buffer_sink added in the previous commit.

So now encode_utf8() is renamed put_utf8_char, and the call sites all
look simpler than they started out.
2022-11-09 19:02:32 +00:00