I thought I'd found all of these before, but perhaps a few managed to
slip in since I last looked. The character argument to the <ctype.h>
functions must have the value of an unsigned char or EOF; passing an
ordinary char (unless you know char is unsigned on every platform the
code will ever go near) risks mistaking '\xFF' for EOF, and causing
outright undefined behaviour on byte values in the range 80-FE. Never
do it.
This has all the basic necessities to become a test of the terminal's
behaviour, in terms of how its data structures evolve as output is
sent to it, and perhaps also (by filling in the stub TermWin more
usefully) testing what it draws during updates and what it sends in
response to query sequences.
For the moment, all I've done is to set up the framework, and add one
demo test of printing some ordinary text and observing that it appears
in the data structures and the cursor has moved.
I expect that writing a full test of terminal.c will be a very big
job. But perhaps I or someone else will find time to prod it gradually
in the background of other work. In particular, when I'm _modifying_
any part of the terminal code, it would be good to add some tests for
the part I'm changing, before making the change, and check they still
work afterwards.
This continues the programme of UTF-8 support in authentication, begun
in commit f4519b6533 which arranged for console userpass prompts
to function in UTF-8 when the prompts_t asked them to. Since the new
line editing setup works properly when it _is_ in UTF-8 mode, I can
now also arrange that it puts the terminal into UTF-8 mode in the
right circumstances.
I've extended the applicability of the '-legacy-charset-handling' flag
introduced by the commit mentioned above, so that now it's not
specific to the console front end. Now you can give it to GUI PuTTY as
well, which restores the previous (wrong) behaviour of accepting
username and password prompt input in the main session's configured
character set. So if this change breaks someone's workflow, they
should be able to have it back.
I'm about to rewrite it completely, so the first thing I need to do is
to write tests for as much of the functionality as possible, so that I
can check the new implementation behaves in the same ways.
Similarly to the one I just added for FontSpec: in a cross-platform
main source file, you don't really want to mess about with
per-platform ifdefs just to initialise a 'struct unicode_data' from a
Conf. But until now, you had to, because init_ucs had a different
prototype on Windows and Unix.
I plan to use this in future test programs. But an immediate positive
effect is that it removes the only platform-dependent call from
fuzzterm.c. So now that could be built on Windows too, given only an
appropriate cmake stanza. (Not that I have much idea if it's useful to
fuzz the terminal separately on multiple platforms, but it's nice to
know that it's possible if anyone does need to.)
Constructing a FontSpec in platform-independent code is awkward,
because you can't call fontspec_new() outside the platform subdirs
(since its prototype varies per platform). But sometimes you just need
_some_ valid FontSpec, e.g. to put in a Conf that will be used in some
place where you don't actually care about font settings, such as a
purely CLI program.
Both Unix and Windows _have_ an idiom for this, but they're different,
because their FontSpec constructors have different prototypes. The
existing CLI tools have always had per-platform main source files, so
they just use the locally appropriate method of constructing a boring
don't-care FontSpec.
But if you want a _platform-independent_ main source file, such as you
might find in a test program, then that's rather awkward. Better to
have a platform-independent API for making a default FontSpec.
Now you can optionally get back an enum value indicating whether the
character was successfully decoded, or whether U+FFFD was substituted
due to some kind of problem, and if the latter, what problem.
For a start, this allows distinguishing 'real' U+FFFD (encoded
legitimately in the input) from one invented by the decoder. Also, it
allows the recipient of the decode to treat failures differently,
either by passing on a useful error report to the user (as
utf8_unknown_char now does) or by doing something special.
In particular, there are two distinct error codes for a truncated
UTF-8 encoding, depending on whether it was truncated by the end of
the input or by encountering a non-continuation byte. The former code
means that the string is not legal UTF-8 _as it is_, but doesn't rule
out it being a (bytewise) prefix of a legal UTF-8 string - so if a
client is receiving UTF-8 data a byte at a time, they can treat that
error code specially and not make it a fatal error.
I've only just noticed that the definition of CP_UTF8 as 65001 (the
Windows code page number for UTF-8) is in the main putty.h, under an
ifdef that checks whether the per-platform header file had already
defined it to something else. That's a silly way to do things! Better
that the Windows-specific definition goes _in_ the Windows platform
header, and putty.h contains no fallback. That way, anyone writing a
third separate platform directory will get an error reminding them
that they have to provide the right definition for their platform,
instead of finding out later via a runtime failure.
This allows you to set a flag in conio_setup() which causes the
returned ConsoleIO object to interpret all its output as UTF-8, by
translating it to UTF-16 and using WriteConsoleW to write it in
Unicode. Similarly, input is read using ReadConsoleW and decoded from
UTF-16 to UTF-8.
This flag is set to false in most places, to avoid making sudden
breaking changes. But when we're about to present a prompts_t to the
user, it's set from the new 'utf8' flag in that prompt, which in turn
is set by the userauth layer in any case where the prompts are going
to the server.
The idea is that this should be the start of a fix for the long-
standing character-set handling bug that strings transmitted during
SSH userauth (usernames, passwords, k-i prompts and responses) are all
supposed to be in UTF-8, but we've always encoded them in whatever our
input system happens to be using, and not done any tidying up on them.
We get occasional complaints about this from users whose passwords
contain characters that are encoded differently between UTF-8 and
their local encoding, but I've never got round to fixing it because
it's a large piece of engineering.
Indeed, this isn't nearly the end of it. The next step is to add UTF-8
support to all the _other_ ways of presenting a prompts_t, as best we
can.
Like the previous change to console handling, it seems very likely
that this will break someone's workflow. So there's a fallback
command-line option '-legacy-charset-handling' to revert to PuTTY's
previous behaviour.
Until now, the command-line PuTTY tools (PSCP, PSFTP and Plink) have
presented all the kinds of interactive prompt (password/passphrase,
host key, the assorted weak-crypto warnings, 'append to log file?') on
standard error, and read the responses from standard input.
This is unfortunate because if you're redirecting their standard
input (especially likely with Plink) then the prompt responses will
consume some of the intended session data. It would be better to
present the prompts _on the console_, even if that's not where stdin
or stderr point.
On Unix, we've been doing this for ages, by opening /dev/tty directly.
On Windows, we didn't, because I didn't know how. But I've recently
found out: you can open the magic file names CONIN$ and CONOUT$, which
will point at your actual console, if one is available.
So now, if it's possible, the command-line tools will do that. But if
the attempt to open CONIN$ and CONOUT$ fails, they'll fall back to the
old behaviour (in particular, if no console is available at all).
In order to make this happen consistently across all the prompt types,
I've introduced a new object called ConsoleIO, which holds whatever
file handles are necessary, knows whether to close them
afterwards (yes if they were obtained by opening CONFOO$, no if
they're the standard I/O handles), and presents a BinarySink API to
write to them and a custom API to read a line of text.
This seems likely to break _someone's_ workflow. So I've added an
option '-legacy-stdio-prompts' to restore the old behaviour.
In the Windows API, there are two places you can get a command line in
the form of a single unsplit string. One is via the command-line
parameter to WinMain(); the other is by calling GetCommandLine(). But
the two have different semantics: the WinMain command line string is
only the part after the program name, whereas GetCommandLine() returns
the full command line _including_ the program name.
PuTTY has never yet had to parse the full output of GetCommandLine,
but I have plans that will involve it beginning to do so. So I need to
make sure the utility function split_into_argv() can handle it.
This is not trivial because the quoting convention is different for
the program name than for everything else. In the program's normal
arguments, parsed by the C library startup code, the convention is
that backslashes are special when they appear before a double quote,
because that's how you write a literal double quote. But in the
program name, backslashes are _never_ special, because that's how
CreateProcess parses the program name at the start of the command
line, and the C library must follow suit in order to correctly
identify where the program name ends and the arguments begin.
In particular, consider a command line such as this:
"C:\Program Files\Foo\"foo.exe "hello \"world\""
The \" in the middle of the program name must be treated as a literal
backslash, followed by a non-literal double quote which matches the
one at the start of the string and causes the space in 'Program Files'
to be treated as part of the pathname. But the same \" when it appears
in the subsequent argument is treated as an escaped double quote, and
turns into a literal " in the argument string.
This commit adds support for this special initial-word handling in
split_into_argv(), via an extra boolean argument indicating whether to
turn that mode on. However, all existing call sites set the flag to
false, because the new mode isn't needed _yet_. So there should be no
functional change.
This removes one case from several of the individual tools'
command-line parsers, and moves it into a central place where it will
automatically be supported by any tool containing console.c.
In order to make that not cause a link failure, there's now a
stubs/no-console.c which GUI clients of cmdline.c must include.
Horizontal scroll events aren't generated by the traditional mouse
wheel, but they can be generated by trackpad gestures, though this
isn't always configured on.
The cross-platform and Windows parts of this patch is due to
Christopher Plewright; I added the GTK support.
From https://invisible-island.net/xterm/ctlseqs/ctlseqs.html#h3-Any-event-tracking:
Any-event mode is the same as button-event mode, except that all motion
events are reported, even if no mouse button is down. It is enabled by
specifying 1003 to DECSET.
Normally the front ends only report mouse events when buttons are
pressed, so we introduce a MA_MOVE event with MBT_NOTHING set to
indicate such a mouse movement.
This enables it to handle data that isn't presented as a
NUL-terminated string.
In particular, the NUL byte can appear _within_ the string and be
correctly translated to the NUL wide character. So I've been able to
remove the awkwardness in the test rig of having to include the
terminating NUL in every test to ensure NUL has been tested, and
instead, insert a single explicit test for it.
Similarly to the previous commit, the simplification at the (one) call
site gives me a strong feeling of 'this is what the API should have
been all along'!
Previously it output to an ordinary char buffer, and returned the
number of bytes it had written. But three out of the four call sites
immediately chucked the resulting bytes into a BinarySink anyway. The
fourth, in windows/unicode.c, really is writing into successive
locations of a fixed-size buffer - but we can make that into a
BinarySink too, using the buffer_sink added in the previous commit.
So now encode_utf8() is renamed put_utf8_char, and the call sites all
look simpler than they started out.
A server attempt to resize the window (for instance via DECCOLM) when
"When window is resized" was set to "Forbid resizing completely" would
cause all terminal output to be suspended, due to the resize attempt
never being acknowledged.
(There are other code paths like this, which I've fixed for
completeness, but I don't think they have any effect: the terminal
filters out resize attempts to the current size before this point, and
even if a server can get such a request through the SUPDUP protocol, the
test for that is wrong and will never fire -- this needs fixing
separately.)
The "Cancel" button's keyboard shortcut was accidentally removed by
f1c8298000, having only just reinstated it in a77040afa1.
(Also, fix a couple of blatantly fibbing "accelerators used" comments.)
It turns out this isn't actually necessary after all to make the
installer behave in the expected way in the default case (giving a UAC
prompt and installing systemwide). And I'm told it has undesirable
consequences in more complicated cases, which I'm not expert enough in
MSI to fully understand.
Of the three calls to queue_toplevel_callback in window.c, one of them
was still passing NULL as its context parameter, rather than the new
'wgs'. As a result, a segfault could occur during some session closures.
I mentioned recently (in commit 9e7d4c53d8) message that I'm no
longer fond of the variable name 'ret', because it's used in two quite
different contexts: it's the return value from a subroutine you just
called (e.g. 'int ret = read(fd, buf, len);' and then check for error
or EOF), or it's the value you're preparing to return from the
_containing_ routine (maybe by assigning it a default value and then
conditionally modifying it, or by starting at NULL and reallocating,
or setting it just before using the 'goto out' cleanup idiom). In the
past I've occasionally made mistakes by forgetting which meaning the
variable had, or accidentally conflating both uses.
If all else fails, I now prefer 'retd' (short for 'returned') in the
former situation, and 'toret' (obviously, the value 'to return') in
the latter case. But even better is to pick a name that actually says
something more specific about what the thing actually is.
One particular bad habit throughout this codebase is to have a set of
functions that deal with some object type (say 'Foo'), all *but one*
of which take a 'Foo *foo' parameter, but the foo_new() function
starts with 'Foo *ret = snew(Foo)'. If all the rest of them think the
canonical name for the ambient Foo is 'foo', so should foo_new()!
So here's a no-brainer start on cutting down on the uses of 'ret': I
looked for all the cases where it was being assigned the result of an
allocation, and renamed the variable to be a description of the thing
being allocated. In the case of a new() function belonging to a
family, I picked the same name as the rest of the functions in its own
family, for consistency. In other cases I picked something sensible.
One case where it _does_ make sense not to use your usual name for the
variable type is when you're cloning an existing object. In that case,
_neither_ of the Foo objects involved should be called 'foo', because
it's ambiguous! They should be named so you can see which is which. In
the two cases I found here, I've called them 'orig' and 'copy'.
As in the previous refactoring, many thanks to clang-rename for the
help.
In the course of recent refactorings I noticed a couple of cases where
we were doing old-fashioned preallocation of a char array with some
conservative maximum size, then writing into it via *p++ or similar
and hoping we got the calculation right.
Now we have strbuf and dupcat, so we shouldn't ever have to do that.
Fixed as many cases as I could find by searching for allocations of
the form 'snewn(foo, char)'.
Particularly worth a mention was the Windows GSSAPI setup code, which
was directly using the Win32 Registry API, and looks much more legible
using the windows/utils/registry.c wrappers. (But that was why I had
to enhance them in the previous commit so as to be able to open
registry keys read-only: without that, the open operation would
actually fail on this key, which is not user-writable.)
Also unix/askpass.c, which was doing a careful reallocation of its
buffer to avoid secrets being left behind in the vacated memory -
which is now just a matter of ensuring we called strbuf_new_nm().
These handy wrappers on the verbose underlying Win32 registry API have
to lose some expressiveness, and one thing they lost was the ability
to open a registry key without asking for both read and write access.
This meant they couldn't be used for accessing keys not owned by the
calling user.
So far, I've only used them for accessing PuTTY's own saved data,
which means that hasn't been a problem. But I want to use them
elsewhere in an upcoming commit, so I need to fix that.
The obvious thing would be to change the meaning of the existing
'create' boolean flag so that if it's false, we also don't request
write access. The rationale would be that you're either reading or
writing, and if you're writing you want both RW access and to create
keys that don't already exist. But in fact that's not true: you do
want to set create==false and have write access in the case where
you're _deleting_ things from the key (or the whole key). So we really
do need three ways to call the wrapper function.
Rather than add another boolean field to every call site or mess about
with an 'access type' enum, I've taken an in-between route: the
underlying open_regkey_fn *function* takes a 'create' and a 'write'
flag, but at call sites, it's wrapped with a macro anyway (to append
NULL to the variadic argument list), so I've just made three macros
whose names request different access. That makes call sites marginally
_less_ verbose, while still
The conditionalisation of that call on 'protocol == PROT_SSH' has been
around since the beginning of our git history. But in those days,
random_save_seed() was unconditional _internally_ - it would always
create and write to the seed file regardless of whether the random
pool had even been initialised, let alone used.
Now random_save_seed() has its own internal condition which prevents
it doing anything if the random subsystem was never started up in the
first place. So it's better to call it unconditionally from
cleanup_exit, and then it'll be able to do its thing whenever needed,
without having to second-guess based on the top-level protocol.
(In fact, that's what all the other implementations of cleanup_exit()
have done all along. On Unix, and in Windows console apps, we do call
random_save_seed() unconditionally, and expect it to uncomplainingly
do nothing if there's nothing to do.)
(cherry picked from commit 260aad5fca)
In cases where we refuse a resize request, either because it's too
large or because the window is not currently resizable due to being
maximised, we were failing to communicate that back to the Terminal so
that it could stop waiting for the resize and resume processing input.
Those have been there since around 2001. They're in a piece of code
that calls get_fullscreen_rect to find the overall screen size, and
then prevents attempts to resize the window larger than that. The
static variables were arranging that we don't have to call
get_fullscreen_rect more than once.
But, firstly, computers are faster 20 years on; secondly, remote
window-resize requests are intentionally rate-limited (as of commit
d74308e90e), so this shouldn't be the limiting factor anyway; and
thirdly, multi-monitor support has appeared since then, which means
that if the window has been dragged from one monitor to another then
get_fullscreen_rect might _legitimately_ return a different bounding
rectangle when called a second time.
So we should just do the full check every time unconditionally.
(cherry picked from commit 4b3a8cbf61)
This is a piece of refactoring that's been overdue forever. In the
Unix front end, all the variables specific to a particular SSH session
live in a big 'inst' structure, and calls to any of the trait APIs
like Seat or TermWin can recover the right 'inst' from their input
context pointer. But the Windows frontend was written before that kind
of thing became habitual to me, and all its variables have been just
lying around at the top level of window.c, and most of those context
pointers were just ignored.
Now I've swept them all up and put them where they ought to be, in a
big context structure. This was made immeasurably easier by the very
nifty 'clang-rename' tool, which can rename a single variable in a C
source file, and find just the right set of identifiers to change even
if other variables in the file have the same name, even in sub-scopes.
So I started by clang-renaming all the top-level variables in question
to have names beginning with a prefix that didn't previously appear
anywhere in the code; checked that still worked; and then moved all
the declarations into the struct type and did a purely textual
search-and-replace of the prefix with 'wgs->'.
One potential benefit of this change is that it allows more than one
instance of the WinGuiSeat structure to exist in the same process. I
don't have any immediate plans to actually do that, but it's nice to
know it wouldn't be ruled out if we ever did need it.
But that's not the main reason I did it. The main reason is that I
recently looked at the output of a Windows build using clang -Wall,
and was horrified by the number of casual shadowings of
generically-named global variables like 'term' and 'conf' with local
variables of the same name. That seemed like a recipe for confusion,
and I decided the best way to disambiguate them all was to do this
refactoring that I'd wanted anyway for years.
A few uses of the global variables occurred in functions that didn't
have convenient access to the right WinGuiSeat via a callback
parameter of some kind. Those had to be treated specially. Most were
cleaned up in advance by the previous few commits; the remaining fixes
in this commit itself were in functions like modalfatalbox(),
nonfatal() and cleanup_exit(). The error reporting functions want the
terminal HWND to use as a MessageBox parameter; they also have the
side effect of un-hiding the mouse pointer in the terminal window, in
case it's hidden; and cleanup_exit wanted to free some resources
dangling off the WinGuiSeat.
For most of those cases, I've made a linked list of all currently live
WinGuiSeat structures, so that they can loop over _all_ live instances
(whether there's one as usual, none, or maybe more than one in
future). The parent window for non-connection-specific error messages
is found by simply picking one arbitrarily off the linked list (if
any); the cleanups are done by iterating over the _whole_ list.
The mouse-pointer unhiding is dealt with by simply allowing
show_mouseptr to take a _null_ WinGuiSeat pointer. The only thing it
needs the context for at all is to check whether pointer-hiding is
enabled in the session's Conf; so if we're _un_-hiding the pointer we
don't need to check, and can unhide it unconditionally.
The remaining global variables in window.c are the new linked list of
all WinGuiSeat structures; 'wm_mousewheel' and 'trust_icon' which
really should be global across all WinGuiSeats even if we were to have
more than one; and there's a static variable 'cursor_visible' in
show_mouseptr() which is likewise legitimately Seat-independent (since
it records the last value we passed to the Win32 API ShowCursor
function, and _that's_ global across the whole process state).
All of this should cause no functional change whatsoever.
Those have been there since around 2001. They're in a piece of code
that calls get_fullscreen_rect to find the overall screen size, and
then prevents attempts to resize the window larger than that. The
static variables were arranging that we don't have to call
get_fullscreen_rect more than once.
But, firstly, computers are faster 20 years on; secondly, remote
window-resize requests are intentionally rate-limited (as of commit
d74308e90e), so this shouldn't be the limiting factor anyway; and
thirdly, multi-monitor support has appeared since then, which means
that if the window has been dragged from one monitor to another then
get_fullscreen_rect might _legitimately_ return a different bounding
rectangle when called a second time.
So we should just do the full check every time unconditionally.
This is like the seat-independent nonfatal(), but specifies a Seat,
which allows the GUI dialog box to have the right terminal window as
its parent (if there are multiple ones).
Changed over all the nonfatal() calls in the code base that could be
localised to a Seat, which means all the ones that come up if
something goes horribly wrong in host key storage. To make that
possible, I've added a 'seat' parameter to store_host_key(); it turns
out that all its call sites had one available already.
Previously, the timing.c subsystem worked in Windows PuTTY by means of
scheduling WM_TIMER messages to be sent to the terminal window. Now it
uses a separate hidden window instead, and all the machinery for
handling that window lives on its own in windows/utils/gui-timing.c.
Most immediately, this removes a use of wgs.term_hwnd that will become
awkward when I move that structure in a following commit. But also, it
will make it easier to add the same timing subsystem to unrelated GUI
programs, such as Windows Pageant: if we ever decide to implement
automatic deletion or re-encryption of unused keys after a timeout,
this will help make that easier.
That clipboard-writing function is called just once, from the Event
Log dialog procedure, for when the user deliberately copies to the
clipboard. That call always passes must_deselect = true, which means
the conditional WM_IGNORE_CLIP messages are not sent. So it's simpler
to remove that parameter completely, and the conditional calls which
are never used.
Also, the clipboard data copied from the Event Log dialog is being put
in the clipboard associated with the main PuTTY terminal window. But
anything else we copy from a dialog box using Windows's built-in
copy-paste mechanisms would surely be associated with the _dialog_,
not its parent window. So we should do the same thing here. Therefore,
I've added a HWND parameter to write_aclip() and used that in place of
wgs.term_hwnd, so that we can pass in the HWND of the dialog itself.
The conditionalisation of that call on 'protocol == PROT_SSH' has been
around since the beginning of our git history. But in those days,
random_save_seed() was unconditional _internally_ - it would always
create and write to the seed file regardless of whether the random
pool had even been initialised, let alone used.
Now random_save_seed() has its own internal condition which prevents
it doing anything if the random subsystem was never started up in the
first place. So it's better to call it unconditionally from
cleanup_exit, and then it'll be able to do its thing whenever needed,
without having to second-guess based on the top-level protocol.
(In fact, that's what all the other implementations of cleanup_exit()
have done all along. On Unix, and in Windows console apps, we do call
random_save_seed() unconditionally, and expect it to uncomplainingly
do nothing if there's nothing to do.)
It complained in console_confirm_ssh_host_key that if the caller
passed in a SeatDialogText containing no SDT_PROMPT record, then
'prompt' would be uninitialised.
The answer is "don't do that, then", but fair enough that Coverity
didn't know that. Added an assertion, which should keep it happy, and
also cause better error handling if we ever _do_ make that mistake.
Coverity complained in keylist_update_callback that in one if
statement I was allowing for the possibility that alg == NULL, and in
the next, I was assuming it would always be non-null.
Right now I'm not actually convinced that _either_ check is necessary
- it would make sense in an agent _client_, where you might be talking
to an agent that knows key algorithms you don't, but this is the GUI
built into Pageant itself, so any key it can store internally ought to
have a known algorithm name.
Still, this fix is certainly _correct_ even if not optimal, and it'll
do for now.
Coverity points out that if we refer to cp_list[codepage - 65536], we
ought to have ensured that codepage - 65536 was _less_ than
lenof(cp_list), not just less or equal.
We had carefully calculated, for each control in an aligned group, how
much _that control_ should move downwards by. But then, because I
carelessly referred to the wrong variable name, we actually moved the
wrong one - not the control we'd just calculated the offset for, but
always the _last_ one in the group, which was the one the top-level
alignment code was processing at the point we began this loop.
As a result, the dropdown list in the front-page protocol selector was
hilariously misaligned. Now it's back where it should be.
My experimental build with clang-cl at -Wall did show up a few things
that are safe enough to fix right now. One was this list of missing
includes, which was causing a lot of -Wmissing-prototype warnings, and
is a real risk because it means the declarations in headers weren't
being type-checked against the actual function definitions.
Happily, no actual mismatches.