In commit d07d7d66f6 I rewrote the code that constructs RTF paste
data so that it uses a strbuf, in place of the previous ad-hoc code
that counted up the lengths of pieces of RTF in advance in order to
realloc the buffer.
But apparently I left in an entire loop whose job was to count up one
of those lengths, failing to notice that it's now completely pointless
because its output value is never needed!
Happily a clang upgrade has just improved the 'variable set but not
used' warning to the point where it can spot that. I expect previously
the variable still counted as 'used' because each increment of it used
the previous value.
When the standalone version of a binary, with its help file included
as a resource, extracts that resource to write it to a disk, it could
have accidentally skipped a byte in the middle if the WriteFile call
in this loop had not managed to write the whole file in one go.
This reverts commit 0615767224
("Windows installer: remove explicit InstallScope setting"), albeit
with different comments.
The original change worked around a Windows security vulnerability
(CVE-2023-21800), but also resulted in a rather broken installer.
While doing that parametrisation I noticed three strlen calls that
could obviously be replaced with one - and then I also noticed that
there were missing parens in an expression that should have
been (n+1)/2, making it n + 1/2, i.e. just n, due to integer
arithmetic.
Happily that bug meant we were _over_-allocating rather than under,
but even so, how embarrassing. Fixed.
Created in the simplest way, by parametrising the existing code using
macros.
Nothing actually uses this yet. I hope to gradually switch
command-line parsing from 'ANSI' to Unicode strings, but this isn't
the only preparation needed, so it might yet be a while.
This will contain test code and test subprograms that don't belong in
the top-level test directory due to not being cross-platform.
Initial contents are test_screenshot.c, which was already its own
source file in the windows subdir, and test_split_into_argv.c, which
I've sawn off the bottom of windows/utils/split_into_argv.c and moved
into its own source file.
I checked exhaustively today and found that the only characters (even
in Unicode) that Windows's default argv splitter will recognise as
word separators are the space and tab characters. So I think it's a
mistake to use <ctype.h> functions to identify word separators; we
should use that fixed character pair, and then we know we're getting
the right ones only.
I'm not conveniently set up to actually run it during my main build,
since that happens entirely on Linux and cross-builds the Windows
binaries. But it should at least be possible to build and run it by
hand.
I thought I'd found all of these before, but perhaps a few managed to
slip in since I last looked. The character argument to the <ctype.h>
functions must have the value of an unsigned char or EOF; passing an
ordinary char (unless you know char is unsigned on every platform the
code will ever go near) risks mistaking '\xFF' for EOF, and causing
outright undefined behaviour on byte values in the range 80-FE. Never
do it.
This has all the basic necessities to become a test of the terminal's
behaviour, in terms of how its data structures evolve as output is
sent to it, and perhaps also (by filling in the stub TermWin more
usefully) testing what it draws during updates and what it sends in
response to query sequences.
For the moment, all I've done is to set up the framework, and add one
demo test of printing some ordinary text and observing that it appears
in the data structures and the cursor has moved.
I expect that writing a full test of terminal.c will be a very big
job. But perhaps I or someone else will find time to prod it gradually
in the background of other work. In particular, when I'm _modifying_
any part of the terminal code, it would be good to add some tests for
the part I'm changing, before making the change, and check they still
work afterwards.
This continues the programme of UTF-8 support in authentication, begun
in commit f4519b6533 which arranged for console userpass prompts
to function in UTF-8 when the prompts_t asked them to. Since the new
line editing setup works properly when it _is_ in UTF-8 mode, I can
now also arrange that it puts the terminal into UTF-8 mode in the
right circumstances.
I've extended the applicability of the '-legacy-charset-handling' flag
introduced by the commit mentioned above, so that now it's not
specific to the console front end. Now you can give it to GUI PuTTY as
well, which restores the previous (wrong) behaviour of accepting
username and password prompt input in the main session's configured
character set. So if this change breaks someone's workflow, they
should be able to have it back.
I'm about to rewrite it completely, so the first thing I need to do is
to write tests for as much of the functionality as possible, so that I
can check the new implementation behaves in the same ways.
Similarly to the one I just added for FontSpec: in a cross-platform
main source file, you don't really want to mess about with
per-platform ifdefs just to initialise a 'struct unicode_data' from a
Conf. But until now, you had to, because init_ucs had a different
prototype on Windows and Unix.
I plan to use this in future test programs. But an immediate positive
effect is that it removes the only platform-dependent call from
fuzzterm.c. So now that could be built on Windows too, given only an
appropriate cmake stanza. (Not that I have much idea if it's useful to
fuzz the terminal separately on multiple platforms, but it's nice to
know that it's possible if anyone does need to.)
Constructing a FontSpec in platform-independent code is awkward,
because you can't call fontspec_new() outside the platform subdirs
(since its prototype varies per platform). But sometimes you just need
_some_ valid FontSpec, e.g. to put in a Conf that will be used in some
place where you don't actually care about font settings, such as a
purely CLI program.
Both Unix and Windows _have_ an idiom for this, but they're different,
because their FontSpec constructors have different prototypes. The
existing CLI tools have always had per-platform main source files, so
they just use the locally appropriate method of constructing a boring
don't-care FontSpec.
But if you want a _platform-independent_ main source file, such as you
might find in a test program, then that's rather awkward. Better to
have a platform-independent API for making a default FontSpec.
Now you can optionally get back an enum value indicating whether the
character was successfully decoded, or whether U+FFFD was substituted
due to some kind of problem, and if the latter, what problem.
For a start, this allows distinguishing 'real' U+FFFD (encoded
legitimately in the input) from one invented by the decoder. Also, it
allows the recipient of the decode to treat failures differently,
either by passing on a useful error report to the user (as
utf8_unknown_char now does) or by doing something special.
In particular, there are two distinct error codes for a truncated
UTF-8 encoding, depending on whether it was truncated by the end of
the input or by encountering a non-continuation byte. The former code
means that the string is not legal UTF-8 _as it is_, but doesn't rule
out it being a (bytewise) prefix of a legal UTF-8 string - so if a
client is receiving UTF-8 data a byte at a time, they can treat that
error code specially and not make it a fatal error.
I've only just noticed that the definition of CP_UTF8 as 65001 (the
Windows code page number for UTF-8) is in the main putty.h, under an
ifdef that checks whether the per-platform header file had already
defined it to something else. That's a silly way to do things! Better
that the Windows-specific definition goes _in_ the Windows platform
header, and putty.h contains no fallback. That way, anyone writing a
third separate platform directory will get an error reminding them
that they have to provide the right definition for their platform,
instead of finding out later via a runtime failure.
This allows you to set a flag in conio_setup() which causes the
returned ConsoleIO object to interpret all its output as UTF-8, by
translating it to UTF-16 and using WriteConsoleW to write it in
Unicode. Similarly, input is read using ReadConsoleW and decoded from
UTF-16 to UTF-8.
This flag is set to false in most places, to avoid making sudden
breaking changes. But when we're about to present a prompts_t to the
user, it's set from the new 'utf8' flag in that prompt, which in turn
is set by the userauth layer in any case where the prompts are going
to the server.
The idea is that this should be the start of a fix for the long-
standing character-set handling bug that strings transmitted during
SSH userauth (usernames, passwords, k-i prompts and responses) are all
supposed to be in UTF-8, but we've always encoded them in whatever our
input system happens to be using, and not done any tidying up on them.
We get occasional complaints about this from users whose passwords
contain characters that are encoded differently between UTF-8 and
their local encoding, but I've never got round to fixing it because
it's a large piece of engineering.
Indeed, this isn't nearly the end of it. The next step is to add UTF-8
support to all the _other_ ways of presenting a prompts_t, as best we
can.
Like the previous change to console handling, it seems very likely
that this will break someone's workflow. So there's a fallback
command-line option '-legacy-charset-handling' to revert to PuTTY's
previous behaviour.
Until now, the command-line PuTTY tools (PSCP, PSFTP and Plink) have
presented all the kinds of interactive prompt (password/passphrase,
host key, the assorted weak-crypto warnings, 'append to log file?') on
standard error, and read the responses from standard input.
This is unfortunate because if you're redirecting their standard
input (especially likely with Plink) then the prompt responses will
consume some of the intended session data. It would be better to
present the prompts _on the console_, even if that's not where stdin
or stderr point.
On Unix, we've been doing this for ages, by opening /dev/tty directly.
On Windows, we didn't, because I didn't know how. But I've recently
found out: you can open the magic file names CONIN$ and CONOUT$, which
will point at your actual console, if one is available.
So now, if it's possible, the command-line tools will do that. But if
the attempt to open CONIN$ and CONOUT$ fails, they'll fall back to the
old behaviour (in particular, if no console is available at all).
In order to make this happen consistently across all the prompt types,
I've introduced a new object called ConsoleIO, which holds whatever
file handles are necessary, knows whether to close them
afterwards (yes if they were obtained by opening CONFOO$, no if
they're the standard I/O handles), and presents a BinarySink API to
write to them and a custom API to read a line of text.
This seems likely to break _someone's_ workflow. So I've added an
option '-legacy-stdio-prompts' to restore the old behaviour.
In the Windows API, there are two places you can get a command line in
the form of a single unsplit string. One is via the command-line
parameter to WinMain(); the other is by calling GetCommandLine(). But
the two have different semantics: the WinMain command line string is
only the part after the program name, whereas GetCommandLine() returns
the full command line _including_ the program name.
PuTTY has never yet had to parse the full output of GetCommandLine,
but I have plans that will involve it beginning to do so. So I need to
make sure the utility function split_into_argv() can handle it.
This is not trivial because the quoting convention is different for
the program name than for everything else. In the program's normal
arguments, parsed by the C library startup code, the convention is
that backslashes are special when they appear before a double quote,
because that's how you write a literal double quote. But in the
program name, backslashes are _never_ special, because that's how
CreateProcess parses the program name at the start of the command
line, and the C library must follow suit in order to correctly
identify where the program name ends and the arguments begin.
In particular, consider a command line such as this:
"C:\Program Files\Foo\"foo.exe "hello \"world\""
The \" in the middle of the program name must be treated as a literal
backslash, followed by a non-literal double quote which matches the
one at the start of the string and causes the space in 'Program Files'
to be treated as part of the pathname. But the same \" when it appears
in the subsequent argument is treated as an escaped double quote, and
turns into a literal " in the argument string.
This commit adds support for this special initial-word handling in
split_into_argv(), via an extra boolean argument indicating whether to
turn that mode on. However, all existing call sites set the flag to
false, because the new mode isn't needed _yet_. So there should be no
functional change.
This removes one case from several of the individual tools'
command-line parsers, and moves it into a central place where it will
automatically be supported by any tool containing console.c.
In order to make that not cause a link failure, there's now a
stubs/no-console.c which GUI clients of cmdline.c must include.
Horizontal scroll events aren't generated by the traditional mouse
wheel, but they can be generated by trackpad gestures, though this
isn't always configured on.
The cross-platform and Windows parts of this patch is due to
Christopher Plewright; I added the GTK support.
From https://invisible-island.net/xterm/ctlseqs/ctlseqs.html#h3-Any-event-tracking:
Any-event mode is the same as button-event mode, except that all motion
events are reported, even if no mouse button is down. It is enabled by
specifying 1003 to DECSET.
Normally the front ends only report mouse events when buttons are
pressed, so we introduce a MA_MOVE event with MBT_NOTHING set to
indicate such a mouse movement.
This enables it to handle data that isn't presented as a
NUL-terminated string.
In particular, the NUL byte can appear _within_ the string and be
correctly translated to the NUL wide character. So I've been able to
remove the awkwardness in the test rig of having to include the
terminating NUL in every test to ensure NUL has been tested, and
instead, insert a single explicit test for it.
Similarly to the previous commit, the simplification at the (one) call
site gives me a strong feeling of 'this is what the API should have
been all along'!
Previously it output to an ordinary char buffer, and returned the
number of bytes it had written. But three out of the four call sites
immediately chucked the resulting bytes into a BinarySink anyway. The
fourth, in windows/unicode.c, really is writing into successive
locations of a fixed-size buffer - but we can make that into a
BinarySink too, using the buffer_sink added in the previous commit.
So now encode_utf8() is renamed put_utf8_char, and the call sites all
look simpler than they started out.
A server attempt to resize the window (for instance via DECCOLM) when
"When window is resized" was set to "Forbid resizing completely" would
cause all terminal output to be suspended, due to the resize attempt
never being acknowledged.
(There are other code paths like this, which I've fixed for
completeness, but I don't think they have any effect: the terminal
filters out resize attempts to the current size before this point, and
even if a server can get such a request through the SUPDUP protocol, the
test for that is wrong and will never fire -- this needs fixing
separately.)
The "Cancel" button's keyboard shortcut was accidentally removed by
f1c8298000, having only just reinstated it in a77040afa1.
(Also, fix a couple of blatantly fibbing "accelerators used" comments.)
It turns out this isn't actually necessary after all to make the
installer behave in the expected way in the default case (giving a UAC
prompt and installing systemwide). And I'm told it has undesirable
consequences in more complicated cases, which I'm not expert enough in
MSI to fully understand.
Of the three calls to queue_toplevel_callback in window.c, one of them
was still passing NULL as its context parameter, rather than the new
'wgs'. As a result, a segfault could occur during some session closures.
I mentioned recently (in commit 9e7d4c53d8) message that I'm no
longer fond of the variable name 'ret', because it's used in two quite
different contexts: it's the return value from a subroutine you just
called (e.g. 'int ret = read(fd, buf, len);' and then check for error
or EOF), or it's the value you're preparing to return from the
_containing_ routine (maybe by assigning it a default value and then
conditionally modifying it, or by starting at NULL and reallocating,
or setting it just before using the 'goto out' cleanup idiom). In the
past I've occasionally made mistakes by forgetting which meaning the
variable had, or accidentally conflating both uses.
If all else fails, I now prefer 'retd' (short for 'returned') in the
former situation, and 'toret' (obviously, the value 'to return') in
the latter case. But even better is to pick a name that actually says
something more specific about what the thing actually is.
One particular bad habit throughout this codebase is to have a set of
functions that deal with some object type (say 'Foo'), all *but one*
of which take a 'Foo *foo' parameter, but the foo_new() function
starts with 'Foo *ret = snew(Foo)'. If all the rest of them think the
canonical name for the ambient Foo is 'foo', so should foo_new()!
So here's a no-brainer start on cutting down on the uses of 'ret': I
looked for all the cases where it was being assigned the result of an
allocation, and renamed the variable to be a description of the thing
being allocated. In the case of a new() function belonging to a
family, I picked the same name as the rest of the functions in its own
family, for consistency. In other cases I picked something sensible.
One case where it _does_ make sense not to use your usual name for the
variable type is when you're cloning an existing object. In that case,
_neither_ of the Foo objects involved should be called 'foo', because
it's ambiguous! They should be named so you can see which is which. In
the two cases I found here, I've called them 'orig' and 'copy'.
As in the previous refactoring, many thanks to clang-rename for the
help.
In the course of recent refactorings I noticed a couple of cases where
we were doing old-fashioned preallocation of a char array with some
conservative maximum size, then writing into it via *p++ or similar
and hoping we got the calculation right.
Now we have strbuf and dupcat, so we shouldn't ever have to do that.
Fixed as many cases as I could find by searching for allocations of
the form 'snewn(foo, char)'.
Particularly worth a mention was the Windows GSSAPI setup code, which
was directly using the Win32 Registry API, and looks much more legible
using the windows/utils/registry.c wrappers. (But that was why I had
to enhance them in the previous commit so as to be able to open
registry keys read-only: without that, the open operation would
actually fail on this key, which is not user-writable.)
Also unix/askpass.c, which was doing a careful reallocation of its
buffer to avoid secrets being left behind in the vacated memory -
which is now just a matter of ensuring we called strbuf_new_nm().
These handy wrappers on the verbose underlying Win32 registry API have
to lose some expressiveness, and one thing they lost was the ability
to open a registry key without asking for both read and write access.
This meant they couldn't be used for accessing keys not owned by the
calling user.
So far, I've only used them for accessing PuTTY's own saved data,
which means that hasn't been a problem. But I want to use them
elsewhere in an upcoming commit, so I need to fix that.
The obvious thing would be to change the meaning of the existing
'create' boolean flag so that if it's false, we also don't request
write access. The rationale would be that you're either reading or
writing, and if you're writing you want both RW access and to create
keys that don't already exist. But in fact that's not true: you do
want to set create==false and have write access in the case where
you're _deleting_ things from the key (or the whole key). So we really
do need three ways to call the wrapper function.
Rather than add another boolean field to every call site or mess about
with an 'access type' enum, I've taken an in-between route: the
underlying open_regkey_fn *function* takes a 'create' and a 'write'
flag, but at call sites, it's wrapped with a macro anyway (to append
NULL to the variadic argument list), so I've just made three macros
whose names request different access. That makes call sites marginally
_less_ verbose, while still
The conditionalisation of that call on 'protocol == PROT_SSH' has been
around since the beginning of our git history. But in those days,
random_save_seed() was unconditional _internally_ - it would always
create and write to the seed file regardless of whether the random
pool had even been initialised, let alone used.
Now random_save_seed() has its own internal condition which prevents
it doing anything if the random subsystem was never started up in the
first place. So it's better to call it unconditionally from
cleanup_exit, and then it'll be able to do its thing whenever needed,
without having to second-guess based on the top-level protocol.
(In fact, that's what all the other implementations of cleanup_exit()
have done all along. On Unix, and in Windows console apps, we do call
random_save_seed() unconditionally, and expect it to uncomplainingly
do nothing if there's nothing to do.)
(cherry picked from commit 260aad5fca)
In cases where we refuse a resize request, either because it's too
large or because the window is not currently resizable due to being
maximised, we were failing to communicate that back to the Terminal so
that it could stop waiting for the resize and resume processing input.
Those have been there since around 2001. They're in a piece of code
that calls get_fullscreen_rect to find the overall screen size, and
then prevents attempts to resize the window larger than that. The
static variables were arranging that we don't have to call
get_fullscreen_rect more than once.
But, firstly, computers are faster 20 years on; secondly, remote
window-resize requests are intentionally rate-limited (as of commit
d74308e90e), so this shouldn't be the limiting factor anyway; and
thirdly, multi-monitor support has appeared since then, which means
that if the window has been dragged from one monitor to another then
get_fullscreen_rect might _legitimately_ return a different bounding
rectangle when called a second time.
So we should just do the full check every time unconditionally.
(cherry picked from commit 4b3a8cbf61)
This is a piece of refactoring that's been overdue forever. In the
Unix front end, all the variables specific to a particular SSH session
live in a big 'inst' structure, and calls to any of the trait APIs
like Seat or TermWin can recover the right 'inst' from their input
context pointer. But the Windows frontend was written before that kind
of thing became habitual to me, and all its variables have been just
lying around at the top level of window.c, and most of those context
pointers were just ignored.
Now I've swept them all up and put them where they ought to be, in a
big context structure. This was made immeasurably easier by the very
nifty 'clang-rename' tool, which can rename a single variable in a C
source file, and find just the right set of identifiers to change even
if other variables in the file have the same name, even in sub-scopes.
So I started by clang-renaming all the top-level variables in question
to have names beginning with a prefix that didn't previously appear
anywhere in the code; checked that still worked; and then moved all
the declarations into the struct type and did a purely textual
search-and-replace of the prefix with 'wgs->'.
One potential benefit of this change is that it allows more than one
instance of the WinGuiSeat structure to exist in the same process. I
don't have any immediate plans to actually do that, but it's nice to
know it wouldn't be ruled out if we ever did need it.
But that's not the main reason I did it. The main reason is that I
recently looked at the output of a Windows build using clang -Wall,
and was horrified by the number of casual shadowings of
generically-named global variables like 'term' and 'conf' with local
variables of the same name. That seemed like a recipe for confusion,
and I decided the best way to disambiguate them all was to do this
refactoring that I'd wanted anyway for years.
A few uses of the global variables occurred in functions that didn't
have convenient access to the right WinGuiSeat via a callback
parameter of some kind. Those had to be treated specially. Most were
cleaned up in advance by the previous few commits; the remaining fixes
in this commit itself were in functions like modalfatalbox(),
nonfatal() and cleanup_exit(). The error reporting functions want the
terminal HWND to use as a MessageBox parameter; they also have the
side effect of un-hiding the mouse pointer in the terminal window, in
case it's hidden; and cleanup_exit wanted to free some resources
dangling off the WinGuiSeat.
For most of those cases, I've made a linked list of all currently live
WinGuiSeat structures, so that they can loop over _all_ live instances
(whether there's one as usual, none, or maybe more than one in
future). The parent window for non-connection-specific error messages
is found by simply picking one arbitrarily off the linked list (if
any); the cleanups are done by iterating over the _whole_ list.
The mouse-pointer unhiding is dealt with by simply allowing
show_mouseptr to take a _null_ WinGuiSeat pointer. The only thing it
needs the context for at all is to check whether pointer-hiding is
enabled in the session's Conf; so if we're _un_-hiding the pointer we
don't need to check, and can unhide it unconditionally.
The remaining global variables in window.c are the new linked list of
all WinGuiSeat structures; 'wm_mousewheel' and 'trust_icon' which
really should be global across all WinGuiSeats even if we were to have
more than one; and there's a static variable 'cursor_visible' in
show_mouseptr() which is likewise legitimately Seat-independent (since
it records the last value we passed to the Win32 API ShowCursor
function, and _that's_ global across the whole process state).
All of this should cause no functional change whatsoever.
Those have been there since around 2001. They're in a piece of code
that calls get_fullscreen_rect to find the overall screen size, and
then prevents attempts to resize the window larger than that. The
static variables were arranging that we don't have to call
get_fullscreen_rect more than once.
But, firstly, computers are faster 20 years on; secondly, remote
window-resize requests are intentionally rate-limited (as of commit
d74308e90e), so this shouldn't be the limiting factor anyway; and
thirdly, multi-monitor support has appeared since then, which means
that if the window has been dragged from one monitor to another then
get_fullscreen_rect might _legitimately_ return a different bounding
rectangle when called a second time.
So we should just do the full check every time unconditionally.
This is like the seat-independent nonfatal(), but specifies a Seat,
which allows the GUI dialog box to have the right terminal window as
its parent (if there are multiple ones).
Changed over all the nonfatal() calls in the code base that could be
localised to a Seat, which means all the ones that come up if
something goes horribly wrong in host key storage. To make that
possible, I've added a 'seat' parameter to store_host_key(); it turns
out that all its call sites had one available already.
Previously, the timing.c subsystem worked in Windows PuTTY by means of
scheduling WM_TIMER messages to be sent to the terminal window. Now it
uses a separate hidden window instead, and all the machinery for
handling that window lives on its own in windows/utils/gui-timing.c.
Most immediately, this removes a use of wgs.term_hwnd that will become
awkward when I move that structure in a following commit. But also, it
will make it easier to add the same timing subsystem to unrelated GUI
programs, such as Windows Pageant: if we ever decide to implement
automatic deletion or re-encryption of unused keys after a timeout,
this will help make that easier.
That clipboard-writing function is called just once, from the Event
Log dialog procedure, for when the user deliberately copies to the
clipboard. That call always passes must_deselect = true, which means
the conditional WM_IGNORE_CLIP messages are not sent. So it's simpler
to remove that parameter completely, and the conditional calls which
are never used.
Also, the clipboard data copied from the Event Log dialog is being put
in the clipboard associated with the main PuTTY terminal window. But
anything else we copy from a dialog box using Windows's built-in
copy-paste mechanisms would surely be associated with the _dialog_,
not its parent window. So we should do the same thing here. Therefore,
I've added a HWND parameter to write_aclip() and used that in place of
wgs.term_hwnd, so that we can pass in the HWND of the dialog itself.
The conditionalisation of that call on 'protocol == PROT_SSH' has been
around since the beginning of our git history. But in those days,
random_save_seed() was unconditional _internally_ - it would always
create and write to the seed file regardless of whether the random
pool had even been initialised, let alone used.
Now random_save_seed() has its own internal condition which prevents
it doing anything if the random subsystem was never started up in the
first place. So it's better to call it unconditionally from
cleanup_exit, and then it'll be able to do its thing whenever needed,
without having to second-guess based on the top-level protocol.
(In fact, that's what all the other implementations of cleanup_exit()
have done all along. On Unix, and in Windows console apps, we do call
random_save_seed() unconditionally, and expect it to uncomplainingly
do nothing if there's nothing to do.)
It complained in console_confirm_ssh_host_key that if the caller
passed in a SeatDialogText containing no SDT_PROMPT record, then
'prompt' would be uninitialised.
The answer is "don't do that, then", but fair enough that Coverity
didn't know that. Added an assertion, which should keep it happy, and
also cause better error handling if we ever _do_ make that mistake.
Coverity complained in keylist_update_callback that in one if
statement I was allowing for the possibility that alg == NULL, and in
the next, I was assuming it would always be non-null.
Right now I'm not actually convinced that _either_ check is necessary
- it would make sense in an agent _client_, where you might be talking
to an agent that knows key algorithms you don't, but this is the GUI
built into Pageant itself, so any key it can store internally ought to
have a known algorithm name.
Still, this fix is certainly _correct_ even if not optimal, and it'll
do for now.
Coverity points out that if we refer to cp_list[codepage - 65536], we
ought to have ensured that codepage - 65536 was _less_ than
lenof(cp_list), not just less or equal.
We had carefully calculated, for each control in an aligned group, how
much _that control_ should move downwards by. But then, because I
carelessly referred to the wrong variable name, we actually moved the
wrong one - not the control we'd just calculated the offset for, but
always the _last_ one in the group, which was the one the top-level
alignment code was processing at the point we began this loop.
As a result, the dropdown list in the front-page protocol selector was
hilariously misaligned. Now it's back where it should be.
My experimental build with clang-cl at -Wall did show up a few things
that are safe enough to fix right now. One was this list of missing
includes, which was causing a lot of -Wmissing-prototype warnings, and
is a real risk because it means the declarations in headers weren't
being type-checked against the actual function definitions.
Happily, no actual mismatches.
Use of thread-local storage on Windows (introduced recently in
69e8d471d1) could cause a -Wattributes warning in mingw-w64 builds,
since that toolchain doesn't understand __declspec(thread).
Define a portability macro THREADLOCAL, which should resolve to
something appropriate for at least:
- MSVC, which understands the Microsoft syntax __declspec(thread);
- GCC (e.g., mingw-w64) which understands the GNU syntax __thread
(GCC only implements __declspec() to the extent of understanding the
arguments 'dllexport' and 'dllimport');
- Clang, which supports both syntaxes.
(It's possible there's some more obscure Windows toolchain which will
now hit the #error as a result of this change.)
I haven't attempted to try to detect and use the C11 syntax
'thread_local'. And this is all still Windows-only, since that's all we
need for now and it avoids opening the can of worms that is TLS on other
platforms.
(I considered delegating all this to cmake, but as well as being fiddly,
it seems even the latest versions of cmake don't know about thread-local
storage for C, as opposed to C++ (cxx_thread_local).)
In recent months I've had two requests from different people to build
support into PuTTY for automatically handling complicated third-party
auth protocols layered on top of keyboard-interactive - the kind of
thing where you're asked to enter some auth response, and you have to
refer to some external source like a web server to find out what the
right response _is_, which is a pain to do by hand, so you'd prefer it
to be automated in the SSH client.
That seems like a reasonable thing for an end user to want, but I
didn't think it was a good idea to build support for specific
protocols of that kind directly into PuTTY, where there would no doubt
be an ever-lengthening list, and maintenance needed on all of them.
So instead, in collaboration with one of my correspondents, I've
designed and implemented a protocol to be spoken between PuTTY and a
plugin running as a subprocess. The plugin can opt to handle the
keyboard-interactive authentication loop on behalf of the user, in
which case PuTTY passes on all the INFO_REQUEST packets to it, and
lets it make up responses. It can also ask questions of the user if
necessary.
The protocol spec is provided in a documentation appendix. The entire
configuration for the end user consists of providing a full command
line to use as the subprocess.
In the contrib directory I've provided an example plugin written in
Python. It gives a set of fixed responses suitable for getting through
Uppity's made-up k-i system, because that was a reasonable thing I
already had lying around to test against. But it also provides example
code that someone else could pick up and insert their own live
response-provider into the middle of, assuming they were happy with it
being in Python.
We already have the ability to start a subprocess and hook it up to a
Socket, for running local proxy commands. Now the same facility is
available as an auxiliary feature, so that a backend can start another
subcommand for a different purpose, and make a separate Socket to
communicate with it.
Just like the local proxy system, this facility captures the
subprocess's stderr, and passes it back to the caller via plug_log. To
make that not look silly, I had to add a system where the "proxy:"
prefix on the usual plug_log messages is reconfigurable, and when you
call platform_start_subprocess(), you get to pass the prefix you want
to use in this case.
I made a specific subdirectory 'stubs' to keep all the link-time stub
modules in, like notiming.c. And I put _one_ run-time stub in it,
namely nullplug.c. But the rest of the runtime stubs went into utils.
I think it's better to keep all the stubs together, so I've moved all
the null*.c in utils into stubs (with the exception of nullstrcmp.c,
which means the 'null' in a different sense). Also, fiddled with the
naming to be a bit more consistent, and stated in the new CMakeLists
the naming policy that distinguishes no-*.c from null-*.c.
We've occasionally had reports of SSH servers disconnecting as soon as
they receive PuTTY's KEXINIT. I think all such reports have involved
the kind of simple ROM-based SSH server software you find in small
embedded devices.
I've never been able to prove it, but I've always suspected that one
possible cause of this is simply that PuTTY's KEXINIT is _too long_,
either in number of algorithms listed or in total length (especially
given all the ones that end in @very.long.domain.name suffixes).
If I'm right about either of those being the cause, then it's just
become even more likely to happen, because of all the extra
Diffie-Hellman groups and GSSAPI algorithms we just threw into our
already-long list in the previous few commits.
A workaround I've had in mind for ages is to wait for the server's
KEXINIT, and then filter our own down to just the algorithms the
server also mentioned. Then our KEXINIT is no longer than that of the
server, and hence, presumably fits in whatever buffer it has. So I've
implemented that workaround, in anticipation of it being needed in the
near future.
(Well ... it's not _quite_ true that our KEXINIT is at most the same
length as the server. In fact I had to leave in one KEXINIT item that
won't match anything in the server's list, namely "ext-info-c" which
gates access to SHA-2 based RSA. So if we turn out to support
absolutely everything on all the server's lists, then our KEXINIT
would be a few bytes longer than the server's, even with this
workaround. But that would only cause trouble if the server's outgoing
KEXINIT was skating very close to whatever buffer size it has for the
incoming one, and I'm guessing that's not very likely.)
((Another possible cause of this kind of disconnection would be a
server that simply objects to seeing any KEXINIT string it doesn't
know how to speak. But _surely_ no such server would have survived
initial testing against any full-featured client at all!))
I only recently found out that OpenSSH defined their own protocol IDs
for AES-GCM, defined to work the same as the standard ones except that
they fixed the semantics for how you select the linked cipher+MAC pair
during key exchange.
(RFC 5647 defines protocol ids for AES-GCM in both the cipher and MAC
namespaces, and requires that you MUST select both or neither - but
this contradicts the selection policy set out in the base SSH RFCs,
and there's no discussion of how you resolve a conflict between them!
OpenSSH's answer is to do it the same way ChaCha20-Poly1305 works,
because that will ensure the two suites don't fight.)
People do occasionally ask us for this linked cipher/MAC pair, and now
I know it's actually feasible, I've implemented it, including a pair
of vector implementations for x86 and Arm using their respective
architecture extensions for multiplying polynomials over GF(2).
Unlike ChaCha20-Poly1305, I've kept the cipher and MAC implementations
in separate objects, with an arm's-length link between them that the
MAC uses when it needs to encrypt single cipher blocks to use as the
inputs to the MAC algorithm. That enables the cipher and the MAC to be
independently selected from their hardware-accelerated versions, just
in case someone runs on a system that has polynomial multiplication
instructions but not AES acceleration, or vice versa.
There's a fourth implementation of the GCM MAC, which is a pure
software implementation of the same algorithm used in the vectorised
versions. It's too slow to use live, but I've kept it in the code for
future testing needs, and because it's a convenient place to dump my
design comments.
The vectorised implementations are fairly crude as far as optimisation
goes. I'm sure serious x86 _or_ Arm optimisation engineers would look
at them and laugh. But GCM is a fast MAC compared to HMAC-SHA-256
(indeed compared to HMAC-anything-at-all), so it should at least be
good enough to use. And we've got a working version with some tests
now, so if someone else wants to improve them, they can.
We're now setting the help context centrally in ssh/common.c - but I
forgot to remove the _old_ assignment statements, which overwrite
whatever that asks for. Oops.
OpenSSH, when called on to give the fingerprint of a certified public
key, will in many circumstances generate the hash of the public blob
of the _underlying_ key, rather than the hash of the full certificate.
I think the hash of the certificate is also potentially useful (if
nothing else, it provides a way to tell apart multiple certificates on
the same key). But I can also see that it's useful to be able to
recognise a key as the same one 'really' (since all certificates on
the same key share a private key, so they're unavoidably related).
So I've dealt with this by introducing an extra pair of fingerprint
types, giving the cross product of {MD5, SHA-256} x {base key only,
full certificate}. You can manually select which one you want to see
in some circumstances (notably PuTTYgen), and in others (such as
diagnostics) both fingerprints will be emitted side by side via the
new functions ssh2_double_fingerprint[_blob].
The default, following OpenSSH, is to just fingerprint the base key.
This patch fixes a few other whitespace and formatting issues which
were pointed out by the bulk-reindent or which I spotted in passing,
some involving manual editing to break lines more nicely.
I think the weirdest hunk in here is the one in windows/window.c
TranslateKey() where _half_ of an assignment statement inside an 'if'
was on the same line as the trailing paren of the if condition. No
idea at all how that one managed to happen!
I think a lot of these were inserted by a prior run through GNU indent
many years ago. I noticed in a more recent experiment that that tool
doesn't always correctly distinguish which instances of 'id * id' are
pointer variable declarations and which are multiplications, so it
spaces some of the former as if they were the latter.
If the function name (or expression) in a function call or declaration
is itself so long that even the first argument doesn't fit after it on
the same line, or if that would leave so little space that it would be
silly to try to wrap all the run-on lines into a tall thin column,
then I used to do this
ludicrously_long_function_name
(arg1, arg2, arg3);
and now prefer this
ludicrously_long_function_name(
arg1, arg2, arg3);
I picked up the habit from Python, where the latter idiom is required
by Python's syntactic significance of newlines (you can write the
former if you use a backslash-continuation, but pretty much everyone
seems to agree that that's much uglier). But I've found it works well
in C as well: it makes it more obvious that the previous line is
incomplete, it gives you a tiny bit more space to wrap the following
lines into (the old idiom indents the _third_ line one space beyond
the second), and I generally turn out to agree with the knock-on
indentation decisions made by at least Emacs if you do it in the
middle of a complex expression. Plus, of course, using the _same_
idiom between C and Python means less state-switching.
So, while I'm making annoying indentation changes in general, this
seems like a good time to dig out all the cases of the old idiom in
this code, and switch them over to the new.
My aim has always been to have those back-dented by 2 spaces (half an
indent level) compared to the statements around them, so that in
particular switch statements have distinct alignment for the
statement, the cases and the interior code without consuming two whole
indent levels.
This patch sweeps up all the violations of that principle found by my
bulk-reindentation exercise.
My bulk indentation check also turned up a lot of cases where a run-on
function call or if statement didn't have its later lines aligned
correctly relative to the open paren.
I think this is quite easy to do by getting things out of
sync (editing the first line of the function call and forgetting to
update the rest, perhaps even because you never _saw_ the rest during
a search-replace). But a few didn't quite fit into that pattern, in
particular an outright misleading case in unix/askpass.c where the
second line of a call was aligned neatly below the _wrong_ one of the
open parens on the opening line.
Restored as many alignments as I could easily find.
In several pieces of development recently I've run across the
occasional code block in the middle of a function which suddenly
switched to 2-space indent from this code base's usual 4. I decided I
was tired of it, so I ran the whole code base through a re-indenter,
which made a huge mess, and then manually sifted out the changes that
actually made sense from that pass.
Indeed, this caught quite a few large sections with 2-space indent
level, a couple with 8, and a handful of even weirder things like 3
spaces or 12. This commit fixes them all.
The fixed tab stops that we used to use in the old LBS_HASSTRINGS list
box, and that I carefully replicated in the new owner-drawn version,
are no more! Now, every time we refresh the key list, we actually
_measure_ the maximum size of string that needs to fit into each
column, and size the columns based on that.
Now I don't have to worry any more about whether the set of algorithm
names might one day overflow the fixed column width, or whether a
particularly unlucky choice of key with lots of wide letters like M
and W in its base64-encoded SHA256 hash might do the same.
Also, the previous column sizes were pessimistic (for reason of
exactly that worry), so this change generally moves things over
towards the left of the list box - which means there's now room for
longer key comments, and more chance of the suffixes '(encrypted)' or
'(re-encryptable)' being visible on the right.
The test in the Pageant list box code for whether we should display
the bit count of a key was done by checking specifically for ssh_rsa
or ssh_dsa, which of course meant that it didn't catch the certified
versions of those keys.
Now there's yet another footling ssh_keyalg method that asks the
question 'is it worth displaying the bit count?', to which RSA and DSA
answer yes, and the opensshcert family delegates to its base key type,
so that RSA and DSA certified keys also answer yes.
(This isn't the same as ssh_key_public_bits(alg, blob) >= 0. All
supported public key algorithms _can_ display a bit count if called
on. But only in RSA and DSA is it configurable, and therefore worth
bothering to print in the list box.)
Also in this commit, I've fixed a bug in the certificate
implementation of public_bits, which was passing a wrongly formatted
public blob to the underlying key. (Done by factoring out the code
from opensshcert_new_shared which constructed the _correct_ public
blob, and reusing it in public_bits to do the same job.)
If you load a certified key into Windows Pageant, the official SSH id
for the key type is so long that it overflows its space in the list
box and overlaps the key fingerprint hash.
This commit introduces yet another footling little ssh_keyalg method
which returns a shorter human-readable description of the key type,
and uses that in the Windows Pageant list box only.
(Not in the Unix Pageant list, though, because being output to stdout,
that seems like something people are more likely to want to
machine-read, which firstly means we shouldn't change it lightly, and
secondly, if we did change it we'd want to avoid having a variable
number of spaces in the replacement key type text.)
The main key list control in the Pageant window was previously an
ordinary LBS_HASSTRINGS list box, with tab characters aligning the
various parts of the key information into different columns. This was
fragile because any mistake in the font metrics could have overflowed
a tab stop and forced the text to move on to the next one.
Now I've switched the list box into LBS_OWNERDRAWFIXED mode, which
means that in place of a string for each list item I store a struct of
my choice, and I have to draw the list-box entries myself by
responding to WM_DRAWITEM. So now I'm drawing each component of the
key information as a separate call to ExtTextOut (plus one
TabbedTextOut to put the '(encrypted)' suffix on the end), which means
that the tab stops are now guaranteed to appear where I tell them to.
No functional change, for the moment: this is pure refactoring. As
closely as I can tell, the appearance of the list box is
pixel-for-pixel what it was before this commit. But it opens the door
for two further improvements (neither one done in this commit): I can
dynamically choose the tab stop locations based on querying the text
metrics of the strings that will actually need to fit in the columns,
and also, whatever reorganisation I need to do to make certificates
fit sensibly in this list box can now be done without worrying about
breaking anything terribly fragile.
The main list box in the Pageant key list window was identified by a
numeric control id, even though pageant-rc.h has a nice meaningful
macro name for it (and pageant.c uses that).
When PuTTYgen is holding a certified key, I don't think there's any
sensible use for pasting around the full public key in authorized_keys
format, because the whole point is that what you put in
authorized_keys is 'please trust this CA' rather than the specific
key. So instead I've reused the space in the dialog box to indicate
that it's a certificate, and provide a 'more info' sub-dialog.
I'm about to want setupbigedit1 and setupbigedit2 to know the control
ids themselves, and also add more controls to the enum, and it keeps
the diffs more legible if I move the entire enum around unchanged
_first_ and then start making small changes in the middle of it.
Most of the Windows-specific dialog control construction functions
were passing their string parameters as 'char *' even though they were
string literals. Apparently none of our previous giant constification
patches spotted that.
A user reports that if the Print Spooler service is disabled via
services.msc, then PuTTY can report 'Out of memory!' when you try to
open the Terminal config pane, which is the one containing the combo
box enumerating the available printers.
Apparently this is because the call to EnumPrinters failed with the
error code other than the expected ERROR_INSUFFICIENT_BUFFER, and in
the process, left garbage in the pcbNeeded output parameter. That
wouldn't be too surprising if it had simply _not written_ to that
parameter and therefore it was never initialised at all in the calling
function printer_add_enum. But in fact, printer_add_enum *does*
precautionarily initialise needed=0 before the initial call to
EnumPrinters. So EnumPrinters must have actively written one of its
*own* uninitialised variables into it!
Anyway, the obvious fix is to distinguish ERROR_INSUFFICIENT_BUFFER
from any other kind of EnumPrinters failure (in fact turning off Print
Spooler seems to lead to RPC_S_SERVER_UNAVAILABLE), and not attempt to
proceed in the case of other failures.
In the 'xterm 216+' function key mode, a function key pressed with a
combination of Shift, Ctrl and Alt has its usual sequence like
ESC[n~ (for some integer n) turned into ESC[n;m~ where m-1 is a 3-bit
bitmap of currently pressed modifier keys.
This mode now also applies to the keys on the small keypad above the
arrow keys (Ins, Home, PgUp etc). If xterm 216+ mode is selected,
those keys are modified in the same way as the function keys.
As with the function keys, this doesn't guarantee that PuTTY will
_receive_ any particular shifted key of this kind, and not repurpose
it. Just as Alt+F4 still closes the window (at least on Windows)
rather than sending a modified F4 sequence, Shift+Ins will still
perform a paste action rather than sending a modified Ins sequence,
Shift-PgUp will still scroll the scrollback, etc. But the keys not
already used by PuTTY for other purposes should now have their
modern-xterm behaviour in modern-xterm mode.
Thanks to H.Merijn Brand for developing and testing a version of this
patch.
The text of the host key warnings was replicated in three places: the
Windows rc file, the GTK dialog setup function, and the console.c
shared between both platforms' CLI tools. Now it lives in just one
place, namely ssh/common.c where the rest of the centralised host-key
checking is done, so it'll be easier to adjust the wording in future.
This comes with some extra automation. Paragraph wrapping is no longer
done by hand in any version of these prompts. (Previously we let GTK
do the wrapping on GTK, but on Windows the resource file contained a
bunch of pre-wrapped LTEXT lines, and console.c had pre-wrapped
terminal messages.) And the dialog heights in Windows are determined
automatically based on the amount of stuff in the window.
The main idea of all this is that it'll be easier to set up more
elaborate kinds of host key prompt that deal with certificates (if,
e.g., a server sends us a certified host key which we don't trust the
CA for). But there are side benefits of this refactoring too: each
tool now reliably inserts its own appname in the prompts, and also, on
Windows the entire prompt text is copy-pastable.
Details of implementation: there's a new type SeatDialogText which
holds a set of (type, string) pairs describing the contents of a
prompt. Type codes distinguish ordinary text paragraphs, paragraphs to
be displayed prominently (like key fingerprints), the extra-bold scary
title at the top of the 'host key changed' version of the dialog, and
the various information that lives in the subsidiary 'more info' box.
ssh/common.c constructs this, and passes it to the Seat to present the
actual prompt.
In order to deal with the different UI for answering the prompt, I've
added an extra Seat method 'prompt_descriptions' which returns some
snippets of text to interpolate into the messages. ssh/common.c calls
that while it's still constructing the text, and incorporates the
resulting snippets into the SeatDialogText.
For the moment, this refactoring only affects the host key prompts.
The warnings about outmoded crypto are still done the old-fashioned
way; they probably ought to be similarly refactored to use this new
SeatDialogText system, but it's not immediately critical for the
purpose I have right now.
They were previously using an ad-hoc system for getting hold of their
context structure, which had to be different between the WM_INITDIALOG
handler and everything else. That's exactly what ShinyDialogBox is
good for preventing, so let's use it, to save effort.
Partly, this just seems more sensible, since it may well vary per
platform beyond the ability of intorptr to specify. But more
immediately it means the definition of the HELPCTX macro doesn't have
to use the P() function from dialog.h, which isn't defined in any
circumstances outside the config subsystem. And I'm about to want to
put a help context well outside that subsystem.
This replaces the previous placeholder scheme of having a list of
hostname wildcards with implicit logical-OR semantics (if any wildcard
matched then the certificate would be trusted to sign for that host).
That scheme didn't allow for exceptions within a domain ('everything
in example.com except extra-high-security-machine.example.com'), and
also had no way to specify port numbers.
In the new system, you can still write a hostname wildcard by itself
in the simple case, but now those are just atomic subexpressions in a
boolean-logic domain-specific language I've made up. So if you want
multiple wildcards, you can separate them with || in a single longer
expression, and also you can use && and ! to impose exceptions on top
of that.
Full details of the expression language are in the comment at the top
of utils/cert-expr.c. It'll need documenting properly before release,
of course.
For the sake of backwards compatibility for early adopters who've
already set up configuration in the old system, I've put in some code
that will read the old MatchHosts configuration and automatically
translate it into the equivalent boolean expression (by simply
stringing together the list of wildcards with || between them).
This manipulates the selection inside an edit box, to select a
specific range of characters in the contained text. The idea is that
you can use it as a means of error highlighting, if the user has
entered something invalid in that edit box and you want to draw their
attention to the specific location of the part you're unhappy with.
ctrl_radiobuttons has a variadic argument list that it expects to be
terminated by a null pointer where a const char * should be. So the
terminating NULL at each call site ought to be cast to const char *,
for the usual reason (NULL might expand literally to something not the
same size as a pointer and get mis-marshalled in the variadic argument
list). It wasn't being.
Fixed in the same way as dupcat: I've turned ctrl_radiobuttons into a
macro wrapper on the underlying function, which adds the NULL
automatically with its correct cast. This also saves verbiage at every
call site, because now I can leave out all the NULLs there.
Instead of maintaining a single sparse table mapping Unicode to the
currently selected code page, we now maintain a collection of such
tables mapping Unicode to any code page we've so far found a need to
work with, and we add code pages to that list as necessary, and never
throw them away (since there are a limited number of them).
This means that the wc_to_mb family of functions are effectively
stateless: they no longer depend on a 'struct unicode_data'
corresponding to the current terminal settings. So I've removed that
parameter from all of them.
This fills in the missing piece of yesterday's commit a216d86106:
now wc_to_mb too should be able to handle internally-implemented
character sets, by hastily making their reverse mapping table if it
doesn't already have it.
(That was only a _latent_ bug, because the only use of wc_to_mb in the
cross-platform or Windows code _did_ want to convert to the currently
selected code page, so the old strategy worked in that case. But there
was no protection against an unworkable use of it being added later.)
A user points out that the new charset-aware window title setting
doesn't work if the configured character set is one of the entries in
cp_list[] based on a hard-coded Unicode translation table, such as the
ISO 8859 family.
That's because the Windows mb_to_wc() function assumes that the code
page it's given will always be OK to pass to the Windows API function
MultiByteToWideChar, forgetting that for those internally implemented
single-byte character sets are not.
This commit adds a manual implementation of SBCS -> Unicode based on
those tables, which restores the ability to set a window title
specified in ISO 8859.
However, it's not a full fix to windows/unicode.c in general, because
wc_to_mb has a similar blind spot: it's only prepared to convert
Unicode to an internally implemented SBCS if that SBCS happens to be
the one currently set in ucsdata->line_codepage, because that's when
we've already prepared the reverse lookup table. Probably we ought to
sort that out, and arrange that it can make the reverse lookup table
if suddenly called on to do a different conversion. But that needs
more refactoring, so I haven't done it in this commit.
Fixes a cosmetic issue where the new ConPTY error added in 4ae8b742ab
had an ugly "Unable to open connection to".
(Arguably this ought to test a backend property rather than
cmdline_tooltype.)
This makes pterm.exe support the same (very small) subset of the
standard option collection that Unix pterm does. Namely, -load (which
won't do anything useful with a hostname to connect to, but is still
useful if you have a saved session containing configuration like
colours or default size or what have you), and also -sessionlog.
To make this work, I've had to move the 'tooltype' definition out of
window.c into {putty,pterm}.c, so that it can be defined differently
in the two.
Turns out that standard C 'size_t' and the Win32 API's 'SIZE_T' aren't
the same integer type in all cases! They have the same _size_, but in
32-bit, one of them is a typedef for 'unsigned int' and the other for
'unsigned long', leading to annoying pointer-type mismatch warnings.