A user reports, _just_ in time to make the 0.79 release, that changes
in the Windows port of OpenSSH from 8.9.x have made it unhappy with
the use of \ as a path separator in the 'IdentityAgent' config
directive. Switch to /, which is also accepted by earlier versions, so
it should work everywhere.
The pathname of Pageant's named pipe includes the name of the user
running it. And Windows usernames are allowed to have spaces in! So
the pipe pathname may also have a space, in which case Windows OpenSSH
will interpret the spacey pathname as an invalid first half followed
by a trailing garbage word.
A user reports that quoting the filename makes this work. Since double
quotes are an illegal Windows filename character, I think it should
therefore do no harm to quote it unconditionally, which is the easiest
fix.
When you send a "publickey" USERAUTH_REQUEST containing a certified
RSA key, and you want to use a SHA-2 based RSA algorithm, modern
OpenSSH expects you to send the algorithm string as
rsa-sha2-NNN-cert-v01@openssh.com. But 7.7 and earlier didn't
recognise those names, and expected the algorithm string in the
userauth request packet to be ssh-rsa-cert-v01@... and would then
follow it with an rsa-sha2-NNN signature.
OpenSSH itself has a bug workaround for its own older versions. Follow
suit.
In commit d07d7d66f6 I rewrote the code that constructs RTF paste
data so that it uses a strbuf, in place of the previous ad-hoc code
that counted up the lengths of pieces of RTF in advance in order to
realloc the buffer.
But apparently I left in an entire loop whose job was to count up one
of those lengths, failing to notice that it's now completely pointless
because its output value is never needed!
Happily a clang upgrade has just improved the 'variable set but not
used' warning to the point where it can spot that. I expect previously
the variable still counted as 'used' because each increment of it used
the previous value.
(cherry picked from commit 6a27ae772c)
When the standalone version of a binary, with its help file included
as a resource, extracts that resource to write it to a disk, it could
have accidentally skipped a byte in the middle if the WriteFile call
in this loop had not managed to write the whole file in one go.
(cherry picked from commit 775d969ca8)
This reverts commit 0615767224
("Windows installer: remove explicit InstallScope setting"), albeit
with different comments.
The original change worked around a Windows security vulnerability
(CVE-2023-21800), but also resulted in a rather broken installer.
(cherry picked from commit cedeb75d59)
I checked exhaustively today and found that the only characters (even
in Unicode) that Windows's default argv splitter will recognise as
word separators are the space and tab characters. So I think it's a
mistake to use <ctype.h> functions to identify word separators; we
should use that fixed character pair, and then we know we're getting
the right ones only.
(cherry picked from commit 9adfa79767)
I thought I'd found all of these before, but perhaps a few managed to
slip in since I last looked. The character argument to the <ctype.h>
functions must have the value of an unsigned char or EOF; passing an
ordinary char (unless you know char is unsigned on every platform the
code will ever go near) risks mistaking '\xFF' for EOF, and causing
outright undefined behaviour on byte values in the range 80-FE. Never
do it.
(cherry picked from commit a76109c586)
Horizontal scroll events aren't generated by the traditional mouse
wheel, but they can be generated by trackpad gestures, though this
isn't always configured on.
The cross-platform and Windows parts of this patch is due to
Christopher Plewright; I added the GTK support.
(cherry picked from commit 819efc3c21)
From https://invisible-island.net/xterm/ctlseqs/ctlseqs.html#h3-Any-event-tracking:
Any-event mode is the same as button-event mode, except that all motion
events are reported, even if no mouse button is down. It is enabled by
specifying 1003 to DECSET.
Normally the front ends only report mouse events when buttons are
pressed, so we introduce a MA_MOVE event with MBT_NOTHING set to
indicate such a mouse movement.
(cherry picked from commit 3cfbd3df0f)
A server attempt to resize the window (for instance via DECCOLM) when
"When window is resized" was set to "Forbid resizing completely" would
cause all terminal output to be suspended, due to the resize attempt
never being acknowledged.
(There are other code paths like this, which I've fixed for
completeness, but I don't think they have any effect: the terminal
filters out resize attempts to the current size before this point, and
even if a server can get such a request through the SUPDUP protocol, the
test for that is wrong and will never fire -- this needs fixing
separately.)
(cherry picked from commit ebceb8bc94)
The "Cancel" button's keyboard shortcut was accidentally removed by
f1c8298000, having only just reinstated it in a77040afa1.
(Also, fix a couple of blatantly fibbing "accelerators used" comments.)
It turns out this isn't actually necessary after all to make the
installer behave in the expected way in the default case (giving a UAC
prompt and installing systemwide). And I'm told it has undesirable
consequences in more complicated cases, which I'm not expert enough in
MSI to fully understand.
The conditionalisation of that call on 'protocol == PROT_SSH' has been
around since the beginning of our git history. But in those days,
random_save_seed() was unconditional _internally_ - it would always
create and write to the seed file regardless of whether the random
pool had even been initialised, let alone used.
Now random_save_seed() has its own internal condition which prevents
it doing anything if the random subsystem was never started up in the
first place. So it's better to call it unconditionally from
cleanup_exit, and then it'll be able to do its thing whenever needed,
without having to second-guess based on the top-level protocol.
(In fact, that's what all the other implementations of cleanup_exit()
have done all along. On Unix, and in Windows console apps, we do call
random_save_seed() unconditionally, and expect it to uncomplainingly
do nothing if there's nothing to do.)
(cherry picked from commit 260aad5fca)
In cases where we refuse a resize request, either because it's too
large or because the window is not currently resizable due to being
maximised, we were failing to communicate that back to the Terminal so
that it could stop waiting for the resize and resume processing input.
Those have been there since around 2001. They're in a piece of code
that calls get_fullscreen_rect to find the overall screen size, and
then prevents attempts to resize the window larger than that. The
static variables were arranging that we don't have to call
get_fullscreen_rect more than once.
But, firstly, computers are faster 20 years on; secondly, remote
window-resize requests are intentionally rate-limited (as of commit
d74308e90e), so this shouldn't be the limiting factor anyway; and
thirdly, multi-monitor support has appeared since then, which means
that if the window has been dragged from one monitor to another then
get_fullscreen_rect might _legitimately_ return a different bounding
rectangle when called a second time.
So we should just do the full check every time unconditionally.
(cherry picked from commit 4b3a8cbf61)
It complained in console_confirm_ssh_host_key that if the caller
passed in a SeatDialogText containing no SDT_PROMPT record, then
'prompt' would be uninitialised.
The answer is "don't do that, then", but fair enough that Coverity
didn't know that. Added an assertion, which should keep it happy, and
also cause better error handling if we ever _do_ make that mistake.
Coverity complained in keylist_update_callback that in one if
statement I was allowing for the possibility that alg == NULL, and in
the next, I was assuming it would always be non-null.
Right now I'm not actually convinced that _either_ check is necessary
- it would make sense in an agent _client_, where you might be talking
to an agent that knows key algorithms you don't, but this is the GUI
built into Pageant itself, so any key it can store internally ought to
have a known algorithm name.
Still, this fix is certainly _correct_ even if not optimal, and it'll
do for now.
Coverity points out that if we refer to cp_list[codepage - 65536], we
ought to have ensured that codepage - 65536 was _less_ than
lenof(cp_list), not just less or equal.
We had carefully calculated, for each control in an aligned group, how
much _that control_ should move downwards by. But then, because I
carelessly referred to the wrong variable name, we actually moved the
wrong one - not the control we'd just calculated the offset for, but
always the _last_ one in the group, which was the one the top-level
alignment code was processing at the point we began this loop.
As a result, the dropdown list in the front-page protocol selector was
hilariously misaligned. Now it's back where it should be.
My experimental build with clang-cl at -Wall did show up a few things
that are safe enough to fix right now. One was this list of missing
includes, which was causing a lot of -Wmissing-prototype warnings, and
is a real risk because it means the declarations in headers weren't
being type-checked against the actual function definitions.
Happily, no actual mismatches.
Use of thread-local storage on Windows (introduced recently in
69e8d471d1) could cause a -Wattributes warning in mingw-w64 builds,
since that toolchain doesn't understand __declspec(thread).
Define a portability macro THREADLOCAL, which should resolve to
something appropriate for at least:
- MSVC, which understands the Microsoft syntax __declspec(thread);
- GCC (e.g., mingw-w64) which understands the GNU syntax __thread
(GCC only implements __declspec() to the extent of understanding the
arguments 'dllexport' and 'dllimport');
- Clang, which supports both syntaxes.
(It's possible there's some more obscure Windows toolchain which will
now hit the #error as a result of this change.)
I haven't attempted to try to detect and use the C11 syntax
'thread_local'. And this is all still Windows-only, since that's all we
need for now and it avoids opening the can of worms that is TLS on other
platforms.
(I considered delegating all this to cmake, but as well as being fiddly,
it seems even the latest versions of cmake don't know about thread-local
storage for C, as opposed to C++ (cxx_thread_local).)
In recent months I've had two requests from different people to build
support into PuTTY for automatically handling complicated third-party
auth protocols layered on top of keyboard-interactive - the kind of
thing where you're asked to enter some auth response, and you have to
refer to some external source like a web server to find out what the
right response _is_, which is a pain to do by hand, so you'd prefer it
to be automated in the SSH client.
That seems like a reasonable thing for an end user to want, but I
didn't think it was a good idea to build support for specific
protocols of that kind directly into PuTTY, where there would no doubt
be an ever-lengthening list, and maintenance needed on all of them.
So instead, in collaboration with one of my correspondents, I've
designed and implemented a protocol to be spoken between PuTTY and a
plugin running as a subprocess. The plugin can opt to handle the
keyboard-interactive authentication loop on behalf of the user, in
which case PuTTY passes on all the INFO_REQUEST packets to it, and
lets it make up responses. It can also ask questions of the user if
necessary.
The protocol spec is provided in a documentation appendix. The entire
configuration for the end user consists of providing a full command
line to use as the subprocess.
In the contrib directory I've provided an example plugin written in
Python. It gives a set of fixed responses suitable for getting through
Uppity's made-up k-i system, because that was a reasonable thing I
already had lying around to test against. But it also provides example
code that someone else could pick up and insert their own live
response-provider into the middle of, assuming they were happy with it
being in Python.
We already have the ability to start a subprocess and hook it up to a
Socket, for running local proxy commands. Now the same facility is
available as an auxiliary feature, so that a backend can start another
subcommand for a different purpose, and make a separate Socket to
communicate with it.
Just like the local proxy system, this facility captures the
subprocess's stderr, and passes it back to the caller via plug_log. To
make that not look silly, I had to add a system where the "proxy:"
prefix on the usual plug_log messages is reconfigurable, and when you
call platform_start_subprocess(), you get to pass the prefix you want
to use in this case.
I made a specific subdirectory 'stubs' to keep all the link-time stub
modules in, like notiming.c. And I put _one_ run-time stub in it,
namely nullplug.c. But the rest of the runtime stubs went into utils.
I think it's better to keep all the stubs together, so I've moved all
the null*.c in utils into stubs (with the exception of nullstrcmp.c,
which means the 'null' in a different sense). Also, fiddled with the
naming to be a bit more consistent, and stated in the new CMakeLists
the naming policy that distinguishes no-*.c from null-*.c.
We've occasionally had reports of SSH servers disconnecting as soon as
they receive PuTTY's KEXINIT. I think all such reports have involved
the kind of simple ROM-based SSH server software you find in small
embedded devices.
I've never been able to prove it, but I've always suspected that one
possible cause of this is simply that PuTTY's KEXINIT is _too long_,
either in number of algorithms listed or in total length (especially
given all the ones that end in @very.long.domain.name suffixes).
If I'm right about either of those being the cause, then it's just
become even more likely to happen, because of all the extra
Diffie-Hellman groups and GSSAPI algorithms we just threw into our
already-long list in the previous few commits.
A workaround I've had in mind for ages is to wait for the server's
KEXINIT, and then filter our own down to just the algorithms the
server also mentioned. Then our KEXINIT is no longer than that of the
server, and hence, presumably fits in whatever buffer it has. So I've
implemented that workaround, in anticipation of it being needed in the
near future.
(Well ... it's not _quite_ true that our KEXINIT is at most the same
length as the server. In fact I had to leave in one KEXINIT item that
won't match anything in the server's list, namely "ext-info-c" which
gates access to SHA-2 based RSA. So if we turn out to support
absolutely everything on all the server's lists, then our KEXINIT
would be a few bytes longer than the server's, even with this
workaround. But that would only cause trouble if the server's outgoing
KEXINIT was skating very close to whatever buffer size it has for the
incoming one, and I'm guessing that's not very likely.)
((Another possible cause of this kind of disconnection would be a
server that simply objects to seeing any KEXINIT string it doesn't
know how to speak. But _surely_ no such server would have survived
initial testing against any full-featured client at all!))
I only recently found out that OpenSSH defined their own protocol IDs
for AES-GCM, defined to work the same as the standard ones except that
they fixed the semantics for how you select the linked cipher+MAC pair
during key exchange.
(RFC 5647 defines protocol ids for AES-GCM in both the cipher and MAC
namespaces, and requires that you MUST select both or neither - but
this contradicts the selection policy set out in the base SSH RFCs,
and there's no discussion of how you resolve a conflict between them!
OpenSSH's answer is to do it the same way ChaCha20-Poly1305 works,
because that will ensure the two suites don't fight.)
People do occasionally ask us for this linked cipher/MAC pair, and now
I know it's actually feasible, I've implemented it, including a pair
of vector implementations for x86 and Arm using their respective
architecture extensions for multiplying polynomials over GF(2).
Unlike ChaCha20-Poly1305, I've kept the cipher and MAC implementations
in separate objects, with an arm's-length link between them that the
MAC uses when it needs to encrypt single cipher blocks to use as the
inputs to the MAC algorithm. That enables the cipher and the MAC to be
independently selected from their hardware-accelerated versions, just
in case someone runs on a system that has polynomial multiplication
instructions but not AES acceleration, or vice versa.
There's a fourth implementation of the GCM MAC, which is a pure
software implementation of the same algorithm used in the vectorised
versions. It's too slow to use live, but I've kept it in the code for
future testing needs, and because it's a convenient place to dump my
design comments.
The vectorised implementations are fairly crude as far as optimisation
goes. I'm sure serious x86 _or_ Arm optimisation engineers would look
at them and laugh. But GCM is a fast MAC compared to HMAC-SHA-256
(indeed compared to HMAC-anything-at-all), so it should at least be
good enough to use. And we've got a working version with some tests
now, so if someone else wants to improve them, they can.
We're now setting the help context centrally in ssh/common.c - but I
forgot to remove the _old_ assignment statements, which overwrite
whatever that asks for. Oops.
OpenSSH, when called on to give the fingerprint of a certified public
key, will in many circumstances generate the hash of the public blob
of the _underlying_ key, rather than the hash of the full certificate.
I think the hash of the certificate is also potentially useful (if
nothing else, it provides a way to tell apart multiple certificates on
the same key). But I can also see that it's useful to be able to
recognise a key as the same one 'really' (since all certificates on
the same key share a private key, so they're unavoidably related).
So I've dealt with this by introducing an extra pair of fingerprint
types, giving the cross product of {MD5, SHA-256} x {base key only,
full certificate}. You can manually select which one you want to see
in some circumstances (notably PuTTYgen), and in others (such as
diagnostics) both fingerprints will be emitted side by side via the
new functions ssh2_double_fingerprint[_blob].
The default, following OpenSSH, is to just fingerprint the base key.
This patch fixes a few other whitespace and formatting issues which
were pointed out by the bulk-reindent or which I spotted in passing,
some involving manual editing to break lines more nicely.
I think the weirdest hunk in here is the one in windows/window.c
TranslateKey() where _half_ of an assignment statement inside an 'if'
was on the same line as the trailing paren of the if condition. No
idea at all how that one managed to happen!
I think a lot of these were inserted by a prior run through GNU indent
many years ago. I noticed in a more recent experiment that that tool
doesn't always correctly distinguish which instances of 'id * id' are
pointer variable declarations and which are multiplications, so it
spaces some of the former as if they were the latter.
If the function name (or expression) in a function call or declaration
is itself so long that even the first argument doesn't fit after it on
the same line, or if that would leave so little space that it would be
silly to try to wrap all the run-on lines into a tall thin column,
then I used to do this
ludicrously_long_function_name
(arg1, arg2, arg3);
and now prefer this
ludicrously_long_function_name(
arg1, arg2, arg3);
I picked up the habit from Python, where the latter idiom is required
by Python's syntactic significance of newlines (you can write the
former if you use a backslash-continuation, but pretty much everyone
seems to agree that that's much uglier). But I've found it works well
in C as well: it makes it more obvious that the previous line is
incomplete, it gives you a tiny bit more space to wrap the following
lines into (the old idiom indents the _third_ line one space beyond
the second), and I generally turn out to agree with the knock-on
indentation decisions made by at least Emacs if you do it in the
middle of a complex expression. Plus, of course, using the _same_
idiom between C and Python means less state-switching.
So, while I'm making annoying indentation changes in general, this
seems like a good time to dig out all the cases of the old idiom in
this code, and switch them over to the new.
My aim has always been to have those back-dented by 2 spaces (half an
indent level) compared to the statements around them, so that in
particular switch statements have distinct alignment for the
statement, the cases and the interior code without consuming two whole
indent levels.
This patch sweeps up all the violations of that principle found by my
bulk-reindentation exercise.
My bulk indentation check also turned up a lot of cases where a run-on
function call or if statement didn't have its later lines aligned
correctly relative to the open paren.
I think this is quite easy to do by getting things out of
sync (editing the first line of the function call and forgetting to
update the rest, perhaps even because you never _saw_ the rest during
a search-replace). But a few didn't quite fit into that pattern, in
particular an outright misleading case in unix/askpass.c where the
second line of a call was aligned neatly below the _wrong_ one of the
open parens on the opening line.
Restored as many alignments as I could easily find.
In several pieces of development recently I've run across the
occasional code block in the middle of a function which suddenly
switched to 2-space indent from this code base's usual 4. I decided I
was tired of it, so I ran the whole code base through a re-indenter,
which made a huge mess, and then manually sifted out the changes that
actually made sense from that pass.
Indeed, this caught quite a few large sections with 2-space indent
level, a couple with 8, and a handful of even weirder things like 3
spaces or 12. This commit fixes them all.
The fixed tab stops that we used to use in the old LBS_HASSTRINGS list
box, and that I carefully replicated in the new owner-drawn version,
are no more! Now, every time we refresh the key list, we actually
_measure_ the maximum size of string that needs to fit into each
column, and size the columns based on that.
Now I don't have to worry any more about whether the set of algorithm
names might one day overflow the fixed column width, or whether a
particularly unlucky choice of key with lots of wide letters like M
and W in its base64-encoded SHA256 hash might do the same.
Also, the previous column sizes were pessimistic (for reason of
exactly that worry), so this change generally moves things over
towards the left of the list box - which means there's now room for
longer key comments, and more chance of the suffixes '(encrypted)' or
'(re-encryptable)' being visible on the right.
The test in the Pageant list box code for whether we should display
the bit count of a key was done by checking specifically for ssh_rsa
or ssh_dsa, which of course meant that it didn't catch the certified
versions of those keys.
Now there's yet another footling ssh_keyalg method that asks the
question 'is it worth displaying the bit count?', to which RSA and DSA
answer yes, and the opensshcert family delegates to its base key type,
so that RSA and DSA certified keys also answer yes.
(This isn't the same as ssh_key_public_bits(alg, blob) >= 0. All
supported public key algorithms _can_ display a bit count if called
on. But only in RSA and DSA is it configurable, and therefore worth
bothering to print in the list box.)
Also in this commit, I've fixed a bug in the certificate
implementation of public_bits, which was passing a wrongly formatted
public blob to the underlying key. (Done by factoring out the code
from opensshcert_new_shared which constructed the _correct_ public
blob, and reusing it in public_bits to do the same job.)
If you load a certified key into Windows Pageant, the official SSH id
for the key type is so long that it overflows its space in the list
box and overlaps the key fingerprint hash.
This commit introduces yet another footling little ssh_keyalg method
which returns a shorter human-readable description of the key type,
and uses that in the Windows Pageant list box only.
(Not in the Unix Pageant list, though, because being output to stdout,
that seems like something people are more likely to want to
machine-read, which firstly means we shouldn't change it lightly, and
secondly, if we did change it we'd want to avoid having a variable
number of spaces in the replacement key type text.)
The main key list control in the Pageant window was previously an
ordinary LBS_HASSTRINGS list box, with tab characters aligning the
various parts of the key information into different columns. This was
fragile because any mistake in the font metrics could have overflowed
a tab stop and forced the text to move on to the next one.
Now I've switched the list box into LBS_OWNERDRAWFIXED mode, which
means that in place of a string for each list item I store a struct of
my choice, and I have to draw the list-box entries myself by
responding to WM_DRAWITEM. So now I'm drawing each component of the
key information as a separate call to ExtTextOut (plus one
TabbedTextOut to put the '(encrypted)' suffix on the end), which means
that the tab stops are now guaranteed to appear where I tell them to.
No functional change, for the moment: this is pure refactoring. As
closely as I can tell, the appearance of the list box is
pixel-for-pixel what it was before this commit. But it opens the door
for two further improvements (neither one done in this commit): I can
dynamically choose the tab stop locations based on querying the text
metrics of the strings that will actually need to fit in the columns,
and also, whatever reorganisation I need to do to make certificates
fit sensibly in this list box can now be done without worrying about
breaking anything terribly fragile.
The main list box in the Pageant key list window was identified by a
numeric control id, even though pageant-rc.h has a nice meaningful
macro name for it (and pageant.c uses that).
When PuTTYgen is holding a certified key, I don't think there's any
sensible use for pasting around the full public key in authorized_keys
format, because the whole point is that what you put in
authorized_keys is 'please trust this CA' rather than the specific
key. So instead I've reused the space in the dialog box to indicate
that it's a certificate, and provide a 'more info' sub-dialog.
I'm about to want setupbigedit1 and setupbigedit2 to know the control
ids themselves, and also add more controls to the enum, and it keeps
the diffs more legible if I move the entire enum around unchanged
_first_ and then start making small changes in the middle of it.
Most of the Windows-specific dialog control construction functions
were passing their string parameters as 'char *' even though they were
string literals. Apparently none of our previous giant constification
patches spotted that.