In a rekey, we expect to see the same host key again, which we enforce
by comparing its cache string, which we happened to have handy. But
certified host keys don't have cache strings, so this no longer works
reliably - the 'assert(s->keystr)' fails.
(This is what I get for making a zillion short-lived test connections
and not leaving any of them running for more than 2 minutes!)
Instead, we now keep the official public blob of the host key from the
first key exchange, and compare that to the public blob of the one in
the rekey.
prepare_session() is the function that takes a Conf in the form it was
loaded from the configuration, and normalises it into the form that
backends can make sense of easily. In particular, if CONF_host
contains something like "user@hostname", this is the place where the
"user@" is stripped off the hostname field and moved into
CONF_username where the backend will expect to find it.
Therefore, you're _always_ supposed to call prepare_session() before
launching a backend from a Conf you (potentially) got from a saved
session. But the SSH proxy code forgot to.
As a result, if you had a saved session with "user@hostname" in the
hostname field, it would work fine to launch that session directly in
PuTTY, but trying to use the same saved session as an SSH proxy would
fail mysteriously after trying to pass "user@hostname" to
getaddrinfo() and getting nothing useful back.
(On Windows, the error message is especially opaque: an invalid string
of that kind generates an error code that we weren't even tranlsating.
And even if we _do_ translate it, it wouldn't be very meaningful,
because the error in question is WSANO_RECOVERY, which FormatMessage
renders as "A non-recoverable error occurred during a database
lookup." I guess the error in question was that Windows couldn't even
figure out how to translate that string into the format of a DNS
request message!)
In the initial version of SSH proxying that only opened direct-tcpip
channels, this wasn't important. But as of commit 6f7c52dcce, we
now support invoking a command or subsystem on the proxy SSH server,
and those _can_ generate stderr data which we must now separate from
stdout.
Happily, we have a perfectly sensible thing to _do_ with it: the same
thing we'd do with stderr coming from a local proxy subprocess, to
wit, pass it to log_proxy_stderr so that it can appear in the terminal
window (if configured to) and the Event Log.
I just tried to trace through the Windows version's control flow in
response to a confusing bug report, and found that the control flow
itself was so confusing I couldn't make sense of it. Why are we
choosing between getaddrinfo and gethostbyname via #ifndef NO_IPV6,
then re-converging control flow and diverging a second time to report
the error?
So I rewrote the whole thing to have completely separate sections of
code dealing with the three resolution strategies, each with its own
dedicated error reporting system. And then I checked the Unix version
and found it was about as confusing, so I rewrote that too in the same
style. Now the two are mostly the same, except for details: Unix has
an override at the top for a Unix socket pathname, Windows has to cope
with getaddrinfo maybe not being found at run time (so the other cases
aren't in the #else clause), and Windows uses the same error reporting
for both lookup functions whereas Unix has to use the appropriate
gai_strerror or hstrerror.
To begin with, Windows's own API documentation doesn't recommend using
it (for thread-safety reasons), and promises that the error codes
returned from getaddrinfo are aliases for the normal Windows error
code enumeration. So it's safe, and quite likely preferable, to just
use ordinary win_strerror instead.
But more embarrassingly, my attempt to acquire and use gai_strerror
from one or other Winsock DLL didn't even *work*! Because of course
it's a function that handles strings, which means it comes in two
variants, gai_strerrorA and gai_strerrorW, so in order to look it up
using GetProcAddress, I should have specified which I wanted. And I
didn't, so the lookup always failed.
This should improve error reporting in cases of interesting kinds of
DNS failure.
We're assigning string literals into it all over the place, so it
should have been const char * all along. No thanks to any of the
compilers that didn't point that out!
It's a thoroughly pedantic point, but I just spotted that the
comparison of wire data against theoretically platform-dependent char
escapes is a violation of \k{udp-portability}.
In commit 7d44e35bb3 I introduced a bug: we were providing an
array of MAXKEXLIST ints to ssh2_scan_kexinits() to write a list of
server-supplied host keys into, and when MAXKEXLIST stopped being a
thing, I mindlessly replaced it with an array dynamically allocated to
the number of host key types we'd offered the server.
But we return a list of host key types the _server_ offered _us_ (and
that we can speak at all), which isn't necessarily the same thing. In
particular, if you deliberately ask to cache a new host key type from
the specials menu, we send a KEXINIT offering just _one_ host key
type, namely the one you've asked for. But that loop still writes down
all the key types it gets back from the server, which is (almost
certainly) more than one. So the array overflows.
In that situation we don't really need the returned array of key types
at all, but it's easier to just make it work than to add conditionals.
Replaced it with a dynamically grown array in the usual sort of way.
As well as eliminating the null-pointer dereference, I also now
realise that the format-translation code depended on leaving the final
translated string in 'otherstr' in order to pass the host key check
afterwards (if they match).
I've also now realised that this only applies to *SSH-1* RSA keys, so
it's even more obsolete than I thought before. Perhaps I should just
remove this code instead of spending all this effort on fixing it. But
I've done the fix now, so I'll commit it, and then maybe we can remove
it afterwards (and have a working version of it available to resurrect
if ever needed!).
A user points out that in commit 6143a50ed2, when I converted all
use of the registry to functions that return a newly allocated buffer
instead of allocating a buffer themselves beforehand, I overlooked
that one use of the old idiom was reusing the preallocated buffer as
work space.
I _hope_ nobody still needs this code - the 'old-style' host key cache
format it handles was replaced in 2000. If anyone has a PuTTY host key
cache entry that's survived 22 years without either having to be
reinitialised on a new system or changed when the machine's host key
was upgraded, they're doing better than I am!
But if it's still here, it should still work, obviously. Replaced the
reused buffer with a strbuf, which is more robust anyway.
Initial live testing pointed out that the ssh_keyalg corresponding to
the certified version of rsa-sha2-512 was expecting to see the SSH id
string "rsa-sha2-512-cert-v01@openssh.com" at the start of the public
key blob, whereas in fact, the _key_ type identifier is still
"ssh-rsa-...", just as the key type for base rsa-sha2-512 is base
ssh-rsa.
Fixed inside openssh-certs.c, by adding a couple more strings to the
'extra' structure.
As of df3a21d97b, this panel has two "Browse..." buttons, and is the
first to do so. On Windows, the shortcut for this file chooser button
was hardcoded to 'w', causing an assertion failure whenever switching to
the Auth panel.
I've just removed the shortcut -- there are others near enough that it's
easy to reach the "Browse..." button with Tab.
This uses the test-CA code to construct a series of certificates with
various properties so as to check all the error cases of certificate
validation. It also tests the various different key types, and all the
RSA signature flags on both the certified key and the certifying one.
This is mostly intended to be invoked from cryptsuite, so that I can
make test certificates with various features to check the validation
function. But it also has a command-line interface, which currently
contains just enough features that I was able to generate a
certificate and actually make sure OpenSSH accepted it (proving that I
got the format right in this script).
You _could_ expand this script into a full production CA, with a
couple more command-line options, if you didn't mind the slightly
awkward requirement that in command-line mode it insists on doing its
signing via an SSH agent. But for the moment it's only intended for
test purposes.
Now we offer the OpenSSH certificate key types in our KEXINIT host key
algorithm list, so that if the server has a certificate, they can send
it to us.
There's a new storage.h abstraction for representing a list of trusted
host CAs, and which ones are trusted to certify hosts for what
domains. This is stored outside the normal saved session data, because
the whole point of host certificates is to avoid per-host faffing.
Configuring this set of trusted CAs is done via a new GUI dialog box,
separate from the main PuTTY config box (because it modifies a single
set of settings across all saved sessions), which you can launch by
clicking a button in the 'Host keys' pane. The GUI is pretty crude for
the moment, and very much at a 'just about usable' stage right now. It
will want some polishing.
If we have no CA configured that matches the hostname, we don't offer
to receive certified host keys in the first place. So for existing
users who haven't set any of this up yet, nothing will immediately
change.
Currently, if we do offer to receive certified host keys and the
server presents one signed by a CA we don't trust, PuTTY will bomb out
unconditionally with an error, instead of offering a confirmation box.
That's an unfinished part which I plan to fix before this goes into a
release.
This is triggered by a new config option, or alternatively a -cert
command-line option. You provide a certificate file (i.e. a public key
containing one of the cert key formats), and then, whenever you
authenticate with a private key that matches the public key inside
that certificate, the certificate will be sent to the server in place
of whatever public key it would have used before.
I expect this to be more convenient for some users than the approach
of baking the certificate into a modified version of the PPK file -
especially users who want to use different certificates on the same
key, either in sequence (if a CA continually reissues certificates
with short lifetimes) or in parallel (if different hosts trust
different CAs).
In particular, this substitution is applied consistently, even when
doing authentication via an agent. So if your bare private key is held
in Pageant, you can _still_ specify a detached certificate, and PuTTY
will spot that the key it's picked from Pageant matches that
certificate, and do the same substitution.
The detached certificate also overrides an existing certificate, if
there was one on the public key already.
This allows you to actually use an OpenSSH user certificate for
authentication, by combining the PPK you already had with the
certificate from the CA to produce a new PPK whose public half
contains the certificate.
I don't intend that this should be the only way to do it. It's
cumbersome to have to use the key passphrase in order to re-encrypt
the modified PPK. But the flip side is that once you've done it you
have everything you need in one convenient file, and also, the
certificate itself is covered by the PPK's tamperproofing (in case you
can think of any attacks in which the admin of your file server swaps
out just the certificate for a different one on the same key). So this
is *a* useful way to do it, even if not the only useful way.
The new options to add and remove certificates are supported by both
Windows GUI PuTTYgen and cmdgen. cmdgen can also operate on pure
public keys, so you can say 'puttygen --remove-certificate
foo-cert.pub' and get back the underlying foo.pub; Windows PuTTYgen
doesn't support that mode, but only because it doesn't in general have
any support for the loaded key not being a full public+private pair.
As far as I can tell, OpenSSH never stores private key files
containing a certified key. I plan to provide that as an option in
PuTTY (though not the only option); so if we get a certified key as
input to the export functions, we need to strip it back to the base
key to save it into either OpenSSH format.
Certificate keys don't work the same as normal keys, so the rest of
the code is going to have to pay attention to whether a key is a
certificate, and if so, treat it differently and do cert-specific
stuff to it. So here's a collection of methods for that purpose.
With one exception, these methods of ssh_key are not expected to be
implemented at all in non-certificate key types: they should only ever
be called once you already know you're dealing with a certificate. So
most of the new method pointers can be left out of the ssh_keyalg
initialisers.
The exception is the base_key method, which retrieves the base key of
a certificate - the underlying one with the certificate stripped off.
It's convenient for non-certificate keys to implement this too, and
just return a pointer to themselves. So I've added an implementation
in nullkey.c doing that. (The returned pointer doesn't transfer
ownership; you have to use the new ssh_key_clone() if you want to keep
the base key after freeing the certificate key.)
The methods _only_ implemented in certificates:
Query methods to return the public key of the CA (for looking up in a
list of trusted ones), and to return the key id string (which exists
to be written into log files).
Obviously, we need a check_cert() method which will verify the CA's
actual signature, not to mention checking all the other details like
the principal and the validity period.
And there's another fiddly method for dealing with the RSA upgrade
system, called 'related_alg'. This is quite like alternate_ssh_id, in
that its job is to upgrade one key algorithm to a related one with
more modern RSA signing flags (or any other similar thing that might
later reuse the same mechanism). But where alternate_ssh_id took the
actual signing flags as an argument, this takes a pointer to the
upgraded base algorithm. So it answers the question "What is to this
key algorithm as you are to its base?" - if you call it on
opensshcert_ssh_rsa and give it ssh_rsa_sha512, it'll give you back
opensshcert_ssh_rsa_sha512.
(It's awkward to have to have another of these fiddly methods, and in
the longer term I'd like to try to clean up their proliferation a bit.
But I even more dislike the alternative of just going through
all_keyalgs looking for a cert algorithm with, say, ssh_rsa_sha512 as
the base: that approach would work fine now but it would be a lurking
time bomb for when all the -cert-v02@ methods appear one day. This
way, each certificate type can upgrade itself to the appropriately
related version. And at least related_alg is only needed if you _are_
a certificate key type - it's not adding yet another piece of
null-method boilerplate to the rest.)
This commit is groundwork for full certificate support, but doesn't
complete the job by itself. It introduces the new key types, and adds
a test in cryptsuite ensuring they work as expected, but nothing else.
If you manually construct a PPK file for one of the new key types, so
that it has a certificate in the public key field, then this commit
enables PuTTY to present that key to a server for user authentication,
either directly or via Pageant storing and using it. But I haven't yet
provided any mechanism for making such a PPK, so by itself, this isn't
much use.
Also, these new key types are not yet included in the KEXINIT host
keys list, because if they were, they'd just be treated as normal host
keys, in that you'd be asked to manually confirm the SSH fingerprint
of the certificate. I'll enable them for host keys once I add the
missing pieces.
This is a simple tweak to the existing in-process SSH jump host
support, where instead of opening a direct-tcpip channel to the
destination host, we open a session channel and run a process in it to
make the connection to the destination.
So, where the existing jump host support replaced a local proxy
command along the lines of "plink %proxyhost -nc %host %port", this
one replaces "plink %proxyhost run-some-command".
Also added a corresponding option to use a subsystem to make the
connection. (Someone could configure an SSH server to support specific
subsystem names for particular destinations, or a general schema of
subsystem names that include the destination address in some standard
format.)
To avoid overflowing the already-full Proxy config panel with an extra
subtype selector, I've put these in as additional top-level proxy
types, so that instead of just PROXY_SSH we now have three
PROXY_SSH_foo.
This makes room to add more entries without the Proxy panel
overflowing. It also means we can put in a bit more explanation in
some of the more cryptic one-word names!
The low-level functions to handle a single atom of base64 at a time
have been in 'utils' / misc.h for ages, but the higher-level family of
base64_encode functions that handle a whole data block were hidden
away in sshpubk.c, and there was no higher-level decode function at
all.
Now moved both into 'utils' modules and declared them in misc.h rather
than ssh.h. Also, improved the APIs: they all take ptrlen in place of
separate data and length arguments, their naming is more consistent
and more explicit (the previous base64_encode which didn't name its
destination is now base64_encode_fp), and the encode functions now
accept cpl == 0 as a special case meaning that the output base64 data
is wanted in the form of an unbroken single-line string with no
trailing \n.
I'm about to want to create a second entirely different dialog box
whose contents are described using the same dialog.h API as the main
config box. So I'm starting by moving as much handler code as possible
out of GenericMainDlgProc and its callers, and into a set of reusable
subroutines.
In particular, this gets rid of the disgusting static variables that
stored all the config-box state. Now they're stored in a more sensible
struct, which lives in the new context-pointer field provided by the
reworked ShinyDialogBox.
It's self-contained enough not to really need to live in dialog.c as a
set of static functions. Also, moving it means we can isolate the
implementation details - which also makes it easy to change them.
One such change is that I've added the ability to bake a context
pointer into the dialog - unused so far, but it will be shortly.
(Also, while I'm here, renamed the functions so they sound more as if
they're adding features than working around bugs - not to mention not
imputing mental illness to the usual versions.)
This makes a second independent copy of an existing ssh_key, for
situations where one piece of code is going to want to keep it after
its current owner frees it.
In order to have it work on an arbitrary ssh_key, whether public-only
or a full public+private key pair, I've had to add an ssh_key query
method to ask whether a private key is known. I'm surprised I haven't
found a need for that before! But I suppose in most situations in an
SSH client you statically know which kind of key you're dealing with.
In this commit, I provide further functions which generate the
existing set of data types:
- key_components_add_text_pl() adds a text component, but takes a
ptrlen rather than a const char *, in case that was what you
happened to have already.
- key_components_add_uint() ends up adding an mp_int to the
structure, but takes it as input in the form of an ordinary C
integer, for the convenience of call sites which will want to do
that a lot and don't enjoy repeating the mp_int construction
boilerplate
- key_components_add_copy() takes a pointer to one of the
key_component sub-structs in an existing key_components, and copies
it into the output key_components under a new name, handling
whatever type it turns out to have.
This stores its data in the same format as the existing KCT_TEXT, but
it displays differently in puttygen --dump, expecting that the data
will be full of horrible control characters, invalid UTF-8, etc.
The displayed data is of the form b64("..."), so you get a hint about
what the encoding is, and can still paste into Python by defining the
identifier 'b64' to be base64.b64decode or equivalent.
Having recently pulled it out into its own file, I think it could also
do with a bit of tidying. In this rework:
- the substructure for a single component now has a globally visible
struct tag, so you can make a variable pointing at it, saving
verbiage in every piece of code looping over a key_components
- the 'is_mp_int' flag has been replaced with a type enum, so that
more types can be added without further upheaval
- the printing loop in cmdgen.c for puttygen --dump has factored out
the initial 'name=' prefix on each line so that it isn't repeated
per component type
- the storage format for text components is now a strbuf rather than
a plain char *, which I think is generally more useful.
Previously, the fact that "ssh-rsa" sometimes comes with two subtypes
"rsa-sha2-256" and "rsa-sha2-512" was known to three different parts
of the code - two in userauth and one in transport. Now the knowledge
of what those ids are, which one goes with which signing flags, and
which key types have subtypes at all, is centralised into a method of
the key algorithm, and all those locations just query it.
This will enable the introduction of further key algorithms that have
a parallel upgrade system.
It's a class method rather than an object method, so it doesn't allow
keys with the same algorithm to make different choices about what
flags they support. But that's not what I wanted it for: the real
purpose is to allow one key algorithm to delegate supported_flags to
another, by having its method implementation call the one from the
delegate class.
(If only C's compile/link model permitted me to initialise a field of
one global const struct variable to be a copy of that of another, I
wouldn't need the runtime overhead of this method! But object file
formats don't let you even specify that.)
Most key algorithms support no flags at all, so they all want to use
the same implementation of this method. So I've started a file of
stubs utils/nullkey.c to contain the common stub version.
All the fiddly business where you have to check that a thing exists,
make sure of its type, find its size, allocate some memory, and then
read it again properly (or, alternatively, loop round dealing with
ERROR_MORE_DATA) just doesn't belong at every call site. It's crying
out to be moved out into some separate utility functions that present
a more ergonomic API, so that the code that decides _which_ Registry
entries to read and what to do with them can concentrate on that.
So I've written a fresh set of registry API wrappers in windows/utils,
and simplified windows/storage.c as a result. The jump-list handling
code in particular is almost legible now!
The function will accept a public key file or a PPK, but if it fails
to parse as any of those, the error message says "not a PuTTY SSH-2
private key", which is particularly incongruous in situations where
you're specifically _not_ after the private half of the key.
Now says "not a public key or a PuTTY SSH-2 private key".
Every time I've had to do this before, I've always done the three-line
dance of initialising a BinarySource and calling get_string on it.
It's long past time I wrapped that up into a convenient subroutine.
If you already have a string (of potentially-binary data) in the form
of a ptrlen reference to somewhere else, and you want to keep a copy
somewhere, it's useful to copy it into a strbuf. But it takes a couple
of lines of faff to do that, and it's nicer to wrap that up into a
tiny helper function.
This commit adds that helper function strbuf_dup, and its non-movable
sibling strbuf_dup_nm for secret data. Also, gone through the existing
code and found a bunch of cases where this makes things less verbose.
These days, the base64 module has 'b64decode', which can tolerate a
str or a bytes as input. Switched to using that, and also, imported it
under a nice short name 'b64'.
In the process, removed the obsolete equivocation between
base64.decodebytes and base64.decodestring. That was there to cope
with Python 2 - but the assert statement right next to it has been
enforcing P3 since commit 2ec2b796ed two years ago!
When a subsidiary part of the SSH system wants to abort the whole
connection, it's supposed to call ssh_sw_abort_deferred, on pain of
free-order confusion. Elsewhere in mainchan.c I was getting this
right, but I missed a couple.
We call setlocale() at the start of the function to get the current
LC_CTYPE locale, then set it to what we need during the function, and
then call setlocale() at the end to put it back again. But the middle
call is allowed to invalidate the pointer returned from the first, so
we have to save it in our own allocated storage until the end of the
function.
This bit me during development just now, and I was surprised that it
hadn't come up before! But I suppose this is one of those things
that's only _allowed_ to fail, and need not in all circumstances -
perhaps it depends on what your LC_CTYPE was set to before.
The list of kex methods recently ran out of space due to the addition
of NTRU (at least, if you have GSSAPI enabled). It's time to stop
having an arbitrary limit on those arrays and switch to doing it
properly.