This consists of DJB's 'Streamlined NTRU Prime' quantum-resistant
cryptosystem, currently in round 3 of the NIST post-quantum key
exchange competition; it's run in parallel with ordinary Curve25519,
and generates a shared secret combining the output of both systems.
(Hence, even if you don't trust this newfangled NTRU Prime thing at
all, it's at least no _less_ secure than the kex you were using
already.)
As the OpenSSH developers point out, key exchange is the most urgent
thing to make quantum-resistant, even before working quantum computers
big enough to break crypto become available, because a break of the
kex algorithm can be applied retroactively to recordings of your past
sessions. By contrast, authentication is a real-time protocol, and can
only be broken by a quantum computer if there's one available to
attack you _already_.
I've implemented both sides of the mechanism, so that PuTTY and Uppity
both support it. In my initial testing, the two sides can both
interoperate with the appropriate half of OpenSSH, and also (of
course, but it would be embarrassing to mess it up) with each other.
This is already slightly nice because it lets me separate the
Weierstrass and Montgomery code more completely, without having to
have a vtable tucked into dh->extra. But more to the point, it will
allow completely different kex methods to fit into the same framework
later.
To that end, I've moved more of the descriptive message generation
into the vtable, and also provided the constructor with a flag that
will let it do different things in client and server.
Also, following on from a previous commit, I've arranged that the new
API returns arbitrary binary data for the exchange hash, rather than
an mp_int. An upcoming implementation of this interface will want to
return an encoded string instead of an encoded mp_int.
Correcting a source file name in the docs just now reminded me that
I've seen a lot of outdated source file names elsewhere in the code,
due to all the reorganisation since we moved to cmake. Here's a giant
pass of trying to make them all accurate again.
All the seat functions that request an interactive prompt of some kind
to the user - both the main seat_get_userpass_input and the various
confirmation dialogs for things like host keys - were using a simple
int return value, with the general semantics of 0 = "fail", 1 =
"proceed" (and in the case of seat_get_userpass_input, answers to the
prompts were provided), and -1 = "request in progress, wait for a
callback".
In this commit I change all those functions' return types to a new
struct called SeatPromptResult, whose primary field is an enum
replacing those simple integer values.
The main purpose is that the enum has not three but _four_ values: the
"fail" result has been split into 'user abort' and 'software abort'.
The distinction is that a user abort occurs as a result of an
interactive UI action, such as the user clicking 'cancel' in a dialog
box or hitting ^D or ^C at a terminal password prompt - and therefore,
there's no need to display an error message telling the user that the
interactive operation has failed, because the user already knows,
because they _did_ it. 'Software abort' is from any other cause, where
PuTTY is the first to know there was a problem, and has to tell the
user.
We already had this 'user abort' vs 'software abort' distinction in
other parts of the code - the SSH backend has separate termination
functions which protocol layers can call. But we assumed that any
failure from an interactive prompt request fell into the 'user abort'
category, which is not true. A couple of examples: if you configure a
host key fingerprint in your saved session via the SSH > Host keys
pane, and the server presents a host key that doesn't match it, then
verify_ssh_host_key would report that the user had aborted the
connection, and feel no need to tell the user what had gone wrong!
Similarly, if a password provided on the command line was not
accepted, then (after I fixed the semantics of that in the previous
commit) the same wrong handling would occur.
So now, those Seat prompt functions too can communicate whether the
user or the software originated a connection abort. And in the latter
case, we also provide an error message to present to the user. Result:
in those two example cases (and others), error messages should no
longer go missing.
Implementation note: to avoid the hassle of having the error message
in a SeatPromptResult being a dynamically allocated string (and hence,
every recipient of one must always check whether it's non-NULL and
free it on every exit path, plus being careful about copying the
struct around), I've instead arranged that the structure contains a
function pointer and a couple of parameters, so that the string form
of the message can be constructed on demand. That way, the only users
who need to free it are the ones who actually _asked_ for it in the
first place, which is a much smaller set.
(This is one of the rare occasions that I regret not having C++'s
extra features available in this code base - a unique_ptr or
shared_ptr to a string would have been just the thing here, and the
compiler would have done all the hard work for me of remembering where
to insert the frees!)
I happened to notice in passing that this function doesn't have any
tests (although it will have been at least somewhat tested by the
cmdgen interop test system).
This involved writing a wrapper that passes the passphrase and salt as
ptrlens, and I decided it made more sense to make the same change to
the original function too and adjust the call sites appropriately.
I derived a test case by getting OpenSSH itself to make an encrypted
key file, and then using the inputs and output from the password hash
operation that decrypted it again.
I recently encountered a paper [1] which catalogues all kinds of
things that can go wrong when one party in a discrete-log system
invents a prime and the other party chooses an exponent. In
particular, some choices of prime make it reasonable to use a short
exponent to save time, but others make that strategy very bad.
That paper is about the ElGamal encryption scheme used in OpenPGP,
which is basically integer Diffie-Hellman with one side's key being
persistent: a shared-secret integer is derived exactly as in DH, and
then it's used to communicate a message integer by simply multiplying
the shared secret by the message, mod p.
I don't _know_ that any problem of this kind arises in the SSH usage
of Diffie-Hellman: the standard integer DH groups in SSH are safe
primes, and as far as I know, the usual generation of prime moduli for
DH group exchange also picks safe primes. So the short exponents PuTTY
has been using _should_ be OK.
However, the range of imaginative other possibilities shown in that
paper make me nervous, even so! So I think I'm going to retire the
short exponent strategy, on general principles of overcaution.
This slows down 4096-bit integer DH by about a factor of 3-4 (which
would be worse if it weren't for the modpow speedup in the previous
commit). I think that's OK, because, firstly, computers are a lot
faster these days than when I originally chose to use short exponents,
and secondly, more and more implementations are now switching to
elliptic-curve DH, which is unaffected by this change (and with which
we've always been using maximum-length exponents).
[1] On the (in)security of ElGamal in OpenPGP. Luca De Feo, Bertram
Poettering, Alessandro Sorniotti. https://eprint.iacr.org/2021/923
This slightly simplifies the lookup function get_dh_group(), but
mostly, the point is to make it more similar to the other lookup
functions, because I'm planning to have those autogenerated.
All this Interactor business has been gradually working towards being
able to inform the user _which_ network connection is currently
presenting them with a password prompt (or whatever), in situations
where more than one of them might be, such as an SSH connection being
used as a proxy for another SSH connection when neither one has
one-touch login configured.
At some point, we have to arrange that any attempt to do a user
interaction during connection setup - be it a password prompt, a host
key confirmation dialog, or just displaying an SSH login banner -
makes it clear which host it's come from. That's going to mean calling
some kind of announcement function before doing any of those things.
But there are several of those functions in the Seat API, and calls to
them are scattered far and wide across the SSH backend. (And not even
just there - the Rlogin backend also uses seat_get_userpass_input).
How can we possibly make sure we don't forget a vital call site on
some obscure little-tested code path, and leave the user confused in
just that one case which nobody might notice for years?
Today I thought of a trick to solve that problem. We can use the C
type system to enforce it for us!
The plan is: we invent a new struct type which contains nothing but a
'Seat *'. Then, for every Seat method which does a thing that ought to
be clearly identified as relating to a particular Interactor, we
adjust the API for that function to take the new struct type where it
previously took a plain 'Seat *'. Or rather - doing less violence to
the existing code - we only need to adjust the API of the dispatch
functions inline in putty.h.
How does that help? Because the way you _get_ one of these
struct-wrapped Seat pointers is by calling interactor_announce() on
your Interactor, which will in turn call interactor_get_seat(), and
wrap the returned pointer into one of these structs.
The effect is that whenever the SSH (or Rlogin) code wants to call one
of those particular Seat methods, it _has_ to call
interactor_announce() just beforehand, which (once I finish all of
this) will make sure the user is aware of who is presenting the prompt
or banner or whatever. And you can't forget to call it, because if you
don't call it, then you just don't have a struct of the right type to
give to the Seat method you wanted to call!
(Of course, there's nothing stopping code from _deliberately_ taking a
Seat * it already has and wrapping it into the new struct. In fact
SshProxy has to do that, in order to forward these requests up the
chain of Seats. But the point is that you can't do it _by accident_,
just by forgetting to make a vital function call - when you do that,
you _know_ you're doing it on purpose.)
No functional change: the new interactor_announce() function exists,
and the type-system trick ensures it's called in all the right places,
but it doesn't actually _do_ anything yet.
Previously, checking the host key against the persistent cache managed
by the storage.h API was done as part of the seat_verify_ssh_host_key
method, i.e. separately by each Seat.
Now that check is done by verify_ssh_host_key(), which is a new
function in ssh/common.c that centralises all the parts of host key
checking that don't need an interactive prompt. It subsumes the
previous verify_ssh_manual_host_key() that checked against the Conf,
and it does the check against the storage API that each Seat was
previously doing separately. If it can't confirm or definitively
reject the host key by itself, _then_ it calls out to the Seat, once
an interactive prompt is definitely needed.
The main point of doing this is so that when SshProxy forwards a Seat
call from the proxy SSH connection to the primary Seat, it won't print
an announcement of which connection is involved unless it's actually
going to do something interactive. (Not that we're printing those
announcements _yet_ anyway, but this is a piece of groundwork that
works towards doing so.)
But while I'm at it, I've also taken the opportunity to clean things
up a bit by renaming functions sensibly. Previously we had three very
similarly named functions verify_ssh_manual_host_key(), SeatVtable's
'verify_ssh_host_key' method, and verify_host_key() in storage.h. Now
the Seat method is called 'confirm' rather than 'verify' (since its
job is now always to print an interactive prompt, so it looks more
like the other confirm_foo methods), and the storage.h function is
called check_stored_host_key(), which goes better with store_host_key
and avoids having too many functions with similar names. And the
'manual' function is subsumed into the new centralised code, so
there's now just *one* host key function with 'verify' in the name.
Several functions are reindented in this commit. Best viewed with
whitespace changes ignored.
Now that the SSH backend's user_input bufchain is no longer needed for
handling userpass input, it doesn't have to be awkwardly shared
between all the packet protocol layers any more. So we can turn the
want_user_input and got_user_input methods of PacketProtocolLayer into
methods of ConnectionLayer, and then only the two connection layers
have to bother implementing them, or store a pointer to the bufchain
they read from.
I've introduced a function ldisc_notify_sendok(), which backends
should call on their ldisc (if they have one) when anything changes
that might cause backend_sendok() to start returning true.
At the moment, the function does nothing. But in future, I'm going to
make ldisc start buffering typed-ahead input data not yet sent to the
backend, and then the effect of this function will be to trigger
flushing all that data into the backend.
Backends only have to call this function if sendok was previously
false: backends requiring no network connection stage (like pty and
serial) can safely return true from sendok, and in that case, they
don't also have to immediately call this function.
This is used to notify the Seat that some data has been cleared from
the backend's outgoing data buffer. In other words, it notifies the
Seat that it might be worth calling backend_sendbuffer() again.
We've never needed this before, because until now, Seats have always
been the 'main program' part of the application, meaning they were
also in control of the event loop. So they've been able to call
backend_sendbuffer() proactively, every time they go round the event
loop, instead of having to wait for a callback.
But now, the SSH proxy is the first example of a Seat without
privileged access to the event loop, so it has no way to find out that
the backend's sendbuffer has got smaller. And without that, it can't
pass that notification on to plug_sent, to unblock in turn whatever
the proxied connection might have been waiting to send.
In fact, before this commit, sshproxy.c never called plug_sent at all.
As a result, large data uploads over an SSH jump host would hang
forever as soon as the outgoing buffer filled up for the first time:
the main backend (to which sshproxy.c was acting as a Socket) would
carefully stop filling up the buffer, and then never receive the call
to plug_sent that would cause it to start again.
The new callback is ignored everywhere except in sshproxy.c. It might
be a good idea to remove backend_sendbuffer() entirely and convert all
previous uses of it into non-empty implementations of this callback,
so that we've only got one system; but for the moment, I haven't done
that.
This code base has always been a bit confused about which spelling it
likes to use to refer to that signature algorithm. The SSH protocol id
is "ssh-dss". But everyone I know refers to it as the Digital
Signature _Algorithm_, not the Digital Signature _Standard_.
When I moved everything down into the crypto subdir, I took the
opportunity to rename sshdss.c to dsa.c. Now I'm doing the rest of the
job: all internal identifiers and code comments refer to DSA, and the
spelling "dss" only survives in externally visible identifiers that
have to remain constant.
(Such identifiers include the SSH protocol id, and also the string id
used to identify the key type in PuTTY's own host key cache. We can't
change the latter without causing everyone a backwards-compatibility
headache, and if we _did_ ever decide to do that, we'd surely want to
do a much more thorough job of making the cache format more sensible!)
This clears up another large pile of clutter at the top level, and in
the process, allows me to rename source files to things that don't all
have that annoying 'ssh' prefix at the top.
This applies to all of AES, SHA-1, SHA-256 and SHA-512. All those
source files previously contained multiple implementations of the
algorithm, enabled or disabled by ifdefs detecting whether they would
work on a given compiler. And in order to get advanced machine
instructions like AES-NI or NEON crypto into the output file when the
compile flags hadn't enabled them, we had to do nasty stuff with
compiler-specific pragmas or attributes.
Now we can do the detection at cmake time, and enable advanced
instructions in the more sensible way, by compile-time flags. So I've
broken up each of these modules into lots of sub-pieces: a file called
(e.g.) 'foo-common.c' containing common definitions across all
implementations (such as round constants), one called 'foo-select.c'
containing the top-level vtable(s), and a separate file for each
implementation exporting just the vtable(s) for that implementation.
One advantage of this is that it depends a lot less on compiler-
specific bodgery. My particular least favourite part of the previous
setup was the part where I had to _manually_ define some Arm ACLE
feature macros before including <arm_neon.h>, so that it would define
the intrinsics I wanted. Now I'm enabling interesting architecture
features in the normal way, on the compiler command line, there's no
need for that kind of trick: the right feature macros are already
defined and <arm_neon.h> does the right thing.
Another change in this reorganisation is that I've stopped assuming
there's just one hardware implementation per platform. Previously, the
accelerated vtables were called things like sha256_hw, and varied
between FOO-NI and NEON depending on platform; and the selection code
would simply ask 'is hw available? if so, use hw, else sw'. Now, each
HW acceleration strategy names its vtable its own way, and the
selection vtable has a whole list of possibilities to iterate over
looking for a supported one. So if someone feels like writing a second
accelerated implementation of something for a given platform - for
example, I've heard you can use plain NEON to speed up AES somewhat
even without the crypto extension - then it will now have somewhere to
drop in alongside the existing ones.
This is a module that I'd noticed in the past was too monolithic.
There's a big pile of stub functions in uxpgnt.c that only have to be
there because the implementation of true X11 _forwarding_ (i.e.
actually managing a channel within an SSH connection), which Pageant
doesn't need, was in the same module as more general X11-related
utility functions which Pageant does need.
So I've broken up this awkward monolith. Now x11fwd.c contains only
the code that really does all go together for dealing with SSH X
forwarding: the management of an X forwarding channel (including the
vtables to make it behave as Channel at the SSH end and a Plug at the
end that connects to the local X server), and the management of
authorisation for those channels, including maintaining a tree234 of
possible auth values and verifying the one we received.
Most of the functions removed from this file have moved into the utils
subdir, and also into the utils library (i.e. further down the link
order), because they were basically just string and data processing.
One exception is x11_setup_display, which parses a display string and
returns a struct telling you everything about how to connect to it.
That talks to the networking code (it does name lookups and makes a
SockAddr), so it has to live in the network library rather than utils,
and therefore it's not in the utils subdirectory either.
The other exception is x11_get_screen_number, which it turned out
nothing called at all! Apparently the job it used to do is now done as
part of x11_setup_display. So I've just removed it completely.
Finally! Now all the previous commits have put the infrastructure in
place to fall back to the old fingerprint if you need to, we can
switch to the new format without a total compatibility break.
verify_ssh_manual_host_key() now takes an array of all key
fingerprints instead of just the default type, which means that an
expected key fingerprint stored in the session configuration can now
be matched against any of them.
ssh2_all_fingerprints() and friends will return a small 'char **'
array, containing all the fingerprints of a key that we know how to
generate, indexed by the FingerprintType enum. The result requires
complex freeing, so there's an ssh2_free_all_fingerprints as well.
For SSH-1 RSA keys, we refuse to generate any fingerprint except the
old SSH-1 MD5 version, because there's no other fingerprint type I
know of that anyone else uses. So I've got a function that returns the
same 'char **' for an SSH-1 key, but it only fills in the MD5 slot,
and leaves the rest NULL.
As a result, I also need a dynamic function that takes a fingerprint
list and returns the id of the most preferred fingerprint type in it
_that actually exists_.
NFC: this API is introduced, but not yet used.
There's a new enumeration of fingerprint types, and you tell
ssh2_fingerprint() or ssh2_fingerprint_blob() which of them to use.
So far, this is only implemented behind the scenes, and exposed for
testcrypt to test. All the call sites of ssh2_fingerprint pass a fixed
default fptype, which is still set to the old MD5. That will change
shortly.
This commit adds the capability in principle to ppk_save_sb, by adding
a fmt_version field in the save parameters structure. As yet it's not
connected up to any user interface in PuTTYgen, but I think I'll need
to, because currently there's no way at all to convert PPK v3 back to
v2, and surely people will need to interoperate with older
installations of PuTTY, or with other PPK-consuming software.
Thanks to Pavel and his CI for pointing out what I'd forgotten: the
automated test of cmdgen.c expects that round-tripping a PPK file to
some other format and back will regenerate the identical file. Of
course, with a randomised salt in the new-look password hash, that
isn't true any more in normal usage.
Fixed by adding an option in the existing parameters structure to
provide a salt override. That shouldn't be used anywhere except
cgtest, but in cgtest, it restores the determinism we need.
Another potential (but not guaranteed) source of difference is the
automatic time-scaling of the Argon2 parameter choice. So I've turned
that off too, while I'm at it.
This removes both uses of SHA-1 in the file format: it was used as the
MAC protecting the key file against tamperproofing, and also used in
the key derivation step that converted the user's passphrase to cipher
and MAC keys.
The MAC is simply upgraded from HMAC-SHA-1 to HMAC-SHA-256; it is
otherwise unchanged in how it's applied (in particular, to what data).
The key derivation is totally reworked, to be based on Argon2, which
I've just added to the code base. This should make stolen encrypted
key files more resistant to brute-force attack.
Argon2 has assorted configurable parameters for memory and CPU usage;
the new key format includes all those parameters. So there's no reason
we can't have them under user control, if a user wants to be
particularly vigorous or particularly lightweight with their own key
files. They could even switch to one of the other flavours of Argon2,
if they thought side channels were an especially large or small risk
in their particular environment. In this commit I haven't added any UI
for controlling that kind of thing, but the PPK loading function is
all set up to cope, so that can all be added in a future commit
without having to change the file format.
While I'm at it, I've also switched the CBC encryption to using a
random IV (or rather, one derived from the passphrase along with the
cipher and MAC keys). That's more like normal SSH-2 practice.
This is going to be used in the new version of the PPK file format. It
was the winner of the Password Hashing Context, which I think makes it
a reasonable choice.
Argon2 comes in three flavours: one with no data dependency in its
memory addressing, one with _deliberate_ data dependency (intended to
serialise computation, to hinder parallel brute-forcing), and a hybrid
form that starts off data-independent and then switches over to the
dependent version once the sensitive input data has been adequately
mixed around. I test all three in the test suite; the side-channel
tester can only expect Argon2i to pass; and, following the spec's
recommendation, I'll be using Argon2id for the actual key file
encryption.
No functional change: currently, the IV passed in is always zero
(except in the test suite). But this prepares to change that in a
future revision of the key file format.
The NEON support for SHA-512 acceleration looks very like SHA-256,
with a pair of chained instructions to generate a 128-bit vector
register full of message schedule, and another pair to update the hash
state based on those. But since SHA-512 is twice as big in all
dimensions, those four instructions between them only account for two
rounds of it, in place of four rounds of SHA-256.
Also, it's a tighter squeeze to fit all the data needed by those
instructions into their limited number of register operands. The NEON
SHA-256 implementation was able to keep its hash state and message
schedule stored as 128-bit vectors and then pass combinations of those
vectors directly to the instructions that did the work; for SHA-512,
in several places you have to make one of the input operands to the
main instruction by combining two halves of different vectors from
your existing state. But that operation is a quick single EXT
instruction, so no trouble.
The only other problem I've found is that clang - in particular the
version on M1 macOS, but as far as I can tell, even on current trunk -
doesn't seem to implement the NEON intrinsics for the SHA-512
extension. So I had to bodge my own versions with inline assembler in
order to get my implementation to compile under clang. Hopefully at
some point in the future the gap might be filled and I can relegate
that to a backwards-compatibility hack!
This commit adds the same kind of switching mechanism for SHA-512 that
we already had for SHA-256, SHA-1 and AES, and as with all of those,
plumbs it through to testcrypt so that you can explicitly ask for the
hardware or software version of SHA-512. So the test suite can run the
standard test vectors against both implementations in turn.
On M1 macOS, I'm testing at run time for the presence of SHA-512 by
checking a sysctl setting. You can perform the same test on the
command line by running "sysctl hw.optional.armv8_2_sha512".
As far as I can tell, on Windows there is not yet any flag to test for
this CPU feature, so for the moment, the new accelerated SHA-512 is
turned off unconditionally on Windows.
This builds on the previous refactoring by reworking the SHA-512
vtables and block layer to look more like the SHA-256 version, in
which the block and padding structure is a subroutine of the top-level
vtable methods instead of an owning layer around them.
This also organises the code in a way that makes it easy to drop in
hardware-accelerated versions alongside it: the block layer and the
big arrays of constants are now nicely separate from the inner
block-transform part.
Previously, the instant at which we send to the server a request to
enable agent forwarding (the "auth-agent-req@openssh.com" channel
request, or SSH1_CMSG_AGENT_REQUEST_FORWARDING) was also the instant
at which we set a flag indicating that we're prepared to accept
attempts from the server to open a channel to talk to the forwarded
agent. If the server attempts that when we haven't sent a forwarding
request, we treat it with suspicion, and reject it.
But it turns out that at least one SSH server does this, for what
seems to be a _somewhat_ sensible purpose, and OpenSSH accepts it. So,
on the basis that the @openssh.com domain suffix makes them the
arbiters of this part of the spec, I'm following their practice. I've
removed the 'agent_fwd_enabled' flag from both connection layer
implementations, together with the ConnectionLayer method that sets
it; now agent-forwarding CHANNEL_OPENs are gated only on the questions
of whether agent forwarding was permitted in the configuration and
whether an agent actually exists to talk to, and not also whether we
had previously sent a message to the server announcing it.
(The change to this condition is also applied in the SSH-1 agent
forwarding code, mostly for the sake of keeping things parallel where
possible. I think it doesn't actually make a difference in SSH-1,
because in SSH-1, it's not _possible_ for the server to try to open an
agent channel before the main channel is set up, due to the entirely
separate setup phase of the protocol.)
The use case is a proxy host which makes a secondary SSH connection to
a real destination host. A user has run into one of these recently,
announcing a version banner of "SSH-2.0-FudoSSH", which relies on
agent forwarding to authenticate the secondary connection. You connect
to the proxy host and authenticate with a username string of the form
"realusername#real.destination.host", and then, at the start of the
connection protocol, the server immediately opens a channel back to
your SSH agent which it uses to authenticate to the destination host.
And it delays answering any CHANNEL_OPEN requests from the client
until that's all done. For example (seen from the client's POV,
although the server's CHANNEL_OPEN may well have been _sent_ up front
rather than in response to the client's):
client: SSH2_MSG_CHANNEL_OPEN "session"
server: SSH2_MSG_CHANNEL_OPEN "auth-agent@openssh.com"
client: SSH2_MSG_CHANNEL_OPEN_CONFIRMATION to the auth-agent request
<- data is exchanged on the agent channel; proxy host uses
that signature to log in to the destination host ->
server: SSH2_MSG_CHANNEL_OPEN_CONFIRMATION to the session request
With PuTTY, this wasn't working, because at the point when the server
sends the auth-agent CHANNEL_OPEN, we had not yet had any opportunity
to send auth-agent-req (because that has to wait until we've had a
CHANNEL_OPEN_CONFIRMATION). So we were rejecting the server's
CHANNEL_OPEN, which broke this workflow:
client: SSH2_MSG_CHANNEL_OPEN "session"
server: SSH2_MSG_CHANNEL_OPEN "auth-agent@openssh.com"
client: SSH2_MSG_CHANNEL_OPEN_FAILURE to the auth-agent request
(hey, I haven't told you you can do that yet!)
server: SSH2_MSG_CHANNEL_OPEN_FAILURE to the session request
(in that case, no shell session for you!)
The information was already centralised in find_pubkey_alg, but that
had a query-based API that couldn't enumerate the key types. Now I
expose an underlying array so that it's possible to iterate over them.
Also, I'd forgotten to add the two new rsa-sha2-* algorithms to
find_pubkey_alg. That's also done as part of this commit.
We now add the appropriate advertisement to our KEXINIT that indicates
a willingness to receive EXT_INFO. Code in the BPP enforces that it
must appear in one of the permitted locations in the protocol (in
particular, this ensures a pre-key-exchange MITM can't get away with
inserting it into the initial cleartext segment of the protocol). And
when we receive it, we look through it for extension names we know
about.
No functional change (except for the advertisement in KEXINIT): we
don't yet actually do anything in response to any extension reported
in EXT_INFO.
This is the cleanest part of the RFC 8332 support: I simply add two
more RSA-based SSH-2 key algorithm vtables, both almost identical to
the existing one, with different ssh_id strings and signature flags.
Adding those to the HOSTKEY_ALGORITHMS list macro is enough to ensure
that we advertise support for the new identifiers in our client
KEXINIT, select the appropriate algorithm if the server announces one
or both of them too, and use the right version of the signature
validation.
This is a sweeping change applied across the whole code base by a spot
of Emacs Lisp. Now, everywhere I declare a vtable filled with function
pointers (and the occasional const data member), all the members of
the vtable structure are initialised by name using the '.fieldname =
value' syntax introduced in C99.
We were already using this syntax for a handful of things in the new
key-generation progress report system, so it's not new to the code
base as a whole.
The advantage is that now, when a vtable only declares a subset of the
available fields, I can initialise the rest to NULL or zero just by
leaving them out. This is most dramatic in a couple of the outlying
vtables in things like psocks (which has a ConnectionLayerVtable
containing only one non-NULL method), but less dramatically, it means
that the new 'flags' field in BackendVtable can be completely left out
of every backend definition except for the SUPDUP one which defines it
to a nonzero value. Similarly, the test_for_upstream method only used
by SSH doesn't have to be mentioned in the rest of the backends;
network Plugs for listening sockets don't have to explicitly null out
'receive' and 'sent', and vice versa for 'accepting', and so on.
While I'm at it, I've normalised the declarations so they don't use
the unnecessarily verbose 'struct' keyword. Also a handful of them
weren't const; now they are.
This is standardised by RFC 8709 at SHOULD level, and for us it's not
too difficult (because we use general-purpose elliptic-curve code). So
let's be up to date for a change, and add it.
This implementation uses all the formats defined in the RFC. But we
also have to choose a wire format for the public+private key blob sent
to an agent, and since the OpenSSH agent protocol is the de facto
standard but not (yet?) handled by the IETF, OpenSSH themselves get to
say what the format for a key should or shouldn't be. So if they don't
support a particular key method, what do you do?
I checked with them, and they agreed that there's an obviously right
format for Ed448 keys, which is to do them exactly like Ed25519 except
that you have a 57-byte string everywhere Ed25519 had a 32-byte
string. So I've done that.
This works more or less like the similar refactoring for Montgomery
curves in 7fa0749fcb: where we previously hardwired the clearing of 3
low bits of a private exponent, we now turn that 3 into a curve-
specific constant, so that Ed448 will be able to set it to a different
value.
With all the preparation now in place, this is more or less trivial.
We add a new curve setup function in sshecc.c, and an ssh_kex linking
to it; we add the curve parameters to the reference / test code
eccref.py, and use them to generate the list of low-order input values
that should be rejected by the sanity check on the kex output; we add
the standard test vectors from RFC 7748 in cryptsuite.py, and the
low-order values we just generated.
The old API was one of those horrible things I used to do when I was
young and foolish, in which you have just one function, and indicate
which of lots of things it's doing by passing in flags. It was crying
out to be replaced with a vtable.
While I'm at it, I've reworked the code on the Windows side that
decides what to do with the progress bar, so that it's based on
actually justifiable estimates of probability rather than magic
integer constants.
Since computers are generally faster now than they were at the start
of this project, I've also decided there's no longer any point in
making the fixed final part of RSA key generation bother to report
progress at all. So the progress bars are now only for the variable
part, i.e. the actual prime generations.
(This is a reapplication of commit a7bdefb39, without the Miller-Rabin
refactoring accidentally folded into it. Also this time I've added -lm
to the link options, which for some reason _didn't_ cause me a link
failure last time round. No idea why not.)
This reverts commit a7bdefb394.
I had accidentally mashed it together with another commit. I did
actually want to push both of them, but I'd rather push them
separately! So I'm backing out the combined blob, and I'll re-push
them with their proper comments and explanations.
The old API was one of those horrible things I used to do when I was
young and foolish, in which you have just one function, and indicate
which of lots of things it's doing by passing in flags. It was crying
out to be replaced with a vtable.
While I'm at it, I've reworked the code on the Windows side that
decides what to do with the progress bar, so that it's based on
actually justifiable estimates of probability rather than magic
integer constants.
Since computers are generally faster now than they were at the start
of this project, I've also decided there's no longer any point in
making the fixed final part of RSA key generation bother to report
progress at all. So the progress bars are now only for the variable
part, i.e. the actual prime generations.
The more features and options I add to PrimeCandidateSource, the more
cumbersome it will be to replicate each one in a command-line option
to the ultimate primegen() function. So I'm moving to an API in which
the client of primegen() constructs a PrimeCandidateSource themself,
and passes it in to primegen().
Also, changed the API for pcs_new() so that you don't have to pass
'firstbits' unless you really want to. The net effect is that even
though we've added flexibility, we've also simplified the call sites
of primegen() in the simple case: if you want a 1234-bit prime, you
just need to pass pcs_new(1234) as the argument to primegen, and
you're done.
The new declaration of primegen() lives in ssh_keygen.h, along with
all the types it depends on. So I've had to #include that header in a
few new files.
It's now a subroutine specific to RSA key generation, because the
reworked PrimeCandidateSource system can handle the requirements of
DSA generation automatically.
The difference is that in DSA, one of the primes you generate is used
as a factor in the generation of the other, so you can just pass q as
a factor to pcs_require_residue_1, and it can get the range right by
itself. But in RSA, neither prime is generated with the other one in
mind; they're conceptually generated separately and independently,
apart from that single joint restriction on their product.
(I _could_ have added a feature to PrimeCandidateSource to specify a
range for the prime more specifically than a few initial bits. But I
didn't want to, because it would have been more complicated than doing
it this way, and also slightly less good: if you invent one prime
first and then constrain the range of the other one once you know the
first, then you're not getting an even probability distribution of the
possible _pairs_ of primes - you're privileging one over the other and
skewing the distribution.)
Also spelled '-O text', this takes a public or private key as input,
and produces on standard output a dump of all the actual numbers
involved in the key: the exponent and modulus for RSA, the p,q,g,y
parameters for DSA, the affine x and y coordinates of the public
elliptic curve point for ECC keys, and all the extra bits and pieces
in the private keys too.
Partly I expect this to be useful to me for debugging: I've had to
paste key files a few too many times through base64 decoders and hex
dump tools, then manually decode SSH marshalling and paste the result
into the Python REPL to get an integer object. Now I should be able to
get _straight_ to text I can paste into Python.
But also, it's a way that other applications can use the key
generator: if you need to generate, say, an RSA key in some format I
don't support (I've recently heard of an XML-based one, for example),
then you can run 'puttygen -t rsa --dump' and have it print the
elements of a freshly generated keypair on standard output, and then
all you have to do is understand the output format.
This is the same protocol that PuTTY's connection sharing has been
using for years, to communicate between the downstream and upstream
PuTTYs. I'm now promoting it to be a first-class member of the
protocols list: if you have a server for it, you can select it in the
GUI or on the command line, and write out a saved session that
specifies it.
This would be completely insecure if you used it as an ordinary
network protocol, of course. Not only is it non-cryptographic and wide
open to eavesdropping and hijacking, but it's not even _authenticated_
- it begins after the userauth phase of SSH. So there isn't even the
mild security theatre of entering an easy-to-eavesdrop password, as
there is with, say, Telnet.
However, that's not what I want to use it for. My aim is to use it for
various specialist and niche purposes, all of which involve speaking
it over an 8-bit-clean data channel that is already set up, secured
and authenticated by other methods. There are lots of examples of such
channels:
- a userv(1) invocation
- the console of a UML kernel
- the stdio channels into other kinds of container, such as Docker
- the 'adb shell' channel (although it seems quite hard to run a
custom binary at the far end of that)
- a pair of pipes between PuTTY and a Cygwin helper process
- and so on.
So this protocol is intended as a convenient way to get a client at
one end of any those to run a shell session at the other end. Unlike
other approaches, it will give you all the SSH-flavoured amenities
you're already used to, like forwarding your SSH agent into the
container, or forwarding selected network ports in or out of it, or
letting it open a window on your X server, or doing SCP/SFTP style
file transfer.
Of course another way to get all those amenities would be to run an
ordinary SSH server over the same channel - but this approach avoids
having to manage a phony password or authentication key, or taking up
your CPU time with pointless crypto.
Reading draft-miller-ssh-agent-04 more carefully, I see that I missed
a few things from the extension-message spec. Firstly, there's an
extension request "query" which is supposed to list all the extensions
you support. Secondly, if you recognise an extension-request name but
are then unable to fulfill the request for some other reason, you're
supposed to return a new kind of failure message that's distinct from
SSH_AGENT_FAILURE, because for extensions, the latter is reserved for
"I don't even know what this extension name means at all".
I've fixed both of those bugs in Pageant by making a centralised map
of known extension names to an enumeration of internal ids, and an
array containing the name for each id. So we can reliably answer the
"query" extension by iterating over that array, and also use the same
array to recognise known extensions up front and give them centralised
processing (in particular, resetting the failure-message type) before
switching on the particular extension index.
This will allow it to be used more conveniently for things other than
key files.
For the moment, the implementation still lives in sshpubk.c. Moving it
out into utils.c or misc.c would be nicer, but it has awkward
dependencies on marshal.c and the per-platform f_open function.
Perhaps another time.
The queue-node structure shared between PktIn and PktOut now has a
'formal_size' field, which is initialised appropriately by the various
packet constructors. And the PacketQueue structure has a 'total_size'
field which tracks the sum of the formal sizes of all the packets on
the queue, and is automatically updated by the push, pop and
concatenate functions.
No functional change, and nothing uses the new fields yet: this is
infrastructure that will be used in the next commit.
This adds an extension request to the agent protocol (named in our
private namespace, naturally) which allows you to upload a key file in
the form of a string containing an entire .ppk file. If the key is
encrypted, then Pageant stores it in such a way that it will show up
in the key list, and on the first attempt to sign something with it,
prompt for a passphrase (if it can), decrypt the key, and then answer
the request.
There are a lot of rough edges still to deal with, but this is good
enough to have successfully answered one request, so it's a start.