The new explicit vtables for the hardware and software implementations
are now exposed by name in the testcrypt protocol, and cryptsuite.py
runs all the AES tests separately on both.
(When hardware AES is compiled out, ssh2_cipher_new("aes128_hw") and
similar calls will return None, and cryptsuite.py will respond by
skipping those tests.)
sshaes.c is more or less completely changed by this commit.
Firstly, I've changed the top-level structure. In the old structure,
there were three levels of indirection controlling what an encryption
function would actually do: first the ssh2_cipher vtable, then a
subsidiary set of function pointers within that to select the software
or hardware implementation, and then inside the main encryption
function, a switch on the key length to jump into the right place in
the unrolled loop of cipher rounds.
That was all a bit untidy. So now _all_ of that is done by means of
just one selection system, namely the ssh2_cipher vtable. The software
and hardware implementations of a given SSH cipher each have their own
separate vtable, e.g. ssh2_aes256_sdctr_sw and ssh2_aes256_sdctr_hw;
this allows them to have their own completely different state
structures too, and not have to try to coexist awkwardly in the same
universal AESContext with workaround code to align things correctly.
The old implementation-agnostic vtables like ssh2_aes256_sdctr still
exist, but now they're mostly empty, containing only the constructor
function, which will decide whether AES-NI is currently available and
then choose one of the other _real_ vtables to instantiate.
As well as the cleaner data representation, this also means the
vtables can have different description strings, which means the Event
Log will indicate which AES implementation is actually in use; it
means the SW and HW vtables are available for testcrypt to use
(although actually using them is left for the next commit); and in
principle it would also make it easy to support a user override for
the automatic SW/HW selection (in case anyone turns out to want one).
The AES-NI implementation has been reorganised to fit into the new
framework. One thing I've done is to de-optimise the key expansion:
instead of having a separate blazingly fast loop-unrolled key setup
function for each key length, there's now just one, which uses AES
intrinsics for the actual transformations of individual key words, but
wraps them in a common loop structure for all the key lengths which
has a clear correspondence to the cipher spec. (Sorry to throw away
your work there, Pavel, but this isn't an application where key setup
really _needs_ to be hugely fast, and I decided I prefer a version I
can understand and debug.)
The software AES implementation is also completely replaced with one
that uses a bit-sliced representation, i.e. the cipher state is split
across eight integers in such a way that each logical byte of the
state occupies a single bit in each of those integers. The S-box
lookup is done by a long string of AND and XOR operations on the eight
bits (removing the potential cache side channel from a lookup table),
and this representation allows 64 S-box lookups to be done in parallel
simply by extending those AND/XOR operations to be bitwise ones on a
whole word. So now we can perform four AES encryptions or decryptions
in parallel, at least when the cipher mode permits it (which SDCTR and
CBC decryption both do).
The result is slower than the old implementation, but (a) not by as
much as you might think - those parallel S-boxes are surprisingly
competitive with 64 separate table lookups; (b) the compensation is
that now it should run in constant time with no data-dependent control
flow or memory addressing; and (c) in any case the really fast
hardware implementation will supersede it for most users.
The old names like ssh_aes128 and ssh_aes128_ctr reflect the SSH
protocol IDs, which is all very well, but I think a more important
principle is that it should be easy for me to remember which cipher
mode each one refers to. So I've renamed them so that they all end in
_cbc and _sdctr.
(I've left alone the string identifiers used by testcrypt, for the
moment. Perhaps I'll go back and change those later.)
All access to AES throughout the code is now done via the ssh2_cipher
vtable interface. All code that previously made direct calls to the
underlying functions (for encrypting and decrypting private key files)
now does it by instantiating an ssh2_cipher.
This removes constraints on the AES module's internal structure, and
allows me to reorganise it as much as I like.
This is the commit that f3295e0fb _should_ have been. Yesterday I just
added some typedefs so that I didn't have to wear out my fingers
typing 'struct' in new code, but what I ought to have done is to move
all the typedefs into defs.h with the rest, and then go through
cleaning up the legacy 'struct's all through the existing code.
But I was mostly trying to concentrate on getting the test suite
finished, so I just did the minimum. Now it's time to come back and do
it better.
Previously, lots of individual ssh2_cipheralg structures were declared
static, and only available to the rest of the code via a smaller
number of 'ssh2_ciphers' objects that wrapped them into lists. But I'm
going to want to access individual ciphers directly in the testing
system I'm currently working on, so I'm giving all those objects
external linkage and declaring them in ssh.h.
Also, I've made up an entirely new one, namely exposing MD5 as an
instance of the general ssh_hashalg abstraction, which it has no need
to be for the purposes of actually using it in SSH. But, again, this
will let me treat it the same as all the other hashes in the test
system.
No functional change, for the moment.
I'm getting tired of typing 'struct Foo' everywhere when I could just
type 'Foo', so here's a bunch of extra typedefs that allow me to leave
off the 'struct' in various places.
ssh_rsakex_encrypt took an input (pointer, length) pair, which I've
replaced with a ptrlen; it also took an _output_ (pointer, length)
pair, and then re-computed the right length internally and enforced by
assertion that the one passed in matched it. Now it just returns a
strbuf of whatever length it computed, which saves the caller having
to compute the length at all.
Also, both ssh_rsakex_encrypt and ssh_rsakex_decrypt took their
arguments in a weird order; I think it looks more sensible to put the
RSA key first rather than last, so now they both have the common order
(key, hash, input data).
The abstract method ssh_key_sign(), and the concrete functions
ssh_rsakex_newkey() and rsa_ssh1_public_blob_len(), now each take a
ptrlen argument in place of a separate pointer and length pair.
Partly that's because I'm generally preferring ptrlens these days and
it keeps argument lists short and tidy-looking, but mostly it's
because it will make those functions easier to wrap in my upcoming
test system.
This makes the API more flexible, so that it's not restricted to
taking a key of precisely the length specified in the ssh2_macalg
structure. Instead, ssh2bpp looks up that length to construct the
MAC's key.
Some MACs (e.g. Poly1305) will only _work_ with a single key length.
But this way, I can run standard test vectors against MACs that can
take a variable length (e.g. everything in the HMAC family).
I'm about to want to use it for purposes other than KEX, so it's now
just called MAX_HASH_LEN and is supposed to be an upper bound on any
hash function we implement at all. Of course this makes no difference
to its value, because the largest hash we have is SHA-512 which
already fit inside that limit.
The macro wrapper for the MAC setkey function expanded to completely
the wrong vtable method due to a cut and paste error. And I never
noticed, because what _should_ have been its two call sites in
ssh2bpp.c were directly calling the _right_ vtable method instead.
The old 'Bignum' data type is gone completely, and so is sshbn.c. In
its place is a new thing called 'mp_int', handled by an entirely new
library module mpint.c, with API differences both large and small.
The main aim of this change is that the new library should be free of
timing- and cache-related side channels. I've written the code so that
it _should_ - assuming I haven't made any mistakes - do all of its
work without either control flow or memory addressing depending on the
data words of the input numbers. (Though, being an _arbitrary_
precision library, it does have to at least depend on the sizes of the
numbers - but there's a 'formal' size that can vary separately from
the actual magnitude of the represented integer, so if you want to
keep it secret that your number is actually small, it should work fine
to have a very long mp_int and just happen to store 23 in it.) So I've
done all my conditionalisation by means of computing both answers and
doing bit-masking to swap the right one into place, and all loops over
the words of an mp_int go up to the formal size rather than the actual
size.
I haven't actually tested the constant-time property in any rigorous
way yet (I'm still considering the best way to do it). But this code
is surely at the very least a big improvement on the old version, even
if I later find a few more things to fix.
I've also completely rewritten the low-level elliptic curve arithmetic
from sshecc.c; the new ecc.c is closer to being an adjunct of mpint.c
than it is to the SSH end of the code. The new elliptic curve code
keeps all coordinates in Montgomery-multiplication transformed form to
speed up all the multiplications mod the same prime, and only converts
them back when you ask for the affine coordinates. Also, I adopted
extended coordinates for the Edwards curve implementation.
sshecc.c has also had a near-total rewrite in the course of switching
it over to the new system. While I was there, I've separated ECDSA and
EdDSA more completely - they now have separate vtables, instead of a
single vtable in which nearly every function had a big if statement in
it - and also made the externally exposed types for an ECDSA key and
an ECDH context different.
A minor new feature: since the new arithmetic code includes a modular
square root function, we can now support the compressed point
representation for the NIST curves. We seem to have been getting along
fine without that so far, but it seemed a shame not to put it in,
since it was suddenly easy.
In sshrsa.c, one major change is that I've removed the RSA blinding
step in rsa_privkey_op, in which we randomise the ciphertext before
doing the decryption. The purpose of that was to avoid timing leaks
giving away the plaintext - but the new arithmetic code should take
that in its stride in the course of also being careful enough to avoid
leaking the _private key_, which RSA blinding had no way to do
anything about in any case.
Apart from those specific points, most of the rest of the changes are
more or less mechanical, just changing type names and translating code
into the new API.
These were both using the old-fashioned strategy of 'count up the
length first, then go back over the same data trying not to do
anything different', which these days I'm trying to replace with
strbufs.
Also, while I was in ssh.h, removed the prototype of rsasanitise()
which doesn't even exist any more.
Several pieces of old code were disposing of pieces of an RSAKey by
manually freeing them one at a time. We have a centralised
freersakey(), so we should use that instead wherever possible.
Where it wasn't possible to switch over to that, it was because we
were only freeing the private fields of the key - so I've fixed that
by cutting freersakey() down the middle and exposing the private-only
half as freersapriv().
It's just silly to have _two_ systems for traversing a string of
comma-separated protocol ids. I think the new get_commasep_word
technique for looping over the elements of a string is simpler and
more general than the old membership-testing approach, and also it's
necessary for the modern KEX untangling system (which has to be able
to loop over one string, even if it used a membership test to check
things in the other). So this commit rewrites the two remaining uses
of in_commasep_string to use get_commasep_word instead, and deletes
the former.
In commit 884a7df94 I claimed that all my trait-like vtable systems
now had the generic object type being a struct rather than a bare
vtable pointer (e.g. instead of 'Socket' being a typedef for a pointer
to a const Socket_vtable, it's a typedef for a struct _containing_ a
vtable pointer).
In fact, I missed a few. This commit converts ssh_key, ssh2_cipher and
ssh1_cipher into the same form as the rest.
Now the RSA signing function supports the two flags defined in
draft-miller-ssh-agent-02, and uses them to generate RSA signatures
based on SHA-256 and SHA-512, which look exactly like the ordinary
kind of RSA SHA-1 signature except that the decoded signature integer
has a different hash at the bottom and an ASN.1 identifying prefix to
match, and also the signature-type string prefixing the integer
changes from "ssh-rsa" to "rsa-sha2-256" or "rsa-sha2-512" as
appropriate.
We don't _accept_ signatures of these new types - that would need an
entirely different protocol extension - and we don't generate them
under any circumstances other than Pageant receiving a sign request
with one of those flags set.
Now each public-key algorithm gets to indicate what flags it supports,
and the ones it specifies support for may turn up in a call to its
sign() method.
We still don't actually support any flags yet, though.
The event log messages generated during DH key exchange now include both the
modulus size and hash algorithm used as well as whether the DH parameters
are from one of the standardized groups or were supplied by the server
during Group Exchange.
This is another cleanup I felt a need for while I was doing
boolification. If you define a function or variable in one .c file and
declare it extern in another, then nothing will check you haven't got
the types of the two declarations mismatched - so when you're
_changing_ the type, it's a pain to make sure you've caught all the
copies of it.
It's better to put all those extern declarations in header files, so
that the declaration in the header is also in scope for the
definition. Then the compiler will complain if they don't match, which
is what I want.
My normal habit these days, in new code, is to treat int and bool as
_almost_ completely separate types. I'm still willing to use C's
implicit test for zero on an integer (e.g. 'if (!blob.len)' is fine,
no need to spell it out as blob.len != 0), but generally, if a
variable is going to be conceptually a boolean, I like to declare it
bool and assign to it using 'true' or 'false' rather than 0 or 1.
PuTTY is an exception, because it predates the C99 bool, and I've
stuck to its existing coding style even when adding new code to it.
But it's been annoying me more and more, so now that I've decided C99
bool is an acceptable thing to require from our toolchain in the first
place, here's a quite thorough trawl through the source doing
'boolification'. Many variables and function parameters are now typed
as bool rather than int; many assignments of 0 or 1 to those variables
are now spelled 'true' or 'false'.
I managed this thorough conversion with the help of a custom clang
plugin that I wrote to trawl the AST and apply heuristics to point out
where things might want changing. So I've even managed to do a decent
job on parts of the code I haven't looked at in years!
To make the plugin's work easier, I pushed platform front ends
generally in the direction of using standard 'bool' in preference to
platform-specific boolean types like Windows BOOL or GTK's gboolean;
I've left the platform booleans in places they _have_ to be for the
platform APIs to work right, but variables only used by my own code
have been converted wherever I found them.
In a few places there are int values that look very like booleans in
_most_ of the places they're used, but have a rarely-used third value,
or a distinction between different nonzero values that most users
don't care about. In these cases, I've _removed_ uses of 'true' and
'false' for the return values, to emphasise that there's something
more subtle going on than a simple boolean answer:
- the 'multisel' field in dialog.h's list box structure, for which
the GTK front end in particular recognises a difference between 1
and 2 but nearly everything else treats as boolean
- the 'urgent' parameter to plug_receive, where 1 vs 2 tells you
something about the specific location of the urgent pointer, but
most clients only care about 0 vs 'something nonzero'
- the return value of wc_match, where -1 indicates a syntax error in
the wildcard.
- the return values from SSH-1 RSA-key loading functions, which use
-1 for 'wrong passphrase' and 0 for all other failures (so any
caller which already knows it's not loading an _encrypted private_
key can treat them as boolean)
- term->esc_query, and the 'query' parameter in toggle_mode in
terminal.c, which _usually_ hold 0 for ESC[123h or 1 for ESC[?123h,
but can also hold -1 for some other intervening character that we
don't support.
In a few places there's an integer that I haven't turned into a bool
even though it really _can_ only take values 0 or 1 (and, as above,
tried to make the call sites consistent in not calling those values
true and false), on the grounds that I thought it would make it more
confusing to imply that the 0 value was in some sense 'negative' or
bad and the 1 positive or good:
- the return value of plug_accepting uses the POSIXish convention of
0=success and nonzero=error; I think if I made it bool then I'd
also want to reverse its sense, and that's a job for a separate
piece of work.
- the 'screen' parameter to lineptr() in terminal.c, where 0 and 1
represent the default and alternate screens. There's no obvious
reason why one of those should be considered 'true' or 'positive'
or 'success' - they're just indices - so I've left it as int.
ssh_scp_recv had particularly confusing semantics for its previous int
return value: its call sites used '<= 0' to check for error, but it
never actually returned a negative number, just 0 or 1. Now the
function and its call sites agree that it's a bool.
In a couple of places I've renamed variables called 'ret', because I
don't like that name any more - it's unclear whether it means the
return value (in preparation) for the _containing_ function or the
return value received from a subroutine call, and occasionally I've
accidentally used the same variable for both and introduced a bug. So
where one of those got in my way, I've renamed it to 'toret' or 'retd'
(the latter short for 'returned') in line with my usual modern
practice, but I haven't done a thorough job of finding all of them.
Finally, one amusing side effect of doing this is that I've had to
separate quite a few chained assignments. It used to be perfectly fine
to write 'a = b = c = TRUE' when a,b,c were int and TRUE was just a
the 'true' defined by stdbool.h, that idiom provokes a warning from
gcc: 'suggest parentheses around assignment used as truth value'!
This commit includes <stdbool.h> from defs.h and deletes my
traditional definitions of TRUE and FALSE, but other than that, it's a
100% mechanical search-and-replace transforming all uses of TRUE and
FALSE into the C99-standardised lowercase spellings.
No actual types are changed in this commit; that will come next. This
is just getting the noise out of the way, so that subsequent commits
can have a higher proportion of signal.
The annoying int64.h is completely retired, since C99 guarantees a
64-bit integer type that you can actually treat like an ordinary
integer. Also, I've replaced the local typedefs uint32 and word32
(scattered through different parts of the crypto code) with the
standard uint32_t.
This server is NOT SECURE! If anyone is reading this commit message,
DO NOT DEPLOY IT IN A HOSTILE-FACING ENVIRONMENT! Its purpose is to
speak the server end of everything PuTTY speaks on the client side, so
that I can test that I haven't broken PuTTY when I reorganise its
code, even things like RSA key exchange or chained auth methods which
it's hard to find a server that speaks at all.
(For this reason, it's declared with [UT] in the Recipe file, so that
it falls into the same category as programs like testbn, which won't
be installed by 'make install'.)
Working title is 'Uppity', partly for 'Universal PuTTY Protocol
Interaction Test Yoke', but mostly because it looks quite like the
word 'PuTTY' with part of it reversed. (Apparently 'test yoke' is a
very rarely used term meaning something not altogether unlike 'test
harness', which is a bit of a stretch, but it'll do.)
It doesn't actually _support_ everything I want yet. At the moment,
it's a proof of concept only. But it has most of the machinery
present, and the parts it's missing - such as chained auth methods -
should be easy enough to add because I've built in the required
flexibility, in the form of an AuthPolicy object which can request
them if it wants to. However, the current AuthPolicy object is
entirely trivial, and will let in any user with the password "weasel".
(Another way in which this is not a production-ready server is that it
also has no interaction with the OS's authentication system. In
particular, it will not only let in any user with the same password,
but it won't even change uid - it will open shells and forwardings
under whatever user id you started it up as.)
Currently, the program can only speak the SSH protocol on its standard
I/O channels (using the new FdSocket facility), so if you want it to
listen on a network port, you'll have to run it from some kind of
separate listening program similar to inetd. For my own tests, I'm not
even doing that: I'm just having PuTTY spawn it as a local proxy
process, which also conveniently eliminates the risk of anyone hostile
connecting to it.
The bulk of the actual code reorganisation is already done by previous
commits, so this change is _mostly_ just dropping in a new set of
server-specific source files alongside the client-specific ones I
created recently. The remaining changes in the shared SSH code are
numerous, but all minor:
- a few extra parameters to BPP and PPL constructors (e.g. 'are you
in server mode?'), and pass both sets of SSH-1 protocol flags from
the login to the connection layer
- in server mode, unconditionally send our version string _before_
waiting for the remote one
- a new hook in the SSH-1 BPP to handle enabling compression in
server mode, where the message exchange works the other way round
- new code in the SSH-2 BPP to do _deferred_ compression the other
way round (the non-deferred version is still nicely symmetric)
- in the SSH-2 transport layer, some adjustments to do key derivation
either way round (swapping round the identifying letters in the
various hash preimages, and making sure to list the KEXINITs in the
right order)
- also in the SSH-2 transport layer, an if statement that controls
whether we send SERVICE_REQUEST and wait for SERVICE_ACCEPT, or
vice versa
- new ConnectionLayer methods for opening outgoing channels for X and
agent forwardings
- new functions in portfwd.c to establish listening sockets suitable
for remote-to-local port forwarding (i.e. not under the direction
of a Conf the way it's done on the client side).
I've written the decryption side of the PKCS#1 encryption used in
SSH-1, and also the RSAES-OAEP system used by SSH-2 RSA kex. Also, the
RSA kex structures now each come with an 'extra' pointer giving the
minimum key length.
ssh2connection.c now knows how to unmarshal the message formats for
all the channel requests we'll need to handle when we're the server
and a client sends them. Each one is translated into a call to a new
method in the Channel vtable, which is implemented by a trivial
'always fail' routine in every channel type we know about so far.
This will be used for the server side of X forwarding. It wraps up the
mechanics of listening on the right TCP port and (if possible) the
associated AF_UNIX socket, and also creates an appropriate X authority
file containing authorisation data provided by its caller.
Like the new platform_create_agent_socket, this function spawns a
watchdog subprocess to clean up the mess afterwards, in the hope of at
least _most_ of the time not leaving old sockets and authority files
lying around /tmp,
The code in Pageant that sets up the Unix socket and its containing
directory now lives in a separate file, uxagentsock.c, where it will
also be callable from the upcoming new SSH server when it wants to
create a similar socket for agent forwarding.
While I'm at it, I've also added a feature to create a watchdog
subprocess that will try to clean up the socket and directory once
Pageant itself terminates, in the hope of leaving less cruft lying
around /tmp.
Previously, it returned a human-readable string suitable for log
files, which tried to say something useful about the remote end of a
socket. Now it returns a whole SocketPeerInfo structure, of which that
human-friendly log string is just one field, but also some of the same
information - remote IP address and port, in particular - is provided
in machine-readable form where it's available.
The function takes the two KEXINIT packets in their string form,
together with a list of mappings from names to known algorithm
implementations, and returns the selected one of each kind, along with
all the other necessary auxiliary stuff.
I've introduced a new POD struct type 'ssh_ttymodes' which stores an
encoding of everything you can specify in the "pty-req" packet or the
SSH-1 equivalent. This allows me to split up
write_ttymodes_to_packet_from_conf() into two separate functions, one
to parse all the ttymode data out of a Conf (and a Seat for fallback)
and return one of those structures, and the other to write it into an
SSH packet.
While I'm at it, I've moved the special case of terminal speeds into
the same mechanism, simplifying the call sites in both versions of the
SSH protocol.
The new master definition of all terminal modes lives in a header
file, with an ifdef around each item, so that later on I'll be able to
include it in a context that only enumerates the modes supported by
the particular target Unix platform.
This gets another big pile of logic out of ssh2connection and puts it
somewhere more central. Now the only thing left in ssh2connection is
the formatting and parsing of the various channel requests; the logic
deciding which ones to issue and what to do about them is devolved to
the Channel implementation, as it properly should be.
This is a new vtable-based abstraction which is passed to a backend in
place of Frontend, and it implements only the subset of the Frontend
functions needed by a backend. (Many other Frontend functions still
exist, notably the wide range of things called by terminal.c providing
platform-independent operations on the GUI terminal window.)
The purpose of making it a vtable is that this opens up the
possibility of creating a backend as an internal implementation detail
of some other activity, by providing just that one backend with a
custom Seat that implements the methods differently.
For example, this refactoring should make it feasible to directly
implement an SSH proxy type, aka the 'jump host' feature supported by
OpenSSH, aka 'open a secondary SSH session in MAINCHAN_DIRECT_TCP
mode, and then expose the main channel of that as the Socket for the
primary connection'. (Which of course you can already do by spawning
'plink -nc' as a separate proxy process, but this would permit it in
the _same_ process without anything getting confused.)
I've centralised a full set of stub methods in misc.c for the new
abstraction, which allows me to get rid of several annoying stubs in
the previous code. Also, while I'm here, I've moved a lot of
duplicated modalfatalbox() type functions from application main
program files into wincons.c / uxcons.c, which I think saves
duplication overall. (A minor visible effect is that the prefixes on
those console-based fatal error messages will now be more consistent
between applications.)
LogContext is now the owner of the logevent() function that back ends
and so forth are constantly calling. Previously, logevent was owned by
the Frontend, which would store the message into its list for the GUI
Event Log dialog (or print it to standard error, or whatever) and then
pass it _back_ to LogContext to write to the currently open log file.
Now it's the other way round: LogContext gets the message from the
back end first, writes it to its log file if it feels so inclined, and
communicates it back to the front end.
This means that lots of parts of the back end system no longer need to
have a pointer to a full-on Frontend; the only thing they needed it
for was logging, so now they just have a LogContext (which many of
them had to have anyway, e.g. for logging SSH packets or session
traffic).
LogContext itself also doesn't get a full Frontend pointer any more:
it now talks back to the front end via a little vtable of its own
called LogPolicy, which contains the method that passes Event Log
entries through, the old askappend() function that decides whether to
truncate a pre-existing log file, and an emergency function for
printing an especially prominent message if the log file can't be
created. One minor nice effect of this is that console and GUI apps
can implement that last function subtly differently, so that Unix
console apps can write it with a plain \n instead of the \r\n
(harmless but inelegant) that the old centralised implementation
generated.
One other consequence of this is that the LogContext has to be
provided to backend_init() so that it's available to backends from the
instant of creation, rather than being provided via a separate API
call a couple of function calls later, because backends have typically
started doing things that need logging (like making network
connections) before the call to backend_provide_logctx. Fortunately,
there's no case in the whole code base where we don't already have
logctx by the time we make a backend (so I don't actually remember why
I ever delayed providing one). So that shortens the backend API by one
function, which is always nice.
While I'm tidying up, I've also moved the printf-style logeventf() and
the handy logevent_and_free() into logging.c, instead of having copies
of them scattered around other places. This has also let me remove
some stub functions from a couple of outlying applications like
Pageant. Finally, I've removed the pointless "_tag" at the end of
LogContext's official struct name.
The sshverstring quasi-frontend is passed a Frontend pointer at setup
time, so that it can generate Event Log entries containing the local
and remote version strings and the results of remote bug detection.
I'm promoting that field of sshverstring to a field of the public BPP
structure, so now all BPPs have the right to talk directly to the
frontend if they want to. This means I can move all the log messages
of the form 'Initialised so-and-so cipher/MAC/compression' down into
the BPPs themselves, where they can live exactly alongside the actual
initialisation of those primitives.
It also means BPPs will be able to log interesting things they detect
at any point in the packet stream, which is about to come in useful
for another purpose.
I haven't needed these until now, but I'm about to need to inspect the
entire contents of a packet queue before deciding whether to process
the first item on it.
I've changed the single 'vtable method' in packet queues from get(),
which returned the head of the queue and optionally popped it, to
after() which does the same bug returns the item after a specified
tree node. So if you pass the special end node to after(), then it
behaves like get(), but now you can also use it to retrieve the
successor of a packet.
(Orthogonality says that you can also _pop_ the successor of a packet
by calling after() with prev != pq.end and pop == TRUE. I don't have a
use for that one yet.)
All the main backend structures - Ssh, Telnet, Pty, Serial etc - now
describe structure types themselves rather than pointers to them. The
same goes for the codebase-wide trait types Socket and Plug, and the
supporting types SockAddr and Pinger.
All those things that were typedefed as pointers are older types; the
newer ones have the explicit * at the point of use, because that's
what I now seem to be preferring. But whichever one of those is
better, inconsistently using a mixture of the two styles is worse, so
let's make everything consistent.
A few types are still implicitly pointers, such as Bignum and some of
the GSSAPI types; generally this is either because they have to be
void *, or because they're typedefed differently on different
platforms and aren't always pointers at all. Can't be helped. But I've
got rid of the main ones, at least.
The check_termination function in ssh2connection is supposed to be
called whenever it's possible that we've run out of (a) channels, and
(b) sharing downstreams. I've been calling it on every channel close,
but apparently completely forgot to add a callback from sshshare.c
that also arranges to call it when we run out of downstreams.
In commit 8cb68390e I managed to copy the packet contexts inaccurately
from the old implementation of ssh2_pkt_type, and listed the ECDH KEX
packets against SSH2_PKTCTX_DHGEX instead of SSH2_PKTCTX_ECDHKEX,
which led to them appearing as "unknown" in packet log files.
I've tried to separate out as many individually coherent changes from
this work as I could into their own commits, but here's where I run
out and have to commit the rest of this major refactoring as a
big-bang change.
Most of ssh.c is now no longer in ssh.c: all five of the main
coroutines that handle layers of the SSH-1 and SSH-2 protocols now
each have their own source file to live in, and a lot of the
supporting functions have moved into the appropriate one of those too.
The new abstraction is a vtable called 'PacketProtocolLayer', which
has an input and output packet queue. Each layer's main coroutine is
invoked from the method ssh_ppl_process_queue(), which is usually
(though not exclusively) triggered automatically when things are
pushed on the input queue. In SSH-2, the base layer is the transport
protocol, and it contains a pair of subsidiary queues by which it
passes some of its packets to the higher SSH-2 layers - first userauth
and then connection, which are peers at the same level, with the
former abdicating in favour of the latter at the appropriate moment.
SSH-1 is simpler: the whole login phase of the protocol (crypto setup
and authentication) is all in one module, and since SSH-1 has no
repeat key exchange, that setup layer abdicates in favour of the
connection phase when it's done.
ssh.c itself is now about a tenth of its old size (which all by itself
is cause for celebration!). Its main job is to set up all the layers,
hook them up to each other and to the BPP, and to funnel data back and
forth between that collection of modules and external things such as
the network and the terminal. Once it's set up a collection of packet
protocol layers, it communicates with them partly by calling methods
of the base layer (and if that's ssh2transport then it will delegate
some functionality to the corresponding methods of its higher layer),
and partly by talking directly to the connection layer no matter where
it is in the stack by means of the separate ConnectionLayer vtable
which I introduced in commit 8001dd4cb, and to which I've now added
quite a few extra methods replacing services that used to be internal
function calls within ssh.c.
(One effect of this is that the SSH-1 and SSH-2 channel storage is now
no longer shared - there are distinct struct types ssh1_channel and
ssh2_channel. That means a bit more code duplication, but on the plus
side, a lot fewer confusing conditionals in the middle of half-shared
functions, and less risk of a piece of SSH-1 escaping into SSH-2 or
vice versa, which I remember has happened at least once in the past.)
The bulk of this commit introduces the five new source files, their
common header sshppl.h and some shared supporting routines in
sshcommon.c, and rewrites nearly all of ssh.c itself. But it also
includes a couple of other changes that I couldn't separate easily
enough:
Firstly, there's a new handling for socket EOF, in which ssh.c sets an
'input_eof' flag in the BPP, and that responds by checking a flag that
tells it whether to report the EOF as an error or not. (This is the
main reason for those new BPP_READ / BPP_WAITFOR macros - they can
check the EOF flag every time the coroutine is resumed.)
Secondly, the error reporting itself is changed around again. I'd
expected to put some data fields in the public PacketProtocolLayer
structure that it could set to report errors in the same way as the
BPPs have been doing, but in the end, I decided propagating all those
data fields around was a pain and that even the BPPs shouldn't have
been doing it that way. So I've reverted to a system where everything
calls back to functions in ssh.c itself to report any connection-
ending condition. But there's a new family of those functions,
categorising the possible such conditions by semantics, and each one
has a different set of detailed effects (e.g. how rudely to close the
network connection, what exit status should be passed back to the
whole application, whether to send a disconnect message and/or display
a GUI error box).
I don't expect this to be immediately perfect: of course, the code has
been through a big upheaval, new bugs are expected, and I haven't been
able to do a full job of testing (e.g. I haven't tested every auth or
kex method). But I've checked that it _basically_ works - both SSH
protocols, all the different kinds of forwarding channel, more than
one auth method, Windows and Linux, connection sharing - and I think
it's now at the point where the easiest way to find further bugs is to
let it out into the wild and see what users can spot.
This means that someone putting things on a packet queue doesn't need
to separately hold a pointer to someone who needs notifying about it,
or remember to call the notification function every time they push
things on the queue. It's all taken care of automatically, without
having to put extra stuff at the call sites.
The precise semantics are that the callback will be scheduled whenever
_new_ packets appear on the queue, but not when packets are removed.
(Because the expectation is that the callback is notifying whoever is
consuming the queue.)
This paves the way for me to reorganise ssh.c in a way that will mean
I don't have a ConnectionLayer available yet at the time I have to
create the connshare. The constructor function now takes a mere
Frontend, for generating setup-time Event Log messages, and there's a
separate ssh_connshare_provide_connlayer() function I can call later
once I have the ConnectionLayer to provide.
NFC for the moment: the new provide_connlayer function is called
immediately after ssh_connection_sharing_init.
This is essentially trivial, because the only thing it needed from the
Ssh structure was the Conf. So the version in sshcommon.c just takes
an actual Conf as an argument, and now it doesn't need access to the
big structure definition any more.
This is a new idea I've had to make memory-management of PktIn even
easier. The idea is that a PktIn is essentially _always_ an element of
some linked-list queue: if it's not one of the queues by which packets
move through ssh.c, then it's a special 'free queue' which holds
packets that are unowned and due to be freed.
pq_pop() on a PktInQueue automatically relinks the packet to the free
queue, and also triggers an idempotent callback which will empty the
queue and really free all the packets on it. Hence, you can pop a
packet off a real queue, parse it, handle it, and then just assume
it'll get tidied up at some point - the only constraint being that you
have to finish with it before returning to the application's main loop.
The exception is that it's OK to pq_push() the packet back on to some
other PktInQueue, because a side effect of that will be to _remove_ it
from the free queue again. (And if _all_ the incoming packets get that
treatment, then when the free-queue handler eventually runs, it may
find it has nothing to do - which is harmless.)
Now I've got a list macro defining all the packet types we recognise,
I can use it to write a test for 'is this a recognised code?', and use
that in turn to centralise detection of completely unrecognised codes
into the binary packet protocol, where any such messages will be
handled entirely internally and never even be seen by the next level
up. This lets me remove another big pile of boilerplate in ssh.c.
This allows me to share just one definition of the packet types
between the enum declarations in ssh.h and the string translation
functions in sshcommon.c. No functional change.
The style of list macro is slightly unusual; instead of the
traditional 'X-macro' in which LIST(X) expands to invocations of the
form X(list element), this is an 'X-y macro', where LIST(X,y) expands
to invocations of the form X(y, list element). That style makes it
possible to wrap the list macro up in another macro and pass a
parameter through from the wrapper to the per-element macro. I'm not
using that facility just yet, but I will in the next commit.