I carefully put a flag in the new Ssh structure so that I could tell
the difference between ssh->base_layer being NULL because it hasn't
been set up yet, and being NULL because it's been and gone and the
session is terminated. And did I check that flag in all the error
routines? I did not. Result: an early socket error, while we're still
in the verstring BPP, doesn't get reported as an error message and
doesn't cause the socket to be cleaned up.
I reworked the code for this at the last moment while preparing the
Big Refactoring, having decided my previous design was overcomplicated
and introducing an argument parameter (commit f4fbaa1bd) would be
simpler.
I carefully checked after the rework that specials manufactured by the
code itself (e.g. SS_PING) came through OK, but apparently the one
thing I _didn't_ test after the rework was that the specials list was
actually returned correctly from ssh_get_specials to be incorporated
into the GUI.
In fact one stray if statement - both redundant even if it had been
right, and also tested the wrong pointer - managed to arrange that
when ssh->specials is NULL, it could never be overwritten by anything
non-NULL. And of course it starts off initialised to NULL. Oops.
I've tried to separate out as many individually coherent changes from
this work as I could into their own commits, but here's where I run
out and have to commit the rest of this major refactoring as a
big-bang change.
Most of ssh.c is now no longer in ssh.c: all five of the main
coroutines that handle layers of the SSH-1 and SSH-2 protocols now
each have their own source file to live in, and a lot of the
supporting functions have moved into the appropriate one of those too.
The new abstraction is a vtable called 'PacketProtocolLayer', which
has an input and output packet queue. Each layer's main coroutine is
invoked from the method ssh_ppl_process_queue(), which is usually
(though not exclusively) triggered automatically when things are
pushed on the input queue. In SSH-2, the base layer is the transport
protocol, and it contains a pair of subsidiary queues by which it
passes some of its packets to the higher SSH-2 layers - first userauth
and then connection, which are peers at the same level, with the
former abdicating in favour of the latter at the appropriate moment.
SSH-1 is simpler: the whole login phase of the protocol (crypto setup
and authentication) is all in one module, and since SSH-1 has no
repeat key exchange, that setup layer abdicates in favour of the
connection phase when it's done.
ssh.c itself is now about a tenth of its old size (which all by itself
is cause for celebration!). Its main job is to set up all the layers,
hook them up to each other and to the BPP, and to funnel data back and
forth between that collection of modules and external things such as
the network and the terminal. Once it's set up a collection of packet
protocol layers, it communicates with them partly by calling methods
of the base layer (and if that's ssh2transport then it will delegate
some functionality to the corresponding methods of its higher layer),
and partly by talking directly to the connection layer no matter where
it is in the stack by means of the separate ConnectionLayer vtable
which I introduced in commit 8001dd4cb, and to which I've now added
quite a few extra methods replacing services that used to be internal
function calls within ssh.c.
(One effect of this is that the SSH-1 and SSH-2 channel storage is now
no longer shared - there are distinct struct types ssh1_channel and
ssh2_channel. That means a bit more code duplication, but on the plus
side, a lot fewer confusing conditionals in the middle of half-shared
functions, and less risk of a piece of SSH-1 escaping into SSH-2 or
vice versa, which I remember has happened at least once in the past.)
The bulk of this commit introduces the five new source files, their
common header sshppl.h and some shared supporting routines in
sshcommon.c, and rewrites nearly all of ssh.c itself. But it also
includes a couple of other changes that I couldn't separate easily
enough:
Firstly, there's a new handling for socket EOF, in which ssh.c sets an
'input_eof' flag in the BPP, and that responds by checking a flag that
tells it whether to report the EOF as an error or not. (This is the
main reason for those new BPP_READ / BPP_WAITFOR macros - they can
check the EOF flag every time the coroutine is resumed.)
Secondly, the error reporting itself is changed around again. I'd
expected to put some data fields in the public PacketProtocolLayer
structure that it could set to report errors in the same way as the
BPPs have been doing, but in the end, I decided propagating all those
data fields around was a pain and that even the BPPs shouldn't have
been doing it that way. So I've reverted to a system where everything
calls back to functions in ssh.c itself to report any connection-
ending condition. But there's a new family of those functions,
categorising the possible such conditions by semantics, and each one
has a different set of detailed effects (e.g. how rudely to close the
network connection, what exit status should be passed back to the
whole application, whether to send a disconnect message and/or display
a GUI error box).
I don't expect this to be immediately perfect: of course, the code has
been through a big upheaval, new bugs are expected, and I haven't been
able to do a full job of testing (e.g. I haven't tested every auth or
kex method). But I've checked that it _basically_ works - both SSH
protocols, all the different kinds of forwarding channel, more than
one auth method, Windows and Linux, connection sharing - and I think
it's now at the point where the easiest way to find further bugs is to
let it out into the wild and see what users can spot.
Having redesigned it a few days ago in commit 562cdd4df, I'm changing
it again, this time to fix a potential race condition on the _output_
side: the last change was intended to cope with a server sending an
asynchronous message like IGNORE immediately after enabling
compression, and this one fixes the case in which _we_ happen to
decide to send an IGNORE while a compression request is still pending.
I couldn't fix this until after the BPP was reorganised to have an
explicit output queue of packets, but now it does, I can simply defer
processing that queue on to the output raw-data bufchain if we're
waiting for a compression request to be answered. Once it is answered,
the BPP can release any pending packets.
This is a convenient place for it because it abstracts away the
difference in disconnect packet formats between SSH-1 and -2, so when
I start restructuring, I'll be able to call it even from places that
don't know which version of SSH they're running.
Now, instead of writing each packet straight on to the raw output
bufchain by calling the BPP's format_packet function, the higher
protocol layers will put the packets on to a queue, which will
automatically trigger a callback (using the new mechanism for
embedding a callback in any packet queue) to make the BPP format its
queue on to the raw-output bufchain. That in turn triggers a second
callback which moves the data to the socket.
This means in particular that the CBC ignore-message workaround can be
moved into the new BPP routine to process the output queue, which is a
good place for it because then it can easily arrange to only put an
ignore message at the start of any sequence of packets that are being
formatted as a single output blob.
This paves the way for me to reorganise ssh.c in a way that will mean
I don't have a ConnectionLayer available yet at the time I have to
create the connshare. The constructor function now takes a mere
Frontend, for generating setup-time Event Log messages, and there's a
separate ssh_connshare_provide_connlayer() function I can call later
once I have the ConnectionLayer to provide.
NFC for the moment: the new provide_connlayer function is called
immediately after ssh_connection_sharing_init.
This is essentially trivial, because the only thing it needed from the
Ssh structure was the Conf. So the version in sshcommon.c just takes
an actual Conf as an argument, and now it doesn't need access to the
big structure definition any more.
This is a new idea I've had to make memory-management of PktIn even
easier. The idea is that a PktIn is essentially _always_ an element of
some linked-list queue: if it's not one of the queues by which packets
move through ssh.c, then it's a special 'free queue' which holds
packets that are unowned and due to be freed.
pq_pop() on a PktInQueue automatically relinks the packet to the free
queue, and also triggers an idempotent callback which will empty the
queue and really free all the packets on it. Hence, you can pop a
packet off a real queue, parse it, handle it, and then just assume
it'll get tidied up at some point - the only constraint being that you
have to finish with it before returning to the application's main loop.
The exception is that it's OK to pq_push() the packet back on to some
other PktInQueue, because a side effect of that will be to _remove_ it
from the free queue again. (And if _all_ the incoming packets get that
treatment, then when the free-queue handler eventually runs, it may
find it has nothing to do - which is harmless.)
Now I've got a list macro defining all the packet types we recognise,
I can use it to write a test for 'is this a recognised code?', and use
that in turn to centralise detection of completely unrecognised codes
into the binary packet protocol, where any such messages will be
handled entirely internally and never even be seen by the next level
up. This lets me remove another big pile of boilerplate in ssh.c.
In order to list cross-certifiable host keys in the GUI specials menu,
the SSH backend has been inventing new values on the end of the
Telnet_Special enumeration, starting from the value TS_LOCALSTART.
This is inelegant, and also makes it awkward to break up special
handlers (e.g. to dispatch different specials to different SSH
layers), since if all you know about a special is that it's somewhere
in the TS_LOCALSTART+n space, you can't tell what _general kind_ of
thing it is. Also, if I ever need another open-ended set of specials
in future, I'll have to remember which TS_LOCALSTART+n codes are in
which set.
So here's a revamp that causes every special to take an extra integer
argument. For all previously numbered specials, this argument is
passed as zero and ignored, but there's a new main special code for
SSH host key cross-certification, in which the integer argument is an
index into the backend's list of available keys. TS_LOCALSTART is now
a thing of the past: if I need any other open-ended sets of specials
in future, I can add a new top-level code with a nicely separated
space of arguments.
While I'm at it, I've removed the legacy misnomer 'Telnet_Special'
from the code completely; the enum is now SessionSpecialCode, the
struct containing full details of a menu entry is SessionSpecial, and
the enum values now start SS_ rather than TS_.
I've just noticed that we call ssh1_bpp_start_compression even if the
server responded to our compression request with SSH1_SMSG_FAILURE!
Also, while I'm here, there's a potential race condition if the server
were to send an unrelated message (such as SSH1_MSG_IGNORE)
immediately after the SSH1_SMSG_SUCCESS that indicates compression
being enabled - the BPP would try to decode the compressed IGNORE
message before the SUCCESS got to the higher layer that would tell the
BPP it should have enabled compression. Fixed that by changing the
method by which we tell the BPP what's going on.
Originally, it controlled whether ssh.c should send terminal messages
(such as login and password prompts) to terminal.c or to stderr. But
we've had the from_backend() abstraction for ages now, which even has
an existing flag to indicate that the data is stderr rather than
stdout data; applications which set FLAG_STDERR are precisely those
that link against uxcons or wincons, so from_backend will do the
expected thing anyway with data sent to it with that flag set. So
there's no reason ssh.c can't just unconditionally pass everything
through that, and remove the special case.
FLAG_STDERR was also used by winproxy and uxproxy to decide whether to
capture standard error from a local proxy command, or whether to let
the proxy command send its diagnostics directly to the usual standard
error. On reflection, I think it's better to unconditionally capture
the proxy's stderr, for three reasons. Firstly, it means proxy
diagnostics are prefixed with 'proxy:' so that you can tell them apart
from any other stderr spew (which used to be particularly confusing if
both the main application and the proxy command were instances of
Plink); secondly, proxy diagnostics are now reliably copied to packet
log files along with all the other Event Log entries, even by
command-line tools; and thirdly, this means the option to suppress
proxy command diagnostics after the main session starts will actually
_work_ in the command-line tools, which it previously couldn't.
A more minor structure change is that copying of Event Log messages to
stderr in verbose mode is now done by wincons/uxcons, instead of
centrally in logging.c (since logging.c can now no longer check
FLAG_STDERR to decide whether to do it). The total amount of code to
do this is considerably smaller than the defensive-sounding comment in
logevent.c explaining why I did it the other way instead :-)
When PuTTY is configured to display stderr diagnostics from the proxy
command but only until the main session starts, this flag is how the
SSH backend indicates the point at which the session starts. It was
previously set during version-string parsing, and I forgot to find a
new home for it when I moved the version string parsing out into the
new verstring BPP module in commit af8e526a7.
Now reinstated, at the point where that BPP gets back to us and tells
us what protocol version it's chosen.
I've removed the encrypted_len fields from PktIn and PktOut, which
were used to communicate from the BPP to ssh.c how much each packet
contributed to the amount of data encrypted with a given set of cipher
keys. It seems more sensible to have the BPP itself keep that counter
- especially since only one of the three BPPs even needs to count it
at all. So now there's a new DataTransferStats structure which the BPP
updates, and ssh.c only needs to check it for overflow and reset the
limits.
That function _did_ depend on ssh.c's internal facilities, namely the
layout of 'struct ssh_channel'. In place of that, it now takes an
extra integer argument telling it where to find the channel id in
whatever data structure you give it a tree of - so now I can split up
the SSH-1 and SSH-2 channel handling without losing the services of
that nice channel-number allocator.
While I'm at it, I've brought it all into a single function: the
parsing of data from Conf, the list of modes, and even the old
callback system for writing to the destination buffer is now a simple
if statement that formats mode parameters as byte or uint32 depending
on SSH version. Also, the terminal speeds and the end byte are part of
the same setup, so it's all together in one place instead of scattered
all over ssh.c.
It doesn't really have to be in ssh.c sharing that file's internal
data structures; it's as much an independent object implementation as
any of the less trivial Channel instances. So it's another thing we
can get out of that too-large source file.
It's really just a concatenator for a pair of linked lists, but
unhelpfully restricted in which of the lists it replaces with the
output. Better to have a three-argument function that puts the output
wherever you like, whether it overlaps either or neither one of the
inputs.
We used it to suppress reading from the network at every point during
protocol setup where PuTTY was waiting for a user response to a dialog
box (e.g. a host key warning). The purpose of this was to avoid
dropping an important packet while the coroutine was listening to one
of its other input parameters (as it were). But now everything is
queue-based, packets will stay queued until we're ready to look at
them anyway; so it's better _not_ to freeze the connection, so that
messages we _can_ handle in between (e.g. SSH_MSG_DEBUG or
SSH_MSG_IGNORE) can still be processed.
That dispenses with all uses of ssh_set_frozen except for its use by
ssh_throttle_conn to exert back-pressure on the server in SSH1 which
doesn't have per-channel windows. So I've moved that last use _into_
ssh_throttle_conn, and now the function is completely gone.
These are essentially data-structure maintenance, and it seems silly
to have them be part of the same file that manages the topmost
structure of the SSH connection.
Some upcoming restructuring I've got planned will need to pass output
packets back and forth on queues, as well as input ones. So here's a
change that arranges that we can have a PktInQueue and a PktOutQueue,
sharing most of their implementation via a PacketQueueBase structure
which links together the PacketQueueNode fields in the two packet
structures.
There's a tricksy bit of macro manoeuvring to get all of this
type-checked, so that I can't accidentally link a PktOut on to a
PktInQueue or vice versa. It works by having the main queue functions
wrapped by macros; when receiving a packet structure on input, they
type-check it against the queue structure and then automatically look
up its qnode field to pass to the underlying PacketQueueBase function;
on output, they translate a returned PacketQueueNode back to its
containing packet type by calling a 'get' function pointer.
Now there's a centralised routine in misc.c to do the sanitisation,
which copies data on to an outgoing bufchain. This allows me to remove
from_backend_untrusted() completely from the frontend API, simplifying
code in several places.
Two use cases for untrusted-terminal-data sanitisation were in the
terminal.c prompts handler, and in the collection of SSH-2 userauth
banners. Both of those were writing output to a bufchain anyway, so
it was very convenient to just replace a bufchain_add with
sanitise_term_data and then not have to worry about it again.
There was also a simplistic sanitiser in uxcons.c, which I've now
replaced with a call to the good one - and in wincons.c there was a
FIXME saying I ought to get round to that, which now I have!
Getting it out of the overgrown ssh.c is worthwhile in itself! But
there are other benefits of this reorganisation too.
One is that I get to remove ssh->current_incoming_data_fn, because now
_all_ incoming network data is handled by whatever the current BPP is.
So now we only indirect through the BPP, not through some other
preliminary function pointer _and_ the BPP.
Another is that all _outgoing_ network data is now handled centrally,
including our outgoing version string - which means that a hex dump of
that string now shows up in the raw-data log file, from which it was
previously conspicuous by its absence.
This replaces the previous log(n)^2 algorithm for channel-number
allocation, which binary-searched the space of tree indices using a
log-time call to index234() at each step, with a single log-time pass
down the tree which only has to check the returned channel number
against the returned tree index at each step.
I'm under no illusions that this was a critical performance issue, but
it's been offending my sense of algorithmic elegance for a while.
This is a vtable that wraps up all the functionality required from the
SSH connection layer by associated modules like port forwarding and
connection sharing. This extra layer of indirection adds nothing
useful right now, but when I later separate the SSH-1 and SSH-2
connection layer implementations, it will be convenient for each one
to be able to implement this vtable in terms of its own internal data
structures.
To simplify this vtable, I've moved a lot of the logging duties
relating to connection sharing out of ssh.c into sshshare.c: now it
handles nearly all the logging itself relating to setting up
connection sharing in the first place and downstreams connecting and
disconnecting. The only exception is the 'Reusing a shared connection'
announcement in the console window, which is now done in ssh.c by
detecting downstream status immediately after setup.
The tree234 storing currently active port forwardings - both local and
remote - now lives in portfwd.c, as does the complicated function that
updates it based on a Conf listing the new set of desired forwardings.
Local port forwardings are passed to ssh.c via the same route as
before - once the listening port receives a connection and portfwd.c
knows where it should be directed to (in particular, after the SOCKS
exchange, if any), it calls ssh_send_port_open.
Remote forwardings are now initiated by calling ssh_rportfwd_alloc,
which adds an entry to the rportfwds tree (which _is_ still in ssh.c,
and still confusingly sorted by a different criterion depending on SSH
protocol version) and sends out the appropriate protocol request.
ssh_rportfwd_remove cancels one again, sending a protocol request too.
Those functions look enough like ssh_{alloc,remove}_sharing_rportfwd
that I've merged those into the new pair as well - now allocating an
rportfwd allows you to specify either a destination host/port or a
sharing context, and returns a handy pointer you can use to cancel the
forwarding later.
Clients outside ssh.c - all implementations of Channel - will now not
see the ssh_channel data type itself, but only a subobject of the
interface type SshChannel. All the sshfwd_* functions have become
methods in that interface type's vtable (though, wrapped in the usual
kind of macros, the call sites look identical).
This paves the way for me to split up the SSH-1 and SSH-2 connection
layers and have each one lay out its channel bookkeeping structure as
it sees fit; as long as they each provide an implementation of the
sshfwd_ method family, the types behind that need not look different.
A minor good effect of this is that the sshfwd_ methods are no longer
global symbols, so they don't have to be stubbed in Unix Pageant to
get it to compile.
This was mildly fiddly because there's a single vtable structure that
implements two distinct interface types, one for compression and one
for decompression - and I have actually confused them before now
(commit d4304f1b7), so I think it's important to make them actually be
separate types!
The new version of ssh_hash has the same nice property as ssh2_mac,
that I can make the generic interface object type function directly as
a BinarySink so that clients don't have to call h->sink() and worry
about the separate sink object they get back from that.
This piece of tidying-up has come out particularly well in terms of
saving tedious repetition and boilerplate. I've managed to remove
three pointless methods from every MAC implementation by means of
writing them once centrally in terms of the implementation-specific
methods; another method (hmacmd5_sink) vanished because I was able to
make the interface type 'ssh2_mac' be directly usable as a BinarySink
by way of a new delegation system; and because all the method
implementations can now find their own vtable, I was even able to
merge a lot of keying and output functions that had previously
differed only in length parameters by having them look up the lengths
in whatever vtable they were passed.
This is more or less the same job as the SSH-1 case, only more
extensive, because we have a wider range of ciphers.
I'm a bit disappointed about the AES case, in particular, because I
feel as if it ought to have been possible to arrange to combine this
layer of vtable dispatch with the subsidiary one that selects between
hardware and software implementations of the underlying cipher. I may
come back later and have another try at that, in fact.
The interchangeable system of SSH-1 ciphers previously followed the
same pattern as the backends and the public-key algorithms, in that
all the clients would maintain two separate pointers, one to the
vtable and the other to the individual instance / context. Now I've
merged them, just as I did with those other two, so that you only cart
around a single pointer, which has a vtable pointer inside it and a
type distinguishing it from an instance of any of the other
interchangeable sets of algorithms.
This was a particularly confusing piece of type-danger, because three
different types were passed outside sshshare.c as 'void *' and only
human vigilance prevented one coming back as the wrong one. Now they
all keep their opaque structure tags when they move through other
parts of the code.
There's now an interface called 'Channel', which handles the local
side of an SSH connection-layer channel, in terms of knowing where to
send incoming channel data to, whether to close the channel, etc.
Channel and the previous 'struct ssh_channel' mutually refer. The
latter contains all the SSH-specific parts, and as much of the common
logic as possible: in particular, Channel doesn't have to know
anything about SSH packet formats, or which SSH protocol version is in
use, or deal with all the fiddly stuff about window sizes - with the
exception that x11fwd.c's implementation of it does have to be able to
ask for a small fixed initial window size for the bodgy system that
distinguishes upstream from downstream X forwardings.
I've taken the opportunity to move the code implementing the detailed
behaviour of agent forwarding out of ssh.c, now that all of it is on
the far side of a uniform interface. (This also means that if I later
implement agent forwarding directly to a Unix socket as an
alternative, it'll be a matter of changing just the one call to
agentf_new() that makes the Channel to plug into a forwarding.)
This is another major source of unexplained 'void *' parameters
throughout the code.
In particular, the currently unused testback.c actually gave the wrong
pointer type to its internal store of the frontend handle - it cast
the input void * to a Terminal *, from which it got implicitly cast
back again when calling from_backend, and nobody noticed. Now it uses
the right type internally as well as externally.
Nearly every part of the code that ever handles a full backend
structure has historically done it using a pair of pointer variables,
one pointing at a constant struct full of function pointers, and the
other pointing to a 'void *' state object that's passed to each of
those.
While I'm modernising the rest of the code, this seems like a good
time to turn that into the same more or less type-safe and less
cumbersome system as I'm using for other parts of the code, such as
Socket, Plug, BinaryPacketProtocol and so forth: the Backend structure
contains a vtable pointer, and a system of macro wrappers handles
dispatching through that vtable.
Same principle again - the more of these structures have globally
visible tags (even if the structure contents are still opaque in most
places), the fewer of them I can mistake for each other.
That's one fewer anonymous 'void *' which might be accidentally
confused with some other pointer type if I misremember the order of
function arguments.
While I'm here, I've made its pointer-nature explicit - that is,
'Ldisc' is now a typedef for the structure type itself rather than a
pointer to it. A stylistic change only, but it feels more natural to
me these days for a thing you're going to eventually pass to a 'free'
function.
Now when we construct a packet containing sensitive data, we just set
a field saying '... and make it take up at least this much space, to
disguise its true size', and nothing in the rest of the system worries
about that flag until ssh2bpp.c acts on it.
Also, I've changed the strategy for doing the padding. Previously, we
were following the real packet with an SSH_MSG_IGNORE to make up the
size. But that was only a partial defence: it works OK against passive
traffic analysis, but an attacker proxying the TCP stream and
dribbling it out one byte at a time could still have found out the
size of the real packet by noting when the dribbled data provoked a
response. Now I put the SSH_MSG_IGNORE _first_, which should defeat
that attack.
But that in turn doesn't work when we're doing compression, because we
can't predict the compressed sizes accurately enough to make that
strategy sensible. Fortunately, compression provides an alternative
strategy anyway: if we've got zlib turned on when we send one of these
sensitive packets, then we can pad out the compressed zlib data as
much as we like by adding empty RFC1951 blocks (effectively chaining
ZLIB_PARTIAL_FLUSHes). So both strategies should now be dribble-proof.
The return value wasn't used to indicate failure; it only indicated
whether any compression was being done at all or whether the
compression method was ssh_comp_none, and we can tell the latter just
as well by the fact that its init function returns a null context
pointer.
Yesterday's reinstatement of ssh_free_pktout revealed - via valgrind
spotting the use-after-free - that the code that prefixed sensible
packets with IV-muddling SSH_MSG_IGNOREs was actually sending a second
copy of the sensible packet in place of the IGNORE, due to a typo.
On the post-userauth rekey, when we're specifically rekeying in order
to turn on delayed compression, we shouldn't write the Event Log
"Server supports delayed compression; will try this later" message
that we did in the original key exchange. At this point, it _is_
later, and we're about to turn on compression right now!
When the whole SSH connection is throttled and then unthrottled, we
need to requeue the callback that transfers data to the Socket from
the new outgoing_data queue introduced in commit 9e3522a97.
The user-visible effect of this missing call was that outgoing SFTP
transactions would lock up, because in SFTP mode we enable the
"simple@putty.projects.tartarus.org" mode and essentially turn off the
per-channel window management, so throttling of the whole connection
becomes the main source of back-pressure.
Now we have the new marshalling system, I think it's outlived its
usefulness, because the new system allows us to directly express
various things (e.g. uint16 and non-zero-terminated strings) that were
actually _more_ awkward to do via the variadic interface. So here's a
rewrite that removes send_packet(), and replaces all its call sites
with something that matches our SSH-2 packet construction idioms.
This diff actually _reduces_ the number of lines of code in ssh.c.
Since the variadic system was trying to save code by centralising
things, that seems like the best possible evidence that it wasn't
pulling its weight!
sshbpp.h now defines a classoid that encapsulates both directions of
an SSH binary packet protocol - that is, a system for reading a
bufchain of incoming data and turning it into a stream of PktIn, and
another system for taking a PktOut and turning it into data on an
outgoing bufchain.
The state structure in each of those files contains everything that
used to be in the 'rdpkt2_state' structure and its friends, and also
quite a lot of bits and pieces like cipher and MAC states that used to
live in the main Ssh structure.
One minor effect of this layer separation is that I've had to extend
the packet dispatch table by one, because the BPP layer can no longer
directly trigger sending of SSH_MSG_UNIMPLEMENTED for a message too
short to have a type byte. Instead, I extend the PktIn type field to
use an out-of-range value to encode that, and the easiest way to make
that trigger an UNIMPLEMENTED message is to have the dispatch table
contain an entry for it.
(That's a system that may come in useful again - I was also wondering
about inventing a fake type code to indicate network EOF, so that that
could be propagated through the layers and be handled by whichever one
currently knew best how to respond.)
I've also moved the packet-censoring code into its own pair of files,
partly because I was going to want to do that anyway sooner or later,
and mostly because it's called from the BPP code, and the SSH-2
version in particular has to be called from both the main SSH-2 BPP
and the bare unencrypted protocol used for connection sharing. While I
was at it, I took the opportunity to merge the outgoing and incoming
censor functions, so that the parts that were common between them
(e.g. CHANNEL_DATA messages look the same in both directions) didn't
need to be repeated.
This mirrors the use of one for incoming wire data: now when we send
raw data (be it the initial greeting, or the output of binary packet
construction), we put it on ssh->outgoing_data, and schedule a
callback to transfer that into the socket.
Partly this is in preparation for delegating the task of appending to
that bufchain to a separate self-contained module that won't have
direct access to the connection's Socket. But also, it has the very
nice feature that I get to throw away the ssh_pkt_defer system
completely! That was there so that we could construct more than one
packet in rapid succession, concatenate them into a single blob, and
pass that blob to the socket in one go so that the TCP headers
couldn't contain any trace of where the boundary between them was. But
now we don't need a separate function to do that: calling the ordinary
packet-send routine twice in the same function before returning to the
main event loop will have that effect _anyway_.
ssh.c has been an unmanageably huge monolith of a source file for too
long, and it's finally time I started breaking it up into smaller
pieces. The first step is to move some declarations - basic types like
packets and packet queues, standard constants, enums, and the
coroutine system - into headers where other files can see them.
It is to put_data what memset is to memcpy. Several places
in the code wanted it already, but not _quite_ enough for me to
have written it with the rest of the BinarySink infrastructure
originally.
This saves a malloc and free every time we add or remove a packet from
a packet queue - it can now be done by pure pointer-shuffling instead
of allocating a separate list node structure.
The body pointer was used after encryption to mark the start of the
fully wire-ready packet by ssh2_pkt_construct, and before encryption
by the log_outgoing_packet functions. Now the former returns a nice
modern ptrlen (it never really needed to store the pointer _in_ the
packet structure anyway), and the latter uses an integer 'prefix'
field, which isn't very different in concept but saves effort on
reallocs.
These were only used in the rdpkt coroutines, during construction of
the incoming packet; once it's complete, they're never touched again.
So really they should have been fields in the rdpkt coroutines' state
- and now they are.
The new memory allocation strategy for incoming packets is to defer
creation of the returned pktin structure until we know how big its
data buffer will really need to be, and then use snew_plus to make the
PktIn and the payload block in the same allocation.
When we have to read and keep some amount of the packet before
allocating the returned structure, we do it by having a persistent
buffer in the rdpkt state, which is retained for the whole connection
and only freed once in ssh_free.
They were duplicating values stored in the BinarySource substructure.
Mostly they're not referred to directly any more (instead, we call
get_foo to access the BinarySource); and when they are, we can switch
to reading the same values back out of the BinarySource anyway.
This is the first stage of massively tidying up this very confused
data structure. In this commit, I replace the unified 'struct Packet'
with two structures PktIn and PktOut, each of which contains only the
fields of struct Packet that are actually used for packets going in
that direction - most notably, PktIn doesn't implement BinarySink, and
PktOut doesn't implement BinarySource.
All uses of the old structure were statically determinable to be one
or the other, so I've done that determination and changed all the
types of variables and function signatures.
Unlike PktIn, PktOut is not reference-counted, so there's a new
ssh_pktout_free function.
The most immediately pleasing thing about this change is that it lets
me finally get rid of the tedious comment explaining how the 'length'
field in struct Packet meant something different depending on
direction. Now it's two fields of the same name in two different
structures, I can comment the same thing much less verbosely!
(I've also got rid of the comment claiming the type field was only
used for incoming packets. That wasn't even true! It might have been
once, because you can write an outgoing packet's type byte straight
into its data buffer, but in fact in the current code pktout->type is
nonetheless used in various other places, e.g. log_outgoing_packet.)
In this commit I've only removed the fields from each structure that
were _already_ unused. There are still quite a few we can get rid of
by _making_ them unused.
After Pavel Kryukov pointed out that I have to put _something_ in the
'ssh_key' structure, I thought of an actually useful thing to put
there: why not make it store a pointer to the ssh_keyalg structure?
Then ssh_key becomes a classoid - or perhaps 'traitoid' is a closer
analogy - in the same style as Socket and Plug. And just like Socket
and Plug, I've also arranged a system of wrapper macros that avoid the
need to mention the 'object' whose method you're invoking twice at
each call site.
The new vtable pointer directly replaces an existing field of struct
ec_key (which was usable by several different ssh_keyalgs, so it
already had to store a pointer to the currently active one), and also
replaces the 'alg' field of the ssh2_userkey structure that wraps up a
cryptographic key with its comment field.
I've also taken the opportunity to clean things up a bit in general:
most of the methods now have new and clearer names (e.g. you'd never
know that 'newkey' made a public-only key while 'createkey' made a
public+private key pair unless you went and looked it up, but now
they're called 'new_pub' and 'new_priv' you might be in with a
chance), and I've completely removed the openssh_private_npieces field
after realising that it was duplicating information that is actually
_more_ conveniently obtained by calling the new_priv_openssh method
(formerly openssh_createkey) and throwing away the result.
This parameter returned a substring of the input, which was used for
two purposes. Firstly, it was used to hash the host and server keys
during the initial SSH-1 key setup phase; secondly, it was used to
check the keys in Pageant against the public key blob of a key
specified on the command line.
Unfortunately, those two purposes didn't agree! The first one needs
just the bare key modulus bytes (without even the SSH-1 mpint length
header); the second needs the entire key blob. So, actually, it seems
to have never worked in SSH-1 to say 'putty -i keyfile' and have PuTTY
find that key in Pageant and not have to ask for the passphrase to
decrypt the version on disk.
Fixed by removing that parameter completely, which simplifies all the
_other_ call sites, and replacing it by custom code in those two
places that each does the actually right thing.
It's an SSH-1 specific function, so it should have a name reflecting
that, and it didn't. Also it had one of those outdated APIs involving
passing it a client-allocated buffer and size. Now it has a sensible
name, and internally it constructs the output string using a strbuf
and returns it dynamically allocated.
Another piece of half-finished machinery that I can't have tested
properly when I set up connection sharing: I had the function
ssh_alloc_sharing_rportfwd which is how sshshare.c asks ssh.c to start
sending it channel-open requests for a given remote forwarded port,
but I had no companion function that removes one of those requests
again when a downstream remote port forwarding goes away (either by
mid-session cancel-tcpip-forward or by the whole downstream
disconnecting).
As a result, the _second_ attempt to set up the same remote port
forwarding, after a sharing downstream had done so once and then
stopped, would quietly fail.
This must have been a bug introduced during the SSH-2 connection
sharing rework. Apparently nobody's ever re-tested SSH-1 X forwarding
since then - until I did so yesterday in the course of testing my
enormous refactor of the packet unmarshalling code.
I must have introduced this bug yesterday when I rewrote the packet
censoring functions using BinarySource. The base pointer passed to
log_packet was pointing at the right place, but the accompanying
length was the gross rather than net one, as it were - it counted the
extra header data we're about to insert at the _start_ of the packet,
so log_packet() was trying to print that many extra bytes at the _end_
and overrunning its buffer.
If the server sends an SSH_MSG_DISCONNECT, then we call
connection_fatal(). But if the server closes the network connection,
then we call connection_fatal(). In situations where the former
happens, the latter happens too.
Currently, calling connection_fatal twice is especially bad on GTK
because all dialogs are now non-modal and an assertion fails in the
GTK front end when two fatal message boxes try to exist at the same
time (the register_dialog system finds that slot is already occupied).
But regardless of that, we'd rather not even _try_ to print two fatal
boxes, because even if the front end doesn't fail an assertion,
there's no guarantee that the _more useful_ one of the messages will
end up being displayed. So a better fix is to have ssh.c make a
sensible decision about which message is the helpful one - in this
case, the actual error message out of the SSH_MSG_DISCONNECT, rather
than the predictable fact of the connection having been slammed shut
immediately afterwards - and only pass that one to the front end in
the first place.
Quite a few of the function pointers in the ssh_keyalg vtable now take
ptrlen arguments in place of separate pointer and length pairs.
Meanwhile, the various key types' implementations of those functions
now work by initialising a BinarySource with the input ptrlen and
using the new decode functions to walk along it.
One exception is the openssh_createkey method which reads a private
key in the wire format used by OpenSSH's SSH-2 agent protocol, which
has to consume a prefix of a larger data stream, and tell the caller
how much of that data was the private key. That function now takes an
actual BinarySource, and passes that directly to the decode functions,
so that on return the caller finds that the BinarySource's read
pointer has been advanced exactly past the private key.
This let me throw away _several_ reimplementations of mpint-reading
functions, one in each of sshrsa, sshdss.c and sshecc.c. Worse still,
they didn't all have exactly the SSH-2 semantics, because the thing in
sshrsa.c whose name suggested it was an mpint-reading function
actually tolerated the wrong number of leading zero bytes, which it
had to be able to do to cope with the "ssh-rsa" signature format which
contains a thing that isn't quite an SSH-2 mpint. Now that deviation
is clearly commented!
This is the function that breaks apart a signature blob (generated
locally or received from an SSH agent) and adds leading zero bytes in
front of the signature integer, if we think we're talking to a server
that will incorrectly insist on that. The breaking-apart process is
just another instance of SSH-style data unmarshalling, so it should be
done by the new centralised routines.
The 'savedpos' field in 'struct Packet', which was already unused on
the output side after I threw away ssh_pkt_addstring_start, is now
unused on the input side too because a BinarySource implementation has
taken over. So it's now completely gone.
I had a one-off 'Strange packet received' error yesterday for an
SSH_MSG_GLOBAL_REQUEST at connection startup. I wasn't able to
reproduce, but examining the packet logs from the successful retry
suggested that the global request in question was an OpenSSH 'here are
my hostkeys' message.
My belief is that in the failing connection that request packet must
have arrived exceptionally promptly, and been added to pq_full before
the preceding SSH_MSG_USERAUTH_SUCCESS had been processed. So the code
that loops along pq_full feeding everything to the dispatch table
would have moved the USERAUTH_SUCCESS on to pq_ssh2_userauth and
scheduled a callback to handle it, and then moved on to the
GLOBAL_REQUEST which would have gone straight to the dispatch table
handler for 'help, we weren't expecting this message'. The userauth
callback would later on have installed a more sensible dispatch table
handler for global requests, but by then, it was too late.
Solution: make a special case during pq_full processing, so that when
we see USERAUTH_SUCCESS we _immediately_ - before continuing to
traverse pq_full - install the initial dispatch table entries for the
ssh-connection protocol. That way, even if connection messages are
already in the queue, we won't get confused by them.
Now I've got FROMFIELD, I can rework it so that structures providing
an implementation of the Socket or Plug trait no longer have to have
the vtable pointer as the very first thing in the structure. In
particular, this means that the ProxySocket structure can now directly
implement _both_ the Socket and Plug traits, which is always
_logically_ how it's worked, but previously it had to be implemented
via two separate structs linked to each other.
During last week's work, I made a mistake in which I got the arguments
backwards in one of the key-blob-generating functions - mistakenly
swapped the 'void *' key instance with the 'BinarySink *' output
destination - and I didn't spot the mistake until run time, because in
C you can implicitly convert both to and from void * and so there was
no compile-time failure of type checking.
Now that I've introduced the FROMFIELD macro that downcasts a pointer
to one field of a structure to retrieve a pointer to the whole
structure, I think I might start using that more widely to indicate
this kind of polymorphic subtyping. So now all the public-key
functions in the struct ssh_signkey vtable handle their data instance
in the form of a pointer to a subfield of a new zero-sized structure
type 'ssh_key', which outside the key implementations indicates 'this
is some kind of key instance but it could be of any type'; they
downcast that pointer internally using FROMFIELD in place of the
previous ordinary C cast, and return one by returning &foo->sshk for
whatever foo they've just made up.
The sshk member is not at the beginning of the structure, which means
all those FROMFIELDs and &key->sshk are actually adding and
subtracting an offset. Of course I could have put the member at the
start anyway, but I had the idea that it's actually a feature _not_ to
have the two types start at the same address, because it means you
should notice earlier rather than later if you absentmindedly cast
from one to the other directly rather than by the approved method (in
particular, if you accidentally assign one through a void * and back
without even _noticing_ you perpetrated a cast). In particular, this
enforces that you can't sfree() the thing even once without realising
you should instead of called the right freekey function. (I found
several bugs by this method during initial testing, so I think it's
already proved its worth!)
While I'm here, I've also renamed the vtable structure ssh_signkey to
ssh_keyalg, because it was a confusing name anyway - it describes the
_algorithm_ for handling all keys of that type, not a specific key. So
ssh_keyalg is the collection of code, and ssh_key is one instance of
the data it handles.
This is a cleanup I started to notice a need for during the BinarySink
work. It removes a lot of faffing about casting things to char * or
unsigned char * so that some API will accept them, even though lots of
such APIs really take a plain 'block of raw binary data' argument and
don't care what C thinks the signedness of that data might be - they
may well reinterpret it back and forth internally.
So I've tried to arrange for all the function call APIs that ought to
have a void * (or const void *) to have one, and those that need to do
pointer arithmetic on the parameter internally can cast it back at the
top of the function. That saves endless ad-hoc casts at the call
sites.
This removes a lot of pointless duplications of those constants.
Of course, _ideally_, I should upgrade to C99 bool throughout the code
base, replacing TRUE and FALSE with true and false and tagging
variables explicitly as bool when that's what they semantically are.
But that's a much bigger piece of work, and shouldn't block this
trivial cleanup!
There was a time, back when the USA was more vigorously against
cryptography, when we toyed with the idea of having a version of PuTTY
that outsourced its cryptographic primitives to the Microsoft optional
encryption API, which would effectively create a tool that acted like
PuTTY proper on a system with that API installed, but automatically
degraded to being PuTTYtel on a system without, and meanwhile (so went
the theory) it could be moved freely across national borders with
crypto restrictions, because it didn't _contain_ any of the actual
crypto.
I don't recall that we ever got it working at all. And certainly the
vestiges of it here and there in the current code are completely
unworkable - they refer to an 'mscrypto.c' that doesn't even exist,
and the ifdefs in the definitions of structures like RSAKey and
MD5Context are not matched by any corresponding ifdefs in the code. So
I ought to have got round to removing it long ago, in order to avoid
misleading anyone.
At the point where the do_ssh2_connection coroutine rewrites lots of
dispatch table entries to point to things other than itself, there's a
possibility that it's too late, because some packets of those types
have already arrived and been put into pq_ssh2_connection. So you'd
get weird errors like 'Strange packet received: type 94', in which the
only thing that's strange is why packet 94 might conceivably be
unexpected, since it's SSH_MSG_CHANNEL_DATA!
(I observed this when 'plink host -nc otherhost:port' became a sharing
downstream, but I have no reason to think that's the only circumstance
where this can possibly occur, or even a reliable one. It just
happened to have the right timing.)
I'm not completely sure this is the _best_ solution, but it seems to
work: at the point when I rewrite the dispatch table, I also return
the whole of the connection packet queue to the main queue, so that
it'll be run through the dispatch table a second time which will
effectively filter out any packets that we've just installed new
handlers for.
This simplifies the client code both in ssh.c and in the client side
of Pageant.
I've cheated a tiny bit by preparing agent requests in a strbuf that
has space reserved at the front for the packet frame, which makes life
easier for the code that sends them off.
Just as I did a few commits ago with the low-level SHA_Bytes type
functions, the ssh_hash and ssh_mac abstract types now no longer have
a direct foo->bytes() update method at all. Instead, each one has a
foo->sink() function that returns a BinarySink with the same lifetime
as the hash context, and then the caller can feed data into that in
the usual way.
This lets me get rid of a couple more duplicate marshalling routines
in ssh.c: hash_string(), hash_uint32(), hash_mpint().
This affects all the functions that generate public and private key
and signature blobs of all kinds, plus ssh_ecdhkex_getpublic. Instead
of returning a bare block of memory and taking an extra 'int *length'
parameter, all these functions now write to a BinarySink, and it's the
caller's job to have prepared an appropriate one where they want the
output to go (usually a strbuf).
The main value of this change is that those blob-generation functions
were chock full of ad-hoc length-counting and data marshalling. You
have only to look at rsa2_{public,private}_blob, for example, to see
the kind of thing I was keen to get rid of!
There's no reason to pass a 'void *' through and then cast it back to
a packet. All those functions are really doing is serialising a pile
of output on to an arbitrary receiver, so it's nicer to use the new
abstract BinarySink type, and then the do_mode function doesn't have
to cast it at all.
This also removes one lingering ugliness from the previous commit, in
the form of the variable 'stringstart' I introduced in ssh2_setup_pty
to work around the removal of the addstring_start system. Now that's
done the sensible way, by constructing the subsidiary terminal-modes
string in a strbuf (taking advantage of the new type-genericity
meaning I can pass that to the underlying functions just as easily as
a struct Packet), and then putting that into the final packet
afterwards.
All previous uses of them are replaced by the new put_* family.
This is a fairly mechanical transformation, which doesn't take full
advantage of the new ways things can be done more usefully. I'll come
back and clean parts of it up in separate patches later; muddling that
together with this main search-and-replace part would make (even more
of) a giant unreadable monolith.
In fact, those functions don't even exist any more. The only way to
get data into a primitive hash state is via the new put_* system. Of
course, that means put_data() is a viable replacement for every
previous call to one of the per-hash update functions - but just
mechanically doing that would have missed the opportunity to simplify
a lot of the call sites.
Lots of functions had really generic names (like 'makekey'), or names
that missed out an important concept (like 'rsakey_pubblob', which
loads a public blob from a _file_ and doesn't generate it from an
in-memory representation at all). Also, the opaque 'int order' that
distinguishes the two formats of public key blob is now a mnemonic
enumeration, and while I'm at it, rsa_ssh1_public_blob takes one of
those as an extra argument.
Added a missing pq_push_front (see commit a41eefff4), without which
the SSH_MSG_USERAUTH_FAILURE was being dropped from the userauth
packet queue before returning to the top of the loop which waits for
it to arrive.
Another piece of fallout from this morning's patch series, which I
didn't notice until I left a session running for more than an hour:
once do_ssh2_transport is told to begin a rekey, it has no way of
knowing _not_ to immediately do another one, and another, and so on.
Added a value RK_NONE to the rekey class enumeration, and set
rekey_class to that immediately after a key exchange completes. Then a
new one won't start until some code actually sets rekey_class to a
nonzero value again.
Returning from the coroutine every time we finish putting together a
packet is wrong, because it means that any further data in the
incoming queue won't get processed until something triggers another
call to the coroutine. In ssh2_rdpkt I got this right - the only
crReturn that _isn't_ due to running out of data is the special one
immediately after a NEWKEYS - but I forgot to fix it the same way in
ssh1_rdpkt.
ssh->current_user_input_fn was not set to NULL when the ssh structure
was initially set up, which meant that with sufficiently eager
typeahead it could accidentally be called while still full of garbage.
And ssh->connshare is freed in more than one place (by ssh_free and
also by do_ssh_close), but only one of those places nulls it out to
stop the other one trying to free it a second time.
This fixes further embarrassing hangs in the wake of the big
restructuring, in which we'd assign a new function pointer into
ssh->current_user_input_fn but not immediately call it to let it
process whatever might already be waiting for it in the user input
queue.
A large part of the purpose of the 20-patch series just past is to
arrange that user input is never dropped on the floor: if you type it
at a moment where the active protocol coroutine has nothing it can
usefully do with it, the default action will now be to leave it on the
user_input queue where it will eventually be picked up by some later
coroutine, or later phase of this one, that does.
And did I _test_ this feature, end to end, just once, before pushing
the giant patch series?
I did not.
It doesn't work, and the reason why it doesn't work is because various
loops that spin round alternating a crReturn with a check of the input
queue do the first crReturn _before_ the first queue check. So if
there's already data in the queue, well, it won't be _dropped_, but it
also won't be passed on immediately to where it needs to be - instead,
it will sit in the queue until you press _another_ key, at which point
a queue check will happen and all your backed-up typeahead data will
come out.
Fixed by restructuring those loops to do a queue check first. This
applies to the final loop in do_ssh2_connection, and all the little
loops during userauth that prompt for usernames, passwords,
passphrases etc via get_userpass_input.
do_ssh2_authconn is no more! In its place are two separate functions
do_ssh2_userauth and do_ssh2_connection, each handling just one of the
three main protocol components of SSH-2, and each with its own
separate packet queue feeding from pq_full via the dispatch table.
This should be entirely NFC: no bugs are fixed as a side effect of
this change. It's mostly done in the hope that further refactoring in
the future might finally be able to split up ssh.c into smaller source
files, for which it's useful to have these two routines separate,
since each one will be talking to a completely different set of
supporting machinery.
This is not as trivial as it sounds, because although
do_ssh2_transport never uses user input, its in/inlen parameter pair
were nonetheless actually doing something, because by setting inlen to
special values -1 or -2 they doubled as a way to initiate a rekey with
a stated textual reason. So _that_ use of in and inlen has to be
replaced. (Good thing too - that was a horribly ugly API!)
I've replaced them with a new pair of fields in the main ssh
structure, a textual "rekey_reason" for the Event Log and an
enumeration "rekey_class" for discrimination within the code. In
particular, that allows me to get rid of the ugly strcmp that was
checking the textual rekey reason to find out whether the rekey was
due to a GSSAPI credentials update - now there's a separate enum value
for that kind of rekey and we can test for that more sensibly.
Even with my love of verbose commit messages, I don't think there's
anything interesting to say about this commit that the previous few
similar ones haven't mentioned already. This is just more of the same
transformations.
The connection protocol didn't have to do anything too complicated
with user input - just remember to check for it and turn it into
SSH1_CMSG_STDIN_DATA - so this is a much less involved change than the
previous user input conversions.
The userauth loop has always had a rather awkward feature whereby some
of its branches have _already_ received an SSH_MSG_USERAUTH_SUCCESS or
SSH_MSG_USERAUTH_FAILURE packet (e.g. because they have to wait for a
packet that might be one of those _or_ a continuation packet of some
kind), whereas other branches go back round to the top of the loop at
the moment that they know one of those will be the next packet to
arrive. So we had a flag 's->gotit' which tells the start of the loop
whether it's already sitting on the success or failure message, or
whether the first thing it should do is to crWait for one.
Now that the packets are coming from a linked list, there's a simpler
way to handle this: the top of the userauth loop _always_ expects to
find a success/failure message as the first thing in the queue, and if
any branch of the auth code finds it's already removed such a message
from the queue, it can simply put it back on the front again before
going back round.
Just like do_ssh1_login, do_ssh2_authconn is now no longer receiving
its packets via function arguments; instead, it's pulling them off a
dedicated PacketQueue populated by particular dispatch table entries,
with the same reference-counting system to stop them being freed when
the pq_full processing function regains control.
This eliminates further ugly bombout()s from the code while waiting
for authentication responses from the user or the SSH agent.
As I mentioned in the previous commit, I've had to keep
ssh2_authconn_input separate from do_ssh2_authconn, and this time the
reason why is thoroughly annoying: it's because do_ssh2_authconn has
changed its prototype to take the Ssh structure as a void * so that it
can be called from the idempotent callback system, whereas
ssh2_authconn_input has to have the type of ssh->current_user_input_fn
which takes the more sensible pointer type 'Ssh'.