Some upcoming restructuring I've got planned will need to pass output
packets back and forth on queues, as well as input ones. So here's a
change that arranges that we can have a PktInQueue and a PktOutQueue,
sharing most of their implementation via a PacketQueueBase structure
which links together the PacketQueueNode fields in the two packet
structures.
There's a tricksy bit of macro manoeuvring to get all of this
type-checked, so that I can't accidentally link a PktOut on to a
PktInQueue or vice versa. It works by having the main queue functions
wrapped by macros; when receiving a packet structure on input, they
type-check it against the queue structure and then automatically look
up its qnode field to pass to the underlying PacketQueueBase function;
on output, they translate a returned PacketQueueNode back to its
containing packet type by calling a 'get' function pointer.
Now there's a centralised routine in misc.c to do the sanitisation,
which copies data on to an outgoing bufchain. This allows me to remove
from_backend_untrusted() completely from the frontend API, simplifying
code in several places.
Two use cases for untrusted-terminal-data sanitisation were in the
terminal.c prompts handler, and in the collection of SSH-2 userauth
banners. Both of those were writing output to a bufchain anyway, so
it was very convenient to just replace a bufchain_add with
sanitise_term_data and then not have to worry about it again.
There was also a simplistic sanitiser in uxcons.c, which I've now
replaced with a call to the good one - and in wincons.c there was a
FIXME saying I ought to get round to that, which now I have!
Getting it out of the overgrown ssh.c is worthwhile in itself! But
there are other benefits of this reorganisation too.
One is that I get to remove ssh->current_incoming_data_fn, because now
_all_ incoming network data is handled by whatever the current BPP is.
So now we only indirect through the BPP, not through some other
preliminary function pointer _and_ the BPP.
Another is that all _outgoing_ network data is now handled centrally,
including our outgoing version string - which means that a hex dump of
that string now shows up in the raw-data log file, from which it was
previously conspicuous by its absence.
This replaces the previous log(n)^2 algorithm for channel-number
allocation, which binary-searched the space of tree indices using a
log-time call to index234() at each step, with a single log-time pass
down the tree which only has to check the returned channel number
against the returned tree index at each step.
I'm under no illusions that this was a critical performance issue, but
it's been offending my sense of algorithmic elegance for a while.
This is a vtable that wraps up all the functionality required from the
SSH connection layer by associated modules like port forwarding and
connection sharing. This extra layer of indirection adds nothing
useful right now, but when I later separate the SSH-1 and SSH-2
connection layer implementations, it will be convenient for each one
to be able to implement this vtable in terms of its own internal data
structures.
To simplify this vtable, I've moved a lot of the logging duties
relating to connection sharing out of ssh.c into sshshare.c: now it
handles nearly all the logging itself relating to setting up
connection sharing in the first place and downstreams connecting and
disconnecting. The only exception is the 'Reusing a shared connection'
announcement in the console window, which is now done in ssh.c by
detecting downstream status immediately after setup.
The tree234 storing currently active port forwardings - both local and
remote - now lives in portfwd.c, as does the complicated function that
updates it based on a Conf listing the new set of desired forwardings.
Local port forwardings are passed to ssh.c via the same route as
before - once the listening port receives a connection and portfwd.c
knows where it should be directed to (in particular, after the SOCKS
exchange, if any), it calls ssh_send_port_open.
Remote forwardings are now initiated by calling ssh_rportfwd_alloc,
which adds an entry to the rportfwds tree (which _is_ still in ssh.c,
and still confusingly sorted by a different criterion depending on SSH
protocol version) and sends out the appropriate protocol request.
ssh_rportfwd_remove cancels one again, sending a protocol request too.
Those functions look enough like ssh_{alloc,remove}_sharing_rportfwd
that I've merged those into the new pair as well - now allocating an
rportfwd allows you to specify either a destination host/port or a
sharing context, and returns a handy pointer you can use to cancel the
forwarding later.
Clients outside ssh.c - all implementations of Channel - will now not
see the ssh_channel data type itself, but only a subobject of the
interface type SshChannel. All the sshfwd_* functions have become
methods in that interface type's vtable (though, wrapped in the usual
kind of macros, the call sites look identical).
This paves the way for me to split up the SSH-1 and SSH-2 connection
layers and have each one lay out its channel bookkeeping structure as
it sees fit; as long as they each provide an implementation of the
sshfwd_ method family, the types behind that need not look different.
A minor good effect of this is that the sshfwd_ methods are no longer
global symbols, so they don't have to be stubbed in Unix Pageant to
get it to compile.
This was mildly fiddly because there's a single vtable structure that
implements two distinct interface types, one for compression and one
for decompression - and I have actually confused them before now
(commit d4304f1b7), so I think it's important to make them actually be
separate types!
The new version of ssh_hash has the same nice property as ssh2_mac,
that I can make the generic interface object type function directly as
a BinarySink so that clients don't have to call h->sink() and worry
about the separate sink object they get back from that.
This piece of tidying-up has come out particularly well in terms of
saving tedious repetition and boilerplate. I've managed to remove
three pointless methods from every MAC implementation by means of
writing them once centrally in terms of the implementation-specific
methods; another method (hmacmd5_sink) vanished because I was able to
make the interface type 'ssh2_mac' be directly usable as a BinarySink
by way of a new delegation system; and because all the method
implementations can now find their own vtable, I was even able to
merge a lot of keying and output functions that had previously
differed only in length parameters by having them look up the lengths
in whatever vtable they were passed.
This is more or less the same job as the SSH-1 case, only more
extensive, because we have a wider range of ciphers.
I'm a bit disappointed about the AES case, in particular, because I
feel as if it ought to have been possible to arrange to combine this
layer of vtable dispatch with the subsidiary one that selects between
hardware and software implementations of the underlying cipher. I may
come back later and have another try at that, in fact.
The interchangeable system of SSH-1 ciphers previously followed the
same pattern as the backends and the public-key algorithms, in that
all the clients would maintain two separate pointers, one to the
vtable and the other to the individual instance / context. Now I've
merged them, just as I did with those other two, so that you only cart
around a single pointer, which has a vtable pointer inside it and a
type distinguishing it from an instance of any of the other
interchangeable sets of algorithms.
This was a particularly confusing piece of type-danger, because three
different types were passed outside sshshare.c as 'void *' and only
human vigilance prevented one coming back as the wrong one. Now they
all keep their opaque structure tags when they move through other
parts of the code.
There's now an interface called 'Channel', which handles the local
side of an SSH connection-layer channel, in terms of knowing where to
send incoming channel data to, whether to close the channel, etc.
Channel and the previous 'struct ssh_channel' mutually refer. The
latter contains all the SSH-specific parts, and as much of the common
logic as possible: in particular, Channel doesn't have to know
anything about SSH packet formats, or which SSH protocol version is in
use, or deal with all the fiddly stuff about window sizes - with the
exception that x11fwd.c's implementation of it does have to be able to
ask for a small fixed initial window size for the bodgy system that
distinguishes upstream from downstream X forwardings.
I've taken the opportunity to move the code implementing the detailed
behaviour of agent forwarding out of ssh.c, now that all of it is on
the far side of a uniform interface. (This also means that if I later
implement agent forwarding directly to a Unix socket as an
alternative, it'll be a matter of changing just the one call to
agentf_new() that makes the Channel to plug into a forwarding.)
This is another major source of unexplained 'void *' parameters
throughout the code.
In particular, the currently unused testback.c actually gave the wrong
pointer type to its internal store of the frontend handle - it cast
the input void * to a Terminal *, from which it got implicitly cast
back again when calling from_backend, and nobody noticed. Now it uses
the right type internally as well as externally.
Nearly every part of the code that ever handles a full backend
structure has historically done it using a pair of pointer variables,
one pointing at a constant struct full of function pointers, and the
other pointing to a 'void *' state object that's passed to each of
those.
While I'm modernising the rest of the code, this seems like a good
time to turn that into the same more or less type-safe and less
cumbersome system as I'm using for other parts of the code, such as
Socket, Plug, BinaryPacketProtocol and so forth: the Backend structure
contains a vtable pointer, and a system of macro wrappers handles
dispatching through that vtable.
Same principle again - the more of these structures have globally
visible tags (even if the structure contents are still opaque in most
places), the fewer of them I can mistake for each other.
That's one fewer anonymous 'void *' which might be accidentally
confused with some other pointer type if I misremember the order of
function arguments.
While I'm here, I've made its pointer-nature explicit - that is,
'Ldisc' is now a typedef for the structure type itself rather than a
pointer to it. A stylistic change only, but it feels more natural to
me these days for a thing you're going to eventually pass to a 'free'
function.
Now when we construct a packet containing sensitive data, we just set
a field saying '... and make it take up at least this much space, to
disguise its true size', and nothing in the rest of the system worries
about that flag until ssh2bpp.c acts on it.
Also, I've changed the strategy for doing the padding. Previously, we
were following the real packet with an SSH_MSG_IGNORE to make up the
size. But that was only a partial defence: it works OK against passive
traffic analysis, but an attacker proxying the TCP stream and
dribbling it out one byte at a time could still have found out the
size of the real packet by noting when the dribbled data provoked a
response. Now I put the SSH_MSG_IGNORE _first_, which should defeat
that attack.
But that in turn doesn't work when we're doing compression, because we
can't predict the compressed sizes accurately enough to make that
strategy sensible. Fortunately, compression provides an alternative
strategy anyway: if we've got zlib turned on when we send one of these
sensitive packets, then we can pad out the compressed zlib data as
much as we like by adding empty RFC1951 blocks (effectively chaining
ZLIB_PARTIAL_FLUSHes). So both strategies should now be dribble-proof.
The return value wasn't used to indicate failure; it only indicated
whether any compression was being done at all or whether the
compression method was ssh_comp_none, and we can tell the latter just
as well by the fact that its init function returns a null context
pointer.
Yesterday's reinstatement of ssh_free_pktout revealed - via valgrind
spotting the use-after-free - that the code that prefixed sensible
packets with IV-muddling SSH_MSG_IGNOREs was actually sending a second
copy of the sensible packet in place of the IGNORE, due to a typo.
On the post-userauth rekey, when we're specifically rekeying in order
to turn on delayed compression, we shouldn't write the Event Log
"Server supports delayed compression; will try this later" message
that we did in the original key exchange. At this point, it _is_
later, and we're about to turn on compression right now!
When the whole SSH connection is throttled and then unthrottled, we
need to requeue the callback that transfers data to the Socket from
the new outgoing_data queue introduced in commit 9e3522a97.
The user-visible effect of this missing call was that outgoing SFTP
transactions would lock up, because in SFTP mode we enable the
"simple@putty.projects.tartarus.org" mode and essentially turn off the
per-channel window management, so throttling of the whole connection
becomes the main source of back-pressure.
Now we have the new marshalling system, I think it's outlived its
usefulness, because the new system allows us to directly express
various things (e.g. uint16 and non-zero-terminated strings) that were
actually _more_ awkward to do via the variadic interface. So here's a
rewrite that removes send_packet(), and replaces all its call sites
with something that matches our SSH-2 packet construction idioms.
This diff actually _reduces_ the number of lines of code in ssh.c.
Since the variadic system was trying to save code by centralising
things, that seems like the best possible evidence that it wasn't
pulling its weight!
sshbpp.h now defines a classoid that encapsulates both directions of
an SSH binary packet protocol - that is, a system for reading a
bufchain of incoming data and turning it into a stream of PktIn, and
another system for taking a PktOut and turning it into data on an
outgoing bufchain.
The state structure in each of those files contains everything that
used to be in the 'rdpkt2_state' structure and its friends, and also
quite a lot of bits and pieces like cipher and MAC states that used to
live in the main Ssh structure.
One minor effect of this layer separation is that I've had to extend
the packet dispatch table by one, because the BPP layer can no longer
directly trigger sending of SSH_MSG_UNIMPLEMENTED for a message too
short to have a type byte. Instead, I extend the PktIn type field to
use an out-of-range value to encode that, and the easiest way to make
that trigger an UNIMPLEMENTED message is to have the dispatch table
contain an entry for it.
(That's a system that may come in useful again - I was also wondering
about inventing a fake type code to indicate network EOF, so that that
could be propagated through the layers and be handled by whichever one
currently knew best how to respond.)
I've also moved the packet-censoring code into its own pair of files,
partly because I was going to want to do that anyway sooner or later,
and mostly because it's called from the BPP code, and the SSH-2
version in particular has to be called from both the main SSH-2 BPP
and the bare unencrypted protocol used for connection sharing. While I
was at it, I took the opportunity to merge the outgoing and incoming
censor functions, so that the parts that were common between them
(e.g. CHANNEL_DATA messages look the same in both directions) didn't
need to be repeated.
This mirrors the use of one for incoming wire data: now when we send
raw data (be it the initial greeting, or the output of binary packet
construction), we put it on ssh->outgoing_data, and schedule a
callback to transfer that into the socket.
Partly this is in preparation for delegating the task of appending to
that bufchain to a separate self-contained module that won't have
direct access to the connection's Socket. But also, it has the very
nice feature that I get to throw away the ssh_pkt_defer system
completely! That was there so that we could construct more than one
packet in rapid succession, concatenate them into a single blob, and
pass that blob to the socket in one go so that the TCP headers
couldn't contain any trace of where the boundary between them was. But
now we don't need a separate function to do that: calling the ordinary
packet-send routine twice in the same function before returning to the
main event loop will have that effect _anyway_.
ssh.c has been an unmanageably huge monolith of a source file for too
long, and it's finally time I started breaking it up into smaller
pieces. The first step is to move some declarations - basic types like
packets and packet queues, standard constants, enums, and the
coroutine system - into headers where other files can see them.
It is to put_data what memset is to memcpy. Several places
in the code wanted it already, but not _quite_ enough for me to
have written it with the rest of the BinarySink infrastructure
originally.
This saves a malloc and free every time we add or remove a packet from
a packet queue - it can now be done by pure pointer-shuffling instead
of allocating a separate list node structure.
The body pointer was used after encryption to mark the start of the
fully wire-ready packet by ssh2_pkt_construct, and before encryption
by the log_outgoing_packet functions. Now the former returns a nice
modern ptrlen (it never really needed to store the pointer _in_ the
packet structure anyway), and the latter uses an integer 'prefix'
field, which isn't very different in concept but saves effort on
reallocs.
These were only used in the rdpkt coroutines, during construction of
the incoming packet; once it's complete, they're never touched again.
So really they should have been fields in the rdpkt coroutines' state
- and now they are.
The new memory allocation strategy for incoming packets is to defer
creation of the returned pktin structure until we know how big its
data buffer will really need to be, and then use snew_plus to make the
PktIn and the payload block in the same allocation.
When we have to read and keep some amount of the packet before
allocating the returned structure, we do it by having a persistent
buffer in the rdpkt state, which is retained for the whole connection
and only freed once in ssh_free.
They were duplicating values stored in the BinarySource substructure.
Mostly they're not referred to directly any more (instead, we call
get_foo to access the BinarySource); and when they are, we can switch
to reading the same values back out of the BinarySource anyway.
This is the first stage of massively tidying up this very confused
data structure. In this commit, I replace the unified 'struct Packet'
with two structures PktIn and PktOut, each of which contains only the
fields of struct Packet that are actually used for packets going in
that direction - most notably, PktIn doesn't implement BinarySink, and
PktOut doesn't implement BinarySource.
All uses of the old structure were statically determinable to be one
or the other, so I've done that determination and changed all the
types of variables and function signatures.
Unlike PktIn, PktOut is not reference-counted, so there's a new
ssh_pktout_free function.
The most immediately pleasing thing about this change is that it lets
me finally get rid of the tedious comment explaining how the 'length'
field in struct Packet meant something different depending on
direction. Now it's two fields of the same name in two different
structures, I can comment the same thing much less verbosely!
(I've also got rid of the comment claiming the type field was only
used for incoming packets. That wasn't even true! It might have been
once, because you can write an outgoing packet's type byte straight
into its data buffer, but in fact in the current code pktout->type is
nonetheless used in various other places, e.g. log_outgoing_packet.)
In this commit I've only removed the fields from each structure that
were _already_ unused. There are still quite a few we can get rid of
by _making_ them unused.
After Pavel Kryukov pointed out that I have to put _something_ in the
'ssh_key' structure, I thought of an actually useful thing to put
there: why not make it store a pointer to the ssh_keyalg structure?
Then ssh_key becomes a classoid - or perhaps 'traitoid' is a closer
analogy - in the same style as Socket and Plug. And just like Socket
and Plug, I've also arranged a system of wrapper macros that avoid the
need to mention the 'object' whose method you're invoking twice at
each call site.
The new vtable pointer directly replaces an existing field of struct
ec_key (which was usable by several different ssh_keyalgs, so it
already had to store a pointer to the currently active one), and also
replaces the 'alg' field of the ssh2_userkey structure that wraps up a
cryptographic key with its comment field.
I've also taken the opportunity to clean things up a bit in general:
most of the methods now have new and clearer names (e.g. you'd never
know that 'newkey' made a public-only key while 'createkey' made a
public+private key pair unless you went and looked it up, but now
they're called 'new_pub' and 'new_priv' you might be in with a
chance), and I've completely removed the openssh_private_npieces field
after realising that it was duplicating information that is actually
_more_ conveniently obtained by calling the new_priv_openssh method
(formerly openssh_createkey) and throwing away the result.
This parameter returned a substring of the input, which was used for
two purposes. Firstly, it was used to hash the host and server keys
during the initial SSH-1 key setup phase; secondly, it was used to
check the keys in Pageant against the public key blob of a key
specified on the command line.
Unfortunately, those two purposes didn't agree! The first one needs
just the bare key modulus bytes (without even the SSH-1 mpint length
header); the second needs the entire key blob. So, actually, it seems
to have never worked in SSH-1 to say 'putty -i keyfile' and have PuTTY
find that key in Pageant and not have to ask for the passphrase to
decrypt the version on disk.
Fixed by removing that parameter completely, which simplifies all the
_other_ call sites, and replacing it by custom code in those two
places that each does the actually right thing.
It's an SSH-1 specific function, so it should have a name reflecting
that, and it didn't. Also it had one of those outdated APIs involving
passing it a client-allocated buffer and size. Now it has a sensible
name, and internally it constructs the output string using a strbuf
and returns it dynamically allocated.
Another piece of half-finished machinery that I can't have tested
properly when I set up connection sharing: I had the function
ssh_alloc_sharing_rportfwd which is how sshshare.c asks ssh.c to start
sending it channel-open requests for a given remote forwarded port,
but I had no companion function that removes one of those requests
again when a downstream remote port forwarding goes away (either by
mid-session cancel-tcpip-forward or by the whole downstream
disconnecting).
As a result, the _second_ attempt to set up the same remote port
forwarding, after a sharing downstream had done so once and then
stopped, would quietly fail.
This must have been a bug introduced during the SSH-2 connection
sharing rework. Apparently nobody's ever re-tested SSH-1 X forwarding
since then - until I did so yesterday in the course of testing my
enormous refactor of the packet unmarshalling code.
I must have introduced this bug yesterday when I rewrote the packet
censoring functions using BinarySource. The base pointer passed to
log_packet was pointing at the right place, but the accompanying
length was the gross rather than net one, as it were - it counted the
extra header data we're about to insert at the _start_ of the packet,
so log_packet() was trying to print that many extra bytes at the _end_
and overrunning its buffer.
If the server sends an SSH_MSG_DISCONNECT, then we call
connection_fatal(). But if the server closes the network connection,
then we call connection_fatal(). In situations where the former
happens, the latter happens too.
Currently, calling connection_fatal twice is especially bad on GTK
because all dialogs are now non-modal and an assertion fails in the
GTK front end when two fatal message boxes try to exist at the same
time (the register_dialog system finds that slot is already occupied).
But regardless of that, we'd rather not even _try_ to print two fatal
boxes, because even if the front end doesn't fail an assertion,
there's no guarantee that the _more useful_ one of the messages will
end up being displayed. So a better fix is to have ssh.c make a
sensible decision about which message is the helpful one - in this
case, the actual error message out of the SSH_MSG_DISCONNECT, rather
than the predictable fact of the connection having been slammed shut
immediately afterwards - and only pass that one to the front end in
the first place.
Quite a few of the function pointers in the ssh_keyalg vtable now take
ptrlen arguments in place of separate pointer and length pairs.
Meanwhile, the various key types' implementations of those functions
now work by initialising a BinarySource with the input ptrlen and
using the new decode functions to walk along it.
One exception is the openssh_createkey method which reads a private
key in the wire format used by OpenSSH's SSH-2 agent protocol, which
has to consume a prefix of a larger data stream, and tell the caller
how much of that data was the private key. That function now takes an
actual BinarySource, and passes that directly to the decode functions,
so that on return the caller finds that the BinarySource's read
pointer has been advanced exactly past the private key.
This let me throw away _several_ reimplementations of mpint-reading
functions, one in each of sshrsa, sshdss.c and sshecc.c. Worse still,
they didn't all have exactly the SSH-2 semantics, because the thing in
sshrsa.c whose name suggested it was an mpint-reading function
actually tolerated the wrong number of leading zero bytes, which it
had to be able to do to cope with the "ssh-rsa" signature format which
contains a thing that isn't quite an SSH-2 mpint. Now that deviation
is clearly commented!
This is the function that breaks apart a signature blob (generated
locally or received from an SSH agent) and adds leading zero bytes in
front of the signature integer, if we think we're talking to a server
that will incorrectly insist on that. The breaking-apart process is
just another instance of SSH-style data unmarshalling, so it should be
done by the new centralised routines.
The 'savedpos' field in 'struct Packet', which was already unused on
the output side after I threw away ssh_pkt_addstring_start, is now
unused on the input side too because a BinarySource implementation has
taken over. So it's now completely gone.
I had a one-off 'Strange packet received' error yesterday for an
SSH_MSG_GLOBAL_REQUEST at connection startup. I wasn't able to
reproduce, but examining the packet logs from the successful retry
suggested that the global request in question was an OpenSSH 'here are
my hostkeys' message.
My belief is that in the failing connection that request packet must
have arrived exceptionally promptly, and been added to pq_full before
the preceding SSH_MSG_USERAUTH_SUCCESS had been processed. So the code
that loops along pq_full feeding everything to the dispatch table
would have moved the USERAUTH_SUCCESS on to pq_ssh2_userauth and
scheduled a callback to handle it, and then moved on to the
GLOBAL_REQUEST which would have gone straight to the dispatch table
handler for 'help, we weren't expecting this message'. The userauth
callback would later on have installed a more sensible dispatch table
handler for global requests, but by then, it was too late.
Solution: make a special case during pq_full processing, so that when
we see USERAUTH_SUCCESS we _immediately_ - before continuing to
traverse pq_full - install the initial dispatch table entries for the
ssh-connection protocol. That way, even if connection messages are
already in the queue, we won't get confused by them.