This should have been moved over from the main ssh_free function back
when I did the original splitting-up of ssh.c: the transport layer
schedules a timer for rekeying (and also for GSSAPI credential
checks), so when it's freed, it needs to ensure the timer doesn't get
called anyway on a stale pointer.
Two users reported this in the form of an assertion failure in
conf_get_int (when ssh2_transport_timer asks for CONF_ssh_rekey_time,
if the tree234 call inside conf_get_int is confused by the contents of
the freed memory into returning failure). In other circumstances (if
the freed memory has different contents) it manifests as a segfault,
but it's the same underlying bug either way.
My normal habit these days, in new code, is to treat int and bool as
_almost_ completely separate types. I'm still willing to use C's
implicit test for zero on an integer (e.g. 'if (!blob.len)' is fine,
no need to spell it out as blob.len != 0), but generally, if a
variable is going to be conceptually a boolean, I like to declare it
bool and assign to it using 'true' or 'false' rather than 0 or 1.
PuTTY is an exception, because it predates the C99 bool, and I've
stuck to its existing coding style even when adding new code to it.
But it's been annoying me more and more, so now that I've decided C99
bool is an acceptable thing to require from our toolchain in the first
place, here's a quite thorough trawl through the source doing
'boolification'. Many variables and function parameters are now typed
as bool rather than int; many assignments of 0 or 1 to those variables
are now spelled 'true' or 'false'.
I managed this thorough conversion with the help of a custom clang
plugin that I wrote to trawl the AST and apply heuristics to point out
where things might want changing. So I've even managed to do a decent
job on parts of the code I haven't looked at in years!
To make the plugin's work easier, I pushed platform front ends
generally in the direction of using standard 'bool' in preference to
platform-specific boolean types like Windows BOOL or GTK's gboolean;
I've left the platform booleans in places they _have_ to be for the
platform APIs to work right, but variables only used by my own code
have been converted wherever I found them.
In a few places there are int values that look very like booleans in
_most_ of the places they're used, but have a rarely-used third value,
or a distinction between different nonzero values that most users
don't care about. In these cases, I've _removed_ uses of 'true' and
'false' for the return values, to emphasise that there's something
more subtle going on than a simple boolean answer:
- the 'multisel' field in dialog.h's list box structure, for which
the GTK front end in particular recognises a difference between 1
and 2 but nearly everything else treats as boolean
- the 'urgent' parameter to plug_receive, where 1 vs 2 tells you
something about the specific location of the urgent pointer, but
most clients only care about 0 vs 'something nonzero'
- the return value of wc_match, where -1 indicates a syntax error in
the wildcard.
- the return values from SSH-1 RSA-key loading functions, which use
-1 for 'wrong passphrase' and 0 for all other failures (so any
caller which already knows it's not loading an _encrypted private_
key can treat them as boolean)
- term->esc_query, and the 'query' parameter in toggle_mode in
terminal.c, which _usually_ hold 0 for ESC[123h or 1 for ESC[?123h,
but can also hold -1 for some other intervening character that we
don't support.
In a few places there's an integer that I haven't turned into a bool
even though it really _can_ only take values 0 or 1 (and, as above,
tried to make the call sites consistent in not calling those values
true and false), on the grounds that I thought it would make it more
confusing to imply that the 0 value was in some sense 'negative' or
bad and the 1 positive or good:
- the return value of plug_accepting uses the POSIXish convention of
0=success and nonzero=error; I think if I made it bool then I'd
also want to reverse its sense, and that's a job for a separate
piece of work.
- the 'screen' parameter to lineptr() in terminal.c, where 0 and 1
represent the default and alternate screens. There's no obvious
reason why one of those should be considered 'true' or 'positive'
or 'success' - they're just indices - so I've left it as int.
ssh_scp_recv had particularly confusing semantics for its previous int
return value: its call sites used '<= 0' to check for error, but it
never actually returned a negative number, just 0 or 1. Now the
function and its call sites agree that it's a bool.
In a couple of places I've renamed variables called 'ret', because I
don't like that name any more - it's unclear whether it means the
return value (in preparation) for the _containing_ function or the
return value received from a subroutine call, and occasionally I've
accidentally used the same variable for both and introduced a bug. So
where one of those got in my way, I've renamed it to 'toret' or 'retd'
(the latter short for 'returned') in line with my usual modern
practice, but I haven't done a thorough job of finding all of them.
Finally, one amusing side effect of doing this is that I've had to
separate quite a few chained assignments. It used to be perfectly fine
to write 'a = b = c = TRUE' when a,b,c were int and TRUE was just a
the 'true' defined by stdbool.h, that idiom provokes a warning from
gcc: 'suggest parentheses around assignment used as truth value'!
I think this is the full set of things that ought logically to be
boolean.
One annoyance is that quite a few radio-button controls in config.c
address Conf fields that are now bool rather than int, which means
that the shared handler function can't just access them all with
conf_{get,set}_int. Rather than back out the rigorous separation of
int and bool in conf.c itself, I've just added a similar alternative
handler function for the bool-typed ones.
This commit includes <stdbool.h> from defs.h and deletes my
traditional definitions of TRUE and FALSE, but other than that, it's a
100% mechanical search-and-replace transforming all uses of TRUE and
FALSE into the C99-standardised lowercase spellings.
No actual types are changed in this commit; that will come next. This
is just getting the noise out of the way, so that subsequent commits
can have a higher proportion of signal.
If values are boolean, it's confusing to use & and | in place of &&
and ||. In two of these three cases it was simply a typo and I've used
the other one; in the third, it was a deliberate avoidance of short-
circuit evaluation (and commented as such), but having seen how easy
it is to make the same typo by accident, I've decided it's clearer to
just move the LHS and RHS evaluations outside the expression.
This server is NOT SECURE! If anyone is reading this commit message,
DO NOT DEPLOY IT IN A HOSTILE-FACING ENVIRONMENT! Its purpose is to
speak the server end of everything PuTTY speaks on the client side, so
that I can test that I haven't broken PuTTY when I reorganise its
code, even things like RSA key exchange or chained auth methods which
it's hard to find a server that speaks at all.
(For this reason, it's declared with [UT] in the Recipe file, so that
it falls into the same category as programs like testbn, which won't
be installed by 'make install'.)
Working title is 'Uppity', partly for 'Universal PuTTY Protocol
Interaction Test Yoke', but mostly because it looks quite like the
word 'PuTTY' with part of it reversed. (Apparently 'test yoke' is a
very rarely used term meaning something not altogether unlike 'test
harness', which is a bit of a stretch, but it'll do.)
It doesn't actually _support_ everything I want yet. At the moment,
it's a proof of concept only. But it has most of the machinery
present, and the parts it's missing - such as chained auth methods -
should be easy enough to add because I've built in the required
flexibility, in the form of an AuthPolicy object which can request
them if it wants to. However, the current AuthPolicy object is
entirely trivial, and will let in any user with the password "weasel".
(Another way in which this is not a production-ready server is that it
also has no interaction with the OS's authentication system. In
particular, it will not only let in any user with the same password,
but it won't even change uid - it will open shells and forwardings
under whatever user id you started it up as.)
Currently, the program can only speak the SSH protocol on its standard
I/O channels (using the new FdSocket facility), so if you want it to
listen on a network port, you'll have to run it from some kind of
separate listening program similar to inetd. For my own tests, I'm not
even doing that: I'm just having PuTTY spawn it as a local proxy
process, which also conveniently eliminates the risk of anyone hostile
connecting to it.
The bulk of the actual code reorganisation is already done by previous
commits, so this change is _mostly_ just dropping in a new set of
server-specific source files alongside the client-specific ones I
created recently. The remaining changes in the shared SSH code are
numerous, but all minor:
- a few extra parameters to BPP and PPL constructors (e.g. 'are you
in server mode?'), and pass both sets of SSH-1 protocol flags from
the login to the connection layer
- in server mode, unconditionally send our version string _before_
waiting for the remote one
- a new hook in the SSH-1 BPP to handle enabling compression in
server mode, where the message exchange works the other way round
- new code in the SSH-2 BPP to do _deferred_ compression the other
way round (the non-deferred version is still nicely symmetric)
- in the SSH-2 transport layer, some adjustments to do key derivation
either way round (swapping round the identifying letters in the
various hash preimages, and making sure to list the KEXINITs in the
right order)
- also in the SSH-2 transport layer, an if statement that controls
whether we send SERVICE_REQUEST and wait for SERVICE_ACCEPT, or
vice versa
- new ConnectionLayer methods for opening outgoing channels for X and
agent forwardings
- new functions in portfwd.c to establish listening sockets suitable
for remote-to-local port forwarding (i.e. not under the direction
of a Conf the way it's done on the client side).
Lots of user-facing messages that claim that the 'server' just did
something or other unexpected will now need to be issued _by_ the
server, when the client does the same unexpected thing. So I've
reworded them all to talk about the 'remote side' instead of the
'server', and the SSH-2 key setup messages talk about initialising
inbound and outbound crypto primitives rather than client->server and
server->client.
This is a major code reorganisation in preparation for making this
code base into one that can build an SSH server as well as a client.
(Mostly for purposes of using the server as a regression test suite
for the client, though I have some other possible uses in mind too.
However, it's currently no part of my plan to harden the server to the
point where it can sensibly be deployed in a hostile environment.)
In this preparatory commit, I've broken up the SSH-2 transport and
connection layers, and the SSH-1 connection layer, into multiple
source files, with each layer having its own header file containing
the shared type definitions. In each case, the new source file
contains code that's specific to the client side of the protocol, so
that a new file can be swapped in in its place when building the
server.
Mostly this is just a straightforward moving of code without changing
it very much, but there are a couple of actual changes in the process:
The parsing of SSH-2 global-request and channel open-messages is now
done by a new pair of functions in the client module. For channel
opens, I've invented a new union data type to be the return value from
that function, representing either failure (plus error message),
success (plus Channel instance to manage the new channel), or an
instruction to hand the channel over to a sharing downstream (plus a
pointer to the downstream in question).
Also, the tree234 of remote port forwardings in ssh2connection is now
initialised on first use by the client-specific code, so that's where
its compare function lives. The shared ssh2connection_free() still
takes responsibility for freeing it, but now has to check if it's
non-null first.
The outer shell of the ssh2_lportfwd_open method, for making a
local-to-remote port forwarding, is still centralised in
ssh2connection.c, but the part of it that actually constructs the
outgoing channel-open message has moved into the client code, because
that will have to change depending on whether the channel-open has to
have type direct-tcpip or forwarded-tcpip.
In the SSH-1 connection layer, half the filter_queue method has moved
out into the new client-specific code, but not all of it -
bidirectional channel maintenance messages are still handled
centrally. One exception is SSH_MSG_PORT_OPEN, which can be sent in
both directions, but with subtly different semantics - from server to
client, it's referring to a previously established remote forwarding
(and must be rejected if there isn't one that matches it), but from
client to server it's just a "direct-tcpip" request with no prior
context. So that one is in the client-specific module, and when I add
the server code it will have its own different handler.
The function takes the two KEXINIT packets in their string form,
together with a list of mappings from names to known algorithm
implementations, and returns the selected one of each kind, along with
all the other necessary auxiliary stuff.
This has nice effects on code tidiness (quite a few variables now
become local to the new function instead of living permanently in the
transport layer), but mostly, the idea is to add flexibility by
introducing a convenient place to change the policy for how we write
the negotiation lists in our KEXINIT.
Somehow I managed to leave that line out in both SSH-1 and SSH-2's
functions for handling DISCONNECT, IGNORE and DEBUG, and in both
cases, only for DISCONNECT. Oops.
This is a new vtable-based abstraction which is passed to a backend in
place of Frontend, and it implements only the subset of the Frontend
functions needed by a backend. (Many other Frontend functions still
exist, notably the wide range of things called by terminal.c providing
platform-independent operations on the GUI terminal window.)
The purpose of making it a vtable is that this opens up the
possibility of creating a backend as an internal implementation detail
of some other activity, by providing just that one backend with a
custom Seat that implements the methods differently.
For example, this refactoring should make it feasible to directly
implement an SSH proxy type, aka the 'jump host' feature supported by
OpenSSH, aka 'open a secondary SSH session in MAINCHAN_DIRECT_TCP
mode, and then expose the main channel of that as the Socket for the
primary connection'. (Which of course you can already do by spawning
'plink -nc' as a separate proxy process, but this would permit it in
the _same_ process without anything getting confused.)
I've centralised a full set of stub methods in misc.c for the new
abstraction, which allows me to get rid of several annoying stubs in
the previous code. Also, while I'm here, I've moved a lot of
duplicated modalfatalbox() type functions from application main
program files into wincons.c / uxcons.c, which I think saves
duplication overall. (A minor visible effect is that the prefixes on
those console-based fatal error messages will now be more consistent
between applications.)
The variable s->e in ssh2_transport_state should never be freed by
ssh2transport itself, because it's owned by the dh_ctx, so it will be
freed by dh_cleanup.
The problem with OpenSSH delayed compression is that the spec has a
race condition. Compression is enabled when the server sends
USERAUTH_SUCCESS. In the server->client direction, that's fine: the
USERAUTH_SUCCESS packet is not itself compressed, and the next packet
in the same direction is. But in the client->server direction, this
specification relies on there being a moment of half-duplex in the
connection: the client can't send any outgoing packet _after_ whatever
userauth packet the USERAUTH_SUCCESS was a response to, and _before_
finding out whether the response is USERAUTH_SUCCESS or something
else. If it emitted, say, an SSH_MSG_IGNORE or initiated a rekey
(perhaps due to a timeout), then that might cross in the network with
USERAUTH_SUCCESS and the server wouldn't be able to know whether to
treat it as compressed.
My previous solution was to note the presence of delayed compression
options in the server KEXINIT, but not to negotiate them in the
initial key exchange. Instead, we conduct the userauth exchange with
compression="none", and then once userauth has concluded, we trigger
an immediate rekey in which we do accept delayed compression methods -
because of course by that time they're no different from the non-
delayed versions. And that means compression is enabled by the
bidirectional NEWKEYS exchange, which lacks that race condition.
I think OpenSSH itself gets away with this because its layer structure
is structure so as to never send any such asynchronous transport-layer
message in the middle of userauth. Ours is not. But my cunning plan is
that now that my BPP abstraction includes a queue of packets to be
sent and a callback that processes that queue on to the output raw
data bufchain, it's possible to make that callback terminate early, to
leave any dangerous transport-layer messages unsent while we wait for
a userauth response.
Specifically: if we've negotiated a delayed compression method and not
yet seen USERAUTH_SUCCESS, then ssh2_bpp_handle_output will emit all
packets from its queue up to and including the last one in the
userauth type-code range, and keep back any further ones. The idea is
that _if_ that last userauth message was one that might provoke
USERAUTH_SUCCESS, we don't want to send any difficult things after it;
if it's not (e.g. it's in the middle of some ongoing userauth process
like k-i or GSS) then the userauth layer will know that, and will emit
some further userauth packet on its own initiative which will clue us
in that it's OK to release everything up to and including that one.
(So in particular it wasn't even necessary to forbid _all_ transport-
layer packets during userauth. I could have done that by reordering
the output queue - packets in that queue haven't been assigned their
sequence numbers yet, so that would have been safe - but it's more
elegant not to have to.)
One particular case we do have to be careful about is not trying to
initiate a _rekey_ during userauth, if delayed compression is in the
offing. That's because when we start rekeying, ssh2transport stops
sending any higher-layer packets at all, to discourage servers from
trying to ignore the KEXINIT and press on regardless - you don't get
your higher-layer replies until you actually respond to the
lower-layer interrupt. But in this case, if ssh2transport sent a
KEXINIT, which ssh2bpp kept back in the queue to avoid a delayed
compression race and would only send if another userauth packet
followed it, which ssh2transport would never pass on to ssh2bpp's
output queue, there'd be a complete protocol deadlock. So instead I
defer any attempt to start a rekey until after userauth finishes
(using the existing system for starting a deferred rekey at that
moment, which was previously used for the _old_ delayed-compression
strategy, and still has to be here anyway for GSSAPI purposes).
The sshverstring quasi-frontend is passed a Frontend pointer at setup
time, so that it can generate Event Log entries containing the local
and remote version strings and the results of remote bug detection.
I'm promoting that field of sshverstring to a field of the public BPP
structure, so now all BPPs have the right to talk directly to the
frontend if they want to. This means I can move all the log messages
of the form 'Initialised so-and-so cipher/MAC/compression' down into
the BPPs themselves, where they can live exactly alongside the actual
initialisation of those primitives.
It also means BPPs will be able to log interesting things they detect
at any point in the packet stream, which is about to come in useful
for another purpose.
Ian Jackson points out that the Linux kernel has a macro of this name
with the same purpose, and suggests that it's a good idea to use the
same name as they do, so that at least some people reading one code
base might recognise it from the other.
I never really thought very hard about what order FROMFIELD's
parameters should go in, and therefore I'm pleasantly surprised to
find that my order agrees with the kernel's, so I don't have to
permute every call site as part of making this change :-)
When I separated out the transport layer into its own source file, I
also reworked the logic deciding when to rekey, and apparently that
rework introduced a braino in which I compared rekey_reason (which is
a pointer) to RK_NONE (which is a value of the enumerated type that
lives in the similarly named variable rekey_class). Oops. The result
was that after the first rekey, the loop would terminate the next time
the transport coroutine got called, because the code just before the
loop had zeroed out rekey_class but not rekey_reason. So there'd be a
rekey on every keypress, or similar.
I've tried to separate out as many individually coherent changes from
this work as I could into their own commits, but here's where I run
out and have to commit the rest of this major refactoring as a
big-bang change.
Most of ssh.c is now no longer in ssh.c: all five of the main
coroutines that handle layers of the SSH-1 and SSH-2 protocols now
each have their own source file to live in, and a lot of the
supporting functions have moved into the appropriate one of those too.
The new abstraction is a vtable called 'PacketProtocolLayer', which
has an input and output packet queue. Each layer's main coroutine is
invoked from the method ssh_ppl_process_queue(), which is usually
(though not exclusively) triggered automatically when things are
pushed on the input queue. In SSH-2, the base layer is the transport
protocol, and it contains a pair of subsidiary queues by which it
passes some of its packets to the higher SSH-2 layers - first userauth
and then connection, which are peers at the same level, with the
former abdicating in favour of the latter at the appropriate moment.
SSH-1 is simpler: the whole login phase of the protocol (crypto setup
and authentication) is all in one module, and since SSH-1 has no
repeat key exchange, that setup layer abdicates in favour of the
connection phase when it's done.
ssh.c itself is now about a tenth of its old size (which all by itself
is cause for celebration!). Its main job is to set up all the layers,
hook them up to each other and to the BPP, and to funnel data back and
forth between that collection of modules and external things such as
the network and the terminal. Once it's set up a collection of packet
protocol layers, it communicates with them partly by calling methods
of the base layer (and if that's ssh2transport then it will delegate
some functionality to the corresponding methods of its higher layer),
and partly by talking directly to the connection layer no matter where
it is in the stack by means of the separate ConnectionLayer vtable
which I introduced in commit 8001dd4cb, and to which I've now added
quite a few extra methods replacing services that used to be internal
function calls within ssh.c.
(One effect of this is that the SSH-1 and SSH-2 channel storage is now
no longer shared - there are distinct struct types ssh1_channel and
ssh2_channel. That means a bit more code duplication, but on the plus
side, a lot fewer confusing conditionals in the middle of half-shared
functions, and less risk of a piece of SSH-1 escaping into SSH-2 or
vice versa, which I remember has happened at least once in the past.)
The bulk of this commit introduces the five new source files, their
common header sshppl.h and some shared supporting routines in
sshcommon.c, and rewrites nearly all of ssh.c itself. But it also
includes a couple of other changes that I couldn't separate easily
enough:
Firstly, there's a new handling for socket EOF, in which ssh.c sets an
'input_eof' flag in the BPP, and that responds by checking a flag that
tells it whether to report the EOF as an error or not. (This is the
main reason for those new BPP_READ / BPP_WAITFOR macros - they can
check the EOF flag every time the coroutine is resumed.)
Secondly, the error reporting itself is changed around again. I'd
expected to put some data fields in the public PacketProtocolLayer
structure that it could set to report errors in the same way as the
BPPs have been doing, but in the end, I decided propagating all those
data fields around was a pain and that even the BPPs shouldn't have
been doing it that way. So I've reverted to a system where everything
calls back to functions in ssh.c itself to report any connection-
ending condition. But there's a new family of those functions,
categorising the possible such conditions by semantics, and each one
has a different set of detailed effects (e.g. how rudely to close the
network connection, what exit status should be passed back to the
whole application, whether to send a disconnect message and/or display
a GUI error box).
I don't expect this to be immediately perfect: of course, the code has
been through a big upheaval, new bugs are expected, and I haven't been
able to do a full job of testing (e.g. I haven't tested every auth or
kex method). But I've checked that it _basically_ works - both SSH
protocols, all the different kinds of forwarding channel, more than
one auth method, Windows and Linux, connection sharing - and I think
it's now at the point where the easiest way to find further bugs is to
let it out into the wild and see what users can spot.