A user reported a phenomenon where running 'plink -shareexists' very
early in the connection would cause the receiving upstream PuTTY to
exit cleanly with the message 'All channels closed' in the log.
That wasn't hard to track down: that happens as a result of the
connection layer callback sharing_no_more_downstreams(), which causes
the connection layer to check whether it has any channels left open,
and if not, to terminate the connection on the grounds that everything
has finished. But it's premature to draw that conclusion if the reason
no channels are open if we haven't _started_ yet! Now we have a
'started' flag which is set when we initialise mainchan, and the
'we're all done now' check will never fire before that flag is set.
But in the course of investigating that, I found a second problem in
the same area: at an even earlier stage of an SSH connection, the
connshare system doesn't _even_ have the real ConnectionLayer pointer
yet. Instead, it has a pointer to a dummy one provided by the
top-level ssh.c, which contains a NULL vtable pointer. So if it calls
sharing_no_more_downstreams on _that_ ConnectionLayer, it will
dereference NULL and crash. So I've filled in cl_dummy's vtable
pointer with a trivial vtable, containing only the one callback
sharing_no_more_downstreams, which itself is a no-op function.
Hopefully that should all be stable now.
Now, instead of a 'const char *' in the static data segment, error
messages returned from backend setup are dynamically allocated and
freed by the caller.
This will allow me to make the messages much more specific (including
errno values and the like). However, this commit is pure refactoring:
I've _just_ changed the allocation policy, and left all the messages
alone.
This is a sweeping change applied across the whole code base by a spot
of Emacs Lisp. Now, everywhere I declare a vtable filled with function
pointers (and the occasional const data member), all the members of
the vtable structure are initialised by name using the '.fieldname =
value' syntax introduced in C99.
We were already using this syntax for a handful of things in the new
key-generation progress report system, so it's not new to the code
base as a whole.
The advantage is that now, when a vtable only declares a subset of the
available fields, I can initialise the rest to NULL or zero just by
leaving them out. This is most dramatic in a couple of the outlying
vtables in things like psocks (which has a ConnectionLayerVtable
containing only one non-NULL method), but less dramatically, it means
that the new 'flags' field in BackendVtable can be completely left out
of every backend definition except for the SUPDUP one which defines it
to a nonzero value. Similarly, the test_for_upstream method only used
by SSH doesn't have to be mentioned in the rest of the backends;
network Plugs for listening sockets don't have to explicitly null out
'receive' and 'sent', and vice versa for 'accepting', and so on.
While I'm at it, I've normalised the declarations so they don't use
the unnecessarily verbose 'struct' keyword. Also a handful of them
weren't const; now they are.
This is the same protocol that PuTTY's connection sharing has been
using for years, to communicate between the downstream and upstream
PuTTYs. I'm now promoting it to be a first-class member of the
protocols list: if you have a server for it, you can select it in the
GUI or on the command line, and write out a saved session that
specifies it.
This would be completely insecure if you used it as an ordinary
network protocol, of course. Not only is it non-cryptographic and wide
open to eavesdropping and hijacking, but it's not even _authenticated_
- it begins after the userauth phase of SSH. So there isn't even the
mild security theatre of entering an easy-to-eavesdrop password, as
there is with, say, Telnet.
However, that's not what I want to use it for. My aim is to use it for
various specialist and niche purposes, all of which involve speaking
it over an 8-bit-clean data channel that is already set up, secured
and authenticated by other methods. There are lots of examples of such
channels:
- a userv(1) invocation
- the console of a UML kernel
- the stdio channels into other kinds of container, such as Docker
- the 'adb shell' channel (although it seems quite hard to run a
custom binary at the far end of that)
- a pair of pipes between PuTTY and a Cygwin helper process
- and so on.
So this protocol is intended as a convenient way to get a client at
one end of any those to run a shell session at the other end. Unlike
other approaches, it will give you all the SSH-flavoured amenities
you're already used to, like forwarding your SSH agent into the
container, or forwarding selected network ports in or out of it, or
letting it open a window on your X server, or doing SCP/SFTP style
file transfer.
Of course another way to get all those amenities would be to run an
ordinary SSH server over the same channel - but this approach avoids
having to manage a phony password or authentication key, or taking up
your CPU time with pointless crypto.
The previous 'name' field was awkwardly serving both purposes: it was
a machine-readable identifier for the backend used in the saved
session format, and it was also used in error messages when Plink
wanted to complain that it didn't support a particular backend. Now
there are two separate name fields for those purposes.
Ever since I reworked the SSH code to have multiple internal packet
queues, there's been a long-standing FIXME in ssh_sendbuffer() saying
that we ought to include the data buffered in those queues as part of
reporting how much data is buffered on standard input.
Recently a user reported that 'proftpd', or rather its 'mod_sftp'
add-on that implements an SFTP-only SSH server, exposes a bug related
to that missing piece of code. The xfer_upload system in sftp.c starts
by pushing SFTP write messages into the SSH code for as long as
sftp_sendbuffer() (which ends up at ssh_sendbuffer()) reports that not
too much data is buffered locally. In fact what happens is that all
those messages end up on the packet queues between SSH protocol
layers, so they're not counted by sftp_sendbuffer(), so we just keep
going until there's some other reason to stop.
Usually the reason we stop is because we've filled up the SFTP
channel's SSH-layer window, so we need the server to send us a
WINDOW_ADJUST before we're allowed to send any more data. So we return
to the main event loop and start waiting for reply packets. And when
the window is moderate (e.g. OpenSSH currently seems to present about
2MB), this isn't really noticeable.
But proftpd presents the maximum-size window of 2^32-1 bytes, and as a
result we just keep shovelling more and more packets into the internal
packet queues until PSFTP has grown to 4GB in size, and only then do
we even return to the event loop and start actually sending them down
the network. Moreover, this happens again at rekey time, because while
a rekey is in progress, ssh2transport stops emptying the queue of
outgoing packets sent by its higher layer - so, again, everything just
keeps buffering up somewhere that sftp_sendbuffer can't see it.
But this commit fixes it! Each PacketProtocolLayer now provides a
vtable method for asking how much data it currently has queued. Most
of them share a default implementation which just returns the newly
added total_size field from their pq_out; the exception is
ssh2transport, which also has to account for data queued in its higher
layer. And ssh_sendbuffer() adds that on to the quantity it already
knew about in other locations, to give a more realistic idea of the
currently buffered data.
The global 'int flags' has always been an ugly feature of this code
base, and I suddenly thought that perhaps it's time to start throwing
it out, one flag at a time, until it's totally unused.
My first target is FLAG_VERBOSE. This was usually set by cmdline.c
when it saw a -v option on the program's command line, except that GUI
PuTTY itself sets it unconditionally on startup. And then various bits
of the code would check it in order to decide whether to print a given
message.
In the current system of front-end abstraction traits, there's no
_one_ place that I can move it to. But there are two: every place that
checked FLAG_VERBOSE has access to either a Seat or a LogPolicy. So
now each of those traits has a query method for 'do I want verbose
messages?'.
A good effect of this is that subsidiary Seats, like the ones used in
Uppity for the main SSH server module itself and the server end of
shell channels, now get to have their own verbosity setting instead of
inheriting the one global one. In fact I don't expect any code using
those Seats to be generating any messages at all, but if that changes
later, we'll have a way to control it. (Who knows, perhaps logging in
Uppity might become a thing.)
As part of this cleanup, I've added a new flag to cmdline_tooltype,
called TOOLTYPE_NO_VERBOSE_OPTION. The unconditionally-verbose tools
now set that, and it has the effect of making cmdline.c disallow -v
completely. So where 'putty -v' would previously have been silently
ignored ("I was already verbose"), it's now an error, reminding you
that that option doesn't actually do anything.
Finally, the 'default_logpolicy' provided by uxcons.c and wincons.c
(with identical definitions) has had to move into a new file of its
own, because now it has to ask cmdline.c for the verbosity setting as
well as asking console.c for the rest of its methods. So there's a new
file clicons.c which can only be included by programs that link
against both cmdline.c _and_ one of the *cons.c, and I've renamed the
logpolicy to reflect that.
These were just too footling for even me to bother splitting up into
multiple commits:
- a couple of int -> size_t changes left out of the big-bang commit
0cda34c6f
- a few 'const' added to pointer-type casts that are only going to be
read from (leaving out the const provokes a warning if the pointer
was const _before_ the cast)
- a couple of 'return' statements trying to pass the void return of
one function through to another.
- another missing (void) in a declaration in putty.h (but this one
didn't cause any knock-on confusion).
- a few tweaks to macros, to arrange that they eat a semicolon after
the macro call (extra do ... while (0) wrappers, mostly, and one
case where I had to do it another way because the macro included a
variable declaration intended to remain in scope)
- reworked key_type_to_str to stop putting an unreachable 'break'
statement after every 'return'
- removed yet another type-check of a function loaded from a Windows
system DLL
- and finally, a totally spurious semicolon right after an open brace
in mainchan.c.
A trawl through the code with -Wmissing-prototypes and
-Wmissing-variable-declarations turned up a lot of things that should
have been internal to a particular source file, but were accidentally
global. Keep the namespace clean by making them all static.
(Also, while I'm here, a couple of them were missing a 'const': the
ONE and ZERO arrays in sshcrcda.c, and EMPTY_WINDOW_TITLE in
terminal.c.)
An assortment of errors: int vs size_t confusion (probably undetected
since the big switchover in commit 0cda34c6f), some outright spurious
parameters after the format string (copy-paste errors), a particularly
silly one in pscp.c (a comma between two halves of what should have
been a single string literal), and a _missing_ format string in ssh.c
(but luckily in a context where the only text that would be wrongly
treated as a format string was error messages generated elsewhere in
PuTTY).
When I'm declaring a local instance of some context structure type to
pass to a function which will pass it in turn to a callback, I've
tended to use a declaration of the form
struct context actx, *ctx = &actx;
so that the outermost caller can initialise the context, and/or read
out fields of it afterwards, by the same syntax 'ctx->foo' that the
callback function will be using. So you get visual consistency between
the two functions that share this context.
It only just occurred to me that there's a much neater way to declare
a context struct of this kind, which still makes 'ctx' behave like a
pointer in the owning function, and doesn't need all that weird
verbiage or a spare variable name:
struct context ctx[1];
That's much nicer! I've switched to doing that in all existing cases I
could find, and also in a couple of cases where I hadn't previously
bothered to do the previous more cumbersome idiom.
The number of people has been steadily increasing who read our source
code with an editor that thinks tab stops are 4 spaces apart, as
opposed to the traditional tty-derived 8 that the PuTTY code expects.
So I've been wondering for ages about just fixing it, and switching to
a spaces-only policy throughout the code. And I recently found out
about 'git blame -w', which should make this change not too disruptive
for the purposes of source-control archaeology; so perhaps now is the
time.
While I'm at it, I've also taken the opportunity to remove all the
trailing spaces from source lines (on the basis that git dislikes
them, and is the only thing that seems to have a strong opinion one
way or the other).
Apologies to anyone downstream of this code who has complicated patch
sets to rebase past this change. I don't intend it to be needed again.
I noticed today that when you use Unix Plink interactively, to run a
tty session on a remote host starting from a local tty, it doesn't
copy the local terminal size into the "pty-req" channel request.
It looks as if this is a bug introduced in 2ca0070f8, when I broke up
ssh.c. Before that, the monolithic Ssh init procedure set
ssh->term_width and ssh->term_height from the Conf you passed in.
Afterwards, variables of the same name existed in both ssh.c *and*
ssh2connection.c (the former so that it can buffer window-size changes
before a connection layer yet exists to pass them on to), but somehow,
*neither* source file remembered to initialise them from the Conf.
Fixed by reinstating the initialisation in ssh.c. (For the same reason
as above: you want the values in Conf to be overwritten if the window
size changes between ssh_init and ssh2_connection_new.)
This is another case where a stale pointer bug could have arisen from
a toplevel callback going off after an object was freed.
But here, just adding delete_callbacks_for_context wouldn't help. The
actual context parameter for the callback wasn't mainchan itself; it
was a tiny separate object, allocated to hold just the parameters of
the function the callback wanted to call. So if _those_ parameters
became stale before the callback was triggered, then even
delete_callbacks_for_context wouldn't have been able to help.
Also, mainchan itself would have been freed moments after this
callback was queued, so moving it to be a callback on mainchan itself
wouldn't help.
Solution: move the callback right out to Ssh, by introducing a new
ssh_sw_abort_deferred() which is just like ssh_sw_abort but does its
main work in a toplevel callback. Then ssh.c's existing call to
delete_callbacks_for_context will clean it up if necessary.
On Windows, PuTTY using the SSH backend could emit messages like
"Network error: Network error: Software caused connection abort".
When ssh.c regained the ability to emit plug_closing error messages in
ed0104c2fe (after it disappeared in fe6caf563c), it came with an extra
"Network error:". But winnet.c already adds that prefix.
This changes it back the way it was, which is also consistent with the
other backends.
This is much simpler than Conf, because I don't expect to have to copy
it around, load or save it to disk (or the Windows registry), or
serialise it between processes. So it can be a straightforward struct.
As yet there's nothing actually _in_ it. I've just created the
structure and arranged to pass it through to all the SSH layers. But
now it's here, it will be a place I can add configuration items as I
find I need them.
This is a fairly shallow patch, which removes the UI and interactions
with external libraries. Some other machinery (which is dead code in
this configuration) is left in place.
Adapted by me from a patch by Jeroen Roovers.
If an Ssh structure is destroyed while its IdempotentCallback
ssh->ic_out_raw is prnding, then the latter will stay on callback.c's
list pointing at the freed memory.
The idea of these is that they centralise the common idiom along the
lines of
if (logical_array_len >= physical_array_size) {
physical_array_size = logical_array_len * 5 / 4 + 256;
array = sresize(array, physical_array_size, ElementType);
}
which happens at a zillion call sites throughout this code base, with
different random choices of the geometric factor and additive
constant, sometimes forgetting them completely, and generally doing a
lot of repeated work.
The new macro sgrowarray(array,size,n) has the semantics: here are the
array pointer and its physical size for you to modify, now please
ensure that the nth element exists, so I can write into it. And
sgrowarrayn(array,size,n,m) is the same except that it ensures that
the array has size at least n+m (so sgrowarray is just the special
case where m=1).
Now that this is a single centralised implementation that will be used
everywhere, I've also gone to more effort in the implementation, with
careful overflow checks that would have been painful to put at all the
previous call sites.
This commit also switches over every use of sresize(), apart from a
few where I really didn't think it would gain anything. A consequence
of that is that a lot of array-size variables have to have their types
changed to size_t, because the macros require that (they address-take
the size to pass to the underlying function).
In commit 0f405ae8a, I arranged to stop reading from the SSH
connection if the in_raw bufchain got too big. But in at least some
tools (this bit me just now with PSCP), nothing actually calls
ssh_check_frozen again when the bufchain clears, so it stays frozen.
Now ssh_check_frozen is non-static, and all the BPP implementations
call it whenever they consume data from ssh->in_raw.
Now that all the call sites are expecting a size_t instead of an int
length field, it's no longer particularly difficult to make it
actually return the pointer,length pair in the form of a ptrlen.
It would be nice to say that simplifies call sites because those
ptrlens can all be passed straight along to other ptrlen-consuming
functions. Actually almost none of the call sites are like that _yet_,
but this makes it possible to move them in that direction in future
(as part of my general aim to migrate ptrlen-wards as much as I can).
But also it's just nicer to keep the pointer and length together in
one variable, and not have to declare them both in advance with two
extra lines of boilerplate.
This is a general cleanup which has been overdue for some time: lots
of length fields are now the machine word type rather than the (in
practice) fixed 'int'.
If the SSH socket is readable, GTK will preferentially give us a
callback to read from it rather than calling its idle functions. That
means the ssh->in_raw bufchain can just keep accumulating data, and
the callback that gets the BPP to take data back off that bufchain
will never be called at all.
The solution is to use sk_set_frozen after a certain point, to stop
reading further data from the socket (and, more importantly, disable
GTK's I/O callback for that fd) until we've had a chance to process
some backlog, and then unfreeze the socket again afterwards.
Annoyingly, that means adding a _second_ 'frozen' flag to Ssh, because
the one we already had has exactly the wrong semantics - it prevents
us from _processing_ our backlog, which is the last thing we want if
the entire problem is that we need that backlog to get smaller! So now
there are two frozen flags, and a big comment explaining the
difference.
The backend_unthrottle function gets called when the backlog on stdout
clears, and it's possible for that to happen _after_ the SSH backend
has terminated the connection and freed all its protocol modules (e.g.
if a protocol error occurred on the network while data was still
waiting to be written to stdout). So ssh_unthrottle should check that
ssh->cl still exists before calling any method of it.
misc.c has always contained a combination of things that are tied
tightly into the PuTTY code base (e.g. they use the conf system, or
work with our sockets abstraction) and things that are pure standalone
utility functions like nullstrcmp() which could quite happily be
dropped into any C program without causing a link failure.
Now the latter kind of standalone utility code lives in the new source
file utils.c, whose only external dependency is on memory.c (for snew,
sfree etc), which in turn requires the user to provide an
out_of_memory() function. So it should now be much easier to link test
programs that use PuTTY's low-level functions without also pulling in
half its bulky infrastructure.
In the process, I came across a memory allocation logging system
enabled by -DMALLOC_LOG that looks long since bit-rotted; in any case
we have much more advanced tools for that kind of thing these days,
like valgrind and Leak Sanitiser, so I've just removed it rather than
trying to transplant it somewhere sensible. (We can always pull it
back out of the version control history if really necessary, but I
haven't used it in at least a decade.)
The other slightly silly thing I did was to give bufchain a function
pointer field that points to queue_idempotent_callback(), and disallow
direct setting of the 'ic' field in favour of calling
bufchain_set_callback which will fill that pointer in too. That allows
the bufchain system to live in utils.c rather than misc.c, so that
programs can use it without also having to link in the callback system
or provide an annoying stub of that function. In fact that's just
allowed me to remove stubs of that kind from PuTTYgen and Pageant!
My normal habit these days, in new code, is to treat int and bool as
_almost_ completely separate types. I'm still willing to use C's
implicit test for zero on an integer (e.g. 'if (!blob.len)' is fine,
no need to spell it out as blob.len != 0), but generally, if a
variable is going to be conceptually a boolean, I like to declare it
bool and assign to it using 'true' or 'false' rather than 0 or 1.
PuTTY is an exception, because it predates the C99 bool, and I've
stuck to its existing coding style even when adding new code to it.
But it's been annoying me more and more, so now that I've decided C99
bool is an acceptable thing to require from our toolchain in the first
place, here's a quite thorough trawl through the source doing
'boolification'. Many variables and function parameters are now typed
as bool rather than int; many assignments of 0 or 1 to those variables
are now spelled 'true' or 'false'.
I managed this thorough conversion with the help of a custom clang
plugin that I wrote to trawl the AST and apply heuristics to point out
where things might want changing. So I've even managed to do a decent
job on parts of the code I haven't looked at in years!
To make the plugin's work easier, I pushed platform front ends
generally in the direction of using standard 'bool' in preference to
platform-specific boolean types like Windows BOOL or GTK's gboolean;
I've left the platform booleans in places they _have_ to be for the
platform APIs to work right, but variables only used by my own code
have been converted wherever I found them.
In a few places there are int values that look very like booleans in
_most_ of the places they're used, but have a rarely-used third value,
or a distinction between different nonzero values that most users
don't care about. In these cases, I've _removed_ uses of 'true' and
'false' for the return values, to emphasise that there's something
more subtle going on than a simple boolean answer:
- the 'multisel' field in dialog.h's list box structure, for which
the GTK front end in particular recognises a difference between 1
and 2 but nearly everything else treats as boolean
- the 'urgent' parameter to plug_receive, where 1 vs 2 tells you
something about the specific location of the urgent pointer, but
most clients only care about 0 vs 'something nonzero'
- the return value of wc_match, where -1 indicates a syntax error in
the wildcard.
- the return values from SSH-1 RSA-key loading functions, which use
-1 for 'wrong passphrase' and 0 for all other failures (so any
caller which already knows it's not loading an _encrypted private_
key can treat them as boolean)
- term->esc_query, and the 'query' parameter in toggle_mode in
terminal.c, which _usually_ hold 0 for ESC[123h or 1 for ESC[?123h,
but can also hold -1 for some other intervening character that we
don't support.
In a few places there's an integer that I haven't turned into a bool
even though it really _can_ only take values 0 or 1 (and, as above,
tried to make the call sites consistent in not calling those values
true and false), on the grounds that I thought it would make it more
confusing to imply that the 0 value was in some sense 'negative' or
bad and the 1 positive or good:
- the return value of plug_accepting uses the POSIXish convention of
0=success and nonzero=error; I think if I made it bool then I'd
also want to reverse its sense, and that's a job for a separate
piece of work.
- the 'screen' parameter to lineptr() in terminal.c, where 0 and 1
represent the default and alternate screens. There's no obvious
reason why one of those should be considered 'true' or 'positive'
or 'success' - they're just indices - so I've left it as int.
ssh_scp_recv had particularly confusing semantics for its previous int
return value: its call sites used '<= 0' to check for error, but it
never actually returned a negative number, just 0 or 1. Now the
function and its call sites agree that it's a bool.
In a couple of places I've renamed variables called 'ret', because I
don't like that name any more - it's unclear whether it means the
return value (in preparation) for the _containing_ function or the
return value received from a subroutine call, and occasionally I've
accidentally used the same variable for both and introduced a bug. So
where one of those got in my way, I've renamed it to 'toret' or 'retd'
(the latter short for 'returned') in line with my usual modern
practice, but I haven't done a thorough job of finding all of them.
Finally, one amusing side effect of doing this is that I've had to
separate quite a few chained assignments. It used to be perfectly fine
to write 'a = b = c = TRUE' when a,b,c were int and TRUE was just a
the 'true' defined by stdbool.h, that idiom provokes a warning from
gcc: 'suggest parentheses around assignment used as truth value'!
I think this is the full set of things that ought logically to be
boolean.
One annoyance is that quite a few radio-button controls in config.c
address Conf fields that are now bool rather than int, which means
that the shared handler function can't just access them all with
conf_{get,set}_int. Rather than back out the rigorous separation of
int and bool in conf.c itself, I've just added a similar alternative
handler function for the bool-typed ones.
This commit includes <stdbool.h> from defs.h and deletes my
traditional definitions of TRUE and FALSE, but other than that, it's a
100% mechanical search-and-replace transforming all uses of TRUE and
FALSE into the C99-standardised lowercase spellings.
No actual types are changed in this commit; that will come next. This
is just getting the noise out of the way, so that subsequent commits
can have a higher proportion of signal.
This server is NOT SECURE! If anyone is reading this commit message,
DO NOT DEPLOY IT IN A HOSTILE-FACING ENVIRONMENT! Its purpose is to
speak the server end of everything PuTTY speaks on the client side, so
that I can test that I haven't broken PuTTY when I reorganise its
code, even things like RSA key exchange or chained auth methods which
it's hard to find a server that speaks at all.
(For this reason, it's declared with [UT] in the Recipe file, so that
it falls into the same category as programs like testbn, which won't
be installed by 'make install'.)
Working title is 'Uppity', partly for 'Universal PuTTY Protocol
Interaction Test Yoke', but mostly because it looks quite like the
word 'PuTTY' with part of it reversed. (Apparently 'test yoke' is a
very rarely used term meaning something not altogether unlike 'test
harness', which is a bit of a stretch, but it'll do.)
It doesn't actually _support_ everything I want yet. At the moment,
it's a proof of concept only. But it has most of the machinery
present, and the parts it's missing - such as chained auth methods -
should be easy enough to add because I've built in the required
flexibility, in the form of an AuthPolicy object which can request
them if it wants to. However, the current AuthPolicy object is
entirely trivial, and will let in any user with the password "weasel".
(Another way in which this is not a production-ready server is that it
also has no interaction with the OS's authentication system. In
particular, it will not only let in any user with the same password,
but it won't even change uid - it will open shells and forwardings
under whatever user id you started it up as.)
Currently, the program can only speak the SSH protocol on its standard
I/O channels (using the new FdSocket facility), so if you want it to
listen on a network port, you'll have to run it from some kind of
separate listening program similar to inetd. For my own tests, I'm not
even doing that: I'm just having PuTTY spawn it as a local proxy
process, which also conveniently eliminates the risk of anyone hostile
connecting to it.
The bulk of the actual code reorganisation is already done by previous
commits, so this change is _mostly_ just dropping in a new set of
server-specific source files alongside the client-specific ones I
created recently. The remaining changes in the shared SSH code are
numerous, but all minor:
- a few extra parameters to BPP and PPL constructors (e.g. 'are you
in server mode?'), and pass both sets of SSH-1 protocol flags from
the login to the connection layer
- in server mode, unconditionally send our version string _before_
waiting for the remote one
- a new hook in the SSH-1 BPP to handle enabling compression in
server mode, where the message exchange works the other way round
- new code in the SSH-2 BPP to do _deferred_ compression the other
way round (the non-deferred version is still nicely symmetric)
- in the SSH-2 transport layer, some adjustments to do key derivation
either way round (swapping round the identifying letters in the
various hash preimages, and making sure to list the KEXINITs in the
right order)
- also in the SSH-2 transport layer, an if statement that controls
whether we send SERVICE_REQUEST and wait for SERVICE_ACCEPT, or
vice versa
- new ConnectionLayer methods for opening outgoing channels for X and
agent forwardings
- new functions in portfwd.c to establish listening sockets suitable
for remote-to-local port forwarding (i.e. not under the direction
of a Conf the way it's done on the client side).
The word 'PuTTY' in the outgoing SSH version string has always
represented the name of the *SSH implementation* as opposed to the
name of the specific program containing it (for example, PSCP and
PSFTP don't announce themselves with a different banner). But I think
that a change from client to server merits a change in that
implementation name, so I'm removing the prefix "PuTTY" from the
constant string sshver[], and moving it to a parameter passed in
separately to ssh_verstring_new, so that the upcoming server can pass
in a different one.
In commit ad0c502ce I forgot to arrange for ssh_connect_ppl to fill in
ppl->logctx, and without it, logevent() was cheerfully throwing away
all those log messages.
This is a new vtable-based abstraction which is passed to a backend in
place of Frontend, and it implements only the subset of the Frontend
functions needed by a backend. (Many other Frontend functions still
exist, notably the wide range of things called by terminal.c providing
platform-independent operations on the GUI terminal window.)
The purpose of making it a vtable is that this opens up the
possibility of creating a backend as an internal implementation detail
of some other activity, by providing just that one backend with a
custom Seat that implements the methods differently.
For example, this refactoring should make it feasible to directly
implement an SSH proxy type, aka the 'jump host' feature supported by
OpenSSH, aka 'open a secondary SSH session in MAINCHAN_DIRECT_TCP
mode, and then expose the main channel of that as the Socket for the
primary connection'. (Which of course you can already do by spawning
'plink -nc' as a separate proxy process, but this would permit it in
the _same_ process without anything getting confused.)
I've centralised a full set of stub methods in misc.c for the new
abstraction, which allows me to get rid of several annoying stubs in
the previous code. Also, while I'm here, I've moved a lot of
duplicated modalfatalbox() type functions from application main
program files into wincons.c / uxcons.c, which I think saves
duplication overall. (A minor visible effect is that the prefixes on
those console-based fatal error messages will now be more consistent
between applications.)
LogContext is now the owner of the logevent() function that back ends
and so forth are constantly calling. Previously, logevent was owned by
the Frontend, which would store the message into its list for the GUI
Event Log dialog (or print it to standard error, or whatever) and then
pass it _back_ to LogContext to write to the currently open log file.
Now it's the other way round: LogContext gets the message from the
back end first, writes it to its log file if it feels so inclined, and
communicates it back to the front end.
This means that lots of parts of the back end system no longer need to
have a pointer to a full-on Frontend; the only thing they needed it
for was logging, so now they just have a LogContext (which many of
them had to have anyway, e.g. for logging SSH packets or session
traffic).
LogContext itself also doesn't get a full Frontend pointer any more:
it now talks back to the front end via a little vtable of its own
called LogPolicy, which contains the method that passes Event Log
entries through, the old askappend() function that decides whether to
truncate a pre-existing log file, and an emergency function for
printing an especially prominent message if the log file can't be
created. One minor nice effect of this is that console and GUI apps
can implement that last function subtly differently, so that Unix
console apps can write it with a plain \n instead of the \r\n
(harmless but inelegant) that the old centralised implementation
generated.
One other consequence of this is that the LogContext has to be
provided to backend_init() so that it's available to backends from the
instant of creation, rather than being provided via a separate API
call a couple of function calls later, because backends have typically
started doing things that need logging (like making network
connections) before the call to backend_provide_logctx. Fortunately,
there's no case in the whole code base where we don't already have
logctx by the time we make a backend (so I don't actually remember why
I ever delayed providing one). So that shortens the backend API by one
function, which is always nice.
While I'm tidying up, I've also moved the printf-style logeventf() and
the handy logevent_and_free() into logging.c, instead of having copies
of them scattered around other places. This has also let me remove
some stub functions from a couple of outlying applications like
Pageant. Finally, I've removed the pointless "_tag" at the end of
LogContext's official struct name.
If you start up two sharing-enabled PuTTYs to the same host
simultaneously, the one that ends up being the downstream can connect
to the upstream before the upstream has provided a ConnectionLayer to
the sharestate, which means that log_downstream() will dereference
cs->parent->cl->frontend to find its Frontend and fail because cl is
NULL.
Fixed by providing a dummy initial ConnectionLayer containing nothing
but a frontend pointer, which is then replaced by the real one later.
The sshverstring quasi-frontend is passed a Frontend pointer at setup
time, so that it can generate Event Log entries containing the local
and remote version strings and the results of remote bug detection.
I'm promoting that field of sshverstring to a field of the public BPP
structure, so now all BPPs have the right to talk directly to the
frontend if they want to. This means I can move all the log messages
of the form 'Initialised so-and-so cipher/MAC/compression' down into
the BPPs themselves, where they can live exactly alongside the actual
initialisation of those primitives.
It also means BPPs will be able to log interesting things they detect
at any point in the packet stream, which is about to come in useful
for another purpose.
Ian Jackson points out that the Linux kernel has a macro of this name
with the same purpose, and suggests that it's a good idea to use the
same name as they do, so that at least some people reading one code
base might recognise it from the other.
I never really thought very hard about what order FROMFIELD's
parameters should go in, and therefore I'm pleasantly surprised to
find that my order agrees with the kernel's, so I don't have to
permute every call site as part of making this change :-)
I think that means that _every_ one of my traitoids is now a struct
containing a vtable pointer as one of its fields (albeit sometimes the
only field), and never just a bare pointer.
All the main backend structures - Ssh, Telnet, Pty, Serial etc - now
describe structure types themselves rather than pointers to them. The
same goes for the codebase-wide trait types Socket and Plug, and the
supporting types SockAddr and Pinger.
All those things that were typedefed as pointers are older types; the
newer ones have the explicit * at the point of use, because that's
what I now seem to be preferring. But whichever one of those is
better, inconsistently using a mixture of the two styles is worse, so
let's make everything consistent.
A few types are still implicitly pointers, such as Bignum and some of
the GSSAPI types; generally this is either because they have to be
void *, or because they're typedefed differently on different
platforms and aren't always pointers at all. Can't be helped. But I've
got rid of the main ones, at least.
A second bug in the area of clean SSH-connection closure: I was
setting the pending_close flag (formerly send_outgoing_eof) and
expecting that once the outgoing backlog was cleared it would cause a
socket closure. But of course the function that does that -
ssh_bpp_output_raw_data_callback() - will only get called if there
_is_ any outgoing backlog to be cleared! So if there was already no
backlog, I would set the pending_close flag and nothing would ever
check it again.
Fixed by manually re-queuing the callback that will check the backlog
and the pending_close flag.
When PuTTY wants to cleanly close an SSH connection, my policy has
been to use shutdown(SHUT_WR) (or rather, sk_write_eof, which ends up
translating into that) to close just the outgoing side of the TCP
connection, and then wait for the server to acknowledge that by
closing its own end.
Mainly the purpose of doing this rather than just immediately closing
the whole socket was that I wanted to make sure any remaining outgoing
packets of ours got sent before the connection was wound up. In
particular, when we send SSH_MSG_DISCONNECT immediately before the
close, we do want that to get through.
But I now think this was a mistake, because it puts us at the mercy of
the server remembering to respond by closing the other direction of
the connection. It might absent-mindedly just continue to sit there
holding the connection open, which would be silly, but if it did
happen, we wouldn't want to sit around waiting in order to close the
client application - we'd rather abandon a socket in that state and
leave it to the OS's network layer to tell the server how silly it was
being.
So now I'm using an in-between strategy: I still wait for outgoing
data to be sent before closing the socket (so the DISCONNECT should
still go out), but once it's gone, I _do_ just close the whole thing
instead of just sending outgoing EOF.