Previously, the instant at which we send to the server a request to
enable agent forwarding (the "auth-agent-req@openssh.com" channel
request, or SSH1_CMSG_AGENT_REQUEST_FORWARDING) was also the instant
at which we set a flag indicating that we're prepared to accept
attempts from the server to open a channel to talk to the forwarded
agent. If the server attempts that when we haven't sent a forwarding
request, we treat it with suspicion, and reject it.
But it turns out that at least one SSH server does this, for what
seems to be a _somewhat_ sensible purpose, and OpenSSH accepts it. So,
on the basis that the @openssh.com domain suffix makes them the
arbiters of this part of the spec, I'm following their practice. I've
removed the 'agent_fwd_enabled' flag from both connection layer
implementations, together with the ConnectionLayer method that sets
it; now agent-forwarding CHANNEL_OPENs are gated only on the questions
of whether agent forwarding was permitted in the configuration and
whether an agent actually exists to talk to, and not also whether we
had previously sent a message to the server announcing it.
(The change to this condition is also applied in the SSH-1 agent
forwarding code, mostly for the sake of keeping things parallel where
possible. I think it doesn't actually make a difference in SSH-1,
because in SSH-1, it's not _possible_ for the server to try to open an
agent channel before the main channel is set up, due to the entirely
separate setup phase of the protocol.)
The use case is a proxy host which makes a secondary SSH connection to
a real destination host. A user has run into one of these recently,
announcing a version banner of "SSH-2.0-FudoSSH", which relies on
agent forwarding to authenticate the secondary connection. You connect
to the proxy host and authenticate with a username string of the form
"realusername#real.destination.host", and then, at the start of the
connection protocol, the server immediately opens a channel back to
your SSH agent which it uses to authenticate to the destination host.
And it delays answering any CHANNEL_OPEN requests from the client
until that's all done. For example (seen from the client's POV,
although the server's CHANNEL_OPEN may well have been _sent_ up front
rather than in response to the client's):
client: SSH2_MSG_CHANNEL_OPEN "session"
server: SSH2_MSG_CHANNEL_OPEN "auth-agent@openssh.com"
client: SSH2_MSG_CHANNEL_OPEN_CONFIRMATION to the auth-agent request
<- data is exchanged on the agent channel; proxy host uses
that signature to log in to the destination host ->
server: SSH2_MSG_CHANNEL_OPEN_CONFIRMATION to the session request
With PuTTY, this wasn't working, because at the point when the server
sends the auth-agent CHANNEL_OPEN, we had not yet had any opportunity
to send auth-agent-req (because that has to wait until we've had a
CHANNEL_OPEN_CONFIRMATION). So we were rejecting the server's
CHANNEL_OPEN, which broke this workflow:
client: SSH2_MSG_CHANNEL_OPEN "session"
server: SSH2_MSG_CHANNEL_OPEN "auth-agent@openssh.com"
client: SSH2_MSG_CHANNEL_OPEN_FAILURE to the auth-agent request
(hey, I haven't told you you can do that yet!)
server: SSH2_MSG_CHANNEL_OPEN_FAILURE to the session request
(in that case, no shell session for you!)
This is a sweeping change applied across the whole code base by a spot
of Emacs Lisp. Now, everywhere I declare a vtable filled with function
pointers (and the occasional const data member), all the members of
the vtable structure are initialised by name using the '.fieldname =
value' syntax introduced in C99.
We were already using this syntax for a handful of things in the new
key-generation progress report system, so it's not new to the code
base as a whole.
The advantage is that now, when a vtable only declares a subset of the
available fields, I can initialise the rest to NULL or zero just by
leaving them out. This is most dramatic in a couple of the outlying
vtables in things like psocks (which has a ConnectionLayerVtable
containing only one non-NULL method), but less dramatically, it means
that the new 'flags' field in BackendVtable can be completely left out
of every backend definition except for the SUPDUP one which defines it
to a nonzero value. Similarly, the test_for_upstream method only used
by SSH doesn't have to be mentioned in the rest of the backends;
network Plugs for listening sockets don't have to explicitly null out
'receive' and 'sent', and vice versa for 'accepting', and so on.
While I'm at it, I've normalised the declarations so they don't use
the unnecessarily verbose 'struct' keyword. Also a handful of them
weren't const; now they are.
These were just too footling for even me to bother splitting up into
multiple commits:
- a couple of int -> size_t changes left out of the big-bang commit
0cda34c6f
- a few 'const' added to pointer-type casts that are only going to be
read from (leaving out the const provokes a warning if the pointer
was const _before_ the cast)
- a couple of 'return' statements trying to pass the void return of
one function through to another.
- another missing (void) in a declaration in putty.h (but this one
didn't cause any knock-on confusion).
- a few tweaks to macros, to arrange that they eat a semicolon after
the macro call (extra do ... while (0) wrappers, mostly, and one
case where I had to do it another way because the macro included a
variable declaration intended to remain in scope)
- reworked key_type_to_str to stop putting an unreachable 'break'
statement after every 'return'
- removed yet another type-check of a function loaded from a Windows
system DLL
- and finally, a totally spurious semicolon right after an open brace
in mainchan.c.
The number of people has been steadily increasing who read our source
code with an editor that thinks tab stops are 4 spaces apart, as
opposed to the traditional tty-derived 8 that the PuTTY code expects.
So I've been wondering for ages about just fixing it, and switching to
a spaces-only policy throughout the code. And I recently found out
about 'git blame -w', which should make this change not too disruptive
for the purposes of source-control archaeology; so perhaps now is the
time.
While I'm at it, I've also taken the opportunity to remove all the
trailing spaces from source lines (on the basis that git dislikes
them, and is the only thing that seems to have a strong opinion one
way or the other).
Apologies to anyone downstream of this code who has complicated patch
sets to rebase past this change. I don't intend it to be needed again.
This is another case where a stale pointer bug could have arisen from
a toplevel callback going off after an object was freed.
But here, just adding delete_callbacks_for_context wouldn't help. The
actual context parameter for the callback wasn't mainchan itself; it
was a tiny separate object, allocated to hold just the parameters of
the function the callback wanted to call. So if _those_ parameters
became stale before the callback was triggered, then even
delete_callbacks_for_context wouldn't have been able to help.
Also, mainchan itself would have been freed moments after this
callback was queued, so moving it to be a callback on mainchan itself
wouldn't help.
Solution: move the callback right out to Ssh, by introducing a new
ssh_sw_abort_deferred() which is just like ssh_sw_abort but does its
main work in a toplevel callback. Then ssh.c's existing call to
delete_callbacks_for_context will clean it up if necessary.
This is a general cleanup which has been overdue for some time: lots
of length fields are now the machine word type rather than the (in
practice) fixed 'int'.
In the past, I've had a lot of macros which you call with double
parentheses, along the lines of debug(("format string", params)), so
that the inner parens protect the commas and permit the macro to treat
the whole printf-style argument list as one macro argument.
That's all very well, but it's a bit inconvenient (it doesn't leave
you any way to implement such a macro by prepending another argument
to the list), and now this code base's rules allow C99isms, I can
switch all those macros to using a single pair of parens, using the
C99 ability to say '...' in the parameter list of the #define and get
at the corresponding suffix of the arguments as __VA_ARGS__.
So I'm doing it. I've made the following printf-style macros variadic:
bpp_logevent, ppl_logevent, ppl_printf and debug.
While I'm here, I've also fixed up a collection of conditioned-out
calls to debug() in the Windows front end which were clearly expecting
a macro with a different calling syntax, because they had an integer
parameter first. If I ever have a need to condition those back in,
they should actually work now.
Looks as if I introduced this bug in commit 431f92ade, when I moved
mainchan out into its own source file: the previous version of
mainchan_send_eof conditionalised the setting of mc->eof_sent in the
same if statement that actually sent the EOF, but somehow, in the new
version, only one of those operations was inside the if.
The effect is that in plink -nc mode, if the server sends EOF first,
the client stops listening to standard input at its own end, so it
never knows when to send EOF back and clean things up.
My normal habit these days, in new code, is to treat int and bool as
_almost_ completely separate types. I'm still willing to use C's
implicit test for zero on an integer (e.g. 'if (!blob.len)' is fine,
no need to spell it out as blob.len != 0), but generally, if a
variable is going to be conceptually a boolean, I like to declare it
bool and assign to it using 'true' or 'false' rather than 0 or 1.
PuTTY is an exception, because it predates the C99 bool, and I've
stuck to its existing coding style even when adding new code to it.
But it's been annoying me more and more, so now that I've decided C99
bool is an acceptable thing to require from our toolchain in the first
place, here's a quite thorough trawl through the source doing
'boolification'. Many variables and function parameters are now typed
as bool rather than int; many assignments of 0 or 1 to those variables
are now spelled 'true' or 'false'.
I managed this thorough conversion with the help of a custom clang
plugin that I wrote to trawl the AST and apply heuristics to point out
where things might want changing. So I've even managed to do a decent
job on parts of the code I haven't looked at in years!
To make the plugin's work easier, I pushed platform front ends
generally in the direction of using standard 'bool' in preference to
platform-specific boolean types like Windows BOOL or GTK's gboolean;
I've left the platform booleans in places they _have_ to be for the
platform APIs to work right, but variables only used by my own code
have been converted wherever I found them.
In a few places there are int values that look very like booleans in
_most_ of the places they're used, but have a rarely-used third value,
or a distinction between different nonzero values that most users
don't care about. In these cases, I've _removed_ uses of 'true' and
'false' for the return values, to emphasise that there's something
more subtle going on than a simple boolean answer:
- the 'multisel' field in dialog.h's list box structure, for which
the GTK front end in particular recognises a difference between 1
and 2 but nearly everything else treats as boolean
- the 'urgent' parameter to plug_receive, where 1 vs 2 tells you
something about the specific location of the urgent pointer, but
most clients only care about 0 vs 'something nonzero'
- the return value of wc_match, where -1 indicates a syntax error in
the wildcard.
- the return values from SSH-1 RSA-key loading functions, which use
-1 for 'wrong passphrase' and 0 for all other failures (so any
caller which already knows it's not loading an _encrypted private_
key can treat them as boolean)
- term->esc_query, and the 'query' parameter in toggle_mode in
terminal.c, which _usually_ hold 0 for ESC[123h or 1 for ESC[?123h,
but can also hold -1 for some other intervening character that we
don't support.
In a few places there's an integer that I haven't turned into a bool
even though it really _can_ only take values 0 or 1 (and, as above,
tried to make the call sites consistent in not calling those values
true and false), on the grounds that I thought it would make it more
confusing to imply that the 0 value was in some sense 'negative' or
bad and the 1 positive or good:
- the return value of plug_accepting uses the POSIXish convention of
0=success and nonzero=error; I think if I made it bool then I'd
also want to reverse its sense, and that's a job for a separate
piece of work.
- the 'screen' parameter to lineptr() in terminal.c, where 0 and 1
represent the default and alternate screens. There's no obvious
reason why one of those should be considered 'true' or 'positive'
or 'success' - they're just indices - so I've left it as int.
ssh_scp_recv had particularly confusing semantics for its previous int
return value: its call sites used '<= 0' to check for error, but it
never actually returned a negative number, just 0 or 1. Now the
function and its call sites agree that it's a bool.
In a couple of places I've renamed variables called 'ret', because I
don't like that name any more - it's unclear whether it means the
return value (in preparation) for the _containing_ function or the
return value received from a subroutine call, and occasionally I've
accidentally used the same variable for both and introduced a bug. So
where one of those got in my way, I've renamed it to 'toret' or 'retd'
(the latter short for 'returned') in line with my usual modern
practice, but I haven't done a thorough job of finding all of them.
Finally, one amusing side effect of doing this is that I've had to
separate quite a few chained assignments. It used to be perfectly fine
to write 'a = b = c = TRUE' when a,b,c were int and TRUE was just a
the 'true' defined by stdbool.h, that idiom provokes a warning from
gcc: 'suggest parentheses around assignment used as truth value'!
I think this is the full set of things that ought logically to be
boolean.
One annoyance is that quite a few radio-button controls in config.c
address Conf fields that are now bool rather than int, which means
that the shared handler function can't just access them all with
conf_{get,set}_int. Rather than back out the rigorous separation of
int and bool in conf.c itself, I've just added a similar alternative
handler function for the bool-typed ones.
This commit includes <stdbool.h> from defs.h and deletes my
traditional definitions of TRUE and FALSE, but other than that, it's a
100% mechanical search-and-replace transforming all uses of TRUE and
FALSE into the C99-standardised lowercase spellings.
No actual types are changed in this commit; that will come next. This
is just getting the noise out of the way, so that subsequent commits
can have a higher proportion of signal.
If values are boolean, it's confusing to use & and | in place of &&
and ||. In two of these three cases it was simply a typo and I've used
the other one; in the third, it was a deliberate avoidance of short-
circuit evaluation (and commented as such), but having seen how easy
it is to make the same typo by accident, I've decided it's clearer to
just move the LHS and RHS evaluations outside the expression.
In some compiler modes - notably the one that gtk-config selects when
GTK PuTTY is built for GTK 1 - it's an error to typedef the same thing
twice. 'mainchan' is defined in defs.h, so it doesn't need defining
again where the structure contents are specified.
ssh2connection.c now knows how to unmarshal the message formats for
all the channel requests we'll need to handle when we're the server
and a client sends them. Each one is translated into a call to a new
method in the Channel vtable, which is implemented by a trivial
'always fail' routine in every channel type we know about so far.
Some kinds of channel, even after they've sent EOF in both directions,
still have something to do before they initiate the CLOSE mechanism
and wind up the channel completely. For example, a session channel
with a subprocess running inside it will want to be sure to send the
"exit-status" or "exit-signal" notification, even if that happens
after bidirectional EOF of the data channels.
Previously, the SSH-2 connection layer had the standard policy that
once EOF had been both sent and received, it would start the final
close procedure. There's a method chan_want_close() by which a Channel
could vary this policy in one direction, by indicating that it wanted
the close procedure to commence after EOF was sent in only one
direction. Its parameters are a pair of booleans saying whether EOF
has been sent, and whether it's been received.
Now chan_want_close can vary the policy in the other direction as
well: if it returns FALSE even when _both_ parameters are true, the
connection layer will honour that, and not send CHANNEL_CLOSE. If it
does that, the Channel is responsible for indicating when it _does_
want close later, by calling sshfwd_initiate_close.
Previously, it returned a human-readable string suitable for log
files, which tried to say something useful about the remote end of a
socket. Now it returns a whole SocketPeerInfo structure, of which that
human-friendly log string is just one field, but also some of the same
information - remote IP address and port, in particular - is provided
in machine-readable form where it's available.
In SSH-1, the channel system isn't rich enough to represent the
complicated main shell session, so it's all done with a separate set
of custom message types. But PuTTY now abstracts away that difference,
by representing both as different implementations of the SshChannel
class: ssh1channel is for things that the protocol thinks are 'really'
channels, and ssh1mainchan is for the shell session. All the same
methods are implemented, but generate different wire messages.
This means that the logic to decide _when_ to enable X forwarding,
agent forwarding etc is all centralised into mainchan.c, where it
doesn't have to be repeated for both protocol versions.
It also simplifies the final loop in the connection protocol, which no
longer has to contain the code to move data from the user input
bufchain to the channel's output; that's now done by the mainchan
write method, the same as it is in SSH-2 where mainchan is just like
other channels.
This is in much the same style as the ttymodes revamp, using a header
file which can be included in different ways to either iterate over
_all_ the signals in the known list or just the ones for which a
definition exists on the target OS.
So this doesn't actually _remove_ the horrid pile of ifdefs in
mainchan_rcvd_exit_signal, but at least it puts it somewhere less
intrusive and more reusable.
This gets another big pile of logic out of ssh2connection and puts it
somewhere more central. Now the only thing left in ssh2connection is
the formatting and parsing of the various channel requests; the logic
deciding which ones to issue and what to do about them is devolved to
the Channel implementation, as it properly should be.