Just as I did to do_ssh1_login, I'm removing the 'in' and 'inlen'
parameters from the combined SSH-2 userauth + connection coroutine, in
favour of it reading directly from ssh->user_input, and in particular
also allowing get_userpass_input to do so on its behalf.
Similarly to the previous case, I've completely emptied the wrapper
function ssh2_authconn_input(), and again, there is a reason I can't
quite get away with removing it...
This introduces the first of several filtered PacketQueues that
receive subsets of pq_full destined for a particular coroutine. The
wrapper function ssh1_coro_wrapper_initial, whose purpose I just
removed in the previous commit, now gains the replacement purpose of
accepting a packet as a function argument and putting it on the new
queue for do_ssh1_login to handle when it's ready. That wrapper in
turn is called from the packet-type dispatch table, meaning that the
new pq_ssh1_login will be filtered down to only the packets destined
for this coroutine.
This is the point where I finally start using the reference counting
system that I added to 'struct Packet' about a dozen commits ago. The
general packet handling calls ssh_unref_packet for everything that
it's just pulled off pq_full and handed to a dispatch-table function -
so _this_ dispatch-table function, which needs the packet not to be
freed because it's going to go on to another queue and wait to be
handled there, can arrange that by incrementing its ref count.
This completes the transformation of do_ssh1_login into a function
with a trivial argument list, whose job is to read from a pair of
input queues (one for user keyboard input and one for SSH packets) and
respond by taking action directly rather than returning a value to its
caller to request action.
It also lets me get rid of a big pile of those really annoying
bombout() calls that I used to work around the old coroutine system's
inability to deal with receiving an SSH packet when the control flow
was in the middle of waiting for some other kind of input. That was
always the weakness of the coroutine structure of this code, which I
accepted as the price for the convenience of coroutines the rest of
the time - but now I think I've got the benefits without that cost :-)
The one remaining argument to do_ssh1_login is the main Ssh structure,
although I've had to turn it into a void * to make the function's type
compatible with the idempotent callback mechanism, since that will be
calling do_ssh1_login directly when its input queue needs looking at.
Now it does its post-completion work itself instead of telling the
callee to do the same. So its caller, ssh1_coro_wrapper_initial, is
now a _completely_ trivial wrapper - but I'm not taking the
opportunity to fold the two functions together completely, because the
wrapper is going to acquire a new purpose in the next commit :-)
This is the first refactoring of a major coroutine enabled by adding
the ssh->user_input queue. Now, instead of receiving a fixed block of
parameter data, do_ssh1_login reads directly from the user input
bufchain.
In particular, I can get rid of all the temporary bufchains I
constructed to pass to get_userpass_input (or rather, the ones in this
particular function), because now we can let get_userpass_input read
directly from the main user_input bufchain, and it will read only as
much data as it has an interest in, and leave the rest as type-ahead
for future prompts or the main session.
After the last few commits, neither incoming SSH packets nor incoming
user input goes through those functions any more - each of those
directions of data goes into a queue and from there to a callback
specifically processing that queue. So the centralised top-level
protocol switching functions have nothing left to switch, and can go.
This change introduces a new bufchain ssh->user_input, into which we
put every byte received via back->send() (i.e. keyboard input from the
GUI PuTTY window, data read by Plink from standard input, or outgoing
SCP/SFTP protocol data made up by the file transfer utilities).
Just like ssh->incoming_data, there's also a function pointer
ssh->current_user_input_fn which says who currently has responsibility
for pulling data back off that bufchain and processing it. So that can
be changed over when the connection moves into a different major phase.
At the moment, nothing very interesting is being done with this
bufchain: each phase of the connection has its own little function
that pulls chunks back out of it with bufchain_prefix and passes them
to the unchanged main protocol coroutines. But this is groundwork for
being able to switch over each of those coroutines in turn to read
directly from ssh->user_input, with the aim of eliminating a
collection of annoying bugs in which typed-ahead data is accidentally
discarded at an SSH phase transition.
This flag was used to indicate that ssh1_protocol (or, as of the
previous commit, ssh1_coro_wrapper) should stop passing packets to
do_ssh1_login and start passing them to do_ssh1_connection.
Now, instead of using a flag, we simply have two separate versions of
ssh1_coro_wrapper for the two phases, and indicate the change by
rewriting all the entries in the dispatch table. So now we _just_ have
a function-pointer dereference per packet, rather than one of those
and then a flag check.
After the previous two refactorings, there's no longer any need to
pass packets to ssh1_protocol or ssh2_protocol so that each one can do
its own thing with them, because now the handling is the same in both
cases: first call the general type-independent packet processing code
(if any), and then call the dispatch table entry for the packet type
(which now always exists).
In SSH-2, every possible packet type code has a non-NULL entry in the
dispatch table, even if most of them are just ssh2_msg_unimplemented.
In SSH-1, some dispatch table entries are NULL, which means that the
code processing the dispatch table has to have some SSH-1 specific
fallback logic.
Now I've put the fallback logic in a separate function, and replaced
the NULL table entries with pointers to that function, so that another
pointless difference between the SSH-1 and SSH-2 code is removed.
NFC: I'm just moving a small piece of code out into a separate
function, which does processing on incoming SSH-2 packets that is
completely independent of the packet type. (Specifically, we count up
the total amount of data so far transferred, and use it to trigger a
rekey when we get over the per-session-key data limit.)
The aim is that I'll be able to call this function from a central
location that's not SSH-2 specific, by using a function pointer that
points to this function in SSH-2 mode or is null in SSH-1 mode.
I've completely removed the top-level coroutine ssh_gotdata(), and
replaced it with a system in which ssh_receive (which is a plug
function, i.e. called directly from the network code) simply adds the
incoming data to a new bufchain called ssh->incoming_data, and then
queues an idempotent callback to ensure that whatever function is
currently responsible for the top-level handling of wire data will be
invoked in the near future.
So the decisions that ssh_gotdata was previously making are now made
by changing which function is invoked by that idempotent callback:
when we finish doing SSH greeting exchange and move on to the packet-
structured main phase of the protocol, we just change
ssh->current_incoming_data_fn and ensure that the new function gets
called to take over anything still outstanding in the queue.
This simplifies the _other_ end of the API of the rdpkt functions. In
the previous commit, they stopped returning their 'struct Packet'
directly, and instead put it on a queue; in this commit, they're no
longer receiving a (data, length) pair in their parameter list, and
instead, they're just reading from ssh->incoming_data. So now, API-
wise, they take no arguments at all except the main 'ssh' state
structure.
It's not just the rdpkt functions that needed to change, of course.
The SSH greeting handlers have also had to switch to reading from
ssh->incoming_data, and are quite substantially rewritten as a result.
(I think they look simpler in the new style, personally.)
This new bufchain takes over from the previous queued_incoming_data,
which was only used at all in cases where we throttled the entire SSH
connection. Now, data is unconditionally left on the new bufchain
whether we're throttled or not, and the only question is whether we're
currently bothering to read it; so all the special-purpose code to
read data from a bufchain and pass it to rdpkt can go away, because
rdpkt itself already knows how to do that job.
One slightly fiddly point is that we now have to defer processing of
EOF from the SSH server: if we have data already in the incoming
bufchain and then the server slams the connection shut, we want to
process the data we've got _before_ reacting to the remote EOF, just
in case that data gives us some reason to change our mind about how we
react to the EOF, or a last-minute important piece of data we might
need to log.
Each of the coroutines that parses the incoming wire data into a
stream of 'struct Packet' now delivers those packets to a PacketQueue
called ssh->pq_full (containing the full, unfiltered stream of all
packets received on the SSH connection), replacing the old API in
which each coroutine would directly return a 'struct Packet *' to its
caller, or NULL if it didn't have one ready yet.
This simplifies the function-call API of the rdpkt coroutines (they
now return void). It increases the complexity at the other end,
because we've now got a function ssh_process_pq_full (scheduled as an
idempotent callback whenever rdpkt appends anything to the queue)
which pulls things out of the queue and passes them to ssh->protocol.
But that's only a temporary complexity increase; by the time I finish
the upcoming stream of refactorings, there won't be two chained
functions there any more.
One small workaround I had to add in this commit is a flag called
'pending_newkeys', which ssh2_rdpkt sets when it's just returned an
SSH_MSG_NEWKEYS packet, and then waits for the transport layer to
process the NEWKEYS and set up the new encryption context before
processing any more wire data. This wasn't necessary before, because
the old architecture was naturally synchronous - ssh2_rdpkt would
return a NEWKEYS, which would be immediately passed to
do_ssh2_transport, which would finish processing it immediately, and
by the time ssh2_rdpkt was next called, the keys would already be in
place.
This change adds a big while loop around the whole of each rdpkt
function, so it's easiest to read it as a whitespace-ignored diff.
NFC for the moment, because the bufchain is always specially
constructed to hold exactly the same data that would have been passed
in to the function as a (pointer,length) pair. But this API change
allows get_userpass_input to express the idea that it consumed some
but not all of the data in the bufchain, which means that later on
I'll be able to point the same function at a longer-lived bufchain
containing the full stream of keyboard input and avoid dropping
keystrokes that arrive too quickly after the end of an interactive
password prompt.
The crWaitUntil macros have do-while type semantics, i.e. they always
crReturn _at least_ once, and then perhaps more times if their
termination condition is still not met. But sometimes a coroutine will
want to wait for a condition that may _already_ be true - the key
examples being non-emptiness of a bufchain or a PacketQueue, which may
already be non-empty in spite of you having just removed something
from its head.
In that situation, it's obviously more convenient not to bother with a
crReturn in the first place than to do one anyway and have to fiddle
about with toplevel callbacks to make sure we resume later. So here's
a new pair of macros crMaybeWaitUntil{,V}, which have the semantics of
while rather than do-while, i.e. they test the condition _first_ and
don't return at all if it's already met.
This is another piece of not-yet-used infrastructure, which later on
will simplify my life when I start processing PacketQueues and adding
some of their packets to other PacketQueues, because this way the code
can unref every packet removed from the source queue in the same way,
whether or not the packet is actually finished with.
This is just a linked list of 'struct Packet' with a convenience API
on the front. As yet it's unused, so ssh.c will currently not compile
with gcc -Werror unless you also add -Wno-unused-function. But all the
functions I've added here will be used in later commits in the current
patch series, so that's only a temporary condition.
Previously it was local, which _mostly_ worked, except that if the SSH
host key needed verifying via a non-modal dialog box, there could be a
crReturn in between writing it and reading it.
It's pretty tempting to suggest that because nobody has noticed this
before, SSH-1 can't be needed any more! But actually I suspect the
intervening crReturn has only appeared since the last release,
probably around November when I was messing about with GTK dialog box
modality. (I observed the problem just now on the GTK build, while
trying to check that a completely different set of changes hadn't
broken SSH-1.)
It hasn't been used since 2012, when commit 8e0ab8be5 introduced a new
method of getting the do_ssh2_authconn coroutine started, and didn't
notice that the variable we were previously using was now completely
unused.
The 2-minutely check to see whether new GSS credentials need to be
forwarded to the server is pointless if we're not even in the mode
where we _have_ forwarded a previous set.
This was made obvious by the overly verbose diagnostic fixed in the
previous commit, so it's a good thing that bug was temporarily there!
Now we don't generate that message as a side effect of the periodic
check for new GSS credentials; we only generate it as part of the much
larger slew of messages that happen during a rekey.
Commit d515e4f1a went through a lot of very different shapes before it
was finally pushed. In some of them, GSS kex had its own value in the
kex enumeration, but it was used in ssh.c but not in config.c
(because, as in the final version, it wasn't configured by the same
drag-list system as the rest of them). So we had to distinguish the
set of key exchange ids known to the program as a whole from the set
controllable in the configuration.
In the final version, GSS kex ended up even more separated from the
kex enumeration than that: the enum value KEX_GSS_SHA1_K5 isn't used
at all. Instead, GSS key exchange appears in the list at the point of
translation from the list of enum values into the list of pointers to
data structures full of kex methods.
But after all the changes, everyone involved forgot to revert the part
of the patch which split KEX_MAX in two and introduced the pointless
value KEX_GSS_SHA1_K5! Better late than never: I'm reverting it now,
to avoid confusion, and because I don't have any reason to think the
distinction will be useful for any other purpose.
The former has advantages in terms of keeping Kerberos credentials up
to date, but it also does something sufficiently weird to the usual
SSH host key system that I think it's worth making sure users have a
means of turning it off separately from the less intrusive GSS
userauth.
This is a heavily edited (by me) version of a patch originally due to
Nico Williams and Viktor Dukhovni. Their comments:
* Don't delegate credentials when rekeying unless there's a new TGT
or the old service ticket is nearly expired.
* Check for the above conditions more frequently (every two minutes
by default) and rekey when we would delegate credentials.
* Do not rekey with very short service ticket lifetimes; some GSSAPI
libraries may lose the race to use an almost expired ticket. Adjust
the timing of rekey checks to try to avoid this possibility.
My further comments:
The most interesting thing about this patch to me is that the use of
GSS key exchange causes a switch over to a completely different model
of what host keys are for. This comes from RFC 4462 section 2.1: the
basic idea is that when your session is mostly bidirectionally
authenticated by the GSSAPI exchanges happening in initial kex and
every rekey, host keys become more or less vestigial, and their
remaining purpose is to allow a rekey to happen if the requirements of
the SSH protocol demand it at an awkward moment when the GSS
credentials are not currently available (e.g. timed out and haven't
been renewed yet). As such, there's no need for host keys to be
_permanent_ or to be a reliable identifier of a particular host, and
RFC 4462 allows for the possibility that they might be purely
transient and only for this kind of emergency fallback purpose.
Therefore, once PuTTY has done a GSS key exchange, it disconnects
itself completely from the permanent host key cache functions in
storage.h, and instead switches to a _transient_ host key cache stored
in memory with the lifetime of just that SSH session. That cache is
populated with keys received from the server as a side effect of GSS
kex (via the optional SSH2_MSG_KEXGSS_HOSTKEY message), and used if
later in the session we have to fall back to a non-GSS key exchange.
However, in practice servers we've tested against do not send a host
key in that way, so we also have a fallback method of populating the
transient cache by triggering an immediate non-GSS rekey straight
after userauth (reusing the code path we also use to turn on OpenSSH
delayed encryption without the race condition).
This is a preliminary refactoring for an upcoming change which will
need to affect every use of schedule_timer to wait for the next rekey:
those calls to schedule_timer are now centralised into a function that
does an organised piece of thinking about when the next timer should
be.
A side effect of this change is that the translation from
CONF_ssh_rekey_time to an actual tick count is now better proofed
against integer overflow (just in case the user entered a completely
silly value).
In the 'SSH packets + raw data' logging mode, one of these occurs
immediately after the initial key exchange, at the point where the
transport routine releases any queued higher-layer packets that had
been waiting for KEX to complete. Of course, in the initial KEX there
are never any of those, so we do a zero-length s_write(), which is
harmless but has the side effect of a zero-length raw-data log entry.
If ssh_init encounters a synchronous error, it will call random_unref
before returning. But the Ssh structure it created will still exist,
and if the caller (sensibly) responds by freeing it, then that will
cause a second random_unref, leading to the RNG's refcount going below
zero and failing an assertion.
We never noticed this before because with only one PuTTY connection
per process it was easier to just exit(1) without bothering to clean
things up. Now, with all the multi-sessions-per-process fixes I'm
doing, this has shown up as a problem. But other front ends may
legitimately still just exit - I don't think I can sensibly enforce
_not_ doing so at this late stage - so I've had to arrange to set a
flag in the Ssh saying whether a random_unref is still pending or not.
I think these began to appear as a consequencce of replacing
fatalbox() calls with more sensible error reports: the more specific a
direction I send a report in, the greater the annoying possibility of
re-entrance when the resulting error handler starts closing stuff.
ssh1_rdpkt claimed to be handling SSH1_MSG_DEBUG and SSH1_MSG_IGNORE
packets, but in fact, the handling of those has long since been moved
into the dispatch table; those particular entries are set up in
ssh1_protocol_setup().
When it calls through ocr->handler() to process the response to a
channel request, sometimes that call ends up back in the main SSH-2
authconn coroutine, and sometimes _that_ will call bomb_out(), which
closes the whole SSH connection and frees all the channels - so that
when control returns back up the call stack to
ssh2_msg_channel_response itself which continues working with the
channel it was passed, it's using freed memory and things go badly.
This is the sort of thing I'd _like_ to fix using some kind of
large-scale refactoring along the lines of moving all the actual
free() calls out into top-level callbacks, so that _any_ function
which is holding a pointer to something can rely on that pointer still
being valid after it calls a subroutine. But I haven't worked out all
the details of how that system should work, and doubtless it will turn
out to have problems of its own once I do, so here's a point fix which
simply checks if the whole SSH session has been closed (which is easy
- much easier than checking if that _channel_ structure still exists)
and fixes the immediate bug.
(I think this is the real fix for the problem reported by the user I
mention in commit f0126dd19, because I actually got the details wrong
in the log message for that previous commit: the user's SSH server
wasn't rejecting the _opening_ of the main session channel, it was
rejecting the "shell" channel request, so this code path was the one
being exercised. Still, the other bug was real too, so no harm done!)
A user reported a nonsensical assertion failure (claiming that
ssh->version != 2) which suggested that a channel had somehow outlived
its parent Ssh in the situation where the opening of the main session
channel is rejected by the server. Checking with valgrind suggested
that things start to go wrong at the point where we free the half-set-
up ssh->mainchan before having filled in its type field, so that the
switch in ssh_channel_close_local() picks an arbitrary wrong action.
I haven't reproduced the same failure the user reported, but with this
change, Unix plink is now valgrind-clean in that failure situation.
This has been a FIXME in the code for ages, because back when the main
channel was always a pty session or a program run in a pipe, there
weren't that many circumstances in which the actual CHANNEL_OPEN could
return failure, so it never seemed like a priority to get round to
pulling the error information out of the CHANNEL_OPEN_FAILURE response
message and including it in PuTTY or Plink's local error message.
However, 'plink -nc' is the real reason why this is actually
important; if you tell the SSH server to make a direct-tcpip network
connection as its main channel, then that can fail for all the usual
network-unreliability reasons, and you actually do want to know which
(did you misspell the hostname, or is the target server refusing
connections, or has network connectivity failed?). This actually bit
me today when I had such a network failure, and had to debug it by
pulling that information manually out of a packet log. Time to
eliminate that FIXME.
So I've pulled the error-extracting code out of the previous handler
for OPEN_FAILURE on non-main channels into a separate function, and
arranged to call that function if the main channel open fails too. In
the process I've made a couple of minor tweaks, e.g. if the server
sends back a reason code we haven't heard of, we say _what_ that
reason code was, and also we at least make a token effort to spot if
we see a packet other than OPEN_{CONFIRMATION,FAILURE} reaching the
main loop in response to the main channel-open.
2ce0b680c inadvertently removed this ability in trying to ensure that
everyone got the new IUTF8 mode by default; you could remove a mode from
the list in the UI, but this would just revert PuTTY to its default.
The UI and storage have been revamped; the storage format now explicitly
says when a mode is not to be sent, and the configuration UI always
shows all modes known to PuTTY; if a mode is not to be sent it now shows
up as "(don't send)" in the list.
Old saved settings are migrated so as to preserve previous removals of
longstanding modes, while automatically adding IUTF8.
(In passing, this removes a bug where pressing the 'Remove' button of
the previous UI would populate the value edit box with garbage.)
I think an agent sending a string length exceeding the buffer bounds
by less than 4 could have made PuTTY read beyond its own buffer end.
Not that I really think a hostile SSH agent is likely to be attacking
PuTTY, but it's as well to fix these things anyway!
Mostly so that we don't have to malloc contiguous space for them
inside PuTTY; since we've already got a handy constant saying how big
is too big, we might as well use it to sanity-check the contents of
our agent forwarding channels.
The previous agent-forwarding system worked by passing each complete
query received from the input to agent_query() as soon as it was
ready. So if the remote client were to pipeline multiple requests,
then Unix PuTTY (in which agent_query() works asynchronously) would
parallelise them into many _simultaneous_ connections to the real
agent - and would not track which query went out first, so that if the
real agent happened to send its replies (to what _it_ thought were
independent clients) in the wrong order, then PuTTY would serialise
the replies on to the forwarding channel in whatever order it got
them, which wouldn't be the order the remote client was expecting.
To solve this, I've done a considerable rewrite, which keeps the
request stream in a bufchain, and only removes data from the bufchain
when it has a complete request. Then, if agent_query decides to be
asynchronous, the forwarding system waits for _that_ agent response
before even trying to extract the next request's worth of data from
the bufchain.
As an added bonus (in principle), this gives agent-forwarding channels
some actual flow control for the first time ever! If a client spams us
with an endless stream of rapid requests, and never reads its
responses, then the output side of the channel will run out of window,
which causes us to stop processing requests until we have space to
send responses again, which in turn causes us to stop granting extra
window on the input side, which serves the client right.
Now, instead of returning a boolean indicating whether the query has
completed or is still pending, agent_query() returns NULL to indicate
that the query _has_ completed, and if it hasn't, it returns a pointer
to a context structure representing the pending query, so that the
latter can be used to cancel the query if (for example) you later
decide you need to free the thing its callback was using as a context.
This should fix a potential race-condition segfault if you overload an
agent forwarding channel and then close it abruptly. (Which nobody
will be doing for sensible purposes, of course! But I ran across this
while stress-testing other aspects of agent forwarding.)
From ssh2_channel_got_eof() to ssh2_msg_channel_eof(). This removes
the only SSH-2 specicifity from the former. ssh2_channel_got_eof()
can also be called from ssh2_msg_channel_close(), but that calls
ssh2_channel_check_close() already.
Nothing ever sets them to NULL, and the various paths by which the
channel types can be set to CHAN_X11 or CHAN_SOCKDATA all ensure thet
the relevant union members are non-NULL. All the removed conditionals
have been converted into assertions, just in case I'm wrong.
It's redundant with the halfopen flag and is a misuse of the channel
type field. Happily, everything that depends on CHAN_SOCKDATA_DORMANT
also checks halfopen, so removing it is trivial.
Now it disconnects if the server sends
SSH_MSG_CHANNEL_OPEN_CONFIRMATION or SSH_MSG_CHANNEL_OPEN_FAILURE for
a channel that isn't half-open. Assertions in the SSH-2 handlers for
these messages rely on this behaviour even though it's never been
enforced before.
All but one caller was doing this unconditionally. The one conditional
call was when initialising the main channel, and in consequence PuTTY
leaked a channel structure when the server refused to open the main
channel. Now it doesn't.
An opcode for this was recently published in
https://tools.ietf.org/html/draft-sgtatham-secsh-iutf8-00 .
The default setting is conditional on frontend_is_utf8(), which is
consistent with the pty back end's policy for setting the same flag
locally. Of course, users can override the setting either way in the
GUI configurer, the same as all other tty modes.
Previously, the code that marshalled tty settings into the "pty-req"
request was iterating through the subkeys stored in ssh->conf, meaning
that if a session had been saved before we gained support for a
particular tty mode, the iteration wouldn't visit that mode at all and
hence wouldn't send even the default setting for it.
Now we iterate over the array of known mode identifiers in
ssh_ttymodes[] and look each one up in ssh->conf, rather than vice
versa. This means that when we add support for a new tty mode with a
nontrivial policy for choosing its default state, we should start
using the default handler immediately, rather than bizarrely waiting
for users to save a session after the change.
Also add an assertion to do_ssh2_transport to catch this.
This bug would be highly unlikely to manifest accidentally, but I
think you could trigger it by setting the data-based rekey threshold
very low.
All calls to ssh2_add_channel_data() were followed by a call to
ssh2_try_send(), so it seems sensible to replace ssh2_add_channel_data()
with ssh2_send_channel_data(), which does both.
Specifically, don't try to unblock all channels just because we've got
something to send on the main one. It looks like the code to do that
was left over from when SSH_MSG_CHANNEL_ADJUST was handled in
do_ssh2_authconn().
The UI now only has "1" and "2" options for SSH protocol version, which
behave like the old "1 only" and "2 only" options; old
SSH-N-with-fallback settings are interpreted as SSH-N-only.
This prevents any attempt at a protocol downgrade attack.
Most users should see no difference; those poor souls who still have to
work with SSH-1 equipment now have to explicitly opt in.
If you're connecting to a new server and it _only_ provides host key
types you've configured to be below the warning threshold, it's OK to
give the standard askalg() message. But if you've newly demoted a host
key type and now reconnect to some server for which that type was the
best key you had cached, the askalg() wording isn't really appropriate
(it's not that the key we've settled on is the first type _supported
by the server_, it's that it's the first type _cached by us_), and
also it's potentially helpful to list the better algorithms so that
the user can pick one to cross-certify.
When Jacob introduced this message in d0d3c47a0, he was right to
assume that hostkey_algs[] and ssh->uncert_hostkeys[] were sorted in
the same order. Unfortunately, he became wrong less than an hour later
when I committed d06098622. Now we avoid making any such assumption.
Since we got a dynamic preference order, it's been bailing out at a
random point, and listing keys we wouldn't use.
(It would still be nice to only mention keys that we'd actually use, but
that's now quite fiddly.)
I noticed this in passing while tinkering with the hostkey_algs array:
these arrays are full of pointers-to-const, but are not also
themselves declared const, which they should have been all along.
Now we actually have enough of them to worry about, and especially
since some of the types we support are approved by organisations that
people might make their own decisions about whether to trust, it seems
worth having a config list for host keys the same way we have one for
kex types and ciphers.
To make room for this, I've created an SSH > Host Keys config panel,
and moved the existing host-key related configuration (manually
specified fingerprints) into there from the Kex panel.
I got momentarily confused between whether the special code
(TS_LOCALSTART+i) meant the ith entry in the variable
uncert_hostkeys[] array, or the ith entry in the fixed hostkey_algs[]
array. Now I think everything agrees on it being the latter.
If a server offers host key algorithms that we don't have a stored key
for, they will now appear in a submenu of the Special Commands menu.
Selecting one will force a repeat key exchange with that key, and if
it succeeds, will add the new host key to the cache. The idea is that
the new key sent by the server is protected by the crypto established
in the previous key exchange, so this is just as safe as typing some
command like 'ssh-keygen -lf /etc/ssh/ssh_host_ed25519_key.pub' at the
server prompt and transcribing the results manually.
This allows switching over to newer host key algorithms if the client
has begun to support them (e.g. people using PuTTY's new ECC
functionality for the first time), or if the server has acquired a new
key (e.g. due to a server OS upgrade).
At the moment, it's only available manually, for a single host key
type at a time. Automating it is potentially controversial for
security policy reasons (what if someone doesn't agree this is what
they want in their host key cache, or doesn't want to switch over to
using whichever of the keys PuTTY would now put top of the list?), for
code plumbing reasons (chaining several of these rekeys might be more
annoying than doing one at a time) and for CPU usage reasons (rekeys
are expensive), but even so, it might turn out to be a good idea in
future.
The last list we returned is now stored in the main Ssh structure
rather than being a static array in ssh_get_specials.
The main point of this is that I want to start adding more dynamic
things to it, for which I can't predict the array's max length in
advance.
But also this fixes a conceptual wrongness, in that if a process had
more than one Ssh instance in it then their specials arrays would have
taken turns occupying the old static array, and although the current
single-threaded client code in the GUI front ends wouldn't have minded
(it would have read out the contents just once immediately after
get_specials returned), it still feels as if it was a bug waiting to
happen.
ssh_pkt_getstring can return (NULL,0) if the input packet is too short
to contain a valid string.
In quite a few places we were passing the returned pointer,length pair
to a printf function with "%.*s" type format, which seems in practice
to have not been dereferencing the pointer but the C standard doesn't
actually guarantee that. In one place we were doing the same job by
hand with memcpy, and apparently that _can_ dereference the pointer in
practice (so a server could have caused a NULL-dereference crash by
sending an appropriately malformed "x11" type channel open request).
And also I spotted a logging call in the "forwarded-tcpip" channel
open handler which had forgotten the field width completely, so it was
erroneously relying on the string happening to be NUL-terminated in
the received packet.
I've tightened all of this up in general by normalising (NULL,0) to
("",0) before calling printf("%.*s"), and replacing the two even more
broken cases with the corrected version of that same idiom.
It has three settings: on, off, and 'only until session starts'. The
idea of the last one is that if you use something like 'ssh -v' as
your proxy command, you probably wanted to see the initial SSH
connection-setup messages while you were waiting to see if the
connection would be set up successfully at all, but probably _didn't_
want a slew of diagnostics from rekeys disrupting your terminal in
mid-emacs once the session had got properly under way.
Default is off, to avoid startling people used to the old behaviour. I
wonder if I should have set it more aggressively, though.
I'm about to want to make a change to all those functions at once, and
since they're almost identical, it seemed easiest to pull them out
into a common helper. The new source file be_misc.c is intended to
contain helper code common to all network back ends (crypto and
non-crypto, in particular), and initially it contains a
backend_socket_log() function which is the common part of ssh_log(),
telnet_log(), rlogin_log() etc.
We've always had the back-end code unconditionally print 'Looking up
host' before calling name_lookup. But name_lookup doesn't always do an
actual lookup - in cases where the connection will be proxied and
we're configured to let the proxy do the DNS for us, it just calls
sk_nonamelookup to return a dummy SockAddr with the unresolved name
still in it. It's better to print a message that varies depending on
whether we're _really_ doing DNS or not, e.g. so that people can tell
the difference between DNS failure and proxy misconfiguration.
Hence, those log messages are now generated inside name_lookup(),
which takes a couple of extra parameters for the purpose - a frontend
pointer to pass to logevent(), and a reason string so that it can say
what the hostname it's (optionally) looking up is going to be used
for. (The latter is intended for possible use in logging subsidiary
lookups for port forwarding, though the moment I haven't changed
the current setup where those connection setups aren't logged in
detail - we just pass NULL in that situation.)
When we set ssh->sc{cipher,mac} to s->sc{cipher,mac}_tobe
conditionally, we should be conditionalising on the values we're
_reading_, not the ones we're about to overwrite.
Thanks to Colin Harrison for this patch.
Apparently if you maintain a branch for a long time where you only
compile with a non-default ifdef enabled, it becomes possible to not
notice a typo you left in the default branch :-)
This adds the "none" cipher and MAC, and also disables kex signure
verification and host-key checking. Since a client like this is
completely insecure, it also rewrites the client version string to
start "ISH", which should make it fail to interoperate with a real SSH
server. The server version string is still expected to begin "SSH" so
the real packet captures can be used against it.
The previous assertion failure is obviously wrong, but RFC 4253 doesn't
explicitly declare them to be a protocol error. Currently, the incoming
packet isn't logged, which might cause some confusion for log parsers.
Bug found with the help of afl-fuzz.
The previous assertion failure is obviously wrong, but RFC 4253 doesn't
explicitly declare them to be a protocol error. Currently, the incoming
packet isn't logged, which might cause some confusion for log parsers.
Bug found with the help of afl-fuzz.
This protects the Unix platform sharing code in the case where no salt
file exists yet in the connection-sharing directory, in which case
make_dirname() will want to create one by using some random bytes, and
prior to this commit, would fail an assertion because the random
number generator wasn't set up.
It would be neater to just return FALSE from ssh_test_for_upstream in
that situation - if there's no salt file, then no sharing socket can
be valid anyway - but that would involve doing more violence to the
code structure than I'm currently prepared to do for a minor elegance
gain.
A user reports that in a particular situation one of the calls to
LoadLibrary from wingss.c has unwanted side effects, and points out
that this happens even when the saved session has GSSAPI disabled. So
I've evaluated as much as possible of the condition under which we
check the results of GSS library loading, and deferred the library
loading itself until after that condition says we even care about the
results.
(cherry picked from commit 9a08d9a7c1)
A Plink invocation of the form 'plink -shareexists <session>' tests
for a currently live connection-sharing upstream for the session in
question. <session> can be any syntax you'd use with Plink to make the
actual connection (a host/port number, a bare saved session name,
-load, whatever).
I envisage this being useful for things like adaptive proxying - e.g.
if you want to connect to host A which you can't route to directly,
and you might already have a connection to either of hosts B or C
which are viable proxies, then you could write a proxy shell script
which checks whether you already have an upstream for B or C and goes
via whichever one is currently active.
Testing for the upstream's existence has to be done by actually
connecting to its socket, because on Unix the mere existence of a
Unix-domain socket file doesn't guarantee that there's a process
listening to it. So we make a test connection, and then immediately
disconnect; hence, that shows up in the upstream's event log.
This is the part of ssh.c's connect_to_host() which figures out the
host name and port number that logically identify the connection -
i.e. not necessarily where we physically connected to, but what we'll
use to look up the saved session cache, put in the window title bar,
and give to the connection sharing code to identify other connections
to share with.
I'm about to want to use it for another purpose, so it needs to be
moved out into a separate function.
If we've chosen the ChaCha20-Poly1305 option for a cipher, then that
forces the use of its associated MAC. In that situation, we should
avoid even _trying_ to figure out a MAC by examining the MAC string
from the server's KEXINIT, because we won't use the MAC selected by
that method anyway, so there's no point imposing the requirement on
servers to present a MAC we believe in just so we know it's there.
This was breaking interoperation with tinysshd, and is in violation of
OpenSSH's spec for the "chacha20-poly1305@openssh.com" cipher.
The revamp of key generation in commit e460f3083 made the assumption
that you could decide how many bytes of key material to generate by
converting cipher->keylen from bits to bytes. This is a good
assumption for all ciphers except DES/3DES: since the SSH DES key
setup ignores one bit in every byte of key material it's given, you
need more bytes than its keylen field would have you believe. So
currently the DES ciphers aren't being keyed correctly.
The original keylen field is used for deciding how big a DH group to
request, and on that basis I think it still makes sense to keep it
reflecting the true entropy of a cipher key. So it turns out we need
two _separate_ key length fields per cipher - one for the real
entropy, and one for the much more obvious purpose of knowing how much
data to ask for from ssh2_mkkey.
A compensatory advantage, though, is that we can now measure the
latter directly in bytes rather than bits, so we no longer have to
faff about with dividing by 8 and rounding up.
Tim Kosse points out that we now support some combinations of crypto
primitives which break the hardwired assumption that two blocks of
hash output from the session-key derivation algorithm are sufficient
to key every cipher and MAC in the system.
So now ssh2_mkkey is given the desired key length, and performs as
many iterations as necessary.
The key derivation code has been assuming (though non-critically, as
it happens) that the size of the MAC output is the same as the size of
the MAC key. That isn't even a good assumption for the HMAC family,
due to HMAC-SHA1-96 and also the bug-compatible versions of HMAC-SHA1
that only use 16 bytes of key material; so now we have an explicit
key-length field separate from the MAC-length field.
A user reports that in a particular situation one of the calls to
LoadLibrary from wingss.c has unwanted side effects, and points out
that this happens even when the saved session has GSSAPI disabled. So
I've evaluated as much as possible of the condition under which we
check the results of GSS library loading, and deferred the library
loading itself until after that condition says we even care about the
results.
If a sharing downstream disconnected while we were still in userauth
(probably by deliberate user action, since such a downstream would
have just been sitting there waiting for upstream to be ready for it)
then we could crash by attempting to count234(ssh->channels) before
the ssh->channels tree had been set up in the first place.
A simple null-pointer check fixes it. Thanks to Antti Seppanen for the
report.
I removed a vital line of code while fixing the merge conflicts when
cherry-picking 1eb578a488 as
26fe1e26c0, causing Diffie-Hellman key
exchange to be completely broken because the server's host key was
never constructed to verify the signature with. Reinstate it.
Assorted calls to ssh_pkt_getstring in handling the later parts of key
exchange (post-KEXINIT) were not checked for NULL afterwards, so that
a variety of badly formatted key exchange packets would cause a crash
rather than a sensible error message.
None of these is an exploitable vulnerability - the server can only
force a clean null-deref crash, not an access to actually interesting
memory.
Thanks to '3unnym00n' for pointing out one of these, causing me to
find all the rest of them too.
(cherry picked from commit 1eb578a488)
Conflicts:
ssh.c
Cherry-picker's notes: the main conflict arose because the original
commit also made fixes to the ECDH branch of the big key exchange if
statement, which doesn't exist on this branch. Also there was a minor
and purely textual conflict, when an error check was added right next
to a function call that had acquired an extra parameter on master.
The final main loop in do_ssh2_authconn will sometimes loop over all
currently open channels calling ssh2_try_send_and_unthrottle. If the
channel is a sharing one, however, that will reference fields of the
channel structure like 'remwindow', which were never initialised in
the first place (thanks, valgrind). Fix by excluding CHAN_SHARING
channels from that loop.
(cherry picked from commit 7366fde1d4)
When anyone connects to a PuTTY tool's listening socket - whether it's
a user of a local->remote port forwarding, a connection-sharing
downstream or a client of Pageant - we'd like to log as much
information as we can find out about where the connection came from.
To that end, I've implemented a function sk_peer_info() in the socket
abstraction, which returns a freeform text string as best it can (or
NULL, if it can't get anything at all) describing the thing at the
other end of the connection. For TCP connections, this is done using
getpeername() to get an IP address and port in the obvious way; for
Unix-domain sockets, we attempt SO_PEERCRED (conditionalised on some
moderately hairy autoconfery) to get the pid and owner of the peer. I
haven't implemented anything for Windows named pipes, but I will if I
hear of anything useful.
(cherry picked from commit c8f83979a3)
Conflicts:
pageant.c
Cherry-picker's notes: the conflict was because the original commit
also added a use of the same feature in the centralised Pageant code,
which doesn't exist on this branch. Also I had to remove 'const' from
the type of the second parameter to wrap_send_port_open(), since this
branch hasn't had the same extensive const-fixing as master.
PuTTY now uses the updated version of Diffie-Hellman group exchange,
except for a few old OpenSSH versions which Darren Tucker reports only
support the old version.
FIXME: this needs further work because the Bugs config panel has now
overflowed.
(cherry picked from commit 62a1bce7cb)
Assorted calls to ssh_pkt_getstring in handling the later parts of key
exchange (post-KEXINIT) were not checked for NULL afterwards, so that
a variety of badly formatted key exchange packets would cause a crash
rather than a sensible error message.
None of these is an exploitable vulnerability - the server can only
force a clean null-deref crash, not an access to actually interesting
memory.
Thanks to '3unnym00n' for pointing out one of these, causing me to
find all the rest of them too.
The final main loop in do_ssh2_authconn will sometimes loop over all
currently open channels calling ssh2_try_send_and_unthrottle. If the
channel is a sharing one, however, that will reference fields of the
channel structure like 'remwindow', which were never initialised in
the first place (thanks, valgrind). Fix by excluding CHAN_SHARING
channels from that loop.
I'd rather see the cipher and MAC named separately, with a hint that
the two are linked together in some way, than see the cipher called by
a name including the MAC and the MAC init message have an ugly
'<implicit>' in it.
This allows for sharing a bit of code, and it also means that
deduplication of KEXINIT algorithms can be done while working out the
list of algorithms rather than when constructing the KEXINIT packet
itself.
The general plan is that if PuTTY knows a host key for a server, it
should preferentially ask for the same type of key so that there's some
chance of actually getting the same key again. This should mean that
when a server (or PuTTY) adds a new host key type, PuTTY doesn't
gratuitously switch to that key type and then warn the user about an
unrecognised key.
It seems like quite an important thing to mention in the event log!
Suppose there's a bug affecting only one curve, for example? Fixed-
group Diffie-Hellman has always logged the group, but the ECDH log
message just told you the hash and not also the curve.
To implement this, I've added a 'textname' field to all elliptic
curves, whether they're used for kex or signing or both, suitable for
use in this log message and any others we might find a need for in
future.
When anyone connects to a PuTTY tool's listening socket - whether it's
a user of a local->remote port forwarding, a connection-sharing
downstream or a client of Pageant - we'd like to log as much
information as we can find out about where the connection came from.
To that end, I've implemented a function sk_peer_info() in the socket
abstraction, which returns a freeform text string as best it can (or
NULL, if it can't get anything at all) describing the thing at the
other end of the connection. For TCP connections, this is done using
getpeername() to get an IP address and port in the obvious way; for
Unix-domain sockets, we attempt SO_PEERCRED (conditionalised on some
moderately hairy autoconfery) to get the pid and owner of the peer. I
haven't implemented anything for Windows named pipes, but I will if I
hear of anything useful.
The new code remembers the contents and meaning of the outgoing KEXINIT
and uses this to drive the algorithm negotiation, rather than trying to
reconstruct what the outgoing KEXINIT probably said. This removes the
need to maintain the KEXINIT generation and parsing code precisely in
parallel.
It also fixes a bug whereby PuTTY would have selected the wrong host key
type in cases where the server gained a host key type between rekeys.
Having found a lot of unfixed constness issues in recent development,
I thought perhaps it was time to get proactive, so I compiled the
whole codebase with -Wwrite-strings. That turned up a huge load of
const problems, which I've fixed in this commit: the Unix build now
goes cleanly through with -Wwrite-strings, and the Windows build is as
close as I could get it (there are some lingering issues due to
occasional Windows API functions like AcquireCredentialsHandle not
having the right constness).
Notable fallout beyond the purely mechanical changing of types:
- the stuff saved by cmdline_save_param() is now explicitly
dupstr()ed, and freed in cmdline_run_saved.
- I couldn't make both string arguments to cmdline_process_param()
const, because it intentionally writes to one of them in the case
where it's the argument to -pw (in the vain hope of being at least
slightly friendly to 'ps'), so elsewhere I had to temporarily
dupstr() something for the sake of passing it to that function
- I had to invent a silly parallel version of const_cmp() so I could
pass const string literals in to lookup functions.
- stripslashes() in pscp.c and psftp.c has the annoying strchr nature
The ec_name_to_curve and ec_curve_to_name functions shouldn't really
have had to exist at all: whenever any part of the PuTTY codebase
starts using sshecc.c, it's starting from an ssh_signkey or ssh_kex
pointer already found by some other means. So if we make sure not to
lose that pointer, we should never need to do any string-based lookups
to find the curve we want, and conversely, when we need to know the
name of our curve or our algorithm, we should be able to look it up as
a straightforward const char * starting from the algorithm pointer.
This commit cleans things up so that that is indeed what happens. The
ssh_signkey and ssh_kex structures defined in sshecc.c now have
'extra' fields containing pointers to all the necessary stuff;
ec_name_to_curve and ec_curve_to_name have been completely removed;
struct ec_curve has a string field giving the curve's name (but only
for those curves which _have_ a name exposed in the wire protocol,
i.e. the three NIST ones); struct ec_key keeps a pointer to the
ssh_signkey it started from, and uses that to remember the algorithm
name rather than reconstructing it from the curve. And I think I've
got rid of all the ad-hockery scattered around the code that switches
on curve->fieldBits or manually constructs curve names using stuff
like sprintf("nistp%d"); the only remaining switch on fieldBits
(necessary because that's the UI for choosing a curve in PuTTYgen) is
at least centralised into one place in sshecc.c.
One user-visible result is that the format of ed25519 host keys in the
registry has changed: there's now no curve name prefix on them,
because I think it's not really right to make up a name to use. So any
early adopters who've been using snapshot PuTTY in the last week will
be inconvenienced; sorry about that.
This gives families of public key and kex functions (by which I mean
those sharing a set of methods) a place to store parameters that allow
the methods to vary depending on which exact algorithm is in use.
The ssh_kex structure already had a set of parameters specific to
Diffie-Hellman key exchange; I've moved those into sshdh.c and made
them part of the 'extra' structure for that family only, so that
unrelated kex methods don't have to faff about saying NULL,NULL,0,0.
(This required me to write an extra accessor function for ssh.c to ask
whether a DH method was group-exchange style or fixed-group style, but
that doesn't seem too silly.)
Not all of them, but the ones that don't get a 'void *key' parameter.
This means I can share methods between multiple ssh_signkey
structures, and still give those methods an easy way to find out which
public key method they're dealing with, by loading parameters from a
larger structure in which the ssh_signkey is the first element.
(In OO terms, I'm arranging that all static methods of my public key
classes get a pointer to the class vtable, to make up for not having a
pointer to the class instance.)
I haven't actually done anything with the new facility in this commit,
but it will shortly allow me to clean up the constant lookups by curve
name in the ECDSA code.
All the name strings in ssh_cipher, ssh_mac, ssh_hash, ssh_signkey
point to compile-time string literals, hence should obviously be const
char *.
Most of these const-correctness patches are just a mechanical job of
adding a 'const' in the one place you need it right now, and then
chasing the implications through the code adding further consts until
it compiles. But this one has actually shown up a bug: the 'algorithm'
output parameter in ssh2_userkey_loadpub was sometimes returning a
pointer to a string literal, and sometimes a pointer to dynamically
allocated memory, so callers were forced to either sometimes leak
memory or sometimes free a bad thing. Now it's consistently
dynamically allocated, and should be freed everywhere too.
There were ad-hoc functions for fingerprinting a bare key blob in both
cmdgen.c and pageant.c, not quite doing the same thing. Also, every
SSH-2 public key algorithm in the code base included a dedicated
fingerprint() method, which is completely pointless since SSH-2 key
fingerprints are computed in an algorithm-independent way (just hash
the standard-format public key blob), so each of those methods was
just duplicating the work of the public_blob() method with a less
general output mechanism.
Now sshpubk.c centrally provides an ssh2_fingerprint_blob() function
that does all the real work, plus an ssh2_fingerprint() function that
wraps it and deals with calling public_blob() to get something to
fingerprint. And the fingerprint() method has been completely removed
from ssh_signkey and all its implementations, and good riddance.
Obviously PuTTY can't actually do public-key authentication itself, if
you give it a public rather than private key file. But it can still
match the supplied public key file against the list of keys in the
agent, and narrow down to that. So if for some reason you're
forwarding an agent to a machine you don't want to trust with your
_private_ key file (even encrypted), you can still use the '-i' option
to select which key from the agent to use, by uploading just the
public key file to that machine.
It's just ssh_pkt_addstring_data but using strlen to get the length of
string to add, so make that explicit by having it call
ssh_pkt_addstring_data. Good compilers should be unaffected by this
change.
This is the kex protocol id "curve25519-sha256@libssh.org", so called
because it's over the prime field of order 2^255 - 19.
Arithmetic in this curve is done using the Montgomery representation,
rather than the Weierstrass representation. So 'struct ec_curve' has
grown a discriminant field and a union of subtypes.
Several of the functions in ssh2_signkey, and one or two SSH-1 key
functions too, were still taking assorted non-const buffer parameters
that had never been properly constified. Sort them all out.
This causes the initial length field of the SSH-2 binary packet to be
unencrypted (with the knock-on effect that now the packet length not
including MAC must be congruent to 4 rather than 0 mod the cipher
block size), and then the MAC is applied over the unencrypted length
field and encrypted ciphertext (prefixed by the sequence number as
usual). At the cost of exposing some information about the packet
lengths to an attacker (but rarely anything they couldn't have
inferred from the TCP headers anyway), this closes down any
possibility of a MITM using the client as a decryption oracle, unless
they can _first_ fake a correct MAC.
ETM mode is enabled by means of selecting a different MAC identifier,
all the current ones of which are constructed by appending
"-etm@openssh.com" to the name of a MAC that already existed.
We currently prefer the original SSH-2 binary packet protocol (i.e. we
list all the ETM-mode MACs last in our KEXINIT), on the grounds that
it's better tested and more analysed, so at the moment the new mode is
only activated if a server refuses to speak anything else.
PuTTY now uses the updated version of Diffie-Hellman group exchange,
except for a few old OpenSSH versions which Darren Tucker reports only
support the old version.
FIXME: this needs further work because the Bugs config panel has now
overflowed.
Florent Daigniere of Matta points out that RFC 4253 actually
_requires_ us to refuse to accept out-of-range values, though it isn't
completely clear to me why this should be a MUST on the receiving end.
Matta considers this to be a security vulnerability, on the grounds
that if a server should accidentally send an obviously useless value
such as 1 then we will fail to reject it and agree a key that an
eavesdropper could also figure out. Their id for this vulnerability is
MATTA-2015-002.
I'm not actually sure why we've always had back ends notify ldisc of
changes to echo/edit settings by giving ldisc_send(ldisc,NULL,0,0) a
special meaning, instead of by having a separate dedicated notify
function with its own prototype and parameter set. Coverity's recent
observation that the two kinds of call don't even have the same
requirements on the ldisc (particularly, whether ldisc->term can be
NULL) makes me realise that it's really high time I separated the two
conceptually different operations into actually different functions.
While I'm here, I've renamed the confusing ldisc_update() function
which that special operation ends up feeding to, because it's not
actually a function applying to an ldisc - it applies to a front end.
So ldisc_send(ldisc,NULL,0,0) is now ldisc_echoedit_update(ldisc), and
that in turn figures out the current echo/edit settings before passing
them on to frontend_echoedit_update(). I think that should be clearer.
This ought to happen in ssh_do_close alongside the code that shuts
down other local listening things like port forwardings, for the same
obvious reason. In particular, we should get through this _before_ we
put up a modal dialog box telling the user what just went wrong with
the SSH connection, so that further sessions started while that box is
active don't try futilely to connect to the not-really-listening
zombie upstream.
This provides support for ECDSA public keys, for both hosts and users,
and also ECDH key exchange. Supported curves are currently just the
three NIST curves required by RFC 5656.
I had initially assumed that, since all of a user's per-connection
subdirectories live inside a top-level putty-connshare.$USER directory
that's not accessible to anyone else, there would be no need to
obfuscate the names of the internal directories for privacy, because
nobody would be able to look at them anyway.
Unfortunately, that's not true: 'netstat -ax' run by any user will
show up the full pathnames of Unix-domain sockets, including pathname
components that you wouldn't have had the access to go and look at
directly. So the Unix connection sharing socket names do need to be
obfuscated after all.
Since Unix doesn't have Windows's CryptProtectMemory, we have to do
this manually, by creating a file of random salt data inside the
top-level putty-connshare directory (if there isn't one there already)
and then hashing that salt with the "user@host" connection identifier
to get the socket directory name. What a pain.
[originally from svn r10222]
This option is available from the command line as '-hostkey', and is
also configurable through the GUI. When enabled, it completely
replaces all of the automated host key management: the server's host
key will be checked against the manually configured list, and the
connection will be allowed or disconnected on that basis, and the host
key store in the registry will not be either consulted or updated.
The main aim is to provide a means of automatically running Plink,
PSCP or PSFTP deep inside Windows services where HKEY_CURRENT_USER
isn't available to have stored the right host key in. But it also
permits you to specify a list of multiple host keys, which means a
second use case for the same mechanism will probably be round-robin
DNS names that select one of several servers with different host keys.
Host keys can be specified as the standard MD5 fingerprint or as an
SSH-2 base64 blob, and are canonicalised on input. (The base64 blob is
more unwieldy, especially with Windows command-line length limits, but
provides a means of specifying the _whole_ public key in case you
don't trust MD5. I haven't bothered to provide an analogous mechanism
for SSH-1, on the basis that anyone worrying about MD5 should have
stopped using SSH-1 already!)
[originally from svn r10220]
This is the same code I previously fixed for failing to check NULL
pointers coming back from ssh_pkt_getstring if the server's KEXINIT
ended early, leading to an embarrassing segfault in place of a fatal
error message. But I've now also had it pointed out to me that the
fatal error message passes the string as %s, which is inappropriate
because (being read straight out of the middle of an SSH packet) it
isn't necessarily zero-terminated!
This is still just an embarrassing segfault in place of a fatal error
message, and not exploitable as far as I can see, because the string
is passed to a dupprintf, which will either read off the end of
allocated address space and segfault non-exploitably, or else it will
find a NUL after all and carefully allocate enough space to format an
error message containing all of the previous junk. But still, how
embarrassing to have messed up the same code _twice_.
[originally from svn r10211]
We now expect that after the server has sent us CHANNEL_CLOSE, we
should not expect to see any replies to our outstanding channel
requests, and conversely after we have sent CHANNEL_CLOSE we avoid
sending any reply to channel requests from the server. This was the
consensus among implementors discussing the problem on ietf-ssh in
April 2014.
To cope with current OpenSSH's (and perhaps other servers we don't
know about yet) willingness to send request replies after
CHANNEL_CLOSE, I introduce a bug-compatibility flag which is detected
for every OpenSSH version up to and including the current 6.6 - but
not beyond, since https://bugzilla.mindrot.org/show_bug.cgi?id=1818
promises that 6.7 will also implement the new consensus behaviour.
[originally from svn r10200]
Martin Prikryl reports that it had the exact same bug as old OpenSSH
(insisting that RSA signature integers be padded with leading zero
bytes to the same length as the RSA modulus, where in fact RFC 4253
section 6.6 says it ought to have _no_ padding), but is recently
fixed. The first version string to not have the bug is reported to be
"mod_sftp/0.9.9", so here we recognise everything less than that as
requiring our existing workaround.
[originally from svn r10161]
If we search for a colon by computing ptr + host_strcspn(ptr,":"),
then the resulting pointer is always non-NULL, and the 'not found'
condition is not !p but !*p.
This typo could have caused PuTTY to overrun a string, but not in a
security-bug sense because any such string would have to have been
loaded from the configuration rather than received from a hostile
source.
[originally from svn r10123]
Both GUI PuTTY front ends have a piece of logic whereby a string is
interpreted as host:port if there's _one_ colon in it, but if there's
more than one colon then it's assumed to be an IPv6 literal with no
trailing port number. This permits the PuTTY command line to take
strings such as 'host', 'host:22' or '[::1]:22', but also cope with a
bare v6 literal such as '::1'.
This logic is also required in the two Plink front ends and in the
processing of CONF_loghost for host key indexing in ssh.c, but was
missing in all those places. Add it.
[originally from svn r10121]
I've gone through everywhere we handle host names / addresses (on
command lines, in PuTTY config, in port forwarding, in X display
names, in host key storage...) and tried to make them handle IPv6
literals sensibly, by using the host_str* functions I introduced in my
previous commit. Generally it's now OK to use a bracketed IPv6 literal
anywhere a hostname might have been valid; in a few cases where no
ambiguity exists (e.g. no :port suffix is permitted anyway)
unbracketed IPv6 literals are also acceptable.
[originally from svn r10120]
The line that resets st->pktin->length to cover only the semantic
payload of the SSH message was overwriting the modification to
st->pktin->length performed by the optional decompression step. I
didn't notice because I don't habitually enable compression.
[originally from svn r10103]
[r10070 == 9f5d51a4ac]
I've enabled gcc's format-string checking on dupprintf, by declaring
it in misc.h to have the appropriate GNU-specific attribute. This
pointed out a selection of warnings, which I've fixed.
[originally from svn r10084]
The basic strategy is described at the top of the new source file
sshshare.c. In very brief: an 'upstream' PuTTY opens a Unix-domain
socket or Windows named pipe, and listens for connections from other
PuTTYs wanting to run sessions on the same server. The protocol spoken
down that socket/pipe is essentially the bare ssh-connection protocol,
using a trivial binary packet protocol with no encryption, and the
upstream has to do some fiddly transformations that I've been
referring to as 'channel-number NAT' to avoid resource clashes between
the sessions it's managing.
This is quite different from OpenSSH's approach of using the Unix-
domain socket as a means of passing file descriptors around; the main
reason for that is that fd-passing is Unix-specific but this system
has to work on Windows too. However, there are additional advantages,
such as making it easy for each downstream PuTTY to run its own
independent set of port and X11 forwardings (though the method for
making the latter work is quite painful).
Sharing is off by default, but configuration is intended to be very
easy in the normal case - just tick one box in the SSH config panel
and everything else happens automatically.
[originally from svn r10083]
Now that it doesn't actually make a network connection because that's
deferred until after the X authorisation exchange, there's no point in
having it return an error message and write the real output through a
pointer argument. Instead, we can just have it return xconn directly
and simplify the call sites.
[originally from svn r10081]
Rather than the top-level component of X forwarding being an
X11Display structure which owns some auth data, it's now a collection
of X11FakeAuth structures, each of which owns a display. The idea is
that when we receive an X connection, we wait to see which of our
available auth cookies it matches, and then connect to whatever X
display that auth cookie identifies. At present the tree will only
have one thing in it; this is all groundwork for later changes.
[originally from svn r10079]
Now we wait to open the socket to the X server until we've seen the
authorisation data. This prepares us to do something else with the
channel if we see different auth data, which will come up in
connection sharing.
[originally from svn r10078]
I don't know that this can ever be triggered in the current state of
the code, but when I start mucking around with SSH session closing in
the near future, it may be handy to have it.
[originally from svn r10076]
The most important change is that, where previously ssh.c held the
Socket pointer for each X11 and port forwarding, and the support
modules would find their internal state structure by calling
sk_get_private_ptr on that Socket, it's now the other way round. ssh.c
now directly holds the internal state structure pointer for each
forwarding, and when the support module needs the Socket it looks it
up in a field of that. This will come in handy when I decouple socket
creation from logical forwarding setup, so that X forwardings can
delay actually opening a connection to an X server until they look at
the authentication data and see which server it has to be.
However, while I'm here, I've also taken the opportunity to clean up a
few other points, notably error message handling, and also the fact
that the same kind of state structure was used for both
connection-type and listening-type port forwardings. Now there are
separate PortForwarding and PortListener structure types, which seems
far more sensible.
[originally from svn r10074]
Because the upcoming connection sharing changes are going to involve
us emitting outgoing SSH packets into our log file that we didn't
construct ourselves, we can no longer rely on metadata inserted at
packet construction time to tell us which parts of which packets have
to be blanked or omitted in the SSH packet log. Instead, we now have
functions that deal with constructing the blanks array just before
passing all kinds of packet (both SSH-1 and SSH-2, incoming and
outgoing) to logging.c; the blanks/nblanks fields in struct Packet are
therefore no longer needed.
[originally from svn r10071]
There's always been some confusion over exactly what it all means. I
haven't cleaned it up to the point of complete sensibleness, but I've
got it to a point where I can at least understand and document the
remaining non-sensibleness.
[originally from svn r10070]
It's now indexed by source hostname as well as source port (so that
separate requests for the server to listen on addr1:1234 and
addr2:1234 can be disambiguated), and also its destination host name
is dynamically allocated rather than a fixed-size buffer.
[originally from svn r10062]
Anthony Ho reports that this can occur naturally in some situation
involving Windows 8 + IE 11 and dynamic port forwarding: apparently we
get through the SOCKS negotiation, send our CHANNEL_OPEN, and then
*immediately* suffer a local WSAECONNABORTED error before the server
has sent back its OPEN_CONFIRMATION or OPEN_FAILURE. In this situation
ssh2_channel_check_close was failing to notice that the channel didn't
yet have a valid server id, and sending out a CHANNEL_CLOSE anyway
containing 32 bits of uninitialised nonsense.
We now handle this by turning our half-open CHAN_SOCKDATA_DORMANT into
a half-open CHAN_ZOMBIE, which means in turn that our handler
functions for OPEN_CONFIRMATION and OPEN_FAILURE have to recognise and
handle that case, the former by immediately initiating channel closure
once we _do_ have the channel's server id to do it with.
[originally from svn r10039]
CHAN_AGENT channels need c->u.a.message to be either NULL or valid
dynamically allocated memory, because it'll be freed by
ssh_channel_destroy. This bug triggers if an agent forwarding channel
is opened and closed without having sent any queries.
[originally from svn r10032]
We now only present the full set of host key algorithms we can handle
in the first key exchange. In subsequent rekeys, we present only the
host key algorithm that we agreed on the previous time, and then we
verify the host key by simply enforcing that it's exactly the same as
the one we saw at first and disconnecting rudely if it isn't.
[originally from svn r10027]
sitting on a pile of buffered data waiting for WINDOW_ADJUSTs, we
should throw away that buffered data, because the CHANNEL_CLOSE tells
us that we won't be receiving those WINDOW_ADJUSTs, and if we hang on
to the data and keep trying then it'll prevent ssh_channel_try_eof
from sending the CHANNEL_EOF which is a prerequisite of sending our
own CHANNEL_CLOSE.
[originally from svn r9953]
crWaitUntilV(pktin) with plain crReturnV, because those coroutines can
be called back either with a response packet from the channel request
_or_ with NULL by ssh_free meaning 'please just clean yourself up'.
[originally from svn r9927]
warnings about insecure crypto components. The latter may crReturn
(though not in any current implementation, I believe), which
invalidates pktin, which is used by the former.
[originally from svn r9921]
of the GET_32BIT macros and then used as length fields. Missing bounds
checks against zero have been added, and also I've introduced a helper
function toint() which casts from unsigned to int in such a way as to
avoid C undefined behaviour, since I'm not sure I trust compilers any
more to do the obviously sensible thing.
[originally from svn r9918]
since there is a theoretical code path (via the crReturn loop after
asking an interactive question about a host key or crypto algorithm)
on which we can leave and return to do_ssh1_login between allocating
and freeing those keys.
(In practice it shouldn't come up anyway with any of the current
implementations of the interactive question functions, not to mention
the unlikelihood of anyone non-specialist still using SSH-1, but
better safe than sorry.)
[originally from svn r9895]
as specified in RFC 6668. This is not so much because I think it's
necessary, but because scrypt uses HMAC-SHA-256 and once we've got it we
may as well use it.
Code very closely derived from the HMAC-SHA-1 code.
Tested against OpenSSH 5.9p1 Debian-5ubuntu1.
[originally from svn r9759]
RFC 4245 section 7.1 specifies the meaning of the "address to bind"
parameter in a "tcpip-forward" request. "0.0.0.0" and "127.0.0.1" are
specified to be all interfaces and the loopback interface respectively
in IPv4, while "" and "localhost" are the address-family-agnostic
equivalents. Switch PuTTY to using the latter, since it doesn't seem
right to force IPv4.
There's an argument that PuTTY should provide a means of configuring the
address family used for remote forwardings like it does for local ones.
[originally from svn r9668]
First, make absolute times unsigned. This means that it's safe to
depend on their overflow behaviour (which is undefined for signed
integers). This requires a little extra care in handling comparisons,
but I think I've correctly adjusted them all.
Second, functions registered with schedule_timer() are guaranteed to be
called with precisely the time that was returned by schedule_timer().
Thus, it's only necessary to check these values for equality rather than
doing risky range checks, so do that.
The timing code still does lots that's undefined, unnecessary, or just
wrong, but this is a good start.
[originally from svn r9667]
confused if they receive a request followed by immediate EOF, since we
currently send outgoing EOF as soon as we see the incoming one - and
then, when the response comes back from the real SSH agent, we send it
along anyway as channel data in spite of having sent EOF.
To fix this, I introduce a new field for each agent channel which
counts the number of calls to ssh_agentf_callback that are currently
expected, and we don't send EOF on an agent channel until we've both
received EOF and that value drops to zero.
[originally from svn r9651]
move the primary conditions out of them into their callers. Fixes a
crash in 'plink -N', since those functions would be called with a NULL
channel parameter and immediately dereference it to try to get c->ssh.
[originally from svn r9644]
They're only likely to be useful for freeing a coroutine state
structure, in which case there's no need to reset the line number
(since all such coroutines keep their line number in the state
structure) and the state structure pointer is always called "s".
[originally from svn r9632]
In sshfwd_unclean_close(), get ssh2_check_close() to handle sending
SSH_MSG_CHANNEL_CLOSE. That way, it can hold off doing so until any
outstanding channel requests are processed.
Also add event log message for unclean channel closures.
[originally from svn r9631]
crFinish or crFinishV, since they will attempt to write to the
coroutine state variable contained in that structure. Introduced some
new all-in-one macros crFinishFree and crFinishFreeV, and used those
instead. Should fix today's report of a crash just after authentication.
[originally from svn r9630]
Part the first: make sure that all structures describing channel
requests are freed when the SSH connection is freed. This involves
adding a means to ask a response handler to free any memory it holds.
Part the second: in ssh_channel_try_eof(), call
ssh2_channel_check_close() rather than emitting an SSH_MSG_CHANNEL_EOF
directly. This avoids the possibility of closing the channel while a
CHANNEL_REQUEST is outstanding.
Also add some assertions that helped with tracking down the latter
problem.
[originally from svn r9623]
This reduces code size a little and also makes it harder to
accidentally request a reply without putting in place a handler for
it or vice versa.
[originally from svn r9620]
The various setup routines can only receive CHANNEL_SUCCESS or
CHANNEL_FAILURE, so there's no need for the to worry about receiving
anything else. Strange packets will end up in do_ssh2_authconn
instead.
[originally from svn r9619]