In the new modular SSH architecture, ssh2transport.c delegates the
actual KEX packet exchange to ssh2kex_coroutine, which has different
implementations for client and server. The KEX code actually in
ssh2transport.c consists of looping on the coroutine until it zeroes
out its state field in the ssh2transport state.
But if something goes wrong enough during KEX that we call
ssh_proto_error or any other fatal connection-terminating function,
then when we return to ssh2transport.c, the ssh2transport state won't
even exist for it to check that flag. Address Sanitiser pointed that
out to me recently, so here's a fix in which we set an 'aborted' flag
to tell the caller that its state has already been freed.
A user points out that the call to close_directory() in pscp.c's
rsource() function should have been inside rather than outside the if
statement that checks whether the directory handle is NULL. As a
result, any failed attempt to open a directory during a 'pscp -r'
recursive upload leads to a null-pointer dereference.
Moved the close_directory call to where it should be, and also
arranged to print the OS error code if the directory-open fails, by
also changing the API of open_directory to return an error string on
failure.
In the previous commit I happened to notice a %.150s in a ppl_logevent
call, which was probably an important safety precaution a couple of
decades ago when that format string was being used for an sprintf into
a fixed-size buffer, but now it's just pointless cruft.
This commit removes all printf string formatting directives with a
compile-time fixed size, with the one exception of a %.3s used to cut
out a 3-letter month name in scpserver.c. In cases where the format
string in question was already going to an arbitrary-length function
like dupprintf or ppl_logevent, that's all I've done; in cases where
there was still a fixed-size buffer, I've replaced it with a dynamic
buffer and dupprintf.
In the past, I've had a lot of macros which you call with double
parentheses, along the lines of debug(("format string", params)), so
that the inner parens protect the commas and permit the macro to treat
the whole printf-style argument list as one macro argument.
That's all very well, but it's a bit inconvenient (it doesn't leave
you any way to implement such a macro by prepending another argument
to the list), and now this code base's rules allow C99isms, I can
switch all those macros to using a single pair of parens, using the
C99 ability to say '...' in the parameter list of the #define and get
at the corresponding suffix of the arguments as __VA_ARGS__.
So I'm doing it. I've made the following printf-style macros variadic:
bpp_logevent, ppl_logevent, ppl_printf and debug.
While I'm here, I've also fixed up a collection of conditioned-out
calls to debug() in the Windows front end which were clearly expecting
a macro with a different calling syntax, because they had an integer
parameter first. If I ever have a need to condition those back in,
they should actually work now.
In the big boolification commit (3214563d8) I accidentally rewrote
"term->wrap == 0" as "term->wrap" instead of as "!term->wrap", so now
sending the backspace character to the terminal at the start of a line
causes the cursor to wrap round to the end of the previous line if and
only if it _shouldn't_ have done.
A long time ago, in commit 4d77b6567, I moved the generation of the
arrow-key escape sequences into a function format_arrow_key(). Mostly
the reason for that was a special purpose I had in mind at the time
which involved auto-generating the same sequences in response to
things other than a keypress, but I always thought it would be nice to
centralise a lot more of PuTTY's complicated keyboard handling in the
same way - at least the handling of the function keys and their
numerous static and dynamic config options.
In this year's general spirit of tidying up and refactoring, I think
it's finally time. So here I introduce three more centralised
functions for dealing with the numbered function keys, the small
keypad (Ins, Home, PgUp etc) and the numeric keypad. Lots of horrible
and duplicated code from the key handling functions in window.c and
gtkwin.c is now more sensibly centralised: each platform keyboard
handler concerns itself with the local format of a keyboard event and
platform-specific enumeration of key codes, and once it's decided what
the logical key press actually _is_, it hands off to the new functions
in terminal.c to generate the appropriate escape code.
Mostly this is intended to be a refactoring without functional change,
leaving the keyboard handling how it's always been. But in cases where
the Windows and GTK handlers were accidentally inconsistent, I've
fixed the inconsistency rather than carefully keeping both sides how
they were. Known consistency fixes:
- swapping the arrow keys between normal (ESC [ A) and application
(ESC O A) is now done by pressing Ctrl with them, and _not_ by
pressing Shift. That was how it was always supposed to work, and
how it's worked on GTK all along, but on Windows it's been done by
Shift as well since 2010, due to a bug at the call site of
format_arrow_key() introduced when I originally wrote that function.
- in Xterm function key mode plus application keypad mode, the /*-
keys on the numeric keypad now send ESC O {o,j,m} in place of ESC O
{Q,R,S}. That's how the Windows keyboard handler has worked all
along (it was a deliberate behaviour tweak for the Xterm-like
function key mode, because in that mode ESC O {Q,R,S} are generated
by F2-F4). But the GTK keyboard handler omitted that particular
special case and was still sending ESC O {Q,R,S} for those keys in
all application keypad modes.
- also in Xterm function key mode plus app keypad mode, we only
generates the app-keypad escape sequences if Num Lock is on; with
Num Lock off, the numeric keypad becomes arrow keys and
Home/End/etc, just as it would in non-app-keypad mode. Windows has
done this all along, but again, GTK lacked that special case.
Now $(CC) is defined to be nothing but the name of the clang-cl binary
itself, which makes it easier to drop in a different one for a special
purpose.
(I tried to use this for static analysis recently - unsuccessfully, as
yet, but I think this change will make anything else along the same
lines easier as well.)
It's just silly to have _two_ systems for traversing a string of
comma-separated protocol ids. I think the new get_commasep_word
technique for looping over the elements of a string is simpler and
more general than the old membership-testing approach, and also it's
necessary for the modern KEX untangling system (which has to be able
to loop over one string, even if it used a membership test to check
things in the other). So this commit rewrites the two remaining uses
of in_commasep_string to use get_commasep_word instead, and deletes
the former.
If GSSAPI authentication fails because we call the GSS acquire_cred
function on the client side and find it doesn't give us anything
useful, then that authentication attempt has to terminate - but since
_we_ decided to terminate it, on the client side, the server will be
sending us neither a formal USERAUTH_FAILURE nor any other kind of
packet.
So when we go back round to the top of the auth loop, we have to avoid
_either_ assuming we're sitting on a USERAUTH_FAILURE we can parse for
its method list, _or_ waiting to receive one. Instead we just have to
push on and try the next auth method in the list from the last
USERAUTH_FAILURE we did see.
Hence, a new flag lets us suppress the usual behaviour of waiting
until we have a response packet on the queue, and then all references
to pktin after that are tested for NULL.
There are situations - or _should_ be, at any rate - in which we
terminate a userauth attempt without having received a
USERAUTH_FAILURE from the server, which means that we can't depend on
always starting a userauth loop iteration by extracting the server's
list of permitted methods from the current failure message. If there
isn't a current failure message, the best we can do is remember the
state from last time.
That's already what we do for actually deciding which methods to
attempt (we set s->can_foo from the methods string). But we should
also keep the full original version of the string, for use in error
message.
The flag 'cross_certifying' in the SSH-2 transport layer state is now
a pointer to the host key algorithm we expect to be certifying,
instead of a plain bool. That lets me check by assertion that it's
what we expected it to be after all the complicated key exchange has
happened.
(I have no reason to think this _will_ go wrong. When we cross-
certify, the desired algorithm should be the only one we put into our
KEXINIT host key algorithm list, so it should also be the only one we
can come out of the far end of KEXINIT having selected. But if
anything ever does go wrong with my KEXINIT handling then I'd prefer
an assertion failure to silently certifying the wrong key, and also,
this makes it clearer to static analysers - and perhaps also humans
reading the code - what we expect the situation to be.)
It hasn't been possible for canonify() to return a null pointer since
commit 094dd30d9, in 2001. But the whole of psftp.c is full of error
checking clauses that allow for the possibility that it might!
This fixes a batch of clang-analyzer warnings of the form 'you
declared / assigned this variable and then never use it'. It doesn't
fix _all_ of them - some are there so that when I add code in the
future _it_ can use the variable without me having to remember to
start setting it - but these are the ones I thought it would make the
code better instead of worse to fix.
This is mostly to make static analysers and compiler warnings a bit
happier - now they know that a call to, say, modalfatalbox() means
they don't have to worry about what the rest of the function will do.
The ad-hoc code that received data from the SCP or SFTP server
predated even not-very-modern conveniences such as bufchain, and was
quite horrible and cumbersome.
Particularly nasty was the part where ssh_scp_recv set a _global_
pointer variable to the buffer it was in the middle of writing to, and
then recursed and expected a callback to use that pointer. That caused
clang-analyzer to grumble at me, in a particular case where the output
buffer was in the ultimate caller's stack frame; even though I'm
confident the code _worked_, I can't blame clang for being unhappy!
So now we do things the modern and much simpler way: the callback when
data comes in just puts it on a bufchain, and the top-level
ssh_scp_recv repeatedly waits until data arrives in the bufchain and
then copies it to the output buffer.
'struct str' in terminal.c was an earlier and less good implementation
of the same concept as misc.h's strbuf, so I've replaced it with the
same strbuf we have everywhere. As a bonus, this means I can also use
put_uint{16,32} to save a bit of effort writing out the compressed
scrollback data.
On the decompression side, I've also switched to using BinarySource,
which has the advantage that now if the decoding goes wrong we can at
least be sure of not reading beyond the end of the buffer.
(The flip side of that is that now we _store_ the length of each
compressed line buffer, which costs a bit of memory. But I think it's
worth it for the safety and code consistency.)
uxnet.c's sk_namelookup and the sorting-key construction in
pangofont_enum_fonts() were both using s[n]printf and strncpy into
buffers that had no real need to be fixed-size; format_telnet_command
and the GTK Event Log selection-data builder were doing their own
sresize loops, but now we have strbuf they can just use that and save
redoing the same work.
The code that parses hexadecimal test arguments out of test lines
writes them into a buffer in binary form, and sets ptrs[i] to be the
starting point of each argument. The idea is that ptrs[i+1]-ptrs[i] is
the length of each argument - but for that to apply to the _final_
argument, we need to have set one final element in ptrs[], which I
forgot to do.
The variable 'toret' in ssh2_transport_get_specials would have been
returned while uninitialised in the case where neither of the if
statements in the function set it to true.
The BPP_READ macros in all four BPP implementations (including
sshverstring) had the same bug: if EOF had been seen on the network
input but there was _also_ enough data in the input queue to satisfy
the current request, they would jump straight to complaining about the
EOF rather than processing the available data first.
I spotted this while trying to pipe in test data from a disk file, but
it could easily also lead to us failing to handle the final message in
the connection, e.g. losing the error message sent by the remote in a
DISCONNECT message.
An empty display number matches any display number.
For example xauth list :1 returns auth cookies where the
display number matches and where the display number is empty.
Now, instead of getting the zlib test/helper program by manually
compiling a source file with unusual options, it gets built as
standard by the ordinary Makefile.
Now they live in their own file memory.c. The advantage of this is
that you can link them into a binary without also pulling in the rest
of misc.c with its various dependencies on other parts of the code,
such as conf.c.
It stopped being useful in commit 20a9bd564, where I removed the only
code that called it. I've only just noticed that this part of the
mechanism is still lying around.
In commit 884a7df94 I claimed that all my trait-like vtable systems
now had the generic object type being a struct rather than a bare
vtable pointer (e.g. instead of 'Socket' being a typedef for a pointer
to a const Socket_vtable, it's a typedef for a struct _containing_ a
vtable pointer).
In fact, I missed a few. This commit converts ssh_key, ssh2_cipher and
ssh1_cipher into the same form as the rest.
This should have been moved over from the main ssh_free function back
when I did the original splitting-up of ssh.c: the transport layer
schedules a timer for rekeying (and also for GSSAPI credential
checks), so when it's freed, it needs to ensure the timer doesn't get
called anyway on a stale pointer.
Two users reported this in the form of an assertion failure in
conf_get_int (when ssh2_transport_timer asks for CONF_ssh_rekey_time,
if the tree234 call inside conf_get_int is confused by the contents of
the freed memory into returning failure). In other circumstances (if
the freed memory has different contents) it manifests as a segfault,
but it's the same underlying bug either way.
Of course this wouldn't have prevented me from making that mistake
myself - it's not as if I carefully re-read the design principles
appendix before writing each code change! - but it might help explain
to _someone_ at some point...
When I added a use of PRIx32 to one of Pageant's debugging messages a
couple of days ago, I forgot that one of my build setups can't cope
with inclusion of <inttypes.h>, and somehow also forgot the
precautionary pre-push full build that would have reminded me.
Now the RSA signing function supports the two flags defined in
draft-miller-ssh-agent-02, and uses them to generate RSA signatures
based on SHA-256 and SHA-512, which look exactly like the ordinary
kind of RSA SHA-1 signature except that the decoded signature integer
has a different hash at the bottom and an ASN.1 identifying prefix to
match, and also the signature-type string prefixing the integer
changes from "ssh-rsa" to "rsa-sha2-256" or "rsa-sha2-512" as
appropriate.
We don't _accept_ signatures of these new types - that would need an
entirely different protocol extension - and we don't generate them
under any circumstances other than Pageant receiving a sign request
with one of those flags set.
Now each public-key algorithm gets to indicate what flags it supports,
and the ones it specifies support for may turn up in a call to its
sign() method.
We still don't actually support any flags yet, though.
A couple of people have mentioned to me recently that these days
OpenSSH is appending a uint32 flags word to the agent sign request,
with flags that ask for an RSA signature to be over a SHA-256 or
SHA-512 hash instead of the SHA-1 standardised in ssh-rsa.
This commit adds support for the mandatory part of this protocol: we
notice the flags word at all (previously we stopped parsing the packet
before even finding it there), and return failure to the signing
request if it has any flag set that we don't support, which currently
means if it has any flag set whatsoever.
While I'm here, I've also added an error check for an undecodable sign
request. (It seemed silly to be checking get_err(msg) _after_ trying
to read the flags word without also having checked it before.)
The event log messages generated during DH key exchange now include both the
modulus size and hash algorithm used as well as whether the DH parameters
are from one of the standardized groups or were supplied by the server
during Group Exchange.
The gdb version of container_of can do better than the C function,
because you don't have to specify the structure field name if it can
be inferred from the type of the input expression.
And $list234 can be made to automatically list the contents of each
tree element, not just a pointer to it - just the thing for looking
quickly through sktree or s->channels to find the one you're after.
Jacob ran across a server which terminated a GSS userauth attempt by
sending that message before USERAUTH_FAILURE. The result was that
PuTTY interpreted the ERRTOK as indicating a failure of authentication
(so far, fair enough), and then handled it exactly as it would have
handled USERAUTH_FAILURE itself: it pushed it back on in_pq and went
back round the main userauth loop. But the thing that loop is
expecting to find at the head of in_pq is an _actual_
USERAUTH_FAILURE, not some arbitrary thing that preceded it. So this
led to some confusion, especially when the real USERAUTH_FAILURE that
immediately followed the ERRTOK got interpreted as the response to the
_next_ auth attempt.
One of the root causes is that we had no handler for ERRTOK at all. We
now do. (Though it's trivial - we ignore the content of the message
and just wait for the followup USERAUTH_FAILURE that the GSSAPI RFC
says MUST come next. Possibly there's something nicer we could do
involving handing the error token to the GSSAPI library and letting it
print a final user-facing message? But that's beyond my GSS expertise.)
But this also exposed another problem: we shouldn't be pushing _any_
packet back on in_pq unless it's actually a USERAUTH_FAILURE. Any
other unexpected packet should have _different_ confused handling. So
now that call to pq_push_front is conditionalised on the packet type,
and only triggers for USERAUTH_FAILURE proper.
Looks as if I introduced this bug in commit 431f92ade, when I moved
mainchan out into its own source file: the previous version of
mainchan_send_eof conditionalised the setting of mc->eof_sent in the
same if statement that actually sent the EOF, but somehow, in the new
version, only one of those operations was inside the if.
The effect is that in plink -nc mode, if the server sends EOF first,
the client stops listening to standard input at its own end, so it
never knows when to send EOF back and clean things up.
It claimed they were only found in ssh.c, which is no longer true:
after I broke up ssh.c into smaller pieces, they're now found all over
the place.
Also, one of the things I did during that refactoring was to arrange
that each protocol layer's cleanup function (hopefully) reliably frees
everything the coroutine might have allocated and been in the middle
of using, which was something I knew the old code was quite bad at. So
I've mentioned that in the coroutines section too, while I'm here.
I've recently started using several C99 features in PuTTY, after
finally reaching the point where it didn't break my builds to do so,
even on Windows. So it's now outright inaccurate for the documented
design principles to claim that we're sticking to C90.
While I'm here, I've filled in a bit more detail about the assumptions
we do permit.
ssh2_channel_check_throttle should only be called on channels for
which c->chan != NULL - that is, only for channels that are not
delegated to a sharing downstream. But throttle_all_channels was
calling it for _all_ channels, so if it had the bad luck to be called
while a sharing downstream was active, ssh2_channel_check_throttle
would dereference the null c->chan for the first downstream channel it
found.
This is another cleanup I felt a need for while I was doing
boolification. If you define a function or variable in one .c file and
declare it extern in another, then nothing will check you haven't got
the types of the two declarations mismatched - so when you're
_changing_ the type, it's a pain to make sure you've caught all the
copies of it.
It's better to put all those extern declarations in header files, so
that the declaration in the header is also in scope for the
definition. Then the compiler will complain if they don't match, which
is what I want.
sk_startup and sk_nextaddr are entirely internal to winnet.c; nearly
all of import.c and minibidi.c's internal routines should have been
static and weren't; {read,write}_utf8 are internal to charset/utf8.c
(and didn't even need separate declarations at all); do_sftp_cleanup
is internal to psftp.c, and get_listitemheight to gtkdlg.c.
While I was editing those prototypes anyway, I've also added missing
'const' to the 'char *' passphrase parameters in import,c.
For a start, they now have different names on Windows and Unix,
reflecting their different roles: on Windows they apply escaping to
any string that's going to be used as a registry key (be it a session
name, or a host name for host key storage), whereas on Unix they're
for constructing saved-session file names in particular (and also
handle the special case of filling in "Default Settings" for NULL).
Also, they now produce output by writing to a strbuf, which simplifies
a lot of the call sites. In particular, the strbuf output idiom is
passed on to enum_settings_next, which is especially nice because its
only actual caller was doing an ad-hoc realloc loop that I can now get
rid of completely.
Thirdly, on Windows they're centralised into winmisc.c instead of
living in winstore.c, because that way Pageant can use the unescape
function too. (It was spotting the duplication there that made me
think of doing this in the first place, but once I'd started, I had to
keep unravelling the thread...)
There's one of these centralised in win_strerror() in winmisc.c, and
it doesn't seem worth keeping an earlier iteration of the same idea
entirely separate in winsock_error_string.
This removal means that non-network-specific error codes received in a
network context will no longer have "Network error:" prefixed to them.
But I think that's OK, because it seems unlikely to be critically
important that such an error was received from a network function - if
anything like that comes up then it's probably some kind of systemwide
chaos.