In both stripctrl_locale_put_wc() and stripctrl_term_put_wc(), we get
a character width from either wcwidth or term_char_width, which might
be -1 for a control character. Coverity complained that that width
value is later passed to to stripctrl_check_line_limit, which takes a
size_t parameter, i.e. it expects the width to be non-negative.
This is another bug that doesn't actually cause damage, but the
reasons why are quite fiddly:
- The only width<0 characters we let through are those that got
through stripctrl_ctrlchar_ok. (At both call sites, if that
function is unhappy then we either take an early return or else
replace the character _and_ the value of 'width' with a
substitution.)
- stripctrl_ctrlchar_ok only lets through \n, or \r if permit_cr is
set.
- stripctrl_check_line_limit ignores its width parameter if line
limiting is turned off, or if it gets \n.
- So the only way this can cause a problem is if permit_cr is set so
that \r can get through ctrlchar_ok, *and* line limiting is turned
on so that we get far enough through check_line_limit to choke on
its negative width.
- But in fact all the call sites that ever call
stripctrl_enable_line_limiting do it with a StripCtrlChars that
they got from a seat, and all the nontrivial implementations of
seat_stripctrl_new pass false for permit_cr.
So the combination of circumstances in which -1 gets implicitly
converted to size_t *and* stripctrl_check_line_limit actually pays
attention to it can never arise. But it's clearly foolish to make the
correctness of the code depend on a proof that long - now Coverity has
pointed it out, I'm not happy with it either! Fixed by substituting 0
for the width in the questionable cases.
Coverity points out that we're inconsistent about checking it for
NULL: seat_stripctrl_new() *can* return NULL, so in principle, we can
go through the initialisation code for s->ki_scc and have it still be
NULL afterwards. I check that in the code that uses it to sanitise the
prompt strings out of USERAUTH_INFO_REQUEST, but not in the code that
sanitises the name or instruction strings. Now all uses are checked in
the same way.
This is only a latent bug, because the four main Seat implementations
(GUI and console, on Windows and Unix) never return NULL. The only
implementation of seat_stripctrl_new which _can_ return NULL is the
trivial nullseat_stripctrl_new, currently only used by Uppity.
However, of course, if that changes in future, this latent bug could
turn into a real one, so better to fix it before that happens. Thanks
Coverity.
At Coverity's urging, I put in an instance of the Fortuna PRNG in
place of the use of rand() to select which drawing area to light up
next on a keypress. But Coverity turns out to be unhappy that I'm
still using rand() to select the _initial_ area, before the user
enters any input at all.
It's even harder to imagine this being a genuine information leak than
the previous complaint. But I don't really see a reason _not_ to
switch it over, now it's been pointed out.
Notable changes:
- mentioned ASan, which now seems to be good enough to take over from
valgrind as my first-choice memory debugging tool
- release announcements now live in putty-aux
- a couple of clarifications of which directory to be in for which
commands, and adjusted relative paths appropriately
- remove obsolete mention of entering the GPG release key passphrase
numerous times (that was for GPG 1, without gpg-agent)
- remind myself that in the final signing procedure, you have to undo
the chmod you did for safety a couple of weeks earlier!
If the agent sent a response whose length field describes an interval
of memory larger than the file-mapping object the message is supposed
to be stored in, we shouldn't return that message to the client as if
nothing is wrong. Treat that the same as a failure to receive any
response at all.
Looking over this function today, I spotted several questionable uses
of strcat to concatenate "\PUTTY.RND" to the end of a pathname,
without having checked whether the pathname had filled up the static
fixed-size buffer already.
I don't think this is exploitable (because you'd have to be in control
of the local account already to control any of the data sources used
to fill those buffers). But it's horrible anyway, of course. Now all
of those are replaced with sensible dupcats.
(This patch re-indents a lot of the function, to give variables
tighter scopes. So the diff is best viewed with whitespace ignored.)
If a server sent a very large number as extended_count, and didn't
actually send any extended attributes, we could loop around and around
calling get_string, which would carefully not overrun any buffer or
leak any memory, and we weren't paying attention to extended
attributes anyway, but it would still pointlessly consume CPU.
Now we bail as soon as the BinarySource flags an error. Current
callers will then spot the error and complain that the packet was
malformed.
We were carefully checking for overflow inside safemalloc() before
multiplying together the two factors of the desired allocation size.
But snew_plus() did an addition _outside_ safemalloc, without the same
guard. Now that addition also happens inside safemalloc.
If _both_ the host key and the server key were less than 32 bytes
long, then less than 32 bytes would be allocated for the buffer
s->rsabuf, into which the 32-byte session id is then copied.
If the packet length field was in the range 0 <= x < 5, then it would
pass the initial range check, but underflow to something in the region
of 0xFFFFFFFF when the BPP code subtracted 5 from it, leading to an
overlarge memory allocation, and/or allocation failure, and perhaps
worse.
Introduced by 61f3e3e29, as part of my periodic efforts to make the
GTK front end work usefully on OS X: the 'Unable to open connection to
[host]: [error]' message box is accidentally passed through two layers
of printf format-string parsing, the second of which has no argument
list. So if you pass in a host name like '%s' on the command line,
bad things will happen when that error message is constructed.
This happened because in that commit I changed a call to
fatal_message_box() into a call to connection_fatal(), without
noticing that the latter was printf-style variadic and the former
wasn't. On the plus side, that means now I can remove the explicit
dupprintf/free around the error message.
This mitigates a borderline-DoS in which a malicious SFTP server sends
a ludicrously large number of file names in response to a SFTP
opendir/readdir request sequence, causing the client to buffer them
all and use up all the system's memory simply so that it can produce
the output in sorted order.
I call it a 'borderline' DoS because it's very likely that this is the
same server that you'll also trust to actually send you the _contents_
of some entire file or directory, in which case, if they want to DoS
you they can do that anyway at that point and you have no way to tell
a legit very large file from a bad one. So it's unclear to me that
anyone would get any real advantage out of 'exploiting' this that they
couldn't have got anyway by other means.
That said, it may have practical benefits in the occasional case.
Imagine a _legit_ gigantic directory (something like a maildir,
perhaps, and perhaps stored on a server-side filesystem specialising
in not choking on really huge single directories), together with a
client workflow that involves listing the whole directory but then
downloading only one particular file in it.
For the moment, the threshold size is fixed at 8Mb of total data
(counting the lengths of the file names as well as just the number of
files). If that needs to become configurable later, we can always add
an option.
While testing the previous fix, I also noticed that s->num_prompts is
an ordinary signed int. So if the server sends a _really_ large value,
it'll be treated as negative.
That's kind of harmless: our loop wouldn't read any prompts at all
from the packet, and then it would send back the same nonsense count
with no responses. But it's inelegant: now, if the server violates the
protocol in this way, we respond by sending an even wronger packet in
return.
Changed the type of num_prompts and the loop variable to uint32_t, so
now we'll respond by actually trying to read that many prompts, which
will fail by reaching the new error check. I think that's a more
sensible way to handle this one.
If a server sends a very large number as the num-prompts field, we'd
loop round calling get_string and get_bool, which would run off the
end of the buffer but still carefully return legal default values (the
empty string and false), so we'd carry on piling pointless stuff into
s->cur_prompt and using up lots of memory.
Coverity pointed this out by warning that an untrusted server-provided
value was being used as a loop bound. I'm not convinced that's an
error in every case, but I must admit it pointed out a useful thing
here!
The fix is to check the error indicator on the BinarySource after
reading each prompt from the input packet. Then we'll only keep
allocating memory as long as there's actually data in the packet.
Coverity points out that wcwidth is capable of returning a negative
number, which suggests that it's a mistake to pass its return value
unchecked to stripctrl_check_line_limit.
This shouldn't cause a problem _in principle_, because by the time
we're making that call, we should already have ruled out the kind of
dangerous control character that might provoke that return value from
wcwidth. But in practice, I couldn't absolutely guarantee that
everyone's idea of what is or is not a control character agrees in
every detail, so I think Coverity is right to urge caution.
Fixed by calling wcwidth (or its wrapper term_char_width) up front,
and ensuring that any character provoking a negative return value is
included in the 'control characters to sanitise out' branch of the
preceding logic.
When I set up the simplified version of just the ssh-connection
protocol we use for connection sharing, I carefully documented at the
top of sshshare.c that packets in that protocol variant are limited to
just over 0x4000 bytes. And did I remember to actually honour that, by
restricting the sizes of outgoing packets when actually speaking the
bare connection protocol? I did not.
Well, better late than never. Here I introduce a packet size limit
that can be imposed by the BPP, and arrange for sshconnection.c to
take the min of that and any given channel's max packet size as sent
by the remote end. All BPPs except ssh2bpp-bare set it to the no-op
value of the largest possible uint32_t.
In normal builds this makes no difference, but in Windows builds with
the Minefield diagnostic system turned on, free()ing a Minefield-
allocated object causes a crash. Apparently I haven't tested Pageant
under Minefield for ages - only PuTTY, talking to an ordinary Pageant
I'd already started up.
The field s->username in the ssh2userauth state was sometimes a
secondary reference to s->default_username, and sometimes a
dynamically allocated string. So not freeing it can leak memory, but
freeing it (if non-null) can cause a double-free crash.
Solution: add a new field s->locally_allocated_username which is the
thing that needs freeing (if any), and make s->username into a const
char * which is always a secondary reference to _something_.
How embarrassing: we were looping back round to prompt for a username
again if s->change_username was *true*, but then the test at the top
would only have reissued the prompt if it was *false*. So that
configuration checkbox must have been broken for ages. It's a good job
most SSH servers don't reward enabling it!
It's a secondary reference to existing data (namely an argv word, if
anything at all), and never needs to be written through, so we should
mark it const, if nothing else to catch any absent-minded attempts to
free it.
This assertion was supposed to be checking for the buffer overrun
fixed by the previous commit, but because it checks the buffer index
just _after_ writing into the buffer, it would have permitted a
one-byte overrun before failing the assertion.
Commit 71e42b04a's refactoring of terminal keyboard input, in the case
where a Unicode string derived from a keystroke is translated into
some other charset to put on the wire, had allocated the output buffer
for that translation one byte too small.
If you select an entry in the saved sessions list box, but without
double-clicking to actually load it, and then you hit OK, the config-
box code will automatically load it. That just saves one click in a
common situation.
But in order to load that session, the config-box system first has to
ask the list-box control _which_ session is selected. On Windows, this
causes an assertion failure if the user has switched away from the
Session panel to some other panel of the config box, because when the
list box isn't on screen, its Windows control object is actually
destroyed.
I think a sensible answer is that we shouldn't be doing that implicit
load behaviour in any case if the list box isn't _visible_: silently
loading and launching a session someone selected a lot of UI actions
ago wasn't really the point. So now I make that behaviour only happen
when the list box (i.e. the Session panel) _is_ visible. That should
prevent the assertion failure on Windows, but the UI effect is cross-
platform, applying even on GTK where the control objects for invisible
panels persist and so the assertion failure didn't happen. I think
it's a reasonable UI change to make globally.
In order to implement it, I've had to invent a new query function so
that config.c can tell whether a given control is visible. In order to
do that on GTK, I had to give each control a pointer to the 'selparam'
structure describing its config-box pane, so that query function could
check it against the current one - and in order to do _that_, I had to
first arrange that those 'selparam' structures have stable addresses
from the moment they're first created, which meant adding a layer of
indirection so that the array of them in the top-level dlgparam
structure is now an array of _pointers_ rather than of actual structs.
(That way, realloc half way through config box creation can't
invalidate the important pointer values.)
One of those things you'd never notice if it weren't for Leak
Sanitiser happening to be turned on while you were doing something
else: freersakey() frees all the things pointed to _from_ an RSAKey
structure, but not the structure itself.
Thanks to Ralf Esswein for pointing out the slip, which should have
been corrected as part of 867e69187.
It didn't cause a test failure, because the ecdsa and eddsa structs
are currently so similar in layout (they differ only in the pointed-to
public key point structure, and on this failure path that pointer is
NULL anyway). As a result it's rather hard to add a regression test
against it failing again; I think I just have to chalk this one up to
the lack of OO-aware type checking when doing this kind of manual
vtable management in C, unfortunately.
Normally I never notice warnings in this build, because it runs inside
bob and dumps all the warnings in a part of the build log I never look
at. But I've had these fixes lying around for a while and should
commit them.
They're benign: all we need is an explicit declaration of strtoumax to
replace the one that stdlib.h doesn't provide, and a couple more of
those annoying NO_TYPECHECK modifiers on GET_WINDOWS_FUNCTION calls.
Having decided that the terminal's local echo setting shouldn't be
allowed to propagate through to termios, I think the local edit
setting shouldn't either. Also, no other terminal emulator I know
seems to implement this sequence, and if you enable it, things get
very confused in general. I think it's generally better off absent; if
somebody turns out to have been using it, then we'll at least be able
to find out what it's good for.
This sequence (ESC[12l, ESC[12h) enables and disables local echo in
the terminal. We were previously implementing it by gatewaying it
directly through to the local echo facility in the line discipline,
which in turn would pass it on to the terminal it was running in (if
it was Plink).
This seems to be at odds with how other terminals do it: they treat
SRM as its own entirely separate thing, in which the terminal
_emulator_ performs its own echoing of input keypress data,
independently of whether the Unix terminal device (or closest
equivalent) is doing the same thing or not.
Now we're doing it the same way as everyone else (or at least I think
so): the new internal terminal function that the term_keyinput pair
feed to is also implementing SRM-driven local echo as another of its
side effects. One observable effect is that SRM now doesn't interfere
with the termios settings of the terminal it's running in; another is
that the echo now only applies to real keypress data, and not
sequences auto-generated by the terminal.
The functions that previously lived in it now live in terminal.c
itself; they've been renamed term_keyinput and term_keyinputw, and
their function is to add data to the terminal's user input buffer from
a char or wchar_t string respectively.
They sit more comfortably in terminal.c anyway, because their whole
point is to translate into the character encoding that the terminal is
currently configured to use. Also, making them part of the terminal
code means they can also take care of calling term_seen_key_event(),
which simplifies most of the call sites in the GTK and Windows front
ends.
Generation of text _inside_ terminal.c, from responses to query escape
sequences, is therefore not done by calling those external entry
points: we send those responses directly to the ldisc, so that they
don't count as keypresses for all the user-facing purposes like bell
overload handling and scrollback reset. To make _that_ convenient,
I've arranged that most of the code that previously lived in
lpage_send and luni_send is now in separate translation functions, so
those can still be called from situations where you're not going to do
the default thing with the translated data.
(However, pasted data _does_ still count as close enough to a keypress
to call term_seen_key_event - but it clears the 'interactive' flag
when the data is passed on to the line discipline, which tweaks a
minor detail of control-char handling in line ending mode but mostly
just means pastes aren't interrupted.)
They're the intermediate product between preprocessing a resource file
and feeding it to the resource compiler proper, so they certainly need
cleaning.
The idea was that if we found a host key already cached for the given
algorithm, we should remove it from the tree and free it. In fact, I
forgot the 'remove from tree' step, so we freed a key that was still
linked from the tree234. Depending on luck and platform, this could
either cause a segfault, or an assertion failure on the subsequent
attempt to add the new key in place of the not-removed-after-all old
one.
If the user closes the Change Settings dialog box using the close
button provided by the window manager (or some analogous thing that
generates the same X11 event) instead of using the Cancel button
within the dialog itself, then after_change_settings_dialog() gets
called with retval < 0, which triggers an early return path in which
we forget to call unregister_dialog(), and as a result, assertions
fail all over the place the _next_ time you try to put up a Change
Settings dialog.
Also, the early return causes ctx.newconf to be memory-leaked. So
rather than just moving the unregister_dialog() call to above the
early return, a better fix is to remove the early return completely,
and simply treat retval<0 the same as retval==0: it doesn't matter
_how_ the user closed the config box without committing the changes,
it only matters that they did.
I needed to send a server-side disconnect message in a test just now,
which caused me to notice that I'd never got round to filling in
ssh_proto_error properly. Now I've done that, and added the associated
machinery for waiting for the remote EOF and winding up the SSH
connection.
The rest of the error functions are still stubs, though.
Without this missing line, if you tried to download a file in SCP mode
using the -p option, the payload of the 'T' command (file times) would
still be sitting in act->buf when we went back round the loop, so the
payload of the followup 'C' or 'D' would be appended to it, leading to
a massive misparse and a complaint about illegal file renaming.
When I start Uppity in listening mode, it's useful to have it
acknowledge that it _has_ started up in that mode, and isn't (for
example) stuck somewhere in my local wrapper script.
Coverity complained that it was wrong to use rand() in a security
context, and although in this case it's _very_ marginal, I can't
actually disagree that the choice of which light to light up to avoid
giving information about passphrase length is a security context.
So, no more rand(); instead we instantiate a shiny Fortuna PRNG
instance, seed it in more or less the usual way, and use that as an
overkill-level method of choosing which light to light up next.
(Acknowledging that this is a slightly unusual application and less
critical than most, I don't actually put the passphrase characters
themselves into the PRNG, and I don't use a random-seed file.)
It's identical in uxnoise and winnoise, being written entirely in
terms of existing cross-platform functions. Might as well centralise
it into sshrand.c.
Coverity points out that under a carefully checked compound condition,
we assign s->gss_cred_expiry to itself. :-)
Before commit 2ca0070f8 split the SSH code into separate modules, that
assignment read 'ssh->gss_cred_expiry = s->gss_cred_expiry', and the
point was that the value had to be copied out of the private state of
the transport-layer coroutine into the general state of the SSH
protocol as a whole so that ssh2_timer() (or rather, ssh2_gss_update,
called from ssh2_timer) could check it.
But now that timer function is _also_ local to the transport layer,
and shares the transport coroutine's state. So on the one hand, we
could just remove that assignment, folding the two variables into one;
on the other hand, we could reinstate the two variables and the
explicit 'commit' action that copies one into the other. The question
is, which?
Any _successful_ key exchange must have gone through that commit step
of the kex that copied the one into the other. And an unsuccessful key
exchange always aborts the whole SSH connection - there's nothing in
the SSH transport protocol that allows recovering from a failed rekey
by carrying on with the previous session keys. So if I made two
variables, then between key exchanges, they would always have the same
value anyway.
So the only effect of the separation between those variables is
_during_ a GSS key exchange: should the version of gss_cred_expiry
checked by the timer function be the new one, or the old one?
This can only make a difference if a timer goes off _during_ a rekey,
and happens to notice that the GSS credentials have expired. And
actually I think it's mildly _better_ if that checks the new expiry
time rather than the old one - otherwise, the timer routine would set
the flag indicating a need for another rekey, when actually the
currently running one was going to get the job done anyway.
So, in summary, I think conflating those two variables is actually an
improvement to the code. I did it by accident, but if I'd realised, I
would have done the same thing on purpose!
Hence, I've just removed the pointless self-assignment, with no
functional change.
If the input key_wanted field were set to an out-of-range value, the
parent structure retkey would not become NULL as a whole: instead, its
field 'key' would never be set to a non-null pointer. So I was testing
the wrong pointer.
Fortunately, this couldn't have come up, because we don't actually
have any support yet for loading the nth key from an OpenSSH new-style
key file containing more than one. So key_wanted was always set to 0
by load_openssh_new_key(), which also checked that the file's key
count was exactly 1 (guarding against the possibility that even 0
might have been an out-of-range index).
Coverity points out that in (the misnamed) psftp_main(), I allow for
the possibility that 'backend' might be null, up until the final
unconditional backend_free().
I haven't actually been able to find a way to make backend() come out
as NULL at all in that part of the code: obvious failure modes like
trying to scp to a nonexistent host trigger a connection_fatal() which
terminates the program without going through that code anyway. But I'm
not confident I tried everything, so I can't be sure there _isn't_ a
way to get NULL to that location, so let's put in the missing check
just in case.
In load_openssh_new_key, ret->keyblob is never null any more: now that
it's a strbuf instead of a bare realloc()ed string, it's at worst an
_empty_ strbuf. Secondly, as Coverity pointed out, the null pointer
check was too late to do any good in the first place - the previous
clause of the same if condition would already have dereferenced it!
In test_mac in the auxiliary testsc program, there's no actual reason
I would ever have called it with null ssh_mac pointer - it would mean
'don't test anything'! Looks as if I just copy-pasted the MAC parts of
the cipher+MAC setup code in test_cipher.
We have to use the file name we just failed to open to format an error
message _before_ freeing it, not after. If that use-after-free managed
not to cause a crash, we'd also leak the file descriptor 'outfd'.
Both spotted by Coverity (which is probably the first thing in years
to look seriously at any of the code designed for Ben's AFL exercise).
Coverity pointed out a call to sftp_pkt_free(pktin) straight after an
unconditional return statement, which is obviously silly.
Fortunately, it was benign: pktin was freed anyway by the function
being called _by_ the return statement, so the unreachability of this
free operation prevented a double-free rather than allowing a memory
leak. So the right fix is to just remove the code, rather than
arranging for it to be run.