I carefully made it return a bool to indicate that the whole PPL had
been freed, and then never actually checked that return value, so any
kind of connection-fatal event inside filter_queue (such as reporting
a DISCONNECT message) would cause a reference to freed memory on
return.
There was a race between toplevel callbacks: if we read enough data to
receive an SSH_MSG_DISCONNECT, and then returned, then whether we
reported the DISCONNECT message or the followup EOF would depend on
whether the BPP or the master PPL got called back first. Now the BPP
politely waits its turn (i.e. waits to see if it even gets called back
at all) before reporting EOF.
Turns out that my assertion that term->cols == line->cols can
sometimes fail, because if the window is shrunk, scrlineptr()
deliberately _doesn't_ shrink the line (so that the columns on the
right can be recovered if the window is then resized larger again). So
clear_line() should _make_ the line the right width, instead of
asserting that it already is.
I've factored out clear_line() (wipe out everything on a terminal line
including its line attrs) and also line_cols() (determine how many
columns are on this particular line, taking into account
LATTR_WRAPPED2 which reduces it by one).
Also, newline() and freeline() were badly named. Now they're called
newtermline() and freetermline(), which include the full actual type
name they deal with, and also means that now neither of them is named
the same as a control character!
Now instead of taking raw arguments to configure the output
StripCtrlChars with, it takes an enumerated value giving the context
of what's being sanitised, and allows the seat to decide what the
output parameters for that context should be.
The only context currently used is SIC_BANNER (SSH login banners).
I've also added a not-yet-used one for keyboard-interactive prompts.
Now instead of making a StripCtrlChars just for that function call, it
uses an existing one, pointing it at the output strbuf via
stripctrl_retarget.
This adds flexibility (now you can use the same convenient string-
sanitising function with a StripCtrl configured in any way you like)
and also saves pointless setting-up and tearing-down of identical sccs
all the time.
The existing call sites in PSCP and PSFTP now use a static
StripCtrlChars instance that was made at program startup.
stripctrl_retarget() points the StripCtrlChars at a new BinarySink, to
avoid having to pointlessly throw it away and make a new one all the
time.
Since that probably means the same scc is going to be reused for
processing a fresh data stream, we also don't want any character-set
conversion state hanging over from the previous stream, so we also
reset the state in the process. Just in case it's needed,
stripctrl_reset() is also provided to do that operation on its own.
Now if a pathname ends with a slash already, we detect that (using the
shiny new ptrlen_endswith), and don't bother putting another one in.
No functional change, but this should improve the occasional error
message, e.g. 'pscp remote:some.filename /' will now say it can't
create /some.filename instead of //some.filename.
I'd never even heard of these before. We don't need to do anything
unusual with these when passing them to our own getaddrinfo, but
host_strduptrim considered them a violation of its expectations about
what an IPv6 literal looked like, and hence wasn't stripping square
brackets off one. So a port-forwarding command-line option such as
'-L 12345:[fe80::%eth0]:22' would cause the address string in the
direct-tcpip CHANNEL_OPEN packet to be "[fe80::%eth0]" instead of the
correct "fe80::%eth0", leading to getaddrinfo failure on the SSH
server side.
The previous commit removed its last use, so now we can garbage-
collect it, including its long-standing FIXME comment which is now
fulfilled by the new StripCtrlChars system.
Local functions in uxcons.c and wincons.c were calling the old
simplistic sanitise_term_data to print console-based prompts. Now they
use the same new system as everything else.
This removes the last use of the ASCII-centric sanitise_term_data.
Now the banner can contain printable Unicode while still having escape
characters removed, in a way that works in both console and GUI
applications.
In the case of SSH banners, ssh2userauth.c does the sanitisation
itself, so it has to use the new Seat method to get an appropriately
configured StripCtrlChars.
(This wasn't an issue for the prompts_t system, because sanitisation
of prompt text is devolved to the local implementation of the prompt
system, in case a local implementation wants to present prompts in a
way that doesn't need sanitisation in any case, e.g. a dialog box.)
If centralised code like the SSH implementation wants to sanitise
escape sequences out of a piece of server-provided text, it will need
to do it by making a locale-based StripCtrlChars if it's running in a
console context, or a Terminal-based one if it's in a GUI terminal-
window application.
All the other changes of behaviour needed between those two contexts
are handled by providing reconfigurable methods in the Seat vtable;
this one is no different. So now there's a new method in the Seat
vtable that will construct a StripCtrlChars appropriate to that kind
of seat. Terminal-window seats (gtkwin.c, window.c) implement it by
calling the new stripctrl_new_term(), and console ones use the locale-
based stripctrl_new().
SSH authentication prompts (passwords, passphrases and keyboard-
interactive) were previously sanitised to remove escape sequences by
the simplistic sanitise_term_data() in utils.c. Now they're fed
through the new mode of StripCtrlChars instead, which means they
should permit printable Unicode (if the terminal is in UTF-8 mode)
while still disallowing escape sequences. Hopefully this will be a
usability improvement to everyone whose login prompts are in a
language not representable in plain ASCII.
If you use the new stripctrl_new_term() to construct a StripCtrlChars
instead of the existing stripctrl_new(), then the resulting object
will align itself with the character-set configuration of the Terminal
object you point it at. (In fact, it'll reuse the same actual
translation code, courtesy of the last few refactoring commits.) So it
will interpret things as control characters precisely if that Terminal
would also have done so.
The previous locale-based sanitisation is appropriate if you're
sending the sanitised output to an OS terminal device managed outside
this process - the LC_CTYPE setting has the best chance of knowing how
that terminal device will interpret a byte stream. But I want to start
using the same sanitisation system for data intended for PuTTY's own
internal terminal emulator, in which case there's no reason why
LC_CTYPE should be expected to match that terminal's configuration,
and no reason to need it to either since we can check the internal
terminal configuration directly.
One small bodge: stripctrl_new_term() is actually a macro, which
passes in the function pointer term_translate() to the underlying real
constructor. That's just so that console-only tools can link in
stripctrl.c without acquiring a dependency on terminal.c (similarly to
how we pass random_read in to the mp_random functions).
Also, instead of insisting on modifying the UTF-8 decoding state
inside the Terminal structure, it now takes a separate pointer to a
small struct containing that decode state. The idea is that if a
separate module wants to decode characters the same way the real
terminal would, it can pass its own mutable state structure, but the
same main Terminal pointer.
When ssh2_connection_filter_queue is _receiving_ messages about a
channel from the other end, it carefully checks if the channel
referred to is half-open. But we weren't exercising the same caution
before beginning to _send_ channel data, and we should, because in
that situation important fields like c->remwinsize aren't even
initialised yet.
This can come up, for example, due to typeahead in the main session
window before the server has sent OPEN_CONFIRMATION.
If an Ssh structure is destroyed while its IdempotentCallback
ssh->ic_out_raw is prnding, then the latter will stay on callback.c's
list pointing at the freed memory.
It's defined in the C standard to return an int, not a size_t, and we
should honour that since the subsequent code checks it for <0.
A knock-on effect is that I reorganise the addends in one of the
sgrowarrays, to be extra careful about overflow when adding something
to that int.
The _nm strategy is slower, so I don't want to just change everything
over no matter what its contents. In this pass I've tried to catch
everything that holds the _really_ sensitive things like passwords,
private keys and session keys.
These versions, distinguished by the _nm suffix on their names, avoid
using realloc to grow the array, in case it moves the block and leaves
a copy of the data in the freed memory at the old address. (The suffix
'nm' stands for 'no moving'.) Instead, the array is grown by making a
new allocation, manually copying the data over, and carefully clearing
the old block before freeing it.
(An alternative would be to give this code base its own custom heap in
which the ordinary realloc takes care about this kind of thing, but I
don't really feel like going to that much effort!)
A minor precaution against leaving secrets visible in process memory:
bufchains are used all over the place and probably _something_ in one
of them would be better wiped out of RAM.
If a proxy command jabbers on standard error in a way that doesn't
involve any newline characters, we now won't keep buffering data for
ever.
(Not that I've heard of it happening, but I noticed the theoretical
possibility on the way past in a recent cleanup pass.)
The idea of these is that they centralise the common idiom along the
lines of
if (logical_array_len >= physical_array_size) {
physical_array_size = logical_array_len * 5 / 4 + 256;
array = sresize(array, physical_array_size, ElementType);
}
which happens at a zillion call sites throughout this code base, with
different random choices of the geometric factor and additive
constant, sometimes forgetting them completely, and generally doing a
lot of repeated work.
The new macro sgrowarray(array,size,n) has the semantics: here are the
array pointer and its physical size for you to modify, now please
ensure that the nth element exists, so I can write into it. And
sgrowarrayn(array,size,n,m) is the same except that it ensures that
the array has size at least n+m (so sgrowarray is just the special
case where m=1).
Now that this is a single centralised implementation that will be used
everywhere, I've also gone to more effort in the implementation, with
careful overflow checks that would have been painful to put at all the
previous call sites.
This commit also switches over every use of sresize(), apart from a
few where I really didn't think it would gain anything. A consequence
of that is that a lot of array-size variables have to have their types
changed to size_t, because the macros require that (they address-take
the size to pass to the underlying function).
I haven't tried compiling with /DMINEFIELD in a while, and when I just
did, I found that the declarations in winstuff.h weren't actually
being included by memory.c where they're needed.
I've just noticed that the MSDN docs for WinSock gethostname()
guarantee that a size-256 buffer is large enough. That seems a lot
simpler than the previous faff.
pageant.c and sshshare.c each had an extra copy of crBegin and
crFinishV, dating from when the main versions were kept in ssh.c where
they couldn't be conveniently #included by other modules. Now they're
in sshcr.h, where they can be, so there's no need to have extra copies
of them anywhere.
(But I've left the crGetChar macro in each of those files, because
those really are specific to the particular context, referring to an
extra variable that clients of the more general sshcr.h macros won't
all have.)
The live versions of the dmemdump macros had a trailing semicolon in
the expansion, which would cause them to break if used in the wrong
syntactic context (e.g. between if and else with the natural semicolon
at the call site). The conditioned-out versions of those and of
debug() itself expanded to the empty string in place of the more usual
((void)0). And SECOND_PASS_ONLY in gtkmain.c's command-line handling
should have had the standard do ... while(0) wrapper to make it
reliably a single statement.
Mostly noticed in passing while using Address / Leak Sanitiser to
check over the previous commit. One highlight here is freeing of the
previous iqmp value in rsa_verify, which was actually a potentially
sensitive leak, introduced in the mp_int rewrite (commit 25b034ee3).
I've fixed a handful of these where I found them in passing, but when
I went systematically looking, there were a lot more that I hadn't
found!
A particular highlight of this collection is the code that formats
Windows clipboard data in RTF, which was absolutely crying out for
strbuf_catf, and now it's got it.
If the user clicks 'ok' to a prompt such as 'should we carry on even
though the server only supports diffie-hellman-stage-whisper-sha0',
then we've done our duty to warn them about weak crypto, and shouldn't
nag them with the same confirmation prompt again and again in
subsequent rekeys. So now we keep a tree234 of all the algorithms the
user has consented to, so as to ask about each one at most once.
Plink's hostname argument can refer to a saved session instead of a
hostname. We test this by trying to load it as a session, and seeing
if the resulting Conf is launchable. But if Default Settings itself is
launchable (e.g. if it has the protocol set to Serial) then that can
happen even if there is no saved session with that name - in which
case we'll _never_ fall back to treating the argument as a host name.
Fixed by also checking the new success flag returned from do_defaults.
Previously, we returned a valid settings_r containing a null HKEY.
That didn't actually cause trouble (I think all the registry API
functions must have spotted the null HKEY and returned a clean error
code instead of crashing), but it means the caller can't tell if the
session really existed or not. Now we return NULL in that situation,
and close_settings_r avoids crashing if we pass the NULL to it later.
My trawl of all the vtable systems in the code spotted a couple of
other function-like macros in passing, which might as well be
rewritten as inline functions too for the same reasons.
This replaces all the macros like ssh_key_sign() and win_draw_text()
which take an object containing a vtable pointer and do the
dereferencing to find the actual concrete method to call. Now they're
all inline functions, which means more sensible type-checking and more
comprehensible error reports when the types go wrong, and also means
that there's no risk of double-evaluating the object argument.
Commit fec93d5e0 missed a piece: when we hand wcTo to
term_bidi_cache_store and it uses it to set up the mapping between
physical and logical character positions for cursor and selection
handling, it will assume wcTo has as many entries as there are columns
in the terminal. But in fact now wcTo may be shorter than that, so
term_bidi_cache_store also needs to pay attention to the nchars field.
Instead of repeatedly looping on the random number generator until it
comes up with two values that have a large enough product, the new
version guarantees only one use of random numbers, by first counting
up all the possible pairs of values that would work, and then
inventing a single random number that's used as an index into that
list.
I've done the selection from the list using constant-time techniques,
not particularly because I think key generation can be made CT in
general, but out of sheer habit after the last few months, and who
knows, it _might_ be useful.
While I'm at it, I've also added an option to make sure the two
firstbits values differ by at least a given value. For RSA, I set that
value to 2, guaranteeing that even if the smaller prime has a very
long string of 1 bits after the firstbits value and the larger has a
long string of 0, they'll still have a relative difference of at least
2^{-12}. Not that there was any serious chance of the primes having
randomly ended up so close together as to make the key in danger of
factoring, but it seems like a silly thing to leave out if I'm
rewriting the function anyway.
The actual calls to win_draw_{text,cursor} in do_paint() were
duplicated in two places, and I may want to change them soon, so it's
convenient to centralise them.
Previously, any double-width character would break the bidi algorithm,
because of the quirk of data representation in which we store UCSWIDE
(0xDFFF) in the right-hand termchar overlapped by the character.
UCSWIDE has bidirectional character class L according to minibidi's
getType(), so it disrupted the algorithm.
Now we remove UCSWIDE from the input line before passing it to
do_bidi(), replacing it with an 'nchars' field in the bidi_char
structure indicating single or double width, and put the UCSWIDEs back
afterwards once do_bidi returns.