Up until now, it's been a variadic _function_, whose argument list
consists of 'const char *' ASCIZ strings to concatenate, terminated by
one containing a null pointer. Now, that function is dupcat_fn(), and
it's wrapped by a C99 variadic _macro_ called dupcat(), which
automatically suffixes the null-pointer terminating argument.
This has three benefits. Firstly, it's just less effort at every call
site. Secondly, it protects against the risk of accidentally leaving
off the NULL, causing arbitrary words of stack memory to be
dereferenced as char pointers. And thirdly, it protects against the
more subtle risk of writing a bare 'NULL' as the terminating argument,
instead of casting it explicitly to a pointer. That last one is
necessary because C permits the macro NULL to expand to an integer
constant such as 0, so NULL by itself may not have pointer type, and
worse, it may not be marshalled in a variadic argument list in the
same way as a pointer. (For example, on a 64-bit machine it might only
occupy 32 bits. And yet, on another 64-bit platform, it might work
just fine, so that you don't notice the mistake!)
I was inspired to do this by happening to notice one of those bare
NULL terminators, and thinking I'd better check if there were any
more. Turned out there were quite a few. Now there are none.
In the variable-length address slot, the main SOCKS5 reply packet can
contain a binary IP address (4- or 16-byte for v4/v6 respectively), or
a string intended to be interpreted as a domain name.
I was trying out the Python SOCKS5 proxy 'pproxy' today, which sends a
string-typed reply if you send it a string-typed domain name to
connect to. This caused me to notice that PuTTY mishandles the latter
case, by failing to account for the prefix length byte of that string
when computing the total size of the reply packet. So we would
misinterpret the final byte of its reply packet as the initial byte of
the actual connection, causing us to fail to recognise the SSH greeting.
Thanks to Ulrich Jannet for pointing this out: in
ssh2_connection_filter_queue, when we process a disconnect message, we
carefully avoid dereferencing the input 'ppl' pointer after
ssh_remote_error returns, because it will have been freed. But
ssh1_connection_filter_queue didn't have the same safety precaution.
Thanks to Patrick Stekovic for pointing out that, unlike sensible IP
stacks, Windows requires a non-default socket option to prevent a
second application from binding to a port you were already listening
on, causing some of your incoming connections to be diverted.
This replaces the previous setsockopt that enabled SO_REUSEADDR, which
I put there a long time ago in order to fix an annoying behaviour if
you used the same listening socket twice in rapid succession (e.g. for
successive PuTTYs forwarding the same port) and the second one failed
to bind the listening port because a left-over connection from the
first one was still in TIME_WAIT and causing the port number to be
marked as used.
As far as I can see, SO_EXCLUSIVEADDRUSE and SO_REUSEADDR are mutually
exclusive - if I try to set both, either way round, then setsockopt
returns failure on the second one - so if I have to set the former
then I _can't_ set the latter. And fortunately, re-testing on Windows
10, the TIME_WAIT problem that SO_REUSEADDR was supposed to solve
doesn't seem to exist any more: I deliberately tried listening on a
port that had a TIME_WAIT connection sitting on it, and it worked for
me even without SO_REUSEADDR.
(I can't remember now whether I definitely confirmed the TIME_WAIT
problem on a previous version of Windows, or whether I just assumed it
would happen on Windows in the same way as Linux, where I definitely
do remember observing it.)
While I'm changing that setsockopt call, I've also fixed its 'on'
parameter so that it's a BOOL rather than an int, in accordance with
the docs for WinSock setsockopt.
The redesign in commit 9fccb065a arranged that all keystroke data goes
via term_keyinput_internal, which calls term_bracketed_paste_stop just
in case the keystroke had interrupted an in-progress paste.
But, embarrassingly, I forgot that _pasted_ data also goes via
term_keyinput_internal, and bracketed paste mode certainly should not
be terminated before _that_ is sent! I should have conditionalised the
call to term_bracketed_paste_stop on the 'interactive' flag parameter,
which is precisely there to tell the difference between pastes and
true keyboard input.
The message "Reusing a shared connection to this server" is sent to
the seat's output method during the call to ssh_init. In Windows
Plink, that output method wants to talk to the BinarySink stderr_bs
(or stdout_bs, but for this particular message, stderr). So we have to
have already set up stderr_bs by the time the backend init function is
called.
When I introduced the unreachable() macro in commit 0112936ef, I
searched the source code for assert(0) and assert(false), together
with their variant form assert(0 && "explanatory text"). But I didn't
search for assert(!"explanatory text"), which is the form I used to
use before finding that assert(0 && "text") seemed to be preferred in
other code bases.
So, here's a belated replacement of all the assert(!"stuff") macros
with further instances of unreachable().
The number of people has been steadily increasing who read our source
code with an editor that thinks tab stops are 4 spaces apart, as
opposed to the traditional tty-derived 8 that the PuTTY code expects.
So I've been wondering for ages about just fixing it, and switching to
a spaces-only policy throughout the code. And I recently found out
about 'git blame -w', which should make this change not too disruptive
for the purposes of source-control archaeology; so perhaps now is the
time.
While I'm at it, I've also taken the opportunity to remove all the
trailing spaces from source lines (on the basis that git dislikes
them, and is the only thing that seems to have a strong opinion one
way or the other).
Apologies to anyone downstream of this code who has complicated patch
sets to rebase past this change. I don't intend it to be needed again.
The RESIZE_EITHER resizing mode responds to a window resize by
changing the logical terminal size if the window is shown normally, or
by changing the font size to keep the terminal size the same if the
resize is a transition between normal and maximised state.
But a user pointed out that it's also possible for a window to receive
a WM_SIZE message while _remaining_ in maximised state, and that
PuTTY's resize logic didn't allow for that possibility. It occurs when
there's a change in the amount of available screen space for the
window to be maximised _in_: e.g. when the video resolution is
reconfigured, or when you reconnect to a Remote Desktop session using
a client window of a different size, or even when you toggle the
'Automatically hide the taskbar' option in the Windows taskbar settings.
In that situation, the right thing seems to be for PuTTY to continue
to go with the policy of changing the font size rather than the
logical terminal size. In other words, we prefer to change the font
size when the resize is _from_ maximised state, _to_ maximised state,
_or both_.
That's easily implemented by removing the check of the 'was_zoomed'
flag, in the case where we've received a WM_SIZE message with the
state SIZE_MAXIMIZED: once we know the transition is _to_ maximised
state, it doesn't matter whether or not it was also _from_ it. (But we
still set the was_zoomed flag to the most recent maximised status, so
that we can recognise transitions _out_ of maximised mode.)
In Python 3, open(path, "w") defaults to text files encoded using some
default encoding, which of course isn't what we want for image data.
Use open(path, "wb") instead, and adjust the write calls to use forms
that work with Python 3. bytes() and bytearray() are available as of
Python 2.6.
(This could be simplified to e.g. b"%c%c%c%c" % (r,g,b,a), and similarly
avoiding the manual .encode call; but %-formatting on bytes requires
Python 3.5, and I thought it might be better to be compatible with older
versions of Python 3.)
"key in dict" was introduced in Python 2.2, and the older
"dict.has_key(key)" form was removed in Python 3. Use the more portable
"key in dict" form instead.
Python 3 changed the built-in round() function to use round-to-even for
exact half-way cases. While this eliminates a bias away from zero and
is thus a better choice in many cases, it's not what mkicon.py expected.
Shadow the built-in with an invocation of the decimal module instead:
this produces identical results for mkicon.py's purposes on both Python
2 and 3.
In Python 2, x/y means floor division if both x and y are ints, and true
division otherwise: that is, 1/2 == 0. In Python 3, x/y always means
true division even if both x and y are ints, so 1/2 == 0.5.
To prepare for porting to Python 3, change all the places that relied on
floor division to use the more explicit // operator, and use "from
__future__ import division" to change the / operator to use Python 3's
semantics even in Python 2. Both of these features have been available
since Python 2.2.
I noticed today that when you use Unix Plink interactively, to run a
tty session on a remote host starting from a local tty, it doesn't
copy the local terminal size into the "pty-req" channel request.
It looks as if this is a bug introduced in 2ca0070f8, when I broke up
ssh.c. Before that, the monolithic Ssh init procedure set
ssh->term_width and ssh->term_height from the Conf you passed in.
Afterwards, variables of the same name existed in both ssh.c *and*
ssh2connection.c (the former so that it can buffer window-size changes
before a connection layer yet exists to pass them on to), but somehow,
*neither* source file remembered to initialise them from the Conf.
Fixed by reinstating the initialisation in ssh.c. (For the same reason
as above: you want the values in Conf to be overwritten if the window
size changes between ssh_init and ssh2_connection_new.)
Commit f2e61275f converted the integer casts in cmpforsearch to
uintptr_t from unsigned long. But it left the companion function
cmpfortree alone, presumably on the grounds that the compiler didn't
report a warning for that one.
But those two functions (cmpfortree and cmpforsearch) are used with
the same tree234, so they're supposed to implement the same sorting
criterion. And the thing they're actually comparing, namely the
Windows API typedef SOCKET, is a pointer-sized integer. So there was a
latent bug here in which cmpforsearch was comparing all 64 bits of the
pointer, while cmpfortree was only comparing the low-order 32.
If the cursor is on the rightmost column of the terminal and
term->wrapnext is set, and the user asks to erase from the current
position to the end of (at least) the line, what should happen?
PuTTY's previous behaviour was to ignore term->wrapnext, and do the
same thing we would have done without it: erase from the current
physical cursor position to EOL inclusive, i.e. blank the character
cell we just printed.
But this is unfortunate if a program writes an interleaving of
printing characters and ESC[K, which I recently found out is what gcc
does in its colour-highlighted error messages: if the last printed
char just before an ESC[K pushes the cursor into the deferred-wrap
state, then the ESC[K blanks that character, and then we wrap to the
next line. So one character of the error message ends up missing.
xfce4-terminal and gnome-terminal take the approach in this situation
of regarding the cursor position as _right_ at the end of the line, so
no character cells get cleared at all, and the error message displays
as intended. I think that's more sensible, so I've switched to doing
the same thing.
(xterm has different behaviour again: it blanks the character cell and
also clears its analogue of the wrapnext flag. So in _their_ handling
of this sequence of output, one character of the error message is
still missing, but it looks as if it's been _omitted_ rather than
replaced by a space.)
Secondly, in the course of fixing that, I looked at the check_boundary
call in erase_lots, which is supposed to ensure that if a wide CJK
character straddles the boundary between what's being erased and what
isn't, then both halves of the character are deleted. I had to modify
that anyway because I was moving that very boundary, and in doing so,
I noticed that even according to the previous behaviour, it had an
off-by-one error. In the case where you send ESC[1K (meaning erase up
to and including the cursor position), the call to check_boundary was
performed on the _left_ edge of the cursor's character cell, when it
should have been the right edge. So you could end up with an
erase_char in the left half (i.e. a space) and still have the magic
value UCSWIDE in the right half, causing the terminal to think you had
a double-width U+0020 on the screen, which isn't supposed to be able
to happen.
Set the initial buffer size to MAX_PATH + 1 (261). Increment e->i before
the function returns instead of incrementing it in the call to
RegEnumKey.
The initial buffer size was too small to fit a subkey with a 256
characters long name plus '\0', the first call to RegEnumKey would fail
with ERROR_MORE_DATA, sgrowarray would grow the buffer, and RegEnumKey
would be called again.
However, because e->i was incremented in the first RegEnumKey call, the
second call would get the next subkey and the subkey with the long name
would be skipped.
Saving a session with a 256 characters long name would trigger this
problem. The session would be saved in the registry, but Putty would not
be able to display it in the saved sessions list.
Pageant didn't have this problem since it uses a different function to get
the saved sessions and the size of the buffer used is MAX_PATH + 1. Pageant
and Putty would display slightly different lists of saved sessions.
It's silly to use the higher-level name_lookup(), because the way in
which they differ is that name_lookup() allows DNS lookups to be
deferred to the other side of a network proxy over which your
connection is travelling - and when you're trying to find out your own
host name for use as a database key, there's no _connection_ involved
at all, and no potential proxy either.
A tiny piece I missed from commit 9545199ea: if the function that sets
that flag is gone, and so is the code that acts on it, then the flag
doesn't need to be there any more either.
I've only just noticed that it doesn't do anything at all!
Almost every implementation of the Socket vtable provides a flush()
method which does nothing, optionally with a comment explaining why
it's OK to do nothing. The sole exception is the wrapper Proxy_Socket,
which implements the method during its setup phase by setting a
pending_flush flag, so that when its sub-socket is later created, it
can call sk_flush on that. But since the sub-socket's sk_flush will do
nothing, even that is completely pointless!
Source control history says that sk_flush was introduced by Dave
Hinton in 2001 (commit 7b0e08270), who was going to use it for some
purpose involving the SSL Telnet support he was working on at the
time. That SSL support was never finished, and its vestigial
declarations in network.h were removed in 2015 (commit 42334b65b). So
sk_flush is just another vestige of that abandoned work, which I
should have removed in the latter commit but overlooked.
Any terminal event that modifies the range of screen+scrollback
currently covered by the mouse selection should call deselect(), so
that users won't be misled into thinking that the new version of the
highlighted text would be pasted, and also so that using the
MBT_EXTEND action to modify the selection bounds won't start from
a selection you never had in the first place.
Clearing the scrollback is such an event, if the selection covered any
of the scrollback at all; so we should apply the same policy here as
everywhere else.
This reverts commit 80f5a009f6.
After a bit more thought, I've decided it's the wrong way to solve the
problem. We shouldn't really be _changing_ the current selection
bounds in response to an event that touches the range they cover. With
this fix in place, if you clear the scrollback while a selection
partly overlaps it, and then extend the modified selection, you'll get
a selection one of whose endpoints is something you never specified as
a selection endpoint at all, and possibly paste the wrong text.
A better fix is to do the same thing we do about any other event that
touches the range covered by the selection: get rid of the selection
completely. For ease of cherry-picking (in case anyone needs to apply
the good fix in some downstream branch, or whatever), I'll make that
change separately in the next commit.
term_clrsb() was emptying the tree234 of scrollback, without checking
whether term->selstart, term->selend and term->selanchor were pointing
at places in the now-removed scrollback. If they were, then a
subsequent extend-selection operation could give rise to the dreaded
'line==NULL' assertion box.
Thanks to the user who sent in one of those debugging dumps, that
finally enabled us to track down (at least one case of) this
long- standing but extremely rare crash!
These are now inline functions (mostly, except for a couple of macro
wrappers to preserve the old call syntax), and they live in terminal.h
instead of at the top of terminal.c.
Also, I've added a few comments for clarity, and renamed the posPlt()
function, which didn't do at all what the name made it look (at least
to a mathematician familiar with product orders) as if it did.
In both stripctrl_locale_put_wc() and stripctrl_term_put_wc(), we get
a character width from either wcwidth or term_char_width, which might
be -1 for a control character. Coverity complained that that width
value is later passed to to stripctrl_check_line_limit, which takes a
size_t parameter, i.e. it expects the width to be non-negative.
This is another bug that doesn't actually cause damage, but the
reasons why are quite fiddly:
- The only width<0 characters we let through are those that got
through stripctrl_ctrlchar_ok. (At both call sites, if that
function is unhappy then we either take an early return or else
replace the character _and_ the value of 'width' with a
substitution.)
- stripctrl_ctrlchar_ok only lets through \n, or \r if permit_cr is
set.
- stripctrl_check_line_limit ignores its width parameter if line
limiting is turned off, or if it gets \n.
- So the only way this can cause a problem is if permit_cr is set so
that \r can get through ctrlchar_ok, *and* line limiting is turned
on so that we get far enough through check_line_limit to choke on
its negative width.
- But in fact all the call sites that ever call
stripctrl_enable_line_limiting do it with a StripCtrlChars that
they got from a seat, and all the nontrivial implementations of
seat_stripctrl_new pass false for permit_cr.
So the combination of circumstances in which -1 gets implicitly
converted to size_t *and* stripctrl_check_line_limit actually pays
attention to it can never arise. But it's clearly foolish to make the
correctness of the code depend on a proof that long - now Coverity has
pointed it out, I'm not happy with it either! Fixed by substituting 0
for the width in the questionable cases.
Coverity points out that we're inconsistent about checking it for
NULL: seat_stripctrl_new() *can* return NULL, so in principle, we can
go through the initialisation code for s->ki_scc and have it still be
NULL afterwards. I check that in the code that uses it to sanitise the
prompt strings out of USERAUTH_INFO_REQUEST, but not in the code that
sanitises the name or instruction strings. Now all uses are checked in
the same way.
This is only a latent bug, because the four main Seat implementations
(GUI and console, on Windows and Unix) never return NULL. The only
implementation of seat_stripctrl_new which _can_ return NULL is the
trivial nullseat_stripctrl_new, currently only used by Uppity.
However, of course, if that changes in future, this latent bug could
turn into a real one, so better to fix it before that happens. Thanks
Coverity.
At Coverity's urging, I put in an instance of the Fortuna PRNG in
place of the use of rand() to select which drawing area to light up
next on a keypress. But Coverity turns out to be unhappy that I'm
still using rand() to select the _initial_ area, before the user
enters any input at all.
It's even harder to imagine this being a genuine information leak than
the previous complaint. But I don't really see a reason _not_ to
switch it over, now it's been pointed out.
Notable changes:
- mentioned ASan, which now seems to be good enough to take over from
valgrind as my first-choice memory debugging tool
- release announcements now live in putty-aux
- a couple of clarifications of which directory to be in for which
commands, and adjusted relative paths appropriately
- remove obsolete mention of entering the GPG release key passphrase
numerous times (that was for GPG 1, without gpg-agent)
- remind myself that in the final signing procedure, you have to undo
the chmod you did for safety a couple of weeks earlier!
If the agent sent a response whose length field describes an interval
of memory larger than the file-mapping object the message is supposed
to be stored in, we shouldn't return that message to the client as if
nothing is wrong. Treat that the same as a failure to receive any
response at all.
Looking over this function today, I spotted several questionable uses
of strcat to concatenate "\PUTTY.RND" to the end of a pathname,
without having checked whether the pathname had filled up the static
fixed-size buffer already.
I don't think this is exploitable (because you'd have to be in control
of the local account already to control any of the data sources used
to fill those buffers). But it's horrible anyway, of course. Now all
of those are replaced with sensible dupcats.
(This patch re-indents a lot of the function, to give variables
tighter scopes. So the diff is best viewed with whitespace ignored.)
If a server sent a very large number as extended_count, and didn't
actually send any extended attributes, we could loop around and around
calling get_string, which would carefully not overrun any buffer or
leak any memory, and we weren't paying attention to extended
attributes anyway, but it would still pointlessly consume CPU.
Now we bail as soon as the BinarySource flags an error. Current
callers will then spot the error and complain that the packet was
malformed.
We were carefully checking for overflow inside safemalloc() before
multiplying together the two factors of the desired allocation size.
But snew_plus() did an addition _outside_ safemalloc, without the same
guard. Now that addition also happens inside safemalloc.
If _both_ the host key and the server key were less than 32 bytes
long, then less than 32 bytes would be allocated for the buffer
s->rsabuf, into which the 32-byte session id is then copied.
If the packet length field was in the range 0 <= x < 5, then it would
pass the initial range check, but underflow to something in the region
of 0xFFFFFFFF when the BPP code subtracted 5 from it, leading to an
overlarge memory allocation, and/or allocation failure, and perhaps
worse.
Introduced by 61f3e3e29, as part of my periodic efforts to make the
GTK front end work usefully on OS X: the 'Unable to open connection to
[host]: [error]' message box is accidentally passed through two layers
of printf format-string parsing, the second of which has no argument
list. So if you pass in a host name like '%s' on the command line,
bad things will happen when that error message is constructed.
This happened because in that commit I changed a call to
fatal_message_box() into a call to connection_fatal(), without
noticing that the latter was printf-style variadic and the former
wasn't. On the plus side, that means now I can remove the explicit
dupprintf/free around the error message.
This mitigates a borderline-DoS in which a malicious SFTP server sends
a ludicrously large number of file names in response to a SFTP
opendir/readdir request sequence, causing the client to buffer them
all and use up all the system's memory simply so that it can produce
the output in sorted order.
I call it a 'borderline' DoS because it's very likely that this is the
same server that you'll also trust to actually send you the _contents_
of some entire file or directory, in which case, if they want to DoS
you they can do that anyway at that point and you have no way to tell
a legit very large file from a bad one. So it's unclear to me that
anyone would get any real advantage out of 'exploiting' this that they
couldn't have got anyway by other means.
That said, it may have practical benefits in the occasional case.
Imagine a _legit_ gigantic directory (something like a maildir,
perhaps, and perhaps stored on a server-side filesystem specialising
in not choking on really huge single directories), together with a
client workflow that involves listing the whole directory but then
downloading only one particular file in it.
For the moment, the threshold size is fixed at 8Mb of total data
(counting the lengths of the file names as well as just the number of
files). If that needs to become configurable later, we can always add
an option.
While testing the previous fix, I also noticed that s->num_prompts is
an ordinary signed int. So if the server sends a _really_ large value,
it'll be treated as negative.
That's kind of harmless: our loop wouldn't read any prompts at all
from the packet, and then it would send back the same nonsense count
with no responses. But it's inelegant: now, if the server violates the
protocol in this way, we respond by sending an even wronger packet in
return.
Changed the type of num_prompts and the loop variable to uint32_t, so
now we'll respond by actually trying to read that many prompts, which
will fail by reaching the new error check. I think that's a more
sensible way to handle this one.
If a server sends a very large number as the num-prompts field, we'd
loop round calling get_string and get_bool, which would run off the
end of the buffer but still carefully return legal default values (the
empty string and false), so we'd carry on piling pointless stuff into
s->cur_prompt and using up lots of memory.
Coverity pointed this out by warning that an untrusted server-provided
value was being used as a loop bound. I'm not convinced that's an
error in every case, but I must admit it pointed out a useful thing
here!
The fix is to check the error indicator on the BinarySource after
reading each prompt from the input packet. Then we'll only keep
allocating memory as long as there's actually data in the packet.
Coverity points out that wcwidth is capable of returning a negative
number, which suggests that it's a mistake to pass its return value
unchecked to stripctrl_check_line_limit.
This shouldn't cause a problem _in principle_, because by the time
we're making that call, we should already have ruled out the kind of
dangerous control character that might provoke that return value from
wcwidth. But in practice, I couldn't absolutely guarantee that
everyone's idea of what is or is not a control character agrees in
every detail, so I think Coverity is right to urge caution.
Fixed by calling wcwidth (or its wrapper term_char_width) up front,
and ensuring that any character provoking a negative return value is
included in the 'control characters to sanitise out' branch of the
preceding logic.
When I set up the simplified version of just the ssh-connection
protocol we use for connection sharing, I carefully documented at the
top of sshshare.c that packets in that protocol variant are limited to
just over 0x4000 bytes. And did I remember to actually honour that, by
restricting the sizes of outgoing packets when actually speaking the
bare connection protocol? I did not.
Well, better late than never. Here I introduce a packet size limit
that can be imposed by the BPP, and arrange for sshconnection.c to
take the min of that and any given channel's max packet size as sent
by the remote end. All BPPs except ssh2bpp-bare set it to the no-op
value of the largest possible uint32_t.
In normal builds this makes no difference, but in Windows builds with
the Minefield diagnostic system turned on, free()ing a Minefield-
allocated object causes a crash. Apparently I haven't tested Pageant
under Minefield for ages - only PuTTY, talking to an ordinary Pageant
I'd already started up.
The field s->username in the ssh2userauth state was sometimes a
secondary reference to s->default_username, and sometimes a
dynamically allocated string. So not freeing it can leak memory, but
freeing it (if non-null) can cause a double-free crash.
Solution: add a new field s->locally_allocated_username which is the
thing that needs freeing (if any), and make s->username into a const
char * which is always a secondary reference to _something_.
How embarrassing: we were looping back round to prompt for a username
again if s->change_username was *true*, but then the test at the top
would only have reissued the prompt if it was *false*. So that
configuration checkbox must have been broken for ages. It's a good job
most SSH servers don't reward enabling it!
It's a secondary reference to existing data (namely an argv word, if
anything at all), and never needs to be written through, so we should
mark it const, if nothing else to catch any absent-minded attempts to
free it.