In the course of recent refactorings I noticed a couple of cases where
we were doing old-fashioned preallocation of a char array with some
conservative maximum size, then writing into it via *p++ or similar
and hoping we got the calculation right.
Now we have strbuf and dupcat, so we shouldn't ever have to do that.
Fixed as many cases as I could find by searching for allocations of
the form 'snewn(foo, char)'.
Particularly worth a mention was the Windows GSSAPI setup code, which
was directly using the Win32 Registry API, and looks much more legible
using the windows/utils/registry.c wrappers. (But that was why I had
to enhance them in the previous commit so as to be able to open
registry keys read-only: without that, the open operation would
actually fail on this key, which is not user-writable.)
Also unix/askpass.c, which was doing a careful reallocation of its
buffer to avoid secrets being left behind in the vacated memory -
which is now just a matter of ensuring we called strbuf_new_nm().
These handy wrappers on the verbose underlying Win32 registry API have
to lose some expressiveness, and one thing they lost was the ability
to open a registry key without asking for both read and write access.
This meant they couldn't be used for accessing keys not owned by the
calling user.
So far, I've only used them for accessing PuTTY's own saved data,
which means that hasn't been a problem. But I want to use them
elsewhere in an upcoming commit, so I need to fix that.
The obvious thing would be to change the meaning of the existing
'create' boolean flag so that if it's false, we also don't request
write access. The rationale would be that you're either reading or
writing, and if you're writing you want both RW access and to create
keys that don't already exist. But in fact that's not true: you do
want to set create==false and have write access in the case where
you're _deleting_ things from the key (or the whole key). So we really
do need three ways to call the wrapper function.
Rather than add another boolean field to every call site or mess about
with an 'access type' enum, I've taken an in-between route: the
underlying open_regkey_fn *function* takes a 'create' and a 'write'
flag, but at call sites, it's wrapped with a macro anyway (to append
NULL to the variadic argument list), so I've just made three macros
whose names request different access. That makes call sites marginally
_less_ verbose, while still
dh_is_gex() expects to find a 'struct dh_extra' in the 'extra' field
of the kex_alg you pass in, and won't look kindly on finding an
instance of some totally different structure type. We were being
careful about that everywhere in the GSSAPI kex code except for the
final free step.
The conditionalisation of that call on 'protocol == PROT_SSH' has been
around since the beginning of our git history. But in those days,
random_save_seed() was unconditional _internally_ - it would always
create and write to the seed file regardless of whether the random
pool had even been initialised, let alone used.
Now random_save_seed() has its own internal condition which prevents
it doing anything if the random subsystem was never started up in the
first place. So it's better to call it unconditionally from
cleanup_exit, and then it'll be able to do its thing whenever needed,
without having to second-guess based on the top-level protocol.
(In fact, that's what all the other implementations of cleanup_exit()
have done all along. On Unix, and in Windows console apps, we do call
random_save_seed() unconditionally, and expect it to uncomplainingly
do nothing if there's nothing to do.)
(cherry picked from commit 260aad5fca)
In cases where we refuse a resize request, either because it's too
large or because the window is not currently resizable due to being
maximised, we were failing to communicate that back to the Terminal so
that it could stop waiting for the resize and resume processing input.
Those have been there since around 2001. They're in a piece of code
that calls get_fullscreen_rect to find the overall screen size, and
then prevents attempts to resize the window larger than that. The
static variables were arranging that we don't have to call
get_fullscreen_rect more than once.
But, firstly, computers are faster 20 years on; secondly, remote
window-resize requests are intentionally rate-limited (as of commit
d74308e90e), so this shouldn't be the limiting factor anyway; and
thirdly, multi-monitor support has appeared since then, which means
that if the window has been dragged from one monitor to another then
get_fullscreen_rect might _legitimately_ return a different bounding
rectangle when called a second time.
So we should just do the full check every time unconditionally.
(cherry picked from commit 4b3a8cbf61)
This is a piece of refactoring that's been overdue forever. In the
Unix front end, all the variables specific to a particular SSH session
live in a big 'inst' structure, and calls to any of the trait APIs
like Seat or TermWin can recover the right 'inst' from their input
context pointer. But the Windows frontend was written before that kind
of thing became habitual to me, and all its variables have been just
lying around at the top level of window.c, and most of those context
pointers were just ignored.
Now I've swept them all up and put them where they ought to be, in a
big context structure. This was made immeasurably easier by the very
nifty 'clang-rename' tool, which can rename a single variable in a C
source file, and find just the right set of identifiers to change even
if other variables in the file have the same name, even in sub-scopes.
So I started by clang-renaming all the top-level variables in question
to have names beginning with a prefix that didn't previously appear
anywhere in the code; checked that still worked; and then moved all
the declarations into the struct type and did a purely textual
search-and-replace of the prefix with 'wgs->'.
One potential benefit of this change is that it allows more than one
instance of the WinGuiSeat structure to exist in the same process. I
don't have any immediate plans to actually do that, but it's nice to
know it wouldn't be ruled out if we ever did need it.
But that's not the main reason I did it. The main reason is that I
recently looked at the output of a Windows build using clang -Wall,
and was horrified by the number of casual shadowings of
generically-named global variables like 'term' and 'conf' with local
variables of the same name. That seemed like a recipe for confusion,
and I decided the best way to disambiguate them all was to do this
refactoring that I'd wanted anyway for years.
A few uses of the global variables occurred in functions that didn't
have convenient access to the right WinGuiSeat via a callback
parameter of some kind. Those had to be treated specially. Most were
cleaned up in advance by the previous few commits; the remaining fixes
in this commit itself were in functions like modalfatalbox(),
nonfatal() and cleanup_exit(). The error reporting functions want the
terminal HWND to use as a MessageBox parameter; they also have the
side effect of un-hiding the mouse pointer in the terminal window, in
case it's hidden; and cleanup_exit wanted to free some resources
dangling off the WinGuiSeat.
For most of those cases, I've made a linked list of all currently live
WinGuiSeat structures, so that they can loop over _all_ live instances
(whether there's one as usual, none, or maybe more than one in
future). The parent window for non-connection-specific error messages
is found by simply picking one arbitrarily off the linked list (if
any); the cleanups are done by iterating over the _whole_ list.
The mouse-pointer unhiding is dealt with by simply allowing
show_mouseptr to take a _null_ WinGuiSeat pointer. The only thing it
needs the context for at all is to check whether pointer-hiding is
enabled in the session's Conf; so if we're _un_-hiding the pointer we
don't need to check, and can unhide it unconditionally.
The remaining global variables in window.c are the new linked list of
all WinGuiSeat structures; 'wm_mousewheel' and 'trust_icon' which
really should be global across all WinGuiSeats even if we were to have
more than one; and there's a static variable 'cursor_visible' in
show_mouseptr() which is likewise legitimately Seat-independent (since
it records the last value we passed to the Win32 API ShowCursor
function, and _that's_ global across the whole process state).
All of this should cause no functional change whatsoever.
Those have been there since around 2001. They're in a piece of code
that calls get_fullscreen_rect to find the overall screen size, and
then prevents attempts to resize the window larger than that. The
static variables were arranging that we don't have to call
get_fullscreen_rect more than once.
But, firstly, computers are faster 20 years on; secondly, remote
window-resize requests are intentionally rate-limited (as of commit
d74308e90e), so this shouldn't be the limiting factor anyway; and
thirdly, multi-monitor support has appeared since then, which means
that if the window has been dragged from one monitor to another then
get_fullscreen_rect might _legitimately_ return a different bounding
rectangle when called a second time.
So we should just do the full check every time unconditionally.
This is like the seat-independent nonfatal(), but specifies a Seat,
which allows the GUI dialog box to have the right terminal window as
its parent (if there are multiple ones).
Changed over all the nonfatal() calls in the code base that could be
localised to a Seat, which means all the ones that come up if
something goes horribly wrong in host key storage. To make that
possible, I've added a 'seat' parameter to store_host_key(); it turns
out that all its call sites had one available already.
Previously, the timing.c subsystem worked in Windows PuTTY by means of
scheduling WM_TIMER messages to be sent to the terminal window. Now it
uses a separate hidden window instead, and all the machinery for
handling that window lives on its own in windows/utils/gui-timing.c.
Most immediately, this removes a use of wgs.term_hwnd that will become
awkward when I move that structure in a following commit. But also, it
will make it easier to add the same timing subsystem to unrelated GUI
programs, such as Windows Pageant: if we ever decide to implement
automatic deletion or re-encryption of unused keys after a timeout,
this will help make that easier.
That clipboard-writing function is called just once, from the Event
Log dialog procedure, for when the user deliberately copies to the
clipboard. That call always passes must_deselect = true, which means
the conditional WM_IGNORE_CLIP messages are not sent. So it's simpler
to remove that parameter completely, and the conditional calls which
are never used.
Also, the clipboard data copied from the Event Log dialog is being put
in the clipboard associated with the main PuTTY terminal window. But
anything else we copy from a dialog box using Windows's built-in
copy-paste mechanisms would surely be associated with the _dialog_,
not its parent window. So we should do the same thing here. Therefore,
I've added a HWND parameter to write_aclip() and used that in place of
wgs.term_hwnd, so that we can pass in the HWND of the dialog itself.
The conditionalisation of that call on 'protocol == PROT_SSH' has been
around since the beginning of our git history. But in those days,
random_save_seed() was unconditional _internally_ - it would always
create and write to the seed file regardless of whether the random
pool had even been initialised, let alone used.
Now random_save_seed() has its own internal condition which prevents
it doing anything if the random subsystem was never started up in the
first place. So it's better to call it unconditionally from
cleanup_exit, and then it'll be able to do its thing whenever needed,
without having to second-guess based on the top-level protocol.
(In fact, that's what all the other implementations of cleanup_exit()
have done all along. On Unix, and in Windows console apps, we do call
random_save_seed() unconditionally, and expect it to uncomplainingly
do nothing if there's nothing to do.)
I don't know why I never bothered to write it down before, but it's a
good idea to let a pre-release build actually *happen* between turning
them on and updating the website to claim they exist.
Also, for the first time ever, I've just sent out an announcement
email for the 0.78 *pre-releases*, soliciting testing in advance of
the actual release. So, add that to the checklist as well.
We withdrew our FTP download links in July, when chiark's OS upgrade
made its previous ftpd go away. We've had no complaints at all about
that, so I think it's time to decide that FTP is officially obsolete,
and remove it from the script that does the uploads, and from the
release checklist.
In the initial commit 031d86ed5b that introduced them, I
accidentally put them below the 'warn about insecurity' line, which I
didn't mean to. Moved them up to just above the existing group14.
Also, I've arranged them in a slightly weird order, so that the most
preferred group of this collection is the medium-sized group16,
followed by the larger ones (17 and 18) and then the smaller 15.
Rationale: larger is better _until_ it starts costing way too much CPU
time, and group18 can grind quite painfully on a slow machine. (And of
course users are free to reconfigure if they have different
preferences.)
This isn't really ideal, of course. The idea that you might not want
to use group18 *because it's slow* contradicts the basic concept of
PuTTY's current crypto-preferences UI, which assumes that you rank
things by security, which is why there's a dividing line below which
things are assumed insecure. I hope that in a future release we'll
rework the UI so that you can express more subtle ideas of what crypto
you do and don't like. But this will do for the moment.
The GSS versions of the same DH methods are reordered similarly.
Jacob points out that the output of 'puttygen --dump', where the
key_components are used, is much more likely to need to be machine-
than human-readable, and so it makes more sense to use a date/time
format that's invariant under external changes such as locale.
(He also points out that Windows's time zone description strings are
overly verbose!)
With the new larger fixed-group methods, it's less clearly always the
right answer. (Really it seems more sensible to use ECDH over any of
the integer DH, these days.)
Also, reword other kex descriptions a bit.
Simon tells me he was pondering whether chacha20-poly1305 could be
reworked to use the new facilities, but on reflection there's no way to
use it to improve matters.
This went missing in the migration to CMake, and broke compatibility
of the standard 32-bit builds with Windows XP. (Of course, the
'buildold' versions should still have run.)
There doesn't seem to be a convenient CMake option to configure it
cleanly, so I had to do a bodgy string-replace on the variable
containing the linker flags, which I found by source-diving in CMake.
That's fragile enough that I've also put in a check after the fact, so
that we'll find out if it ever stops working.
Just spotted this in eyeball review: we're about to construct our new
outgoing KEXINIT and write it into the strbuf s->outgoing_kexinit. So
we should clear that strbuf first. But in fact we were clearing
s->client_kexinit, which aliases s->outgoing_kexinit in an SSH client,
but in a server, aliases s->incoming_kexinit.
This was harmless in PuTTY (since the strbuf we cleared was the right
one anyway). And it was harmless in Uppity's initial kex (since the
strbuf we _meant_ to clear was empty anyway). But if Uppity had ever
initiated a rekey, this would have exploded messily.
I only realised this bug while writing up the feature for the
wishlist:
It's one thing _at connection startup_ to delay sending your KEXINIT
until the server has sent its: the server is very likely to send it
anyway (unless it's attempting the same workaround against us), so
probably nothing goes wrong.
But if we want to initiate a rekey, we do that _by_ sending a KEXINIT.
In that situation we can't just wait until the server sends one,
because it has no idea it's supposed to be doing so!
Happily, in that situation, we already have a KEXINIT from the server,
left over from the previous key exchange. So we can filter against
that, and still have the intended effect of not spending KEXINIT space
on algorithms the server doesn't know about.
Somehow I missed that Coverity reported that complaint about a
(theoretically) uninitialised pointer twice, against the two
platforms' console.c files. Now fixed the same way in the other one.
It complained in console_confirm_ssh_host_key that if the caller
passed in a SeatDialogText containing no SDT_PROMPT record, then
'prompt' would be uninitialised.
The answer is "don't do that, then", but fair enough that Coverity
didn't know that. Added an assertion, which should keep it happy, and
also cause better error handling if we ever _do_ make that mistake.
Coverity complained in keylist_update_callback that in one if
statement I was allowing for the possibility that alg == NULL, and in
the next, I was assuming it would always be non-null.
Right now I'm not actually convinced that _either_ check is necessary
- it would make sense in an agent _client_, where you might be talking
to an agent that knows key algorithms you don't, but this is the GUI
built into Pageant itself, so any key it can store internally ought to
have a known algorithm name.
Still, this fix is certainly _correct_ even if not optimal, and it'll
do for now.
Coverity spotted me copying an uninitialised variable into it, which
made me wonder how I hadn't noticed. The answer is that nothing
actually _uses_ that variable - it's written, but never read. I must
have put it in during development, thinking I was going to need it for
something, and then didn't end up using it after all.
Coverity points out that if we refer to cp_list[codepage - 65536], we
ought to have ensured that codepage - 65536 was _less_ than
lenof(cp_list), not just less or equal.
The protocol selector widgets were misaligned in GTK as well as on
Windows, but for a completely different reason. (I guess both bugs
must have been introduced at the same time when I reworked the system
to tolerate more than two aligned widgets in commit b5ab90143a2df7f.)
To vertically align N widgets, you have to first figure out what range
of y-coordinates they jointly occupy, and then centre each one within
that range. We were trying to do both jobs in the same pass, which
meant trying to place the first widget before finding out where the
last one will be. To do this, we were separately computing the
y-range's start and width, the former by taking max of the
y-coordinates _seen so far_, and the latter by taking max of _all_ the
widgets' heights.
This has two problems. One is that if you later find out that the
y-coordinate of the top of the range needs to be lower than you'd
previously realised, it's too late to go back and reposition the
widgets you've already placed. But that's a theoretical issue that
would only come up with more complicated column layouts than we've
actually used. (And probably more complicated than would even be
_sensible_ to use.)
The other, more immediate, problem: the y-coordinates we were using
for already-placed widgets in the set were the ones _after_ we
adjusted each one for vertical centring. So if the first widget is
short and the second taller (say, heights 20 and 30 pixels), then the
first widget will be offset downwards by 5 pixels, but the second
widget will use that offset y-coordinate as the _top_ of the range to
fit itself into, and hence, will also be 5 pixels downward from where
it should have been.
I think only the second of those problems is immediately concerning,
but it's easier to fix both at once. I've removed the y-adjustment for
vertical centring from the main layout loop, and put it in a separate
pass run after the main layout finishes.
We had carefully calculated, for each control in an aligned group, how
much _that control_ should move downwards by. But then, because I
carelessly referred to the wrong variable name, we actually moved the
wrong one - not the control we'd just calculated the offset for, but
always the _last_ one in the group, which was the one the top-level
alignment code was processing at the point we began this loop.
As a result, the dropdown list in the front-page protocol selector was
hilariously misaligned. Now it's back where it should be.
Another of this weekend's warnings pointed out that this function
contained a pattern I now regard as a cardinal sin: variables called
'ret' that aren't clear whether they've _been_ returned from a
subroutine, or whether they're _planned_ to be returned from the
containing function. Worse, psftp_main had both: two of the former
kind shadowing a case of the latter in sub-scopes.
I decided that the 'namemaker' system introduced recently in commit
fbb979aa98 was just too marginal to be sensible, and it's easier
to simply quote the full SSH id for each protocol.
Also, included an empty argument at the end of each macro invocation,
so that the variadic "..." is never completely missing.
My experimental build with clang-cl at -Wall did show up a few things
that are safe enough to fix right now. One was this list of missing
includes, which was causing a lot of -Wmissing-prototype warnings, and
is a real risk because it means the declarations in headers weren't
being type-checked against the actual function definitions.
Happily, no actual mismatches.
I was wondering what this was doing here at all when strbuf_chomp is a
better choice. The answer turned out to be 'nothing' - it wasn't even
used any more.
Now we should get warned if we do anything that breaks the new
stricter MinGW warning level, not to mention anything generating
warnings in the clang-cl builds.
Unfortunately, it gives an absolutely huge number of warnings, and it
wouldn't be feasible to fix them all without risking introducing
further bugs. Perhaps _after_ the next release branch it might be
worth looking at some of them, but I don't think fixing them right now
is viable.
I've left it on for actual gcc, though, since the MinGW build seems OK
with it.