Mostly I'm rearranging things because of the new workflows that git
makes available - it's now possible (and indeed sensible) to prepare a
lot of stuff in a fairly relaxed manner in local checkouts, and then
the process of going live with the release has a lot less manual
writing of stuff and a lot more mechanical 'git push' and running of
update scripts.
However, there's one new item that was actually missed off the
previous checklist: turning off nightly pre-release builds after
making the release they were a pre-release of. Ahem.
(cherry picked from commit 8af53d1b69)
encodelib.py is a Python library which implements some handy SSH-2
encoding primitives; samplekex.py uses that to fabricate the start of
an SSH connection, up to the point where key exchange totally fails
its crypto.
The idea is that you adapt samplekex.py to construct initial-kex
sequences with particular properties, in order to test robustness and
security fixes that affect the initial-kex sequence. For example, I
used an adaptation of this to test the Diffie-Hellman range check
that's just gone into 0.64.
(cherry picked from commit 12d5b00d62)
To understand the handle leak bug that I fixed in git commit
7549f2da40, I had to think fairly hard
to remind myself what all this code was doing, which means the
comments weren't good enough. Expanded and rewritten some of them in
the hope that things will be clearer next time.
(cherry picked from commit a87a14ae0f)
Cherry-picker's notes: this apparently pointless commit is required on
this branch because it's a dependency of the rather less pointless
9fec2e7738.
Our config boxes are constructed using the CreateDialog() API
function, rather than the modal DialogBox(). CreateDialog() is not
that different from CreateWindow(), so windows created with it don't
appear on the screen automatically; MSDN says that they must be shown
via ShowWindow(), just like non-dialog windows have to be. But we
weren't doing that at any point!
So how was our config box ever getting displayed at all? Apparently by
sheer chance, it turns out. The handler for a selection change in the
tree view, which has to delete a whole panel of controls and creates a
different set, surrounds that procedure with some WM_SETREDRAW calls
and an InvalidateRect(), to prevent flicker while lots of changes were
being made. And the creation of the _first_ panelful of controls, at
dialog box setup, was done by simply selecting an item in the treeview
and expecting that handler to be recursively called. And it appears
that calling WM_SETREDRAW(TRUE) and then InvalidateRect was
undocumentedly having an effect equivalent to the ShowWindow() we
should have called, so that we never noticed the latter was missing.
But a recent Vista update (all reports implicate KB3057839) has caused
that not to work any more: on an updated Vista machine, in some
desktop configurations, it seems that any attempt to fiddle with
WM_SETREDRAW during dialog setup can leave the dialog box in a really
unhelpful invisible state - the window is _physically there_ (you can
see its taskbar entry, and the mouse pointer changes as you move over
where its edit boxes are), but 100% transparent.
So now we're doing something a bit more sensible. The first panelful
of controls is created directly by the WM_INITDIALOG handler, rather
than recursing into code that wasn't really designed to run at setup
time. To be on the safe side, that handler for treeview selection
change is also disabled until the WM_INITDIALOG handler has finished
(like we already did with the WM_COMMAND handler), so that we can be
sure of not accidentally messing about with WM_SETREDRAW at all during
setup. And at the end of setup, we show the window in the sensible
way, by a docs-approved call to ShowWindow().
This appears (on the one machine I've so far tested it on) to fix the
Vista invisible-window issue, and also it should be more API-compliant
and hence safer in future.
Assorted calls to ssh_pkt_getstring in handling the later parts of key
exchange (post-KEXINIT) were not checked for NULL afterwards, so that
a variety of badly formatted key exchange packets would cause a crash
rather than a sensible error message.
None of these is an exploitable vulnerability - the server can only
force a clean null-deref crash, not an access to actually interesting
memory.
Thanks to '3unnym00n' for pointing out one of these, causing me to
find all the rest of them too.
gcc and clang both provide a type called __uint128_t when compiling
for 64-bit targets, code-generated more or less similarly to the way
64-bit long longs are handled on 32-bit targets (spanning two
registers, using ADD/ADC, that sort of thing). Where this is available
(and they also provide a handy macro to make it easy to detect), we
should obviously use it, so that we can handle bignums a larger chunk
at a time and make use of the full width of the hardware's multiplier.
Preliminary benchmarking using 'testbn' suggests a factor of about 2.5
improvement.
I've added the new possibility to the ifdefs in sshbn.h, and also
re-run contrib/make1305.py to generate a set of variants of the
poly1305 arithmetic for the new size of BignumInt.
In many places I was using an 'unsigned int', or an implicit int by
virtue of writing an undecorated integer literal, where what was
really wanted was a BignumInt. In particular, this substitution breaks
in any situation where BignumInt is _larger_ than unsigned - which it
is shortly about to be.
The final main loop in do_ssh2_authconn will sometimes loop over all
currently open channels calling ssh2_try_send_and_unthrottle. If the
channel is a sharing one, however, that will reference fields of the
channel structure like 'remwindow', which were never initialised in
the first place (thanks, valgrind). Fix by excluding CHAN_SHARING
channels from that loop.
If the real SSH connection goes away and we call sharestate_free with
downstreams still active, then that in turn calls share_connstate_free
on all those downstreams, freeing the things their sockets are using
as Plugs but not actually closing the sockets, so further data coming
in from downstream gives rise to a use-after-free bug.
(Thanks to Timothe Litt for a great deal of help debugging this.)
Rather than doing arithmetic mod 2^130-5 using the general-purpose
Bignum library, which requires lots of mallocs and frees per operation
and also uses a general-purpose divide routine for each modular
reduction, we now have some dedicated routines in sshccp.c to do
arithmetic mod 2^130-5 in a more efficient way, and hopefully also
with data-independent performance.
Because PuTTY's target platforms don't all use the same size of bignum
component, I've arranged to auto-generate the arithmetic functions
using a Python script living in the 'contrib' directory. As and when
we need to support an extra BignumInt size, that script should still
be around to re-run with different arguments.
This allows files other than sshbn.c to work with the primitives
necessary to build multi-word arithmetic functions satisfying all of
PuTTY's portability constraints.
I'd rather see the cipher and MAC named separately, with a hint that
the two are linked together in some way, than see the cipher called by
a name including the MAC and the MAC init message have an ugly
'<implicit>' in it.
This allows for sharing a bit of code, and it also means that
deduplication of KEXINIT algorithms can be done while working out the
list of algorithms rather than when constructing the KEXINIT packet
itself.
The general plan is that if PuTTY knows a host key for a server, it
should preferentially ask for the same type of key so that there's some
chance of actually getting the same key again. This should mean that
when a server (or PuTTY) adds a new host key type, PuTTY doesn't
gratuitously switch to that key type and then warn the user about an
unrecognised key.
Installing this systemwide as the Windows text selection cursor is a
workaround for 'black-pointer'. It's a white I-beam with a one-pixel
black outline around it, so it should be visible on any background
colour. (I suppose that a backdrop of tightly packed I-beams looking
just like it might successfully hide it, but that's unlikely :-)
I constructed this some years ago for personal use; I needed it again
this week and had to go and recover it from a backup of a defunct
system, which made me think I really ought to check it in somewhere,
and this 'contrib' directory seems like the ideal place.
It doesn't seem to be all that good a spec, in that it seems to be
specified in terms of functions in libssh and hence based on the
assumption that you already know exactly what those functions do. But
it's something, at least.
We were doing an endianness flip on the output elliptic-curve point.
Endianness flips of bignums, of course, have to specify how many bytes
they're imagining the value to have (that's how you decide whether to
convert 0xA1A2 into 0xA2A1 or 0xA2A10000 or 0xA2A1000000000000 etc),
and we had chosen our byte count based on the highest set bit in the
_output value_ - but in fact we should have chosen it based on the
size of the curve's modulus, leading to a failure about 1/256 of the
time when the MSB happened to come out zero so the two byte counts
differed.
(Also added a missing smemclr, while I was there.)
It seems like quite an important thing to mention in the event log!
Suppose there's a bug affecting only one curve, for example? Fixed-
group Diffie-Hellman has always logged the group, but the ECDH log
message just told you the hash and not also the curve.
To implement this, I've added a 'textname' field to all elliptic
curves, whether they're used for kex or signing or both, suitable for
use in this log message and any others we might find a need for in
future.
An unguarded write() in the dputs function caused gcc -Werror to fail
to compile. I'm confused that this hasn't bitten me before, though -
obviously normal builds of PuTTY condition out the faulty code, but
_surely_ this can't be the first time I've enabled the developer
diagnostics since gcc started complaining about unchecked syscall
returns!
When anyone connects to a PuTTY tool's listening socket - whether it's
a user of a local->remote port forwarding, a connection-sharing
downstream or a client of Pageant - we'd like to log as much
information as we can find out about where the connection came from.
To that end, I've implemented a function sk_peer_info() in the socket
abstraction, which returns a freeform text string as best it can (or
NULL, if it can't get anything at all) describing the thing at the
other end of the connection. For TCP connections, this is done using
getpeername() to get an IP address and port in the obvious way; for
Unix-domain sockets, we attempt SO_PEERCRED (conditionalised on some
moderately hairy autoconfery) to get the pid and owner of the peer. I
haven't implemented anything for Windows named pipes, but I will if I
hear of anything useful.
Caused an embarrassing failure just now trying to run the test program
from a command prompt - I had Return still held down by the time it
started up, and my release of it immediately terminated input :-)
The new code remembers the contents and meaning of the outgoing KEXINIT
and uses this to drive the algorithm negotiation, rather than trying to
reconstruct what the outgoing KEXINIT probably said. This removes the
need to maintain the KEXINIT generation and parsing code precisely in
parallel.
It also fixes a bug whereby PuTTY would have selected the wrong host key
type in cases where the server gained a host key type between rekeys.
The OpenSSH PEM format contains a big integer with the top bit
potentially set, which we handle by copying the data into a faked up
instance of our own private key format, and passing that to
ecdsa_createkey(). But our own private key format expects an SSH-2
standard mpint, i.e. with the top bit reliably clear, so this might
fail for no good reason.
Fixed by prefixing a zero byte unconditionally when constructing the
fake private blob.
snew(), and most of the bignum functions, are deliberately written to
fail an assertion and terminate the program rather than return NULL,
so there's no point carefully checking their every return value for
NULL. This removes a huge amount of pointless error-checking code, and
makes the elliptic curve arithmetic almost legible in places :-)
I've kept error checks after modinv(), because that can return NULL if
asked to invert zero. bigsub() can also fail in principle, because our
bignums are non-negative only, but in the couple of cases where it's
used there's a preceding compare that should prevent it, so I've just
added assertions.
The menu options and radio buttons for key type were not consistently
setting each other when selected: in particular, selecting from the
menu did not cause the ED25519 radio button to be either set or unset
when that would have been appropriate.
Looks as if I failed to catch in code review the fact that we should
have _one_ call to each of CheckRadioButton and CheckMenuRadioItem,
and they should both have the right 'first' and 'last' parameters
Having found a lot of unfixed constness issues in recent development,
I thought perhaps it was time to get proactive, so I compiled the
whole codebase with -Wwrite-strings. That turned up a huge load of
const problems, which I've fixed in this commit: the Unix build now
goes cleanly through with -Wwrite-strings, and the Windows build is as
close as I could get it (there are some lingering issues due to
occasional Windows API functions like AcquireCredentialsHandle not
having the right constness).
Notable fallout beyond the purely mechanical changing of types:
- the stuff saved by cmdline_save_param() is now explicitly
dupstr()ed, and freed in cmdline_run_saved.
- I couldn't make both string arguments to cmdline_process_param()
const, because it intentionally writes to one of them in the case
where it's the argument to -pw (in the vain hope of being at least
slightly friendly to 'ps'), so elsewhere I had to temporarily
dupstr() something for the sake of passing it to that function
- I had to invent a silly parallel version of const_cmp() so I could
pass const string literals in to lookup functions.
- stripslashes() in pscp.c and psftp.c has the annoying strchr nature
Removed another set of ad-hoc tests of the key size to decide which
hash to use for the signature system, and replaced them with a
straightforward pointer to an ssh_hash structure in the 'extra' area.
The ec_name_to_curve and ec_curve_to_name functions shouldn't really
have had to exist at all: whenever any part of the PuTTY codebase
starts using sshecc.c, it's starting from an ssh_signkey or ssh_kex
pointer already found by some other means. So if we make sure not to
lose that pointer, we should never need to do any string-based lookups
to find the curve we want, and conversely, when we need to know the
name of our curve or our algorithm, we should be able to look it up as
a straightforward const char * starting from the algorithm pointer.
This commit cleans things up so that that is indeed what happens. The
ssh_signkey and ssh_kex structures defined in sshecc.c now have
'extra' fields containing pointers to all the necessary stuff;
ec_name_to_curve and ec_curve_to_name have been completely removed;
struct ec_curve has a string field giving the curve's name (but only
for those curves which _have_ a name exposed in the wire protocol,
i.e. the three NIST ones); struct ec_key keeps a pointer to the
ssh_signkey it started from, and uses that to remember the algorithm
name rather than reconstructing it from the curve. And I think I've
got rid of all the ad-hockery scattered around the code that switches
on curve->fieldBits or manually constructs curve names using stuff
like sprintf("nistp%d"); the only remaining switch on fieldBits
(necessary because that's the UI for choosing a curve in PuTTYgen) is
at least centralised into one place in sshecc.c.
One user-visible result is that the format of ed25519 host keys in the
registry has changed: there's now no curve name prefix on them,
because I think it's not really right to make up a name to use. So any
early adopters who've been using snapshot PuTTY in the last week will
be inconvenienced; sorry about that.
This gives families of public key and kex functions (by which I mean
those sharing a set of methods) a place to store parameters that allow
the methods to vary depending on which exact algorithm is in use.
The ssh_kex structure already had a set of parameters specific to
Diffie-Hellman key exchange; I've moved those into sshdh.c and made
them part of the 'extra' structure for that family only, so that
unrelated kex methods don't have to faff about saying NULL,NULL,0,0.
(This required me to write an extra accessor function for ssh.c to ask
whether a DH method was group-exchange style or fixed-group style, but
that doesn't seem too silly.)
Not all of them, but the ones that don't get a 'void *key' parameter.
This means I can share methods between multiple ssh_signkey
structures, and still give those methods an easy way to find out which
public key method they're dealing with, by loading parameters from a
larger structure in which the ssh_signkey is the first element.
(In OO terms, I'm arranging that all static methods of my public key
classes get a pointer to the class vtable, to make up for not having a
pointer to the class instance.)
I haven't actually done anything with the new facility in this commit,
but it will shortly allow me to clean up the constant lookups by curve
name in the ECDSA code.
All the name strings in ssh_cipher, ssh_mac, ssh_hash, ssh_signkey
point to compile-time string literals, hence should obviously be const
char *.
Most of these const-correctness patches are just a mechanical job of
adding a 'const' in the one place you need it right now, and then
chasing the implications through the code adding further consts until
it compiles. But this one has actually shown up a bug: the 'algorithm'
output parameter in ssh2_userkey_loadpub was sometimes returning a
pointer to a string literal, and sometimes a pointer to dynamically
allocated memory, so callers were forced to either sometimes leak
memory or sometimes free a bad thing. Now it's consistently
dynamically allocated, and should be freed everywhere too.
Adding an extra radio button to the key-type selector caused it to
wrap on to another line and push the bottom of the containing control
box down off the bottom of the window.
In the long term, should more public key formats continue to appear,
we'll probably have to replace the radio buttons with something more
extensible like a drop-down list. For the moment, though, I've fixed
it by just reducing the space per radio button to bring all five
controls back on to the same line.
To fit the text into the smaller space, I also removed the 'SSH-2'
prefix on each key type, which ought to be unnecessary these days
since SSH-2 is a well established default. Only the SSH-1 RSA key type
is still labelled with an SSH version. (And I've moved it to the far
end rather than the start of the line, while I'm here.)
I've no reason to believe it will _currently_ be called with the
'passphrases' tree not even set up yet, but I managed to get that to
happen while playing about with experimental code just now, and it
seemed like a good safety check to keep in general.
I've written my own analogue of OpenSSH's ssh-askpass. At the moment,
it's contained inside Pageant proper, though it could easily be
compiled into a standalone binary as well or instead.
Unlike OpenSSH's version, I don't use a GTK edit box; instead I just
process key events myself and append them to a buffer. The big
advantage of doing this is that I can arrange for ^W and ^U to
function as they do in terminal line editing, i.e. delete a word or
delete the whole line.
^W in particular is really valuable when typing a multiple-word
passphrase unseen. If you feel yourself making the kind of typo in
which you're not sure if you pressed six keys or just five, you can
hit ^W and restart just that word, without either having to go right
back to the beginning or carry on and see if you feel lucky.
A delete-word function would of course be an information leak in even
an obscured edit box (displaying a blob per character), so instead I
give a visual acknowledgment of keypresses by a more ad-hoc means: I
display three lights in the box, and every meaningful keypress turns
off the currently active one and instead turns on a randomly selected
one of the others. (So the lit light doesn't even indicate _mod 3_ how
many keys have been pressed.)
I had freed the comment string coming back from pageant_add_keyfile,
but not NULLed out the pointer, so that the cleanup code at the end of
the function would have freed it again.