Now, instead of a 'const char *' in the static data segment, error
messages returned from backend setup are dynamically allocated and
freed by the caller.
This will allow me to make the messages much more specific (including
errno values and the like). However, this commit is pure refactoring:
I've _just_ changed the allocation policy, and left all the messages
alone.
If a terminal window closed with a popup (due to a network error,
for instance) while the mouse pointer was hidden by 'Hide mouse
pointer when typing in window', the mouse pointer could remain hidden
while over the terminal window, making it hard to navigate to the
popup.
Colin reports that on betas of Ubuntu 20.04, Pango has switched to
getting its font metrics from HarfBuzz, and a side effect is
apparently that they're being returned in the full precision of
PANGO_SCALE fixed point.
Previously, Pango appears to have been returning values that were
always a whole number of pixels scaled by PANGO_SCALE. Moreover, it
looks as if it was rounding the font ascent and descent _up_ to a
whole number of pixels, rather than rounding to nearest. But our code
rounds to nearest, which means that now the same font gets allocated
fewer vertical pixels, which can be enough to cut off some ascenders
or descenders.
Pango already provides the macro PANGO_PIXELS_CEIL, so it's easy to
switch over to using it. This should arrange that any text that fits
within the font's stated ascent/descent measurements will also fit in
the character cell.
If gdk_event_get_scroll_deltas() return failure for a given
GdkEventScroll, it doesn't follow that that event has no usable
scrolling action in it at all. The fallback is to call
gdk_event_get_scroll_direction() instead, which is less precise but
still gives _something_ you can use. So in that situation, instead of
just returning false, we can fall through to the handling we use for
pre-GTK3 scroll events (which are always imprecise).
In particular, I've noticed recently that if you run GTK 3 PuTTY in
the virtual X display created by vnc4server, and connect to it using
xtightvncviewer, then scroll-wheel actions passed through from the VNC
client will cause scroll_event() to receive low-res GdkEventScroll
structures of exactly this kind. So scroll-wheel activity on the
terminal window wasn't causing a scroll in that environment, and with
this patch, it does.
In the SSH-2 connection layer, an outstanding_channel_request
structure comes with a handler to be called back with the reply
packet, when the other end sends one. But sometimes it doesn't - if
the channel begins to close before the request has been replied to -
in which case the handler function is called with a NULL packet
pointer.
The common ssh2_channel_response function that handles most of the
client-side channel requests was not prepared to cope with that
pointer being null. Fixed by making it handle a null return the same
as CHANNEL_FAILURE.
This fills in the missing piece of Windows Pageant's story on deferred
decryption: we now actually know how to put up a dialog box asking for
the passphrase, when a not-yet-decrypted key is used.
This is quite a rough implementation so far, but it's a start. Known
issues:
- these new non-modal dialog boxes are serialised with respect to
each other by the Pageant core, but they can run in parallel with a
passphrase prompt popping up from the ordinary GUI 'Add Key'
operation. That may be too confusing; perhaps I should fix it.
- I'm not confident that the passphrase dialog box gets the keyboard
focus in all situations where I'd like it to (or what I can do
about it if not).
- the text in the non-modal box has two copies of the instruction
'enter passphrase for key'.
This is unlikely in most situations, but 'psusan' in particular is
intended to be run in a lot of weird environments where things aren't
properly set up yet. I just found out that if you use a Cygwin-built
psusan as the proxy process for Windows PuTTY (to get a local Cygwin
xterm) then it starts up with SHELL unset, and uxpty's forked
subprocess segfaults when it tries to exec a null pointer.
That causes the config dialog to terminate with result -1, which
wasn't handled at all by the result-receiving code. So GTK PuTTY would
continue running its main loop even though it had no windows open and
wasn't ever planning to do anything.
I just happened to notice in a re-read of the code that we were
computing b^2-4a and feeding it to mp_sqrt to check if it was a
perfect square, without having first checked that the subtraction
didn't overflow and deliver some arbitrary large positive number when
the true mathematical value was negative.
Fortunately, if this came up at all, it would have been as a false
_negative_ in Pockle's primality verification: it might have managed
to reject a genuine prime with a valid certificate on rare occasions.
So that's not too serious. But even so, now I've spotted it, fix it.
In commit 1f399bec58 I had the idea of generating the protocol radio
buttons in the GUI configurer by looping over the backends[] array,
which gets the reliably correct list of available backends for a given
binary rather than having to second-guess. That's given me an idea: we
can do the same for the per-backend config panels too.
Now the GUI config panel for every backend is guarded by a check of
backend_vt_from_proto, and we won't display the config for that
backend unless it's present.
In particular, this allows me to move the serial-port configuration
back into config.c from the separate file sercfg.c: we find out
whether to apply it by querying backend_vt_from_proto(PROT_SERIAL),
the same as any other backend.
In _particular_ particular, that also makes it much easier for me to
move the serial config up the pecking order, so that it's now second
only to SSH in the list of per-protocol config panes, which I think is
now where it deserves to be.
(A side effect of that is that I now have to come up with a different
method of having each serial backend specify the subset of parity and
flow control schemes it supports. I've done it by adding an extra pair
of serial-port specific bitmask fields to BackendVtable, taking
advantage of the new vtable definition idiom to avoid having to
boringly declare them as zero in all the other backends.)
This is a sweeping change applied across the whole code base by a spot
of Emacs Lisp. Now, everywhere I declare a vtable filled with function
pointers (and the occasional const data member), all the members of
the vtable structure are initialised by name using the '.fieldname =
value' syntax introduced in C99.
We were already using this syntax for a handful of things in the new
key-generation progress report system, so it's not new to the code
base as a whole.
The advantage is that now, when a vtable only declares a subset of the
available fields, I can initialise the rest to NULL or zero just by
leaving them out. This is most dramatic in a couple of the outlying
vtables in things like psocks (which has a ConnectionLayerVtable
containing only one non-NULL method), but less dramatically, it means
that the new 'flags' field in BackendVtable can be completely left out
of every backend definition except for the SUPDUP one which defines it
to a nonzero value. Similarly, the test_for_upstream method only used
by SSH doesn't have to be mentioned in the rest of the backends;
network Plugs for listening sockets don't have to explicitly null out
'receive' and 'sent', and vice versa for 'accepting', and so on.
While I'm at it, I've normalised the declarations so they don't use
the unnecessarily verbose 'struct' keyword. Also a handful of them
weren't const; now they are.
The contribution of SUPDUP has pushed SSH off the first page in my
default GTK font settings. It probably won't do that the same way for
everyone, but it did make me realise that the [Telnet, Rlogin, SSH]
order was a relic of 1997 when the SSH backend was new and added at
the bottom of the list. These days, SSH ought to be at the top for
easy access!
(And Serial ought to be next, but since it's added in a separate file,
that involves a bit more faffing about which I haven't done yet.)
The motivation is for the SUPDUP protocol. The server may send a
signal for the terminal to reset any input buffers. After this, the
server will not know the state of the terminal, so it is required to
send its cursor position back.
A known_hosts line can have multiple comma-separated hostnames on it,
or more usually a hostname and an IP address.
In the RSA and DSA key handlers, I was making a list of the integer
parameters of the public key by using the 'map' function, and then
iterating over it once per hostname on the line. But in Python 3, the
'map' function returns an iterator, not a list, so after you've
iterated to its end once, it's empty, and iterating over it a second
time stops immediately. As a result, the registry line for the second
hostname was coming out empty.
The testcrypt protocol expects a string literal to be a concatenation
of literal bytes other than '%' and '\n', and %-escaped hex digit
pairs. But testcrypt.py was only ever using the latter format, so even
a legible ASCII string like "123" was being sent to testcrypt as the
unreadable and needlessly long "%31%32%33".
When debugging, I often arrange to save the testcrypt input stream to
a file, and sometimes I use that file as the starting point for
editing. So it is actually useful to have the protocol exchange be
legible to humans. Hence, here's a change to testcrypt.py which makes
it only use the %-escape encoding for byte values that aren't
printable ASCII.
A 'strong' prime, as defined by the Handbook of Applied Cryptography,
is a prime p such that each of p-1 and p+1 has a large prime factor,
and that the large factor q of p-1 is such that q-1 in turn _also_ has
a large prime factor.
HoAC says that making your RSA key using primes of this form defeats
some factoring algorithms - but there are other faster algorithms to
which it makes no difference. So this is probably not a useful
precaution in practice. However, it has been recommended in the past
by some official standards, and it's easy to implement given the new
general facility in PrimeCandidateSource that lets you ask for your
prime to satisfy an arbitrary modular congruence. (And HoAC also says
there's no particular reason _not_ to use strong primes.) So I provide
it as an option, just in case anyone wants to select it.
The change to the key generation algorithm is entirely in sshrsag.c,
and is neatly independent of the prime-generation system in use. If
you're using Maurer provable prime generation, then the known factor q
of p-1 can be used to help certify p, and the one for q-1 to help with
q in turn; if you switch to probabilistic prime generation then you
still get an RSA key with the right structure, except that every time
the definition says 'prime factor' you just append '(probably)'.
(The probabilistic version of this procedure is described as 'Gordon's
algorithm' in HoAC section 4.4.2.)
Since our prime-generation code contains facilities not used by the
main key generators - Sophie Germain primes, user-specified modular
congruences, and MPU certificate output - it's probably going to be
useful sooner or later to have a command-line tool to access those
facilities. So here's a simple script that glues a Python argparse
interface on to the front of it all.
It would be nice to put this in 'contrib' rather than 'test', on the
grounds that it's at least potentially useful for purposes other than
testing PuTTY during development. But it's a client of the testcrypt
system, so it can't live anywhere other than the same directory as
testcrypt.py without me first having to do a lot of faffing about with
Python module organisation. So it can live here for the moment.
If you want to generate a Sophie Germain / safe prime pair with this
code, then after you've made p, you need to test the primality of
2p+1.
The easiest way to do that is to make a PrimeCandidateSource that is
so constrained as to only be able to deliver 2p+1 as a candidate, and
then run the ordinary prime generation system. The problem is that the
prime generation loops forever, so if 2p+1 _isn't_ prime, it will just
keep testing the same number over and over again and failing the test.
To solve this, I add a 'one-shot mode' to the PrimeCandidateSource
itself, which will cause it to return NULL if asked to generate a
second candidate. Then the prime-generation loops all detect that and
return NULL in turn. However, for clients that _don't_ set a pcs into
one-shot mode, the API remains unchanged: pcs_generate will not return
until it's found a prime (by its own criteria).
This feels like a bit of a bodge, API-wise. But the other two obvious
approaches turn out more awkward.
One option is to extract the Pockle from the PrimeGenerationContext
and use that to directly test primality of 2p+1 based on p - but that
way you don't get to _probabilistically_ generate safe primes, because
that kind of PGC has no Pockle in the first place. (And you can't make
one separately, because you can't convince it that an only
probabilistically generated p is prime!)
Another option is to add a test() method to PrimeGenerationPolicy,
that sits alongside generate(). Then, having generated p, you just
_test_ 2p+1. But then in the provable case you have to explain to it
that it should use p as part of the proof, so that API would get
awkward in its own way.
So this is actually the least disruptive way to do it!
A Sophie Germain prime is a prime p such that 2p+1 is also prime. The
larger prime of the pair 2p+1 is also known as a 'safe prime', and is
the preferred kind of modulus for conventional Diffie-Hellman.
Generating these is harder work than normal prime generation. There's
not really much of a technique except to just keep generating
candidate primes p and then testing 2p+1. But what you _can_ do to
speed things up is to get the prime-candidate generator to help a bit:
it's already enforcing that no small prime divides p, and it's easy to
get it to also enforce that no small prime divides 2p+1. That check
can filter out a lot of bad candidates early, before you waste time on
the more expensive checks, so you have a better chance of success with
each number that gets as far as Miller-Rabin.
Here I add an extra setup function for PrimeCandidateSource which
enables those extra checks. After you call pcs_try_sophie_germain(),
the PCS will only deliver you numbers for which both p and 2p+1 are
free of small factors.
Conveniently checkable certificates of primality aren't a new concept.
I didn't invent them, and I wasn't the first to implement them. Given
that, I thought it might be useful to be able to independently verify
a prime generated by PuTTY's provable prime system. Then, even if you
don't trust _this_ code, you might still trust someone else's
verifier, or at least be less willing to believe that both were
colluding.
The Perl module Math::Prime::Util is the only free software I've found
that defines a specific text-file format for certificates of
primality. The MPU format (as it calls it) supports various different
methods of certifying the primality of a number (most of which, like
Pockle's, depend on having previously proved some smaller number(s) to
be prime). The system implemented by Pockle is on its list: MPU calls
it by the name "BLS5".
So this commit introduces extra stored data inside Pockle so that it
remembers not just _that_ it believes certain numbers to be prime, but
also _why_ it believed each one to be prime. Then there's an extra
method in the Pockle API to translate its internal data structures
into the text of an MPU certificate for any number it knows about.
Math::Prime::Util doesn't come with a command-line verification tool,
unfortunately; only a Perl function which you feed a string argument.
So also in this commit I add test/mpu-check.pl, which is a trivial
command-line client of that function.
At the moment, this new piece of API is only exposed via testcrypt. I
could easily put some user interface into the key generation tools
that would save a few primality certificates alongside the private
key, but I have yet to think of any good reason to do it. Mostly this
facility is intended for debugging and cross-checking of the
_algorithm_, not of any particular prime.
Most of them are now _mandatory_ P3 scripts, because I'm tired of
maintaining everything to be compatible with both versions.
The current exceptions are gdb.py (which has to live with whatever gdb
gives it), and kh2reg.py (which is actually designed for other people
to use, and some of them might still be stuck on P2 for the moment).
This enables Pageant to act as a client for OpenSSH's agent, which
nowadays refuses to respond to SSH1_AGENTC_REQUEST_RSA_IDENTITIES, or
any other SSH1_AGENTC_* message. It now treats SSH_AGENT_FAILURE in
response to either 'list identities' request the same as successfully
receiving an empty list.
We're no longer doing the delta step at all, and even if we were, it's
not really part of _that_ particular algorithm - candidate selection
is centralised between all of them.
When judging how many bits of the generated prime we can afford to
consume with factors of p-1 and still have enough last few bits to
vary to find an actual prime in the range, I started by setting
max_bits_needed to the total size of the required output number, and
then subtracting a safety margin.
But that doesn't account for the fact that some bits may _already_
have been used by prior requirements from the PrimeCandidateSource,
such as the 'firstbits' used in RSA generation, or the 160-bit factor
of p-1 used in DSA.
So now we start by initialising max_bits_needed by asking the PCS how
many bits of entropy it still has left, and making sure not to reduce
_that_ by too much. Should fix another cause of hangs during prime
generation.
(Also, while I'm here, I've tweaked one of the compiled-out
diagnostics so that it reports how many bits it _does_ have left once
it starts trying to find a prime. That should make it easier to spot
any further problems in this area.)
I generated a list of sizes in bits for factors of p-1, knowing a
range of bit sizes that I wanted their product to fall in. But I
forgot that when you multiply together two or more numbers of
particular bit sizes, you can get more than one possible bit size
back. My code was estimating at the low end of the possible range, so
sometimes it would end up with more bits of the output prime specified
than it expected, and be left without enough variable bits to actually
be able to find a prime somewhere in the remaining space.
Now when I'm planning the factor list, I compute both the min and max
sizes of the product of the factors, and abort if any part of the
possible range is outside the safe zone.
How embarrassing - this morning's triumphant push of a shiny new
public-key method managed to break the entire GUI configuration system
so that it dereferences a null pointer during setup. That's what I get
for only testing the crypto side.
settings.c generates a preference list of host-key enum values that
included HK_ED448. So then hklist_handler() in config.c tries to look
that id up in its list of names, and doesn't find one, because I
forgot to add it there. Now reinstated.
The comparison functions between an mp_int and an integer worked by
walking along the mp_int, comparing each of its words to the
corresponding word of the integer. When they ran out of mp_int, they'd
stop.
But this overlooks the possibility that they might not have run out of
_integer_ yet! If BIGNUM_INT_BITS is defined to be less than the size
of a uintmax_t, then comparing (say) the uintmax_t 0x8000000000000001
against a one-word mp_int containing 0x0001 would return equality,
because it would never get as far as spotting the high bit of the
integer.
Fixed by iterating up to the max of the number of BignumInts in the
mp_int and the number that cover a uintmax_t. That means we have to
use mp_word() instead of a direct array lookup to get the mp_int words
to compare against, since now the word indices might be out of range.
Functions like mp_copy_integer_into, mp_add_integer_into and
mp_hs_integer all take an ordinary C integer in the form of a
uintmax_t, and perform an operation between that and an mp_int. In
order to do that, they have to break it up into some number of
BignumInt, via bit shifts.
But in C, shifting by an amount equal to or greater than the width of
the type is undefined behaviour, and you risk the compiler generating
nonsense or complaining at compile time. I did various dodges in those
functions to try to avoid that, but didn't manage to use the same
idiom everywhere. Sometimes I'd leave the integer in its original form
and shift it right by increasing multiples of BIGNUM_INT_BITS;
sometimes I'd shift it down in place every time. And mostly I'd do the
conditional shift by checking against sizeof(n), but once I did it by
shifting by half the word and then the other half.
Now refactored so that there's a pair of functions to shift a
uintmax_t left or right by BIGNUM_INT_BITS in what I hope is a UB-safe
manner, and changed all the code I could find to use them.
This is standardised by RFC 8709 at SHOULD level, and for us it's not
too difficult (because we use general-purpose elliptic-curve code). So
let's be up to date for a change, and add it.
This implementation uses all the formats defined in the RFC. But we
also have to choose a wire format for the public+private key blob sent
to an agent, and since the OpenSSH agent protocol is the de facto
standard but not (yet?) handled by the IETF, OpenSSH themselves get to
say what the format for a key should or shouldn't be. So if they don't
support a particular key method, what do you do?
I checked with them, and they agreed that there's an obviously right
format for Ed448 keys, which is to do them exactly like Ed25519 except
that you have a 57-byte string everywhere Ed25519 had a 32-byte
string. So I've done that.
In the Windows GUI, all the controls that were previously named or
labelled Ed25519 are now labelled EdDSA, and when you select that
top-level key type, there's a dropdown for the specific curve (just
like for ECDSA), whose only current value is Ed25519.
In command-line PuTTYgen, you can say '-t eddsa' and give a number of
bits, just like '-t ecdsa'. You can also still say '-t ed25519', for
backwards compatibility.
Also in command-line PuTTYgen, I've reworked the error messages if you
give a number of bits that doesn't correspond to a known elliptic
curve. Now the messages are generated by consulting the list of
curves, so that that list has to be updated by hand in one fewer
place.
In Ed25519, when you hash things, you just feed them straight to
SHA-512. But in Ed448, you prefix them with a magic string before
feeding them to SHAKE256, and the magic string varies depending on
which variant of Ed448 is in use.
Add support for such a prefix, and for Ed25519, set it to the empty
string.
An EdDSA public key is transmitted as an integer just big enough to
store the curve point's y value plus one extra bit. The topmost bit of
the integer is set to indicate the parity of the x value.
In Ed25519, this works out very nicely: the curve modulus is just
under 2^255, so the y value occupies all but one bit of the topmost
byte of the integer, leaving the y parity to fit neatly in the spare
bit. But in Ed448, the curve modulus occupies a whole number of bytes
(56 of them) and the format standardises that you therefore have to
use a 57-byte string for the public key, with the highest-order
(final) byte having zero in the low 7 of its bits.
To support this, I've arranged that the Edwards-curve modes in
sshecc.c now set curve->fieldBytes to the right number of bytes to
hold curve->fieldBits+1 bits, instead of the obvious curve->fieldBits.
Doing it that way means fewer special cases everywhere else in the
code.
Also, this means changing the order of the validity check: we now wait
until after we've checked and cleared the topmost bit to see if the
rest is within range. That deals with checking the intervening 7 zero
bits.
This works more or less like the similar refactoring for Montgomery
curves in 7fa0749fcb: where we previously hardwired the clearing of 3
low bits of a private exponent, we now turn that 3 into a curve-
specific constant, so that Ed448 will be able to set it to a different
value.
With all the preparation now in place, this is more or less trivial.
We add a new curve setup function in sshecc.c, and an ssh_kex linking
to it; we add the curve parameters to the reference / test code
eccref.py, and use them to generate the list of low-order input values
that should be rejected by the sanity check on the kex output; we add
the standard test vectors from RFC 7748 in cryptsuite.py, and the
low-order values we just generated.
There's always something. I added the actual option, but forgot to
advertise it in the online help.
While I'm here, I've also allowed the word 'proven' as an alternative
spelling for 'provable', because having the approved spellings be
'probable' and 'provable' is just asking for hilarious typos.
In Windows PuTTYgen, this is selected by an extra set of radio-button
style menu options in the Key menu. In the command-line version,
there's a new --primes=provable option.
This whole system is new, so I'm not enabling it by default just yet.
I may in future, though: it's running faster than I expected (in
particular, a lot faster than any previous prototype of the same
algorithm I attempted in standalone Python).
This uses all the facilities I've been adding in previous commits. It
implements Maurer's algorithm for generating a prime together with a
Pocklington certificate of its primality, by means of recursing to
generate smaller primes to be factors of p-1 for the Pocklington
check, then doing a test Miller-Rabin iteration to quickly exclude
obvious composites, and then doing the full Pocklington check.
In my case, this means I add each prime I generate to a Pockle. So the
algorithm says: recursively generate some primes and add them to the
PrimeCandidateSource, then repeatedly get a candidate value back from
the pcs, check it with M-R, and feed it to the Pockle. If the Pockle
accepts it, then we're done (and the Pockle will then know that value
is prime when our recursive caller uses it in turn, if we have one).
A small refinement to that algorithm is that I iterate M-R until the
witness value I tried is such that it at least _might_ be a primitive
root - which is to say that M-R didn't get 1 by evaluating any power
of it smaller than n-1. That way, there's less chance of the Pockle
rejecting the witness value. And sooner or later M-R must _either_
tell me I've got a potential primitive-root witness _or_ tell me it's
shown the number to be composite.
The old system I removed in commit 79d3c1783b had both linear and
exponential phase types, but the new one only had exponential, because
at that point I'd just thrown away all the clients of the linear phase
type. But I'm going to add another one shortly, so I have to put it
back in.
pcs_get_upper_bound lets the holder of a PrimeCandidateSource ask what
is the largest value it might ever generate. pcs_get_bits_remaining
lets it ask how much extra entropy it's going to generate on top of
the requirements that have already been input into it.
Both of these will be needed by the upcoming provable-prime work to
decide what sizes of subsidiary prime to generate.