The assorted host-key and warning prompt messages have no reason to
differ between the two platforms, so let's centralise them. Also,
while I'm here, some basic support functions that are the same in both
modules.
This removes both uses of SHA-1 in the file format: it was used as the
MAC protecting the key file against tamperproofing, and also used in
the key derivation step that converted the user's passphrase to cipher
and MAC keys.
The MAC is simply upgraded from HMAC-SHA-1 to HMAC-SHA-256; it is
otherwise unchanged in how it's applied (in particular, to what data).
The key derivation is totally reworked, to be based on Argon2, which
I've just added to the code base. This should make stolen encrypted
key files more resistant to brute-force attack.
Argon2 has assorted configurable parameters for memory and CPU usage;
the new key format includes all those parameters. So there's no reason
we can't have them under user control, if a user wants to be
particularly vigorous or particularly lightweight with their own key
files. They could even switch to one of the other flavours of Argon2,
if they thought side channels were an especially large or small risk
in their particular environment. In this commit I haven't added any UI
for controlling that kind of thing, but the PPK loading function is
all set up to cope, so that can all be added in a future commit
without having to change the file format.
While I'm at it, I've also switched the CBC encryption to using a
random IV (or rather, one derived from the passphrase along with the
cipher and MAC keys). That's more like normal SSH-2 practice.
This is going to be used in the new version of the PPK file format. It
was the winner of the Password Hashing Context, which I think makes it
a reasonable choice.
Argon2 comes in three flavours: one with no data dependency in its
memory addressing, one with _deliberate_ data dependency (intended to
serialise computation, to hinder parallel brute-forcing), and a hybrid
form that starts off data-independent and then switches over to the
dependent version once the sensitive input data has been adequately
mixed around. I test all three in the test suite; the side-channel
tester can only expect Argon2i to pass; and, following the spec's
recommendation, I'll be using Argon2id for the actual key file
encryption.
In commit 1f399bec58 I had the idea of generating the protocol radio
buttons in the GUI configurer by looping over the backends[] array,
which gets the reliably correct list of available backends for a given
binary rather than having to second-guess. That's given me an idea: we
can do the same for the per-backend config panels too.
Now the GUI config panel for every backend is guarded by a check of
backend_vt_from_proto, and we won't display the config for that
backend unless it's present.
In particular, this allows me to move the serial-port configuration
back into config.c from the separate file sercfg.c: we find out
whether to apply it by querying backend_vt_from_proto(PROT_SERIAL),
the same as any other backend.
In _particular_ particular, that also makes it much easier for me to
move the serial config up the pecking order, so that it's now second
only to SSH in the list of per-protocol config panes, which I think is
now where it deserves to be.
(A side effect of that is that I now have to come up with a different
method of having each serial backend specify the subset of parity and
flow control schemes it supports. I've done it by adding an extra pair
of serial-port specific bitmask fields to BackendVtable, taking
advantage of the new vtable definition idiom to avoid having to
boringly declare them as zero in all the other backends.)
This is standardised by RFC 8709 at SHOULD level, and for us it's not
too difficult (because we use general-purpose elliptic-curve code). So
let's be up to date for a change, and add it.
This implementation uses all the formats defined in the RFC. But we
also have to choose a wire format for the public+private key blob sent
to an agent, and since the OpenSSH agent protocol is the de facto
standard but not (yet?) handled by the IETF, OpenSSH themselves get to
say what the format for a key should or shouldn't be. So if they don't
support a particular key method, what do you do?
I checked with them, and they agreed that there's an obviously right
format for Ed448 keys, which is to do them exactly like Ed25519 except
that you have a 57-byte string everywhere Ed25519 had a 32-byte
string. So I've done that.
This implements an extended form of primality verification using
certificates based on Pocklington's theorem. You make a Pockle object,
and then try to convince it that one number after another is prime, by
means of providing it with a list of prime factors of p-1 and a
primitive root. (Or just by saying 'this prime is small enough for you
to check yourself'.)
Pocklington's theorem requires you to have factors of p-1 whose
product is at least the square root of p. I've extended that to
support factorisations only as big as the cube root, via an extension
of the theorem given in Maurer's paper on generating provable primes.
The Pockle object is more or less write-only: it has no methods for
reading out its contents. Its only output channel is the return value
when you try to insert a prime into it: if it isn't sufficiently
convinced that your prime is prime, it will return an error code. So
anything for which it returns POCKLE_OK you can be confident of.
I'm going to use this for provable prime generation. But exposing this
part of the system as an object in its own right means I can write a
set of unit tests for this specifically. My negative tests exercise
all the different ways a certification can be erroneous or inadequate;
the positive tests include proofs of primality of various primes used
in elliptic-curve crypto. The Poly1305 proof in particular is taken
from a proof in DJB's paper, which has exactly the form of a
Pocklington certificate only written in English.
This further cleans up the prime-generation code, to the point where
the main primegen() function has almost nothing in it. Also now I'll
be able to reuse M-R as a primitive in more sophisticated alternatives
to primegen().
This reverts commit a7bdefb394.
I had accidentally mashed it together with another commit. I did
actually want to push both of them, but I'd rather push them
separately! So I'm backing out the combined blob, and I'll re-push
them with their proper comments and explanations.
The old API was one of those horrible things I used to do when I was
young and foolish, in which you have just one function, and indicate
which of lots of things it's doing by passing in flags. It was crying
out to be replaced with a vtable.
While I'm at it, I've reworked the code on the Windows side that
decides what to do with the progress bar, so that it's based on
actually justifiable estimates of probability rather than magic
integer constants.
Since computers are generally faster now than they were at the start
of this project, I've also decided there's no longer any point in
making the fixed final part of RSA key generation bother to report
progress at all. So the progress bars are now only for the variable
part, i.e. the actual prime generations.
This is built more or less entirely out of pieces I already had. The
SOCKS server code is provided by the dynamic forwarding code in
portfwd.c. When that accepts a connection request, it wants to talk to
an SSH ConnectionLayer, which is already a trait with interchangeable
implementations - so I just provide one of my own which only supports
the lportfwd_open() method. And that in turn returns an SshChannel
object, with a special trait implementation all of whose methods
just funnel back to an ordinary Socket.
Result: you get a Socket-to-Socket SOCKS implementation with no SSH
anywhere, and even a minimal amount of need to _pretend_ internally to
be an SSH implementation.
Additional features include the ability to log all the traffic in the
form of diagnostics to standard error, or log each direction of each
connection separately to a file, or for anything more general, to log
each direction of each connection through a pipe to a subcommand that
can filter out whatever you think are the interesting parts. Also, you
can spawn a subcommand after the SOCKS server is set up, and terminate
automatically when that subcommand does - e.g. you might use this to
wrap the execution of a single SOCKS-using program.
This is a modernisation of a diagnostic utility I've had kicking
around out-of-tree for a long time. With all of last year's
refactorings, it now becomes feasible to keep it in-tree without
needing huge amounts of scaffolding. Also, this version runs on
Windows, which is more than the old one did. (On Windows I haven't
implemented the subprocess parts, although there's no reason I
_couldn't_.)
As well as diagnostic uses, this may also be useful in some situations
as a thing to forward ports to: PuTTY doesn't currently support
reverse dynamic port forwarding (in which the remote listening port
acts as a SOCKS server), but you could get the same effect by
forwarding a remote port to a local instance of this. (Although, of
course, that's nothing you couldn't achieve using any other SOCKS
server.)
I've replaced the random number generation and small delta-finding
loop in primegen() with a much more elaborate system in its own source
file, with unit tests and everything.
Immediate benefits:
- fixes a theoretical possibility of overflowing the target number of
bits, if the random number was so close to the top of the range
that the addition of delta * factor pushed it over. However, this
only happened with negligible probability.
- fixes a directional bias in delta-finding. The previous code
incremented the number repeatedly until it found a value coprime to
all the right things, which meant that a prime preceded by a
particularly long sequence of numbers with tiny factors was more
likely to be chosen. Now we select candidate delta values at
random, that bias should be eliminated.
- changes the semantics of the outermost primegen() function to make
them easier to use, because now the caller specifies the 'bits' and
'firstbits' values for the actual returned prime, rather than
having to account for the factor you're multiplying it by in DSA.
DSA client code is correspondingly adjusted.
Future benefits:
- having the candidate generation in a separate function makes it
easy to reuse in alternative prime generation strategies
- the available constraints support applications such as Maurer's
algorithm for generating provable primes, or strong primes for RSA
in which both p-1 and p+1 have a large factor. So those become
things we could experiment with in future.
Unlike the ones in mpint.c proper, these are not intended to respect
the constant-time guarantees. They're going to be the kind of thing
you use in key generation, which is too random to be constant-time in
any case.
I've arranged several precautions to try to make sure these functions
don't accidentally get linked into the main SSH application, only into
key generators:
- declare them in a separate header with "unsafe" in the name
- put "unsafe" in the name of every actual function
- don't even link the mpunsafe.c translation unit into PuTTY proper
- in fact, define global symbols of the same name in that file and
the SSH client code, so that there will be a link failure if we
ever try to do it by accident
The initial contents of the new source file consist of the subroutine
mp_mod_short() that previously lived in sshprime.c (and was not in
mpint.c proper precisely because it was unsafe). While I'm here, I've
turned it into mp_unsafe_mod_integer() and let it take a modulus of up
to 32 bits instead of 16.
Also added some obviously useful functions to shrink an mpint to the
smallest physical size that can hold the contained number (rather like
bn_restore_invariant in the old Bignum system), which I expect to be
using shortly.
Mostly because I just had a neat idea about how to expose that large
mutable array without it being a mutable global variable: make it a
static in its own module, and expose only a _pointer_ to it, which is
const-qualified.
While I'm there, changed the name to something more descriptive.
Also spelled '-O text', this takes a public or private key as input,
and produces on standard output a dump of all the actual numbers
involved in the key: the exponent and modulus for RSA, the p,q,g,y
parameters for DSA, the affine x and y coordinates of the public
elliptic curve point for ECC keys, and all the extra bits and pieces
in the private keys too.
Partly I expect this to be useful to me for debugging: I've had to
paste key files a few too many times through base64 decoders and hex
dump tools, then manually decode SSH marshalling and paste the result
into the Python REPL to get an integer object. Now I should be able to
get _straight_ to text I can paste into Python.
But also, it's a way that other applications can use the key
generator: if you need to generate, say, an RSA key in some format I
don't support (I've recently heard of an XML-based one, for example),
then you can run 'puttygen -t rsa --dump' and have it print the
elements of a freshly generated keypair on standard output, and then
all you have to do is understand the output format.
In the previous commit I introduced the ability for PuTTY to talk to a
server speaking the bare ssh-connection protocol, and listed several
applications for that ability. But none of those applications is any
use without a server that speaks the same protocol. Until now, the
only such server has been the Unix-domain socket presented by an
upstream connection-sharing PuTTY - and we already had a way to
connect to that.
So here's the missing piece: by reusing code that already existed for
the testing SSH server Uppity, I've created a program that will speak
the bare ssh-connection protocol on its standard I/O channels. If you
want to get a shell session over any of the transports I mentioned in
the last commit, this is the program you need to run at the far end of
it.
I have yet to write the documentation, but just in case I forget, the
name stands for 'Pseudo Ssh for Untappable, Separately Authenticated
Networks'.
Like other 'utils' modules, the point is that sshutils.c has no
external dependencies, so it's safe to include in a tool without
requiring you to bring in a cascade of other modules you didn't really
want.
Right now I'm only planning to use this change in an out-of-tree
experiment, but it's harmless to commit the change itself here.
This is a small cleanup that removes a couple of copies of some boring
stubs, in favour of having just one copy that you can link against.
Unix Pageant can't currently use this, because it's in a precarious
state of _nearly_ having a random number generator: it links against
sshprng but not sshrand, and only uses it for the randomised keypress
acknowledgments in the GUI askpass prompt. But that means it does use
uxnoise, unlike the truly randomness-free tools.
There aren't quite as many of these as there are on Unix, but Windows
Plink and PSFTP still share some suspiciously similar-looking code.
Now they're both clients of wincliloop.c.
Unix Plink, Unix Pageant in server mode, Uppity, and the post-
connection form of PSFTP's command-line reading code all had very
similar loops in them, which run a pollwrapper and mediate between
that, timers, and toplevel callbacks. It's long past time the common
code between all of those became a reusable shared routine.
So, this commit introduces uxcliloop.c, and turns all the previous
copies of basically the same loop into a call to cli_main_loop with
various callback functions to configure the parts that differ.
The global 'int flags' has always been an ugly feature of this code
base, and I suddenly thought that perhaps it's time to start throwing
it out, one flag at a time, until it's totally unused.
My first target is FLAG_VERBOSE. This was usually set by cmdline.c
when it saw a -v option on the program's command line, except that GUI
PuTTY itself sets it unconditionally on startup. And then various bits
of the code would check it in order to decide whether to print a given
message.
In the current system of front-end abstraction traits, there's no
_one_ place that I can move it to. But there are two: every place that
checked FLAG_VERBOSE has access to either a Seat or a LogPolicy. So
now each of those traits has a query method for 'do I want verbose
messages?'.
A good effect of this is that subsidiary Seats, like the ones used in
Uppity for the main SSH server module itself and the server end of
shell channels, now get to have their own verbosity setting instead of
inheriting the one global one. In fact I don't expect any code using
those Seats to be generating any messages at all, but if that changes
later, we'll have a way to control it. (Who knows, perhaps logging in
Uppity might become a thing.)
As part of this cleanup, I've added a new flag to cmdline_tooltype,
called TOOLTYPE_NO_VERBOSE_OPTION. The unconditionally-verbose tools
now set that, and it has the effect of making cmdline.c disallow -v
completely. So where 'putty -v' would previously have been silently
ignored ("I was already verbose"), it's now an error, reminding you
that that option doesn't actually do anything.
Finally, the 'default_logpolicy' provided by uxcons.c and wincons.c
(with identical definitions) has had to move into a new file of its
own, because now it has to ask cmdline.c for the verbosity setting as
well as asking console.c for the rest of its methods. So there's a new
file clicons.c which can only be included by programs that link
against both cmdline.c _and_ one of the *cons.c, and I've renamed the
logpolicy to reflect that.
They're not much use for 'real' key generation, since like all the
other randomness-using testcrypt functions, they need you to have
explicitly queued up some random data. But for generating keys for
test purposes, they have the great virtue that they deliver the key in
the internal format, where we can generate all the various public and
private blobs from it as well as the on-disk formats.
A minor change to one of the keygen functions itself: rsa_generate now
fills in the 'bits' and 'bytes' fields of the returned RSAKey, without
which it didn't actually work to try to generate a public blob from
it. (We'd never noticed before, because no previous client of
rsa_generate even tried that.)
As in the previous commit, this means that agent_query() is now able
to operate in an asynchronous mode, so that if Pageant takes time to
answer a request, the GUI of the PuTTY instance making the request
won't be blocked.
Also as in the previous commit, we still fall back to the WM_COPYDATA
protocol if the new named pipe protocol isn't available.
This reuses all the named-pipe IPC code I set up for connection
sharing a few years ago, to set up a named pipe with a predictable
name and speak the stream-oriented SSH agent protocol over it.
In this commit, we just set up the server, and there's no code that
speaks the client end of the new IPC yet. But my plan is that clients
should switch over to using this interface if possible, because it's
generally better: it doesn't have to be handled synchronously in the
middle of a GUI event loop (either in Pageant itself _or_ in its
client), and it's a better fit to the connection-oriented nature of
forwarded agent connections (so if any features ever appear in the
agent protocol that require state within a connection, we'll now be
able to support them).
Windows Plink and PSFTP had very similar implementations, and now they
share one that lives in a new file winselcli.c. I've similarly moved
GUI PuTTY's implementation out of window.c into winselgui.c, where
other GUI programs wanting to do networking will be able to access
that too.
In the spirit of centralisation, I've also taken the opportunity to
make both functions use the reasonably complete winsock_error_string()
rather than (for some historical reason) each inlining a minimal
version that reports most errors as 'unknown'.
The number of people has been steadily increasing who read our source
code with an editor that thinks tab stops are 4 spaces apart, as
opposed to the traditional tty-derived 8 that the PuTTY code expects.
So I've been wondering for ages about just fixing it, and switching to
a spaces-only policy throughout the code. And I recently found out
about 'git blame -w', which should make this change not too disruptive
for the purposes of source-control archaeology; so perhaps now is the
time.
While I'm at it, I've also taken the opportunity to remove all the
trailing spaces from source lines (on the basis that git dislikes
them, and is the only thing that seems to have a strong opinion one
way or the other).
Apologies to anyone downstream of this code who has complicated patch
sets to rebase past this change. I don't intend it to be needed again.
This mitigates a borderline-DoS in which a malicious SFTP server sends
a ludicrously large number of file names in response to a SFTP
opendir/readdir request sequence, causing the client to buffer them
all and use up all the system's memory simply so that it can produce
the output in sorted order.
I call it a 'borderline' DoS because it's very likely that this is the
same server that you'll also trust to actually send you the _contents_
of some entire file or directory, in which case, if they want to DoS
you they can do that anyway at that point and you have no way to tell
a legit very large file from a bad one. So it's unclear to me that
anyone would get any real advantage out of 'exploiting' this that they
couldn't have got anyway by other means.
That said, it may have practical benefits in the occasional case.
Imagine a _legit_ gigantic directory (something like a maildir,
perhaps, and perhaps stored on a server-side filesystem specialising
in not choking on really huge single directories), together with a
client workflow that involves listing the whole directory but then
downloading only one particular file in it.
For the moment, the threshold size is fixed at 8Mb of total data
(counting the lengths of the file names as well as just the number of
files). If that needs to become configurable later, we can always add
an option.
The functions that previously lived in it now live in terminal.c
itself; they've been renamed term_keyinput and term_keyinputw, and
their function is to add data to the terminal's user input buffer from
a char or wchar_t string respectively.
They sit more comfortably in terminal.c anyway, because their whole
point is to translate into the character encoding that the terminal is
currently configured to use. Also, making them part of the terminal
code means they can also take care of calling term_seen_key_event(),
which simplifies most of the call sites in the GTK and Windows front
ends.
Generation of text _inside_ terminal.c, from responses to query escape
sequences, is therefore not done by calling those external entry
points: we send those responses directly to the ldisc, so that they
don't count as keypresses for all the user-facing purposes like bell
overload handling and scrollback reset. To make _that_ convenient,
I've arranged that most of the code that previously lived in
lpage_send and luni_send is now in separate translation functions, so
those can still be called from situations where you're not going to do
the default thing with the translated data.
(However, pasted data _does_ still count as close enough to a keypress
to call term_seen_key_event - but it clears the 'interactive' flag
when the data is passed on to the line discipline, which tweaks a
minor detail of control-char handling in line ending mode but mostly
just means pastes aren't interrupted.)
Coverity complained that it was wrong to use rand() in a security
context, and although in this case it's _very_ marginal, I can't
actually disagree that the choice of which light to light up to avoid
giving information about passphrase length is a security context.
So, no more rand(); instead we instantiate a shiny Fortuna PRNG
instance, seed it in more or less the usual way, and use that as an
overkill-level method of choosing which light to light up next.
(Acknowledging that this is a slightly unusual application and less
critical than most, I don't actually put the passphrase characters
themselves into the PRNG, and I don't use a random-seed file.)
Uppity is not secure enough to listen on a TCP port as if it was a
normal SSH server. Until now, I've been using it by means of a local
proxy command, i.e. PuTTY invokes Uppity in the same way it might
invoke 'plink -nc'. This rigorously prevents any hostile user from
connecting to my utterly insecure test server, but it's a thundering
inconvenience as soon as you want to attach a debugger to the Uppity
process itself - you have to stick a gdbserver somewhere in the middle
of your already complicated shell pipeline, and then find a way to
connect back to it from a gdb in a terminal window.
So I've added an option to make Uppity listen on a TCP port in the
normal way - but it's protected using that /proc/net/tcp trick I just
added in the previous commit.
Now it can optionally check that output lines don't go beyond a
certain length (measured in terminal columns, via wcwidth, rather than
bytes or characters). In this mode, lines are prefixed with a
distinctive character (namely '|'), and if a line is too long, then it
is broken and the continuation line gets a different prefix ('>').
When StripCtrlChars is targeting a terminal, it asks the terminal to
call wcwidth on its behalf, so it can be sure to use the same idea as
the real terminal about which characters are wide (i.e. depending on
the configuration of ambiguous characters).
This mode isn't yet used anywhere.
I've always thought poll was more hassle to set up, because if you
want to reuse part of your pollfds list between calls then you have to
index every fd by its position in the list as well as the fd number
itself, which gives you twice as many indices to keep track of than if
the fd is always its own key.
But the problem is that select is fundamentally limited to the range
of fds that can fit in an fd_set, which is not the range of fds that
can _exist_, so I've had a change of heart and now have to go with
poll.
For the moment, I've surrounded it with a 'pollwrapper' structure that
lets me treat it more or less like select, containing a tree234 that
maps each fd to its location in the list, and also translating between
the simple select r/w/x classification and the richer poll flags.
That's let me do the migration with minimal disruption to the call
sites.
In future perhaps I can start using poll more directly, and/or using
the richer flag system (though the latter might be fiddly because of
sometimes being constrained to use the glib event loop). But this will
do for now.
With this change, we stop expecting to find putty.chm alongside the
executable file. That was a security hazard comparable to DLL
hijacking, because of the risk that a malicious CHM file could be
dropped into the same directory as putty.exe (e.g. if someone ran
PuTTY from their browser's download dir)..
Instead, the standalone putty.exe (and other binaries needing help)
embed the proper CHM file within themselves, as a Windows resource,
and if called on to display the help then they write the file out to a
temporary location. This has the advantage that if you download and
run the standalone putty.exe then you actually _get_ help, which
previously didn't happen!
The versions of the binaries in the installer don't each contain a
copy of the help file; that would be extravagant. Instead, the
installer itself writes a registry entry pointing at the proper help
file, and the executables will look there.
Another effect of this commit is that I've withdrawn support for the
older .HLP format completely. It's now entirely outdated, and
supporting it through this security fix would have been a huge pain.
If you use the new stripctrl_new_term() to construct a StripCtrlChars
instead of the existing stripctrl_new(), then the resulting object
will align itself with the character-set configuration of the Terminal
object you point it at. (In fact, it'll reuse the same actual
translation code, courtesy of the last few refactoring commits.) So it
will interpret things as control characters precisely if that Terminal
would also have done so.
The previous locale-based sanitisation is appropriate if you're
sending the sanitised output to an OS terminal device managed outside
this process - the LC_CTYPE setting has the best chance of knowing how
that terminal device will interpret a byte stream. But I want to start
using the same sanitisation system for data intended for PuTTY's own
internal terminal emulator, in which case there's no reason why
LC_CTYPE should be expected to match that terminal's configuration,
and no reason to need it to either since we can check the internal
terminal configuration directly.
One small bodge: stripctrl_new_term() is actually a macro, which
passes in the function pointer term_translate() to the underlying real
constructor. That's just so that console-only tools can link in
stripctrl.c without acquiring a dependency on terminal.c (similarly to
how we pass random_read in to the mp_random functions).
I've fixed a handful of these where I found them in passing, but when
I went systematically looking, there were a lot more that I hadn't
found!
A particular highlight of this collection is the code that formats
Windows clipboard data in RTF, which was absolutely crying out for
strbuf_catf, and now it's got it.
This commit adds sanitisation to PSCP and PSFTP in the same style as
I've just put it into Plink. This time, standard error is sanitised
without reference to whether it's redirected (at least unless you give
an override option), on the basis that where Plink is _sometimes_ an
SSH transport for some other protocol, PSCP and PSFTP _always_ are.
But also, the sanitiser is run over any remote filename sent by the
server, substituting ? for any control characters it finds. That
removes another avenue for the server to deliberately confuse the
display.
This commit fixes our bug 'pscp-unsanitised-server-output', aka the
two notional 'vulnerabilities' CVE-2019-6109 and CVE-2019-6110.
(Although we regard those in isolation as only bugs, not serious
vulnerabilities, because their main threat was in hiding the evidence
of a server having exploited other more serious vulns that we never
had.)
If Plink's standard output and/or standard error points at a Windows
console or a Unix tty device, and if Plink was not configured to
request a remote pty (and hence to send a terminal-type string), then
we apply the new control-character stripping facility.
The idea is to be a mild defence against malicious remote processes
sending confusing escape sequences through the standard error channel
when Plink is being used as a transport for something like git: it's
OK to have actual sensible error messages come back from the server,
but when you run a git command, you didn't really intend to give the
remote server the implicit licence to write _all over_ your local
terminal display. At the same time, in that scenario, the standard
_output_ of Plink is left completely alone, on the grounds that git
will be expecting it to be 8-bit clean. (And Plink can tell that
because it's redirected away from the console.)
For interactive login sessions using Plink, this behaviour is
disabled, on the grounds that once you've sent a terminal-type string
it's assumed that you were _expecting_ the server to use it to know
what escape sequences to send to you.
So it should be transparent for all the use cases I've so far thought
of. But in case it's not, there's a family of new command-line options
like -no-sanitise-stdout and -sanitise-stderr that you can use to
forcibly override the autodetection of whether to do it.
This all applies the same way to both Unix and Windows Plink.
All the work I've put in in the last few months to eliminate timing
and cache side channels from PuTTY's mp_int and cipher implementations
has been on a seat-of-the-pants basis: just thinking very hard about
what kinds of language construction I think would be safe to use, and
trying not to absentmindedly leave a conditional branch or a cast to
bool somewhere vital.
Now I've got a test suite! The basic idea is that you run the same
crypto primitive multiple times, with inputs differing only in ways
that are supposed to avoid being leaked by timing or leaving evidence
in the cache; then you instrument the code so that it logs all the
control flow, memory access and a couple of other relevant things in
each of those runs, and finally, compare the logs and expect them to
be identical.
The instrumentation is done using DynamoRIO, which I found to be well
suited to this kind of work: it lets you define custom modifications
of the code in a reasonably low-effort way, and it lets you work at
both the low level of examining single instructions _and_ the higher
level of the function call ABI (so you can give things like malloc
special treatment, not to mention intercepting communications from the
program being instrumented). Build instructions are all in the comment
at the top of testsc.c.
At present, I've found this test to give a 100% pass rate using gcc
-O0 and -O3 (Ubuntu 18.10). With clang, there are a couple of
failures, which I'll fix in the next commit.
I just found I wanted to generate a prime with particular properties,
and I knew PuTTY's prime generator could manage it, so it was easier
to add this function to testcrypt for occasional manual use than to
look for another prime-generator with the same feature set!
I've wrapped the function so as to remove the three progress-
reporting parameters.
This tears out the entire previous random-pool system in sshrand.c. In
its place is a system pretty close to Ferguson and Schneier's
'Fortuna' generator, with the main difference being that I use SHA-256
instead of AES for the generation side of the system (rationale given
in comment).
The PRNG implementation lives in sshprng.c, and defines a self-
contained data type with no state stored outside the object, so you
can instantiate however many of them you like. The old sshrand.c still
exists, but in place of the previous random pool system, it's just
become a client of sshprng.c, whose job is to hold a single global
instance of the PRNG type, and manage its reference count, save file,
noise-collection timers and similar administrative business.
Advantages of this change include:
- Fortuna is designed with a more varied threat model in mind than my
old home-grown random pool. For example, after any request for
random numbers, it automatically re-seeds itself, so that if the
state of the PRNG should be leaked, it won't give enough
information to find out what past outputs _were_.
- The PRNG type can be instantiated with any hash function; the
instance used by the main tools is based on SHA-256, an improvement
on the old pool's use of SHA-1.
- The new PRNG only uses the completely standard interface to the
hash function API, instead of having to have privileged access to
the internal SHA-1 block transform function. This will make it
easier to revamp the hash code in general, and also it means that
hardware-accelerated versions of SHA-256 will automatically be used
for the PRNG as well as for everything else.
- The new PRNG can be _tested_! Because it has an actual (if not
quite explicit) specification for exactly what the output numbers
_ought_ to be derived from the hashes of, I can (and have) put
tests in cryptsuite that ensure the output really is being derived
in the way I think it is. The old pool could have been returning
any old nonsense and it would have been very hard to tell for sure.
This replaces all the separate HMAC-implementing wrappers in the
various source files implementing the underlying hashes.
The new HMAC code also correctly handles the case of a key longer than
the underlying hash's block length, by replacing it with its own hash.
This means I can reinstate the test vectors in RFC 6234 which exercise
that case, which I didn't add to cryptsuite before because they'd have
failed.
It also allows me to remove the ad-hoc code at the call site in
cproxy.c which turns out to have been doing the same thing - I think
that must have been the only call site where the question came up
(since MAC keys invented by the main SSH-2 BPP are always shorter than
that).
All those functions like aes256_encrypt_pubkey and des_decrypt_xdmauth
previously lived in the same source files as the ciphers they were
based on, and provided an alternative API to the internals of that
cipher's implementation. But there was no _need_ for them to have that
privileged access to the internals, because they didn't do anything
you couldn't access through the standard API. It was just a historical
oddity with no real benefit, whose main effect is to make refactoring
painful.
So now all of those functions live in a new file sshauxcrypt.c, and
they all work through the same vtable system as all other cipher
clients, by making an instance of the right cipher and configuring it
in the appropriate way. This should make them completely independent
of any internal changes to the cipher implementations they're based
on.
The refactored sshaes.c gives me a convenient slot to drop in a second
hardware-accelerated AES implementation, similar to the existing one
but using Arm NEON intrinsics in place of the x86 AES-NI ones.
This needed a minor structural change, because Arm systems are often
heterogeneous, containing more than one type of CPU which won't
necessarily all support the same set of architecture features. So you
can't test at run time for the presence of AES acceleration by
querying the CPU you're running on - even if you found a way to do it,
the answer wouldn't be reliable once the OS started migrating your
process between CPUs. Instead, you have to ask the OS itself, because
only that knows about _all_ the CPUs on the system. So that means the
aes_hw_available() mechanism has to extend a tentacle into each
platform subdirectory.
The trickiest part was the nest of ifdefs that tries to detect whether
the compiler can support the necessary parts. I had successful
test-compiles on several compilers, and was able to run the code
directly on an AArch64 tablet (so I know it passes cryptsuite), but
it's likely that at least some Arm platforms won't be able to build it
because of some path through the ifdefs that I haven't been able to
test yet.
I remembered the existence of that module while I was changing the API
of the CRC functions. It's still quite possibly the only code in PuTTY
not written specifically _for_ PuTTY, so it definitely deserves a bit
of a test suite.
In order to expose it through the ptrlen-centric testcrypt system,
I've added some missing 'const' in the detector module itself, but
otherwise I've left the detector code as it was.