I had a one-off 'Strange packet received' error yesterday for an
SSH_MSG_GLOBAL_REQUEST at connection startup. I wasn't able to
reproduce, but examining the packet logs from the successful retry
suggested that the global request in question was an OpenSSH 'here are
my hostkeys' message.
My belief is that in the failing connection that request packet must
have arrived exceptionally promptly, and been added to pq_full before
the preceding SSH_MSG_USERAUTH_SUCCESS had been processed. So the code
that loops along pq_full feeding everything to the dispatch table
would have moved the USERAUTH_SUCCESS on to pq_ssh2_userauth and
scheduled a callback to handle it, and then moved on to the
GLOBAL_REQUEST which would have gone straight to the dispatch table
handler for 'help, we weren't expecting this message'. The userauth
callback would later on have installed a more sensible dispatch table
handler for global requests, but by then, it was too late.
Solution: make a special case during pq_full processing, so that when
we see USERAUTH_SUCCESS we _immediately_ - before continuing to
traverse pq_full - install the initial dispatch table entries for the
ssh-connection protocol. That way, even if connection messages are
already in the queue, we won't get confused by them.
This fixes an oversight in commit 0fc2d3b45: if a key creation
function returns a null 'ssh_key *', then adjusting the pointer's
address using FROMFIELD is a mistake, both in technical C terms
(undefined behaviour) and practically speaking because it will foil
the subsequent check against NULL. Instead, if we're going to check a
pointer against NULL, we must do it _before_ applying this kind of
address-adjusting type conversion.
Now we're building four rather than two sets of Windows binaries, the
build time has gone up rather painfully. I've just added a feature to
bob where it will invent a sensible value to use in 'make -j' and the
like, so let's start using it.
I build both 32- and 64-bit versions of the .exe files, code-sign
them, and create the same .zip file as I do for x86 Windows. I don't
yet have a method of building Arm MSI installers, though.
Apparently Windows on Arm has an emulator that lets it run x86
binaries without it being obvious, which could get confusing when
people start reporting what version of what they're running where.
(Indeed, it might get confusing for _me_ during early testing.) So now
the Windows builds explicitly state 'x86' or 'Arm' as well as 32- or
64-bit.
Now we don't have to worry about which windres we're using (or whether
another target architecture's windres will do just as well), this is
very easy - just test for a couple of extra values of $(Platform).
To build on Arm with VS2017 includes and libraries, various blog posts
and websites explain that you have to #define a cumbersome macro
called _ARM_WINAPI_PARTITION_DESKTOP_SDK_AVAILABLE, without which the
headers will #error at you. But if you do that, then everything seems
to compile fine and I actually tested it on an Arm Windows machine
today.
Also, I had to disable stack protection (/GS-), because clang-cl
doesn't yet support the particular form of it for which the VS2017 Arm
C library provides the runtime support. Unfortunate in a security
application, of course, but there we go.
Previously I was using clang-cl as my compiler, lld as my linker, and
GNU windres as my resource compiler, which made a confusing hybrid of
the LLVM and GNU toolchains. This was because llvm-rc had about four
missing features that stopped it being able to handle PuTTY's resource
files. (Some dialog control types; dialog class names; handling
preprocessor output without getting confused by line markers and
snippets of stray C; not complaining about the DISCARDABLE keyword.
Although admittedly I could have dealt with the last of those by just
removing it from my .rc files, because it's pointless anyway.)
In the past month, the llvm-rc developers have been hard at work, and
now it has _all_ those features! So now I can switch over to a purely
LLVM-based toolchain for my Windows builds, which is easier to set up
(and easier to tell other people how to set up, since they get it for
free with the rest of LLVM); doesn't have a nominal architecture
dependency (windres has to built against a particular flavour of
binutils); and produces bit-identical output to Visual Studio's
resource compiler as far as I can see (whereas windres is more in the
'close enough' area).
This needed a small makefile restructuring, because unlike windres,
llvm-rc doesn't have a built-in option to run the resource file
through the preprocessor. So now Makefile.clangcl has separate rules
to preprocess $tool.rc into $tool.rcpp and then compile the latter
into $tool.res.
In the course of reworking the socket vtable system, I noticed that
both sshshare.c and x11fwd.c independently invented the idea of a Plug
none of whose methods do anything. We don't need more than one of
those, so let's centralise the idea to somewhere it can be easily
reused.
It wasn't freeing the key comment along with the key data, probably
because I originally based the code on the SSH-1 analogue and forgot
that freersakey() *does* free the comment.
Now I've got FROMFIELD, I can rework it so that structures providing
an implementation of the Socket or Plug trait no longer have to have
the vtable pointer as the very first thing in the structure. In
particular, this means that the ProxySocket structure can now directly
implement _both_ the Socket and Plug traits, which is always
_logically_ how it's worked, but previously it had to be implemented
via two separate structs linked to each other.
During last week's work, I made a mistake in which I got the arguments
backwards in one of the key-blob-generating functions - mistakenly
swapped the 'void *' key instance with the 'BinarySink *' output
destination - and I didn't spot the mistake until run time, because in
C you can implicitly convert both to and from void * and so there was
no compile-time failure of type checking.
Now that I've introduced the FROMFIELD macro that downcasts a pointer
to one field of a structure to retrieve a pointer to the whole
structure, I think I might start using that more widely to indicate
this kind of polymorphic subtyping. So now all the public-key
functions in the struct ssh_signkey vtable handle their data instance
in the form of a pointer to a subfield of a new zero-sized structure
type 'ssh_key', which outside the key implementations indicates 'this
is some kind of key instance but it could be of any type'; they
downcast that pointer internally using FROMFIELD in place of the
previous ordinary C cast, and return one by returning &foo->sshk for
whatever foo they've just made up.
The sshk member is not at the beginning of the structure, which means
all those FROMFIELDs and &key->sshk are actually adding and
subtracting an offset. Of course I could have put the member at the
start anyway, but I had the idea that it's actually a feature _not_ to
have the two types start at the same address, because it means you
should notice earlier rather than later if you absentmindedly cast
from one to the other directly rather than by the approved method (in
particular, if you accidentally assign one through a void * and back
without even _noticing_ you perpetrated a cast). In particular, this
enforces that you can't sfree() the thing even once without realising
you should instead of called the right freekey function. (I found
several bugs by this method during initial testing, so I think it's
already proved its worth!)
While I'm here, I've also renamed the vtable structure ssh_signkey to
ssh_keyalg, because it was a confusing name anyway - it describes the
_algorithm_ for handling all keys of that type, not a specific key. So
ssh_keyalg is the collection of code, and ssh_key is one instance of
the data it handles.
In Friday's testing of the BinarySink work, I noticed that if you
accidentally add a mathematically invalid RSA1 key to Pageant, it will
accept it, getting into a state where it can fail assertions when
asked to use the key later. Added a call to rsa_verify(), triggering
an SSH_AGENT_FAILURE response if it doesn't agree the key is good.
This is a cleanup I started to notice a need for during the BinarySink
work. It removes a lot of faffing about casting things to char * or
unsigned char * so that some API will accept them, even though lots of
such APIs really take a plain 'block of raw binary data' argument and
don't care what C thinks the signedness of that data might be - they
may well reinterpret it back and forth internally.
So I've tried to arrange for all the function call APIs that ought to
have a void * (or const void *) to have one, and those that need to do
pointer arithmetic on the parameter internally can cast it back at the
top of the function. That saves endless ad-hoc casts at the call
sites.
Similarly to commit 1c4f12252, this was one of those errors that C
can't catch because of the const-removing prototype of memchr. But
when you memchr() a const string, the value you get back is
_semantically_ a pointer to const, and should be assigned into a
variable of that type.
This removes a lot of pointless duplications of those constants.
Of course, _ideally_, I should upgrade to C99 bool throughout the code
base, replacing TRUE and FALSE with true and false and tagging
variables explicitly as bool when that's what they semantically are.
But that's a much bigger piece of work, and shouldn't block this
trivial cleanup!
There was a time, back when the USA was more vigorously against
cryptography, when we toyed with the idea of having a version of PuTTY
that outsourced its cryptographic primitives to the Microsoft optional
encryption API, which would effectively create a tool that acted like
PuTTY proper on a system with that API installed, but automatically
degraded to being PuTTYtel on a system without, and meanwhile (so went
the theory) it could be moved freely across national borders with
crypto restrictions, because it didn't _contain_ any of the actual
crypto.
I don't recall that we ever got it working at all. And certainly the
vestiges of it here and there in the current code are completely
unworkable - they refer to an 'mscrypto.c' that doesn't even exist,
and the ifdefs in the definitions of structures like RSAKey and
MD5Context are not matched by any corresponding ifdefs in the code. So
I ought to have got round to removing it long ago, in order to avoid
misleading anyone.
I spotted this at some point during this week's BinarySink
refactoring, but only just remembered to come back and fix it. snew
aborts the whole program rather than return NULL, so there's no need
to check its return value against NULL.
At the point where the do_ssh2_connection coroutine rewrites lots of
dispatch table entries to point to things other than itself, there's a
possibility that it's too late, because some packets of those types
have already arrived and been put into pq_ssh2_connection. So you'd
get weird errors like 'Strange packet received: type 94', in which the
only thing that's strange is why packet 94 might conceivably be
unexpected, since it's SSH_MSG_CHANNEL_DATA!
(I observed this when 'plink host -nc otherhost:port' became a sharing
downstream, but I have no reason to think that's the only circumstance
where this can possibly occur, or even a reliable one. It just
happened to have the right timing.)
I'm not completely sure this is the _best_ solution, but it seems to
work: at the point when I rewrite the dispatch table, I also return
the whole of the connection packet queue to the main queue, so that
it'll be run through the dispatch table a second time which will
effectively filter out any packets that we've just installed new
handlers for.
Another big pile of packet-construction now looks simpler and nicer,
although - as with the agent messages - I've done that tiny cheat of
filling in the length field at the start of the packet frame at the
very end of processing.
This gets rid of yet another huge pile of beating around the bush with
length-counting. Also, this time, the BinarySink in question is a
little more interesting than just being a strbuf every time: on
Windows, where the shared-memory Pageant IPC system imposes a hard
limit on the size of message we can return, I've written a custom
BinarySink implementation that collects up to that much data and then
gives up and sets an overflow flag rather than continue to allocate
memory.
So the main Pageant code no longer has to worry about checking
AGENT_MAX_MSGLEN all the time - and better still, the Unix version of
Pageant is no longer _limited_ by AGENT_MAX_MSGLEN in its outgoing
messages, i.e. it could store a really extra large number of keys if
it needed to. That limitation is now a local feature of Windows
Pageant rather than intrinsic to the whole code base.
(AGENT_MAX_MSGLEN is still used to check incoming agent messages for
sanity, however. Mostly that's because I feel I ought to check them
against _some_ limit, and this one seems sensible enough. Incoming
agent messages are more bounded anyway - they generally don't hold
more than _one_ private key.)
This simplifies the client code both in ssh.c and in the client side
of Pageant.
I've cheated a tiny bit by preparing agent requests in a strbuf that
has space reserved at the front for the packet frame, which makes life
easier for the code that sends them off.
The output routines in import.c and sshpubk.c were further horrifying
hotbeds of manual length-counting. Reworked it all so that it builds
up key file components in strbufs, and uses the now boringly standard
put_* functions to write into those strbufs.
This removes the write_* functions in import.c, which I had to hastily
rename a few commits ago when I introduced the new marshalling system
in the first place.
However, I wasn't quite able to get rid of _all_ of import.c's local
formatting functions; there are a couple still there (but now with new
BinarySink-style API) which output multiprecision integers in a couple
of different formats starting from an existing big-endian binary
representation, as opposed to starting from an internal Bignum.
Just as I did a few commits ago with the low-level SHA_Bytes type
functions, the ssh_hash and ssh_mac abstract types now no longer have
a direct foo->bytes() update method at all. Instead, each one has a
foo->sink() function that returns a BinarySink with the same lifetime
as the hash context, and then the caller can feed data into that in
the usual way.
This lets me get rid of a couple more duplicate marshalling routines
in ssh.c: hash_string(), hash_uint32(), hash_mpint().
This affects all the functions that generate public and private key
and signature blobs of all kinds, plus ssh_ecdhkex_getpublic. Instead
of returning a bare block of memory and taking an extra 'int *length'
parameter, all these functions now write to a BinarySink, and it's the
caller's job to have prepared an appropriate one where they want the
output to go (usually a strbuf).
The main value of this change is that those blob-generation functions
were chock full of ad-hoc length-counting and data marshalling. You
have only to look at rsa2_{public,private}_blob, for example, to see
the kind of thing I was keen to get rid of!
Now instead of iterating through conf twice in separate functions,
once to count up the size of the serialised data and once to write it
out, I just go through once and dump it all in a strbuf.
(Of course, I could still do a two-pass count-then-allocate approach
easily enough in this system; nothing would stop me writing a
BinarySink implementation that didn't actually store any data and just
counted its size, and then I could choose at each call site whether I
preferred to do it that way.)
There's no reason to pass a 'void *' through and then cast it back to
a packet. All those functions are really doing is serialising a pile
of output on to an arbitrary receiver, so it's nicer to use the new
abstract BinarySink type, and then the do_mode function doesn't have
to cast it at all.
This also removes one lingering ugliness from the previous commit, in
the form of the variable 'stringstart' I introduced in ssh2_setup_pty
to work around the removal of the addstring_start system. Now that's
done the sensible way, by constructing the subsidiary terminal-modes
string in a strbuf (taking advantage of the new type-genericity
meaning I can pass that to the underlying functions just as easily as
a struct Packet), and then putting that into the final packet
afterwards.
All previous uses of them are replaced by the new put_* family.
This is a fairly mechanical transformation, which doesn't take full
advantage of the new ways things can be done more usefully. I'll come
back and clean parts of it up in separate patches later; muddling that
together with this main search-and-replace part would make (even more
of) a giant unreadable monolith.
In fact, those functions don't even exist any more. The only way to
get data into a primitive hash state is via the new put_* system. Of
course, that means put_data() is a viable replacement for every
previous call to one of the per-hash update functions - but just
mechanically doing that would have missed the opportunity to simplify
a lot of the call sites.
We can now simply call the centralised functions to put uint32s and
mpints into hash states, so there's no need to have duplicate local
copies doing the same things less type-generically.
I've finally got tired of all the code throughout PuTTY that repeats
the same logic about how to format the SSH binary primitives like
uint32, string, mpint. We've got reasonably organised code in ssh.c
that appends things like that to 'struct Packet'; something similar in
sftp.c which repeats a lot of the work; utility functions in various
places to format an mpint to feed to one or another hash function; and
no end of totally ad-hoc stuff in functions like public key blob
formatters which actually have to _count up_ the size of data
painstakingly, then malloc exactly that much and mess about with
PUT_32BIT.
It's time to bring all of that into one place, and stop repeating
myself in error-prone ways everywhere. The new marshal.h defines a
system in which I centralise all the actual marshalling functions, and
then layer a touch of C macro trickery on top to allow me to (look as
if I) pass a wide range of different types to those functions, as long
as the target type has been set up in the right way to have a write()
function.
This commit adds the new header and source file, and sets up some
general centralised types (strbuf and the various hash-function
contexts like SHA_State), but doesn't use the new calls for anything
yet.
(I've also renamed some internal functions in import.c which were
using the same names that I've just defined macros over. That won't
last long - those functions are going to go away soon, so the changed
names are strictly temporary.)
Now, instead of being a black box that you shovel strings into and
eventually extract a final answer, it exposes enough structure fields
to the world that you can append things to it _and_ look inside its
current contents. For convenience, it exports its internal pointer as
both a char * and an unsigned char *.
This centralises a few things that multiple header files were
previously defining, and were protecting against each other's
redefinition with ifdefs - small things like structs and typedefs. Now
all those things are in a defs.h which is by definition safe to
include _first_ (out of all the codebase-local headers) and only need
to be defined once.
In the SSH1_AGENTC_ADD_RSA_IDENTITY message, the multiplicative
inverse integer is the inverse of the first prime mod the second one.
In our notation, that means we should send iqmp, then q, then p, which
is also how the Pageant server side expects to receive them.
Unfortunately, we were sending iqmp, p, q instead, which I think must
be a confusion resulting from the SSH 1.5 document naming the primes
the other way round (and talking about the auxiliary value 'inverse of
p mod q').
If we're called on to generate the one-line OpenSSH public key format
for a key that we don't have a comment field for, we were mistakenly
testing this by checking if '*comment' rather than 'comment' was zero,
i.e. if comment was NULL we'd dereference it by mistake.
If name resolution fails, pfd_connect tries to sfree(dummy_realhost)
when it's completely uninitialised - the failed resolution didn't
write to it, and we also didn't precautionarily initialise it to NULL
first. Now we do the latter.
I never noticed before because it only comes up in the case of an
agent sending back one particular kind of corrupt data, but if the
last-minute check that there's no trailing junk on the end of the
agent's SSH-2 key list fails, it prints an error message erroneously
mentioning SSH-1.
Lots of functions had really generic names (like 'makekey'), or names
that missed out an important concept (like 'rsakey_pubblob', which
loads a public blob from a _file_ and doesn't generate it from an
in-memory representation at all). Also, the opaque 'int order' that
distinguishes the two formats of public key blob is now a mnemonic
enumeration, and while I'm at it, rsa_ssh1_public_blob takes one of
those as an extra argument.
If you forget the '+' at the start of a continuation line with only
one word on it, then Perl would test $_[1] before checking that it
even existed to test. The test would give the right answer anyway, but
the warning was annoying.
Added a missing pq_push_front (see commit a41eefff4), without which
the SSH_MSG_USERAUTH_FAILURE was being dropped from the userauth
packet queue before returning to the top of the loop which waits for
it to arrive.
This seems to be a knock-on effect of my recent reworking of the SSH
code to be based around queues and callbacks. The loop iteration
function in uxsftp.c (ssh_sftp_do_select) would keep going round its
select loop until something had happened on one of its file
descriptors, and then return to the caller in the assumption that the
resulting data might have triggered whatever condition the caller was
waiting for - and if not, then the caller checks, finds nothing
interesting has happened, and resumes looping with no harm done.
But now, when something happens on an fd, it doesn't _synchronously_
trigger the follow-up condition PSFTP was waiting for (which, at
startup time, happens to be back->sendok() starting to return TRUE).
Instead, it schedules a callback, which will schedule a callback,
which ... ends up setting that flag. But by that time, the loop
function has already returned, the caller has found nothing
interesting and resumed looping, and _now_ the interesting thing
happens but it's too late because ssh_sftp_do_select will wait until
the next file descriptor activity before it next returns.
Solution: give run_toplevel_callbacks a return value which says
whether it's actually done something, and if so, return immediately in
case that was the droid the caller was looking for. As it were.
In commit 528513dde I absentmindedly replaced a write to the local
variable 'need_size' of drawing_area_setup with a write to
inst->drawing_area_setup_needed, imagining that they had the same
effect. But actually, need_size was doing two jobs and I only replaced
one of them: it was also the variable that indicated that the logical
terminal size had changed and so we had to call term_size() to make
the terminal.c data structures resize themselves appropriately. The
loss of that call also inhibited generation of SIGWINCH.
This causes the previous graphic character to be displayed another Pn
times (defaulting to 1, as usual). I just found out about it because
Ubuntu 18.04's ncurses expects it to be honoured.
According to all-escapes, REP is only supposed to be used when the
thing immediately preceding it in the terminal data stream _is_ a
printing character, and if not, then the behaviour is undefined. But
'undefined' is good enough for me to do the simple thing of just
remembering the last graphic character no matter whether anything else
has intervened since then.
To avoid DoS attacks using this escape sequence with a really huge Pn,
I clamp the value at the total size of the screen. There might be ways
to do that with more finesse (e.g. reduce it mod the width so that the
screen ends up looking the way it should even for huge parameters, or
reduce it even further if we notice the terminal isn't in wrapping
modes), but this will do for now.
I'm about to want to implement an escape sequence that causes a
graphic character to be printed, which means I'll need the code that
does so to be in a separate routine that I can call easily, instead of
buried a few loops deep in the middle of the main state machine.
Another piece of fallout from this morning's patch series, which I
didn't notice until I left a session running for more than an hour:
once do_ssh2_transport is told to begin a rekey, it has no way of
knowing _not_ to immediately do another one, and another, and so on.
Added a value RK_NONE to the rekey class enumeration, and set
rekey_class to that immediately after a key exchange completes. Then a
new one won't start until some code actually sets rekey_class to a
nonzero value again.