It is to put_data what memset is to memcpy. Several places
in the code wanted it already, but not _quite_ enough for me to
have written it with the rest of the BinarySink infrastructure
originally.
After Pavel Kryukov pointed out that I have to put _something_ in the
'ssh_key' structure, I thought of an actually useful thing to put
there: why not make it store a pointer to the ssh_keyalg structure?
Then ssh_key becomes a classoid - or perhaps 'traitoid' is a closer
analogy - in the same style as Socket and Plug. And just like Socket
and Plug, I've also arranged a system of wrapper macros that avoid the
need to mention the 'object' whose method you're invoking twice at
each call site.
The new vtable pointer directly replaces an existing field of struct
ec_key (which was usable by several different ssh_keyalgs, so it
already had to store a pointer to the currently active one), and also
replaces the 'alg' field of the ssh2_userkey structure that wraps up a
cryptographic key with its comment field.
I've also taken the opportunity to clean things up a bit in general:
most of the methods now have new and clearer names (e.g. you'd never
know that 'newkey' made a public-only key while 'createkey' made a
public+private key pair unless you went and looked it up, but now
they're called 'new_pub' and 'new_priv' you might be in with a
chance), and I've completely removed the openssh_private_npieces field
after realising that it was duplicating information that is actually
_more_ conveniently obtained by calling the new_priv_openssh method
(formerly openssh_createkey) and throwing away the result.
This parameter returned a substring of the input, which was used for
two purposes. Firstly, it was used to hash the host and server keys
during the initial SSH-1 key setup phase; secondly, it was used to
check the keys in Pageant against the public key blob of a key
specified on the command line.
Unfortunately, those two purposes didn't agree! The first one needs
just the bare key modulus bytes (without even the SSH-1 mpint length
header); the second needs the entire key blob. So, actually, it seems
to have never worked in SSH-1 to say 'putty -i keyfile' and have PuTTY
find that key in Pageant and not have to ask for the passphrase to
decrypt the version on disk.
Fixed by removing that parameter completely, which simplifies all the
_other_ call sites, and replacing it by custom code in those two
places that each does the actually right thing.
Quite a few of the function pointers in the ssh_keyalg vtable now take
ptrlen arguments in place of separate pointer and length pairs.
Meanwhile, the various key types' implementations of those functions
now work by initialising a BinarySource with the input ptrlen and
using the new decode functions to walk along it.
One exception is the openssh_createkey method which reads a private
key in the wire format used by OpenSSH's SSH-2 agent protocol, which
has to consume a prefix of a larger data stream, and tell the caller
how much of that data was the private key. That function now takes an
actual BinarySource, and passes that directly to the decode functions,
so that on return the caller finds that the BinarySource's read
pointer has been advanced exactly past the private key.
This let me throw away _several_ reimplementations of mpint-reading
functions, one in each of sshrsa, sshdss.c and sshecc.c. Worse still,
they didn't all have exactly the SSH-2 semantics, because the thing in
sshrsa.c whose name suggested it was an mpint-reading function
actually tolerated the wrong number of leading zero bytes, which it
had to be able to do to cope with the "ssh-rsa" signature format which
contains a thing that isn't quite an SSH-2 mpint. Now that deviation
is clearly commented!
This wraps up a (pointer, length) pair into a convenient struct that
lets me return it by value from a function, and also pass it through
to other functions in one go.
Ideally quite a lot of this code base could be switched over to using
ptrlen in place of separate pointer and length variables or function
parameters. (In fact, in my personal ideal conception of C, the usual
string type would be of this form, and all the string.h functions
would operate on ptrlens instead of zero-terminated 'char *'.)
For the moment, I'm just introducing it to make some upcoming
refactoring less inconvenient. Bulk migration of existing code to
ptrlen is a project for another time.
Along with the type itself, I've provided a convenient system of
including the contents of a ptrlen in a printf; a constructor function
that wraps up a pointer and length so you can make a ptrlen on the fly
in mid-expression; a function to compare a ptrlen against an ordinary
C string (which I mostly expect to use with string literals); and a
function 'mkstr' to make a dynamically allocated C string out of one.
That last function replaces a function of the same name in sftp.c,
which I'm promoting to a whole-codebase facility and adjusting its
API.
During last week's work, I made a mistake in which I got the arguments
backwards in one of the key-blob-generating functions - mistakenly
swapped the 'void *' key instance with the 'BinarySink *' output
destination - and I didn't spot the mistake until run time, because in
C you can implicitly convert both to and from void * and so there was
no compile-time failure of type checking.
Now that I've introduced the FROMFIELD macro that downcasts a pointer
to one field of a structure to retrieve a pointer to the whole
structure, I think I might start using that more widely to indicate
this kind of polymorphic subtyping. So now all the public-key
functions in the struct ssh_signkey vtable handle their data instance
in the form of a pointer to a subfield of a new zero-sized structure
type 'ssh_key', which outside the key implementations indicates 'this
is some kind of key instance but it could be of any type'; they
downcast that pointer internally using FROMFIELD in place of the
previous ordinary C cast, and return one by returning &foo->sshk for
whatever foo they've just made up.
The sshk member is not at the beginning of the structure, which means
all those FROMFIELDs and &key->sshk are actually adding and
subtracting an offset. Of course I could have put the member at the
start anyway, but I had the idea that it's actually a feature _not_ to
have the two types start at the same address, because it means you
should notice earlier rather than later if you absentmindedly cast
from one to the other directly rather than by the approved method (in
particular, if you accidentally assign one through a void * and back
without even _noticing_ you perpetrated a cast). In particular, this
enforces that you can't sfree() the thing even once without realising
you should instead of called the right freekey function. (I found
several bugs by this method during initial testing, so I think it's
already proved its worth!)
While I'm here, I've also renamed the vtable structure ssh_signkey to
ssh_keyalg, because it was a confusing name anyway - it describes the
_algorithm_ for handling all keys of that type, not a specific key. So
ssh_keyalg is the collection of code, and ssh_key is one instance of
the data it handles.
The output routines in import.c and sshpubk.c were further horrifying
hotbeds of manual length-counting. Reworked it all so that it builds
up key file components in strbufs, and uses the now boringly standard
put_* functions to write into those strbufs.
This removes the write_* functions in import.c, which I had to hastily
rename a few commits ago when I introduced the new marshalling system
in the first place.
However, I wasn't quite able to get rid of _all_ of import.c's local
formatting functions; there are a couple still there (but now with new
BinarySink-style API) which output multiprecision integers in a couple
of different formats starting from an existing big-endian binary
representation, as opposed to starting from an internal Bignum.
This affects all the functions that generate public and private key
and signature blobs of all kinds, plus ssh_ecdhkex_getpublic. Instead
of returning a bare block of memory and taking an extra 'int *length'
parameter, all these functions now write to a BinarySink, and it's the
caller's job to have prepared an appropriate one where they want the
output to go (usually a strbuf).
The main value of this change is that those blob-generation functions
were chock full of ad-hoc length-counting and data marshalling. You
have only to look at rsa2_{public,private}_blob, for example, to see
the kind of thing I was keen to get rid of!
In fact, those functions don't even exist any more. The only way to
get data into a primitive hash state is via the new put_* system. Of
course, that means put_data() is a viable replacement for every
previous call to one of the per-hash update functions - but just
mechanically doing that would have missed the opportunity to simplify
a lot of the call sites.
If we're called on to generate the one-line OpenSSH public key format
for a key that we don't have a comment field for, we were mistakenly
testing this by checking if '*comment' rather than 'comment' was zero,
i.e. if comment was NULL we'd dereference it by mistake.
Lots of functions had really generic names (like 'makekey'), or names
that missed out an important concept (like 'rsakey_pubblob', which
loads a public blob from a _file_ and doesn't generate it from an
in-memory representation at all). Also, the opaque 'int order' that
distinguishes the two formats of public key blob is now a mnemonic
enumeration, and while I'm at it, rsa_ssh1_public_blob takes one of
those as an extra argument.
Having found a lot of unfixed constness issues in recent development,
I thought perhaps it was time to get proactive, so I compiled the
whole codebase with -Wwrite-strings. That turned up a huge load of
const problems, which I've fixed in this commit: the Unix build now
goes cleanly through with -Wwrite-strings, and the Windows build is as
close as I could get it (there are some lingering issues due to
occasional Windows API functions like AcquireCredentialsHandle not
having the right constness).
Notable fallout beyond the purely mechanical changing of types:
- the stuff saved by cmdline_save_param() is now explicitly
dupstr()ed, and freed in cmdline_run_saved.
- I couldn't make both string arguments to cmdline_process_param()
const, because it intentionally writes to one of them in the case
where it's the argument to -pw (in the vain hope of being at least
slightly friendly to 'ps'), so elsewhere I had to temporarily
dupstr() something for the sake of passing it to that function
- I had to invent a silly parallel version of const_cmp() so I could
pass const string literals in to lookup functions.
- stripslashes() in pscp.c and psftp.c has the annoying strchr nature
Not all of them, but the ones that don't get a 'void *key' parameter.
This means I can share methods between multiple ssh_signkey
structures, and still give those methods an easy way to find out which
public key method they're dealing with, by loading parameters from a
larger structure in which the ssh_signkey is the first element.
(In OO terms, I'm arranging that all static methods of my public key
classes get a pointer to the class vtable, to make up for not having a
pointer to the class instance.)
I haven't actually done anything with the new facility in this commit,
but it will shortly allow me to clean up the constant lookups by curve
name in the ECDSA code.
All the name strings in ssh_cipher, ssh_mac, ssh_hash, ssh_signkey
point to compile-time string literals, hence should obviously be const
char *.
Most of these const-correctness patches are just a mechanical job of
adding a 'const' in the one place you need it right now, and then
chasing the implications through the code adding further consts until
it compiles. But this one has actually shown up a bug: the 'algorithm'
output parameter in ssh2_userkey_loadpub was sometimes returning a
pointer to a string literal, and sometimes a pointer to dynamically
allocated memory, so callers were forced to either sometimes leak
memory or sometimes free a bad thing. Now it's consistently
dynamically allocated, and should be freed everywhere too.
There were ad-hoc functions for fingerprinting a bare key blob in both
cmdgen.c and pageant.c, not quite doing the same thing. Also, every
SSH-2 public key algorithm in the code base included a dedicated
fingerprint() method, which is completely pointless since SSH-2 key
fingerprints are computed in an algorithm-independent way (just hash
the standard-format public key blob), so each of those methods was
just duplicating the work of the public_blob() method with a less
general output mechanism.
Now sshpubk.c centrally provides an ssh2_fingerprint_blob() function
that does all the real work, plus an ssh2_fingerprint() function that
wraps it and deals with calling public_blob() to get something to
fingerprint. And the fingerprint() method has been completely removed
from ssh_signkey and all its implementations, and good riddance.
There was a fair amount of duplication between Windows and Unix
PuTTYgen, and some confusion over writing things to FILE * and
formatting them internally into strings. I think all the public-key
output code now lives in sshpubk.c, and there's only one copy of the
code to generate each format.
The rsakey_pubblob() and ssh2_userkey_loadpub() functions, which
expected to be given a private key file and load only the unencrypted
public half, now also cope with any of the public-only formats I know
about (SSH-1 only has one, whereas SSH-2 has the RFC 4716 format and
OpenSSH's one-line format) and return an appropriate public key blob
from each of those too.
cmdgen now supports this functionality, by permitting public key files
to be loaded and used by any operation that doesn't need the private
key: so you can convert back and forth between the SSH-2 public
formats, or list the file's fingerprint.
When I implemented reading and writing of the new format a couple of
weeks ago, I kept them strictly separate in the UI, so you have to ask
for the format you want when exporting. But in fact this is silly,
because not every key type can be saved in both formats, and OpenSSH
itself has the policy of using the old format for key types it can
handle, unless specifically asked to use the new one.
So I've now arranged that the key file format enum has three values
for OpenSSH: PEM, NEW and AUTO. Files being loaded are identified as
either PEM or NEW, which describe the two physical file formats. But
exporting UIs present either AUTO or NEW, where AUTO is the virtual
format meaning 'save in the old format if possible, otherwise the new
one'.
The only reason those couldn't be replaced with a call to the
centralised find_pubkey_alg is because that function takes a zero-
terminated string and instead we had a (length,pointer) string. Easily
fixed; there's now a find_pubkey_alg_len(), and we call that.
This also fixes a string-matching bug in which the sense of memcmp was
reversed by mistake for ECDSA keys!
It's all very well for these two different formats to share a type
code as long as we're only loading them and not saving, but as soon as
we need to save one or the other, we'll need different type codes
after all.
This commit introduces the openssh_new_write() function, but for the
moment, it always returns failure.
The absence of these could have prevented sensitive private key
information from being properly cleared out of memory that PuTTY tools
had finished with.
Thanks to Patrick Coleman for spotting this and sending a patch.
We incremented buf by a few bytes, so we must decrement the
corresponding length by the same amount, or else makekey() could
overrun.
Thanks to Patrick Coleman for the patch.
This provides support for ECDSA public keys, for both hosts and users,
and also ECDH key exchange. Supported curves are currently just the
three NIST curves required by RFC 5656.
We should NULL out mac after freeing it, so that the cleanup code
doesn't try to free it again; also if the final key creation fails, we
should avoid freeing ret->comment when we're going to go to that same
cleanup code which will free 'comment' which contains the same pointer.
Thanks to Christopher Staite for pointing these out.
I'm about to need to refer to it from a source file that won't
necessarily always be linked against sshpubk.c, so it needs to live
somewhere less specialist. Now it sits alongside base64_encode_atom
(already in misc.c for another reason), which is neater anyway.
[originally from svn r10218]
header text from a PuTTY key file.
(It's silly to have both while (len > 0) at the top of the loop _and_
an if (len == 0) return in the middle, and in fact the former was the
erroneous one since it would have prohibited a 39-character header,
which I intended to be permitted.)
[originally from svn r9926]
of the GET_32BIT macros and then used as length fields. Missing bounds
checks against zero have been added, and also I've introduced a helper
function toint() which casts from unsigned to int in such a way as to
avoid C undefined behaviour, since I'm not sure I trust compilers any
more to do the obviously sensible thing.
[originally from svn r9918]
zero but does it in such a way that over-clever compilers hopefully
won't helpfully optimise the call away if you do it just before
freeing something or letting it go out of scope. Use this for
(hopefully) every memset whose job is to destroy sensitive data that
might otherwise be left lying around in the process's memory.
[originally from svn r9586]
takes a third argument which is TRUE if the file is being opened for
writing and wants to be created in such a way that it's readable
only to the owner. This is used when saving private keys.
While I'm here, I also use this option when writing session logs, on
the general principle that they probably contain _something_
sensitive.
The new argument is only supported on Unix, for the moment. (I think
writing owner-accessible-only files is the default on Windows.)
[originally from svn r7084]
(Since we choose to compile with -Werror, this is particularly important.)
I haven't yet checked that the resulting source actually compiles cleanly with
GCC 4, hence not marking `gcc4-warnings' as fixed just yet.
[originally from svn r7041]