Ian Jackson points out that the Linux kernel has a macro of this name
with the same purpose, and suggests that it's a good idea to use the
same name as they do, so that at least some people reading one code
base might recognise it from the other.
I never really thought very hard about what order FROMFIELD's
parameters should go in, and therefore I'm pleasantly surprised to
find that my order agrees with the kernel's, so I don't have to
permute every call site as part of making this change :-)
The new version of ssh_hash has the same nice property as ssh2_mac,
that I can make the generic interface object type function directly as
a BinarySink so that clients don't have to call h->sink() and worry
about the separate sink object they get back from that.
This piece of tidying-up has come out particularly well in terms of
saving tedious repetition and boilerplate. I've managed to remove
three pointless methods from every MAC implementation by means of
writing them once centrally in terms of the implementation-specific
methods; another method (hmacmd5_sink) vanished because I was able to
make the interface type 'ssh2_mac' be directly usable as a BinarySink
by way of a new delegation system; and because all the method
implementations can now find their own vtable, I was even able to
merge a lot of keying and output functions that had previously
differed only in length parameters by having them look up the lengths
in whatever vtable they were passed.
This is more or less the same job as the SSH-1 case, only more
extensive, because we have a wider range of ciphers.
I'm a bit disappointed about the AES case, in particular, because I
feel as if it ought to have been possible to arrange to combine this
layer of vtable dispatch with the subsidiary one that selects between
hardware and software implementations of the underlying cipher. I may
come back later and have another try at that, in fact.
Just as I did a few commits ago with the low-level SHA_Bytes type
functions, the ssh_hash and ssh_mac abstract types now no longer have
a direct foo->bytes() update method at all. Instead, each one has a
foo->sink() function that returns a BinarySink with the same lifetime
as the hash context, and then the caller can feed data into that in
the usual way.
This lets me get rid of a couple more duplicate marshalling routines
in ssh.c: hash_string(), hash_uint32(), hash_mpint().
In fact, those functions don't even exist any more. The only way to
get data into a primitive hash state is via the new put_* system. Of
course, that means put_data() is a viable replacement for every
previous call to one of the per-hash update functions - but just
mechanically doing that would have missed the opportunity to simplify
a lot of the call sites.
I've finally got tired of all the code throughout PuTTY that repeats
the same logic about how to format the SSH binary primitives like
uint32, string, mpint. We've got reasonably organised code in ssh.c
that appends things like that to 'struct Packet'; something similar in
sftp.c which repeats a lot of the work; utility functions in various
places to format an mpint to feed to one or another hash function; and
no end of totally ad-hoc stuff in functions like public key blob
formatters which actually have to _count up_ the size of data
painstakingly, then malloc exactly that much and mess about with
PUT_32BIT.
It's time to bring all of that into one place, and stop repeating
myself in error-prone ways everywhere. The new marshal.h defines a
system in which I centralise all the actual marshalling functions, and
then layer a touch of C macro trickery on top to allow me to (look as
if I) pass a wide range of different types to those functions, as long
as the target type has been set up in the right way to have a write()
function.
This commit adds the new header and source file, and sets up some
general centralised types (strbuf and the various hash-function
contexts like SHA_State), but doesn't use the new calls for anything
yet.
(I've also renamed some internal functions in import.c which were
using the same names that I've just defined macros over. That won't
last long - those functions are going to go away soon, so the changed
names are strictly temporary.)
Clang generates an internal failure if the same function
has different target attributes in definition and declaration.
To work around that, we made a proxy predeclared function
without target attribute.
SHA256-NI code is conditionally enabled if CPU supports SHA extensions.
The procedure is based on Jeffrey Walton's SHA256 implementation:
https://github.com/noloader/SHA-Intrinsics
These pointers will be required in next commits
where subroutines with new instructions are introduced.
Depending on CPUID dynamic check, pointers will refer to old
SW-only implementations or to new instructions subroutines
The key derivation code has been assuming (though non-critically, as
it happens) that the size of the MAC output is the same as the size of
the MAC key. That isn't even a good assumption for the HMAC family,
due to HMAC-SHA1-96 and also the bug-compatible versions of HMAC-SHA1
that only use 16 bytes of key material; so now we have an explicit
key-length field separate from the MAC-length field.
This permits a hash state to be cloned in the middle of being used, so
that multiple strings with the same prefix can be hashed without
having to repeat all the computation over the prefix.
Having done that, we'll also sometimes need to free a hash state that
we aren't generating actual hash output from, so we need a free method
as well.
Now that we have modes in which the MAC verification happens before
any other crypto operation and hence will be the only thing seen by an
attacker, it seems like about time we got round to doing it in a
cautious way that tries to prevent the attacker from using our memcmp
as a timing oracle.
So, here's an smemeq() function which has the semantics of !memcmp but
attempts to run in time dependent only on the length parameter. All
the MAC implementations now use this in place of !memcmp to verify the
MAC on input data.
This causes the initial length field of the SSH-2 binary packet to be
unencrypted (with the knock-on effect that now the packet length not
including MAC must be congruent to 4 rather than 0 mod the cipher
block size), and then the MAC is applied over the unencrypted length
field and encrypted ciphertext (prefixed by the sequence number as
usual). At the cost of exposing some information about the packet
lengths to an attacker (but rarely anything they couldn't have
inferred from the TCP headers anyway), this closes down any
possibility of a MITM using the client as a decryption oracle, unless
they can _first_ fake a correct MAC.
ETM mode is enabled by means of selecting a different MAC identifier,
all the current ones of which are constructed by appending
"-etm@openssh.com" to the name of a MAC that already existed.
We currently prefer the original SSH-2 binary packet protocol (i.e. we
list all the ETM-mode MACs last in our KEXINIT), on the grounds that
it's better tested and more analysed, so at the moment the new mode is
only activated if a server refuses to speak anything else.
as specified in RFC 6668. This is not so much because I think it's
necessary, but because scrypt uses HMAC-SHA-256 and once we've got it we
may as well use it.
Code very closely derived from the HMAC-SHA-1 code.
Tested against OpenSSH 5.9p1 Debian-5ubuntu1.
[originally from svn r9759]
diffie-hellman-group-exchange-sha256, which the last DHGEX draft defined.
Code lifted from Simon's "crypto" directory, with changes to make it look
more like sshsh512.c.
[originally from svn r6252]