An obvious goof - in SHA-384, you don't want to write out the last of
the four state vectors! Fortunately I spotted it only a couple of
hours after introducing it.
MD5 is structurally very similar to all the SHA-1 and SHA-2 hashes
(with the main difference being that the message schedule consists of
just repeating the 32-bit words of the message four times in different
permutations, instead of transforming them via an LFSR-style process).
So it helps legibility and maintainability if all the implementations
of these hashes are coded in a similar style - for example, that way,
the next time I need to make a change to the ssh_hash API, I can do it
the same way in all these modules without having to think everything
out again.
After the SHA-512 rewrite earlier today, all the hashes in that family
had been updated to a consistent new style as a side effect of adding
optional hardware acceleration, except for MD5, because there's no
hardware-accelerated version of it. (And not much chance of anyone
ever needing one, I hope!)
So this is a purely stylistic update which reworks MD5 so that it
looks just like all the SHA-1 and SHA-2 hash implementations. No
functional change.
The NEON support for SHA-512 acceleration looks very like SHA-256,
with a pair of chained instructions to generate a 128-bit vector
register full of message schedule, and another pair to update the hash
state based on those. But since SHA-512 is twice as big in all
dimensions, those four instructions between them only account for two
rounds of it, in place of four rounds of SHA-256.
Also, it's a tighter squeeze to fit all the data needed by those
instructions into their limited number of register operands. The NEON
SHA-256 implementation was able to keep its hash state and message
schedule stored as 128-bit vectors and then pass combinations of those
vectors directly to the instructions that did the work; for SHA-512,
in several places you have to make one of the input operands to the
main instruction by combining two halves of different vectors from
your existing state. But that operation is a quick single EXT
instruction, so no trouble.
The only other problem I've found is that clang - in particular the
version on M1 macOS, but as far as I can tell, even on current trunk -
doesn't seem to implement the NEON intrinsics for the SHA-512
extension. So I had to bodge my own versions with inline assembler in
order to get my implementation to compile under clang. Hopefully at
some point in the future the gap might be filled and I can relegate
that to a backwards-compatibility hack!
This commit adds the same kind of switching mechanism for SHA-512 that
we already had for SHA-256, SHA-1 and AES, and as with all of those,
plumbs it through to testcrypt so that you can explicitly ask for the
hardware or software version of SHA-512. So the test suite can run the
standard test vectors against both implementations in turn.
On M1 macOS, I'm testing at run time for the presence of SHA-512 by
checking a sysctl setting. You can perform the same test on the
command line by running "sysctl hw.optional.armv8_2_sha512".
As far as I can tell, on Windows there is not yet any flag to test for
this CPU feature, so for the moment, the new accelerated SHA-512 is
turned off unconditionally on Windows.
This builds on the previous refactoring by reworking the SHA-512
vtables and block layer to look more like the SHA-256 version, in
which the block and padding structure is a subroutine of the top-level
vtable methods instead of an owning layer around them.
This also organises the code in a way that makes it easy to drop in
hardware-accelerated versions alongside it: the block layer and the
big arrays of constants are now nicely separate from the inner
block-transform part.
It was written in an awkward roundabout way involving all the
arithmetic being done in horrible macros looking like assembler
instructions, and lots of explicit temp variables. That's because,
when I originally wrote it, I needed it to compile on platforms
without a 64-bit integer type.
In commit a647f2ba11 I switched it over to using uint64_t, but I did
it in a way that made minimal change to the code structure, by
rewriting the insides of those macros to contain ordinary uint64_t
arithmetic instead of faffing about with 32-bit halves. So it worked,
but it still looked disgusting.
Now I've reworked it so that individual arithmetic operations are
written directly in the sensible way, and the more complicated
SHA-specific operations are written as inline functions instead of
macros.
The M1 chip in the new range of Macs includes the crypto extension
that permits AES, SHA-1 and SHA-256 acceleration. But you can't find
that out by querying the ELF aux vector, because macOS isn't even
ELF-based at all, so there isn't an ELF aux vector, and no web search
I've tried has turned up any MachO thing obviously analogous to it.
Running 'sysctl -a' does show some flags indicating CPU architecture
extensions, but they're more advanced ones than this. So I think we
have to assume that if we're on the new M1 macOS at all, then we have
the basic crypto extension available.
Accordingly, I've added a special case to all the query functions that
simply returns true if defined __APPLE__.
If the autoconf/ifdef system ends up taking the trivial branch through
all the Arm-architecture ifdefs, then we define the always-fail
version of getauxval as a 'static inline' function, and then (because
none of our desired HWCAP_FOO values is defined at all) never call it.
This leads to a compiler warning because we defined a static function
and never called it - i.e. at the default -Werror, a build failure.
Of course it's perfectly sensible to define a static inline function
that never gets called! Header files do it all the time, and nobody is
expected to ensure that if they include a header file then they take
care to refer to every static inline function it defines.
But if the definition is in the _source_ file rather than a header
file, then clang (in particular on macOS) will give a warning. So the
easy solution is to move the inline definitions of getauxval into a
header file, which suppresses the warning without requiring me to faff
about with further ifdefs to make the definitions conditional on at
least one use.
With the new --open-unconditional-agent-socket option, every time
Uppity receives an SSH connection, it will immediately open a Unix-
domain socket and attempt to do agent forwarding on it, in the sense
that any connection to that socket will be turned into an
"auth-agent@openssh.com" CHANNEL_OPEN request on whichever SSH
connection it was associated with.
That connection-global socket is independent of any that are created
as part of setting up a session channel. The pathname of the socket
file is written to the server's event log (there being no other
sensible place to send it).
The aim is that this allows me to test the behaviour of an SSH client
if the server tries to open an agent-forwarding channel outside the
usual context. In particular, it allows me to test the change I just
made in the previous commit, that if you enable agent forwarding in
the client configuration, then auth-agent channels opened by the
server are accepted even if no session channel opened by the client
has sent an auth-agent-req. More importantly, it allows me to check
that I _haven't_ accidentally arranged that those channels are
accepted even when agent forwarding is _not_ permitted by the client
configuration!
Implementation details: the agent forwarding socket was previously
implemented as part of the internal sesschan structure. I've moved it
out into a little sub-struct of its own which can be created
independently of a sesschan.
Previously, the instant at which we send to the server a request to
enable agent forwarding (the "auth-agent-req@openssh.com" channel
request, or SSH1_CMSG_AGENT_REQUEST_FORWARDING) was also the instant
at which we set a flag indicating that we're prepared to accept
attempts from the server to open a channel to talk to the forwarded
agent. If the server attempts that when we haven't sent a forwarding
request, we treat it with suspicion, and reject it.
But it turns out that at least one SSH server does this, for what
seems to be a _somewhat_ sensible purpose, and OpenSSH accepts it. So,
on the basis that the @openssh.com domain suffix makes them the
arbiters of this part of the spec, I'm following their practice. I've
removed the 'agent_fwd_enabled' flag from both connection layer
implementations, together with the ConnectionLayer method that sets
it; now agent-forwarding CHANNEL_OPENs are gated only on the questions
of whether agent forwarding was permitted in the configuration and
whether an agent actually exists to talk to, and not also whether we
had previously sent a message to the server announcing it.
(The change to this condition is also applied in the SSH-1 agent
forwarding code, mostly for the sake of keeping things parallel where
possible. I think it doesn't actually make a difference in SSH-1,
because in SSH-1, it's not _possible_ for the server to try to open an
agent channel before the main channel is set up, due to the entirely
separate setup phase of the protocol.)
The use case is a proxy host which makes a secondary SSH connection to
a real destination host. A user has run into one of these recently,
announcing a version banner of "SSH-2.0-FudoSSH", which relies on
agent forwarding to authenticate the secondary connection. You connect
to the proxy host and authenticate with a username string of the form
"realusername#real.destination.host", and then, at the start of the
connection protocol, the server immediately opens a channel back to
your SSH agent which it uses to authenticate to the destination host.
And it delays answering any CHANNEL_OPEN requests from the client
until that's all done. For example (seen from the client's POV,
although the server's CHANNEL_OPEN may well have been _sent_ up front
rather than in response to the client's):
client: SSH2_MSG_CHANNEL_OPEN "session"
server: SSH2_MSG_CHANNEL_OPEN "auth-agent@openssh.com"
client: SSH2_MSG_CHANNEL_OPEN_CONFIRMATION to the auth-agent request
<- data is exchanged on the agent channel; proxy host uses
that signature to log in to the destination host ->
server: SSH2_MSG_CHANNEL_OPEN_CONFIRMATION to the session request
With PuTTY, this wasn't working, because at the point when the server
sends the auth-agent CHANNEL_OPEN, we had not yet had any opportunity
to send auth-agent-req (because that has to wait until we've had a
CHANNEL_OPEN_CONFIRMATION). So we were rejecting the server's
CHANNEL_OPEN, which broke this workflow:
client: SSH2_MSG_CHANNEL_OPEN "session"
server: SSH2_MSG_CHANNEL_OPEN "auth-agent@openssh.com"
client: SSH2_MSG_CHANNEL_OPEN_FAILURE to the auth-agent request
(hey, I haven't told you you can do that yet!)
server: SSH2_MSG_CHANNEL_OPEN_FAILURE to the session request
(in that case, no shell session for you!)
When we invent a movzx instruction as part of shift-count logging on
x86, we apparently need to set its 'translation' field to point at a
pre-existing instruction that it's logically related to. Later
versions of DynamoRIO than I was running with will complain if this
isn't done.
The callback function to pageant_enum_keys now takes a flags
parameter, which receives the flags word from the extended key list
request, if available. (If not, then the flags word is passed as
zero.)
The only callback that uses this parameter is the one for printing
text output from 'pageant -l', which uses it to print a suffix on each
line, indicating whether the key is stored encrypted only (so it will
need a passphrase on next use), or whether it's stored both encrypted
_and_ unencrypted (so that 'pageant -R' will be able to return it to
the former state).
Now, if you have a given key stored encrypted in your agent and you
say 'pageant -a [same key]' (without -E), Pageant will notice (via the
new extended key list request) that the key is currently encrypted in
the agent, and that you're trying to add it unencrypted. In this
situation it won't abort the attempt, and will try to add the key
anyway, so that it becomes decrypted in your agent.
Now, if you send SSH2_AGENTC_ADD_IDENTITY with a cleartext private key
blob, and the agent already contains an encrypted-only version of the
same key, it will drop the cleartext version in alongside it,
effectively decrypting the key as if the passphrase had been typed.
This is an extended version of SSH2_AGENTC_REQUEST_IDENTITIES, which
augments each entry in the returned key list with an extra field
containing additional data about the key.
The initial contents of that extra field are a pair of flags
indicating whether the key is currently stored in the agent encrypted,
decrypted or both.
The idea is that this will permit a Pageant-aware client to make
decisions based on that. For a start, the output key list can mention
it to the user; also, if you try to add a key unencrypted when it's
already present, we can discriminate based on whether it's already
present _unencrypted_
Somehow it had acquired a lot of internal 2-space indentation, which
is out of step with the rest of this code base's style. Before I get
into making more changes in here, let's clean it up.
When I added the psusan man page, I noticed that they've all got
impenetrable names like 'man-pl.but' to fit within 8.3 naming. But
this source base hasn't had to worry about 8.3 naming conventions in a
long time, so I think I can safely rename all those files to ones
whose purpose is more obvious.
I've been collecting actual examples of things I've used psusan for,
and now I think I have enough of them to make some kind of case for
why it's a useful tool. So I've written a man page, and dumped all my
collected examples in there.
In some applications of psusan, it's useful to establish a fixed
listening endpoint on a Unix-domain socket. You can make this happen
using an external helper program (effectively behaving like a
specialised inetd), but it's more convenient to have it built in to
psusan itself, and not really very difficult since Uppity had all the
necessary code already.
I've also added the --listen-once option from Uppity, and for good
measure, the --verbose option (so that psusan in listening mode can
show connections and disconnections on its original standard error).
'Uppity' is the name of a program that's only useful for debugging, so
I'd rather not have its name reused by psusan which I'm polishing up
to be actually useful to end users (if rather specialist ones).
So SshServerConfig now has an 'application name' field which is used
as the application name in the SSH banner, and Uppity sets it to
"Uppity" while psusan sets it to "PSUSAN".
This occurred to me recently as a (very small) hole in the logging
strategy: if the size of an allocated memory block depended on some
secret data, it certainly would change the control flow and memory
access pattern inside malloc, but since we disable logging inside
malloc, the log file from this test suite would never see the
difference.
Easily fixed by printing the size of each block in the code that
intercepts malloc and realloc. As expected, no test actually fails as
a result of filling in this gap.
On AArch64, there are unexpectedly malloc and free functions in ld.so,
so the module-load function finds them there, wraps them, and then
misses the real versions in libc.
This allows my side-channel test system to at least _compile_ on other
architectures without failing for the lack of OP_xxx enum constants,
although it now won't log all the things it needs to be a proper test.
Apart from being pointless, it also triggers a bug in OpenSSH pre-8.1
that causes it to send a repeat EXT_INFO after the rekey concludes,
which trips our quite draconian check for whether EXT_INFO has been
sent at the right time.
The OpenSSH bug: https://bugzilla.mindrot.org/show_bug.cgi?id=2929
Again in the GDK Broadway backend, where we never get a non-Unicode
translation of any keystroke: when we come to the code that handles
Ctrl+letter (and other symbols), we were basing the Ctrl transform on
output[1], that is, the non-Unicode translation we had so far. But we
didn't have one.
Now we check the use_ucsoutput flag to decide which of output and
ucsoutput to start from, and as a result, Ctrl+keys works again on
Broadway.
When you run using the GDK Broadway backend, this turns out to happen,
and it's new in my experience - I was cheerfully iterating over
event->string and calling strlen on it without ever checking it for
NULL.
I must have not recompiled with debug printouts enabled since updating
the internal printf functions to have the gcc printf attribute, or
these warnings would surely have come up before.
If it's worth having command-line options to _specify_ a log file,
it's also worth having options to avoid having to answer an
interactive prompt _about_ that log file every time.
(Particularly useful when debugging, in which I often want to run a
zillion instances of the same quite temporary command line that
involves writing a log file.)
This adds the framework to be able to send it in both client _and_
server (in the post-NEWKEYS slot); it's just that currently only the
server has anything it wants to put in it.
Uppity now announces its public key type list, which is enough by
itself to allow it to accept RFC 8332 rsa-sha2-* signatures during
userauth. (Because the key verification code receives an ssh-rsa host
key and validates it against the SHA2-based key algorithm structure
derived from the id string that was sent separately.)
The information was already centralised in find_pubkey_alg, but that
had a query-based API that couldn't enumerate the key types. Now I
expose an underlying array so that it's possible to iterate over them.
Also, I'd forgotten to add the two new rsa-sha2-* algorithms to
find_pubkey_alg. That's also done as part of this commit.
As with the userauth keys, there's a localised bodge when sending
algorithm names, where I just write a couple of extra entries into the
list when I notice that a key is RSA-typed. Then I arrange that the
selection of those entries sets the new variable s->hkflags to the
right value to pass to ssh_key_sign.
We parse enough of EXT_INFO to spot when the server advertises support
for them, and if it does, we upgrade the key algorithm name from
"ssh-rsa" to one of the other two, and set appropriate signing flags.
This doesn't actually end up using the ssh_rsa_sha256 / ssh_rsa_sha512
vtables I set up two commits ago, because it's easier to just vary the
flags word we pass to ssh_key_sign.
The upgrade is done by ad-hoc special-case code in ssh2userauth.c. I
could have done it by introducing a new ssh_keyalg vtable method for
'please upgrade to your favourite version of yourself according to
some set of flags from the BPP', but it just didn't seem like a good
idea at this stage, because it presupposes that quirks in the
algorithm selection are going to follow a consistent pattern, and I
think it's much more likely that the next weird thing in this area
will be something totally different. So I've left it as a localised
bodge for now, and we can always refactor it into something nicer once
we have more information and know what the nicer thing _is_.
We now add the appropriate advertisement to our KEXINIT that indicates
a willingness to receive EXT_INFO. Code in the BPP enforces that it
must appear in one of the permitted locations in the protocol (in
particular, this ensures a pre-key-exchange MITM can't get away with
inserting it into the initial cleartext segment of the protocol). And
when we receive it, we look through it for extension names we know
about.
No functional change (except for the advertisement in KEXINIT): we
don't yet actually do anything in response to any extension reported
in EXT_INFO.
This is the cleanest part of the RFC 8332 support: I simply add two
more RSA-based SSH-2 key algorithm vtables, both almost identical to
the existing one, with different ssh_id strings and signature flags.
Adding those to the HOSTKEY_ALGORITHMS list macro is enough to ensure
that we advertise support for the new identifiers in our client
KEXINIT, select the appropriate algorithm if the server announces one
or both of them too, and use the right version of the signature
validation.
Its boolean parameters were typed 'int', a hangover from before the
int-to-bool migration last year. It's not used in the current state of
the code base, so nobody noticed until now.
Uppity's built-in SFTP server makes up its file handle identifiers
using random_read(). But when that server is reused in psusan, which
doesn't have the random number generator enabled, you get an assertion
failure.
Introduced by commit 5891142aee, in which I invented
strbuf_shrink_to() and went round replacing lots of assignments of the
form 'sb->len = smaller_value' with calls to it. The bug is also in
0.74, because 34a0460f05 is a cherry-pick of the commit that
introduced it.
The difference between that assignment and strbuf_shrink_to is partly
that the latter checks by assertion that the new length really is
_smaller_ - it doesn't let you accidentally grow a strbuf's length
field beyond the limit of its buffer (or indeed at all). But also,
strbuf_shrink_to re-establishes the strbuf invariant that the text
logically in the buffer is always followed by a zero byte, so that
it's a valid C string.
Unfortunately, in one of the places I made this change, I was storing
binary data in the strbuf (so the terminating NUL is unimportant), and
immediately after decreasing the strbuf's length, I was doing a memcmp
one of whose arguments was the data I'd just chopped off the end of
the strbuf. So it _mattered_ that no random NUL had been splurged over
it.
Specifically, this happened in the run-length encoder used to compress
scrollback data, and had the effect that two components of the
compressed scrollback could be spuriously considered equal, if one of
them started with a legitimate zero byte and the other had a zero byte
written over it by this bug. Thanks to Michael Weller for a nice test
case that demonstrated a compressed scrollback line being decompressed
again as the wrong thing:
"NORMAL TEXT, \033[42mGREEN BACKGROUND\033[0m, NORMAL TEXT AGAIN"
If the above line is printed to the terminal (after being decoded as
if it was a C string literal), then only the words "GREEN BACKGROUND"
get a green background. But after that line is scrolled off the top of
the window, if you find it in the scrollback, then the rest of the
line to the right has also become green-backgrounded due to this bug.
This is a piece of conditioned-out code that I haven't used since I
originally invented the compressed scrollback format, which
decompressed every scrollback line immediately after compressing it to
check that the round-trip conversion worked. Now I have occasion to
actually use it, I find that (of course) changes around it have made
it not quite work any more: the thing the diagnostic code is passing
to decompressline hasn't had its length field filled in yet, because
that gets done 20 lines later.
Now you can compile with -DTERM_CC_DIAGS again, and it doesn't crash
_unless_ it detects the actual bug it was intended to spot.
We use this for detecting the Arm crypto extension and using it to
enable accelerated AES and/or SHA-{1,2}. Previously, I had code that
called glibc's getauxval(3) function, conditioned on #ifdef __linux__.
Now, instead, I do an autoconf test to query the presence of getauxval
itself (so that any other system with the same API can still work),
and alongside it, also check for the analogous FreeBSD libc function
elf_aux_info(3). As a result, building on Arm FreeBSD now gets the
accelerated-crypto autodetection.
I carefully set a 'finished' flag in the main source file on receipt
of the server_instance_terminated() callback, and then I plain forgot
to hook it up to the uxcliloop callback that says whether the program
should carry on running each time round the main loop. Now we actually
check the finished flag, and terminate the program if it's set.
We keep an internal 128-bit counter that's used as part of the hash
preimages. There's no real need to import all the mp_int machinery in
order to implement that: we can do it by hand using a small fixed-size
array and a trivial use of BignumADC. This is another inter-module
dependency that's easy to remove and useful to spinoff programs.
This changes the hash preimage calculation in the PRNG, because we're
now formatting our 128-bit integer in the fixed-length representation
of 16 little-endian bytes instead of as an SSH-2 mpint. This is
harmless (perhaps even mildly beneficial, due to the length now not
depending on how long the PRNG has been running), but means I have to
update the PRNG tests as well.
For use in spinoff programs: this is an alternative to proxy.c, which
provides the same API (to avoid link failures in modules like
x11fwd.c) but implements it in the trivial way, supporting no proxying
at all and just wrapping the underlying sk_new() and friends.