The immediate usefulness of this is in pterm.exe: when the user uses
-e to specify a command to run in the pterm, we retrieve the command
in Unicode, store it in CONF_remote_cmd as UTF-8, and then in conpty.c
we can extract it in the same form and convert it back to Unicode to
pass losslessly to CreateProcessW. So now non-ACP Unicode works in
that part of the pterm command line.
This is the pathfinding change that proves it's possible for _one_
Conf setting to become Unicode-capable.
That seems like quite a small reward for all the refactoring in the
previous patches this week! But changing over one configuration
setting is enough to get started with: once all the bugs are out of
this one, we can try switching over some more.
Changing the type to CONF_TYPE_STR_AMBI is enough by itself to make
the configuration dialog box write it into Conf as UTF-8, because
conf_editbox_handler automatically checks whether that possibility is
available. However, setting the same Conf entry from the command line
isn't automatic: I had to add code in the handler for the -l
command-line option in cmdline.c.
This commit also doesn't yet handle the _other_ way to specify a
username on the command line: including it as part of the hostname
argument via "putty user@host" or similar. That's more difficult,
because it also requires deciding what to do about UTF-8 in the actual
hostname.
(That looks as if it ought to be possible: Windows should be able to
handle looking up Unicode hostnames if you use GetAddrInfoW() in place
of getaddrinfo(). But plumbing it through everything in between
cmdline.c and windows/network.c is a bigger job than I'm prepared to
do in this proof-of-concept commit.)
This begins the process of making PuTTY more able to handle Unicode
strings as a first-class type in its configuration. One of the new
types, CONF_TYPE_UTF8, looks physically just like CONF_TYPE_STR but
the semantics are that it's definitely encoded in UTF-8, instead of
'shrug, whatever the system locale's encoding is'.
Unfortunately, we can't yet switch over any Conf items to having that
type, because our data representations in saved configuration (both on
Unix and Windows) store char strings in the system encoding. So we'll
have to change that representation at the same time, which risks
breaking backwards compatibility with old PuTTYs reading the same
configuration.
So the other new type, CONF_TYPE_STR_AMBI, is intended as a
transitional form, recording a configuration setting that _might_ be
explicitly UTF-8 or might have the legacy 'shrug, whatever' semantics,
depending on where we got it from.
My general migration plan is that first I _enable_ Unicode support in
a Conf item, by turning it into STR_AMBI; the Unicode version of the
string (if any) is saved in a new location, and a best-effort
local-charset version is saved where it's always been. That way new
PuTTY can read the Unicode version, and old PuTTY reading that
configuration will behave no worse than it would have done already.
It would be nice to think that in the far future we've migrated
everything to STR_AMBI and can move them all to mandatory UTF-8,
obsoleting the old configuration. I think it's more likely we'll never
get there. But at least _new_ Conf items, with no backwards
compatibility requirement in the first place, can be CONF_TYPE_UTF8
where appropriate.
(In conf_get_str_ambi(), I considered making it mandatory via assert()
to pass the 'utf8' output pointer as non-NULL, to defend against lazy
adaptation of existing code by just changing the function call. But in
fact I think there's a legitimate use case for not caring if the
output is UTF-8 or not, because some of the existing SSH code
currently just shoves strings like usernames directly on to the wire
whether they're in the right encoding or not; so if you want to do the
correct UTF-8 thing where possible and preserve legacy behaviour if
not, then treating both classes of string the same _is_ the right
thing to do.)
This also requires linking the Unicode support into many Unix
applications that hadn't previously needed it.
You could reproduce this, for example, by cutting the final line
reading "---- END SSH2 PUBLIC KEY ----" off the end of a file, and
feeding it to Unix 'puttygen -l'.
rfc4716_loadpub() was looping round on get_chomped_line() until it
found a line starting with "-" after the base64 data. But it failed to
check for the end of the file as well, so if the data was truncated,
it would just keep spinning at the end of the file.
A user sent a transcript from a curses-based tool 'ncmpc', which
carefully disables terminal autowrap when printing a character in the
bottom right corner of the display, and then turns it back on again.
After that, it expects that sending the backspace character really
moves the cursor back a space, instead of clearing the wrapnext flag.
But in PuTTY, we set the wrapnext flag even if we're not in wrapping
mode - it just doesn't _do_ anything when the next character is sent.
But it remains set, and still affects backspace. So the display is
corrupted by this change of expectation.
(Specifically, ncmpc is printing a time display [m:ss] in the very
bottom right, so it disables wrap in order to print the final ']'.
Then the next thing it needs to do is to update the low-order digit of
the seconds field, so it sends \b as the simplest way to get to that
character. The effect on the display is that the updated seconds digit
appears where the ] was, instead of overwriting the old seconds digit.)
This is a tradeoff in desirable behaviours. The point of having a
backspace operation cancel the wrapnext flag and not actually move the
cursor is to preserve the invariant that sending 'x', backspace, 'y'
causes the y to overprint the x, even if that happens near the end of
the terminal's line length. In non-wrapping mode that invariant was
bound to break _eventually_, but with this change, it breaks one
character earlier than before. However, I think that's less bad than
breaking the expectations of curses-based full-screen applications,
especially since the _main_ need for that invariant arises from naïve
applications that don't want to have to think about the terminal width
at all - and those applications generally run in _wrapping_ mode,
where it's possible to continue the invariant across multiple lines in
any case.
Python 3.12 has a new warning for backslash-character pairs that are not
valid escape sequences at the level of string literals, as opposed to in
some interior syntax such as regular expressions
(https://docs.python.org/3/whatsnew/3.12.html#other-language-changes).
Suppress it by using raw strings.
The "rsa-sha2-256" and "rsa-sha2-512" algorithms, as defined by RFC
8332, differ in one detail from "ssh-rsa" in addition to the change of
hash function. They also specify that the signature integer should be
encoded using the same number of bytes as the key modulus, even if
that means giving it a leading zero byte (or even more than one).
I hadn't noticed this, and had assumed that unrelated details wouldn't
have changed. But they had. Thanks to Ilia Mirkin for pointing this
out.
Nobody has previously reported a problem, so very likely most servers
are forgiving of people making this mistake! But now it's been pointed
out, we should comply with the spec. (Especially since the new spec is
more sensible, and only historical inertia justified sticking to the
old one.)
These were deliberately thrown away in our UTF-8 decoder, with a
comment apparently introduced by RDB in the 2001 big Unicode patch.
The purpose of this character range has changed completely since then,
and now they act as modifier characters on top of U+1F3F4 to construct
a space of flags (the standard examples being those of England,
Scotland and Wales). We were failing to display those flags, and even
pasting out of the terminal didn't give back the right Unicode.
This is the only one of the newly added cases in test/utf8.txt which I
can (try to) fix unilaterally just by changing PuTTY's display code,
because it doesn't change the number of character cells occupied by
the text, only the appearance of those cells.
In this commit I make the necessary changes in terminal.c, which makes
flags start working in GTK PuTTY and pterm, but not on Windows.
The system of encoding flags in Unicode is that there's a space of 26
regional-indicator letter code points (U+1F1E6 to U+1F1FF inclusive)
corresponding to the unaccented Latin alphabet, and an adjacent pair
of those letters represents the flag associated with that two-letter
code (usually a nation, although at least one non-nation pair exists,
namely EU).
There are two plausible ways we could handle this in terminal.c:
(a) leave the regional indicators as they are in the internal data
model, so that each RI letter occupies its own character cell,
and at display time have do_paint() spot adjacent pairs of them
and send each pair to the frontend as a combined glyph.
(b) combine the pairs _in_ the internal data model, by
special-casing them in term_display_graphic_char().
This choice makes a semantic difference. What if a flag is displayed
in the terminal and something overprints one of its two character
cells? With option (a), overprinting one cell of an RI pair with a
different RI letter would change it into a different flag; with
option (b), flags behave like any other wide character, in that
overprinting one of the two cells blanks the other as a side effect.
I think we need (a), because not all terminal redraw systems
(curses-style libraries) will understand the Unicode flag glyph system
at all. So if a full-screen application genuinely wants to do a screen
redraw in which a flag changes to a different flag while keeping one
of its constituent letters the same (say, swapping between BA and CA,
or between AC and AD), then the redraw library might very well
implement that screen update by redrawing only the changed letter, and
we need not to corrupt the flag.
All of this is now implemented in terminal.c. The effect is that pairs
of RI characters are passed to the TermWin draw_text() method as if
they were a wide character with a combining mark: that is, you get a
two-character (or four-surrogate) string, with TATTR_COMBINING
indicating that it represents a single glyph, and ATTR_WIDE indicating
that that glyph occupies two character cells rather than one.
In GTK, that's enough to make flag display Just Work. But on
Windows (at least the Win10 machine I have to test on), that doesn't
make flags start working all by itself. But then, the rest of the new
emoji tests also look a bit confused on Windows too. Help would be
welcome from someone who knows how Windows emoji display is supposed
to work!
These samples all come from the 'emoji' parts of Unicode, although I
use the word a bit loosely because I'm not sure that flags count (they
have their own special system). But they're all things that ought to
display via a separate font, likely in colour.
The second line of this extra test already looks correct in PuTTY:
three code points each representing an emoji, for which wcwidth()
correctly reports that they occupy 2 cells each. On GTK, the emoji
even appear in colour; on Windows they come out in black and
white. (And I don't know what I can do to fix that; the problem is not
that I don't have any emoji font installed. I do.)
The first line consists of 'simpler' emoji in the sense of being more
common, but technically more complicated, because they're ordinary
Unicode characters such as U+2764 HEAVY BLACK HEART, modified into
emoji by U+FE0F VARIATION SELECTOR-16. This goes badly because
wcwidth() measures the primary character as having width 1 (which it
would do, by itself), and the variation selector as width 0 (also not
unreasonable), but the total is 1, where you'd like it to be 2. This
is also difficult to fix, because if we unilaterally changed it then
every curses-type library would mispredict the cursor position and
produce display corruption during partial screen redraws!
The third line uses a mechanism I've only found out about recently:
U+200D ZERO WIDTH JOINER glues together two code points that would
each be a valid emoji on its own, to make a single combined one. In
this case, WOMAN + PERSONAL COMPUTER ought to combine into a woman
using a computer. Again this doesn't work in PuTTY, which knows
nothing about ZWJ. But it comes out as expected in other tools viewing
this file, such as 'gedit', or Firefox.
The fourth line shows another complex emoji case: the WOMAN code point
is followed by U+1F3FB EMOJI MODIFIER FITZPATRICK TYPE-1-2, and
another one is followed by U+1F3FF EMOJI MODIFIER FITZPATRICK TYPE-6,
in each case selecting the woman's skin tone. PuTTY mishandles that
too, because it doesn't know that those should act as modifiers (again
because wcwidth gives them width 2 rather than 0), and so each one
occupies an extra two character cells.
And the last line contains some sample flags, each of which is
obtained by writing a 2-letter code for a country or region (here GB,
UA, EU) with each Latin letter replaced by the appropriate 'regional
indicator symbol letter' from the 26-code-point range U+1F1E6 to
U+1F1FF inclusive. PuTTY doesn't know anything about those either, but
they at least occupy the right number of cells if handled naïvely, so
_that_ one might be possible to fix!
I used this to confirm that the previous nonces generated by
dsa_gen_k() were indeed biased, and to check that the new RFC6979 ones
don't have the same problem.
Recovering the DSA nonce value is equivalent to recovering the private
key. One way round, this is well known: if you leak or reuse a nonce,
your private key is compromised. But the other direction of the
equivalence is also true - if you know the private key and have a
signed message, you can retrieve the input nonce. This is much less
obviously useful (certainly not to an attacker), but I found it
convenient for this particular test purpose, because it can operate on
the standard SSH data formats, without needing special access into the
signing algorithm to retrieve its internal variables. So I was able to
run this script unchanged against the 'before' and 'after' versions of
testcrypt, and observe the difference.
These tests also failed when I reran testsc, and looking at the code,
no wonder: in each test iteration, the hash object is allocated
_before_ logging begins, rather than after, so that its addresses
aren't normalised by the test suite to 'n bytes after allocation #0'.
So these tests only pass as long as all the allocations get lucky in
reusing the same address. I guess we got lucky on all previous
occasions and didn't notice until now.
Easy fix: now each iteration does alloc / do stuff / free within the
logged section.
This fixes a vulnerability that compromises NIST P521 ECDSA keys when
they are used with PuTTY's existing DSA nonce generation code. The
vulnerability has been assigned the identifier CVE-2024-31497.
PuTTY has been doing its DSA signing deterministically for literally
as long as it's been doing it at all, because I didn't trust Windows's
entropy generation. Deterministic nonce generation was introduced in
commit d345ebc2a5, as part of the initial version of our DSA
signing routine. At the time, there was no standard for how to do it,
so we had to think up the details of our system ourselves, with some
help from the Cambridge University computer security group.
More than ten years later, RFC 6979 was published, recommending a
similar system for general use, naturally with all the details
different. We didn't switch over to doing it that way, because we had
a scheme in place already, and as far as I could see, the differences
were not security-critical - just the normal sort of variation you
expect when any two people design a protocol component of this kind
independently.
As far as I know, the _structure_ of our scheme is still perfectly
fine, in terms of what data gets hashed, how many times, and how the
hash output is converted into a nonce. But the weak spot is the choice
of hash function: inside our dsa_gen_k() function, we generate 512
bits of random data using SHA-512, and then reduce that to the output
range by modular reduction, regardless of what signature algorithm
we're generating a nonce for.
In the original use case, this introduced a theoretical bias (the
output size is an odd prime, which doesn't evenly divide the space of
2^512 possible inputs to the reduction), but the theory was that since
integer DSA uses a modulus prime only 160 bits long (being based on
SHA-1, at least in the form that SSH uses it), the bias would be too
small to be detectable, let alone exploitable.
Then we reused the same function for NIST-style ECDSA, when it
arrived. This is fine for the P256 curve, and even P384. But in P521,
the order of the base point is _greater_ than 2^512, so when we
generate a 512-bit number and reduce it, the reduction never makes any
difference, and our output nonces are all in the first 2^512 elements
of the range of about 2^521. So this _does_ introduce a significant
bias in the nonces, compared to the ideal of uniformly random
distribution over the whole range. And it's been recently discovered
that a bias of this kind is sufficient to expose private keys, given a
manageably small number of signatures to work from.
(Incidentally, none of this affects Ed25519. The spec for that system
includes its own idea of how you should do deterministic nonce
generation - completely different again, naturally - and we did it
that way rather than our way, so that we could use the existing test
vectors.)
The simplest fix would be to patch our existing nonce generator to use
a longer hash, or concatenate a couple of SHA-512 hashes, or something
similar. But I think a more robust approach is to switch it out
completely for what is now the standard system. The main reason why I
prefer that is that the standard system comes with test vectors, which
adds a lot of confidence that I haven't made some other mistake in
following my own design.
So here's a commit that adds an implementation of RFC 6979, and
removes the old dsa_gen_k() function. Tests are added based on the
RFC's appendix of test vectors (as many as are compatible with the
more limited API of PuTTY's crypto code, e.g. we lack support for the
NIST P192 curve, or for doing integer DSA with many different hash
functions). One existing test changes its expected outputs, namely the
one that has a sample key pair and signature for every key algorithm
we support.
While trying to get an upcoming piece of code through testsc, I had
trouble - _yet again_ - with the way that control flow diverges inside
the glibc implementations of functions like memcpy and memset,
depending on the alignment of the input blocks _above_ the alignment
guaranteed by malloc, so that doing the same sequence of malloc +
memset can lead to different control flow. (I believe this is done
either for cache performance reasons or SIMD alignment requirements,
or both: on x86, some SIMD instructions require memory alignment
beyond what malloc guarantees, which is also awkward for our x86
hardware crypto implementations.)
My previous effort to normalise this problem out of sclog's log files
worked by wrapping memset and all its synonyms that I could find. But
this weekend, that failed for me, and the reason appears to be ifuncs.
I'm aware of the great irony of committing code to a security project
with a log message saying something vague about ifuncs, on the same
weekend that it came to light that commits matching that description
were one of the methods used to smuggle a backdoor into the XZ Utils
project (CVE-2024-3094). So I'll bend over backwards to explain both
what I think is going on, and why this _isn't_ a weird ifunc-related
backdooring attempt:
When I say I 'wrap' memset, I mean I use DynamoRIO's 'drwrap' API to
arrange that the side-channel test rig calls a function of mine before
and after each call to memset. The way drwrap works is to look up the
symbol address in either the main program or a shared library; in this
case, it's a shared library, namely libc.so. Then it intercepts call
instructions with exactly that address as the target.
Unfortunately, what _actually_ happens when the main program calls
memset is more complicated. First, control goes to the PLT entry for
memset (still in the main program). In principle, that loads a GOT
entry containing the address of memset (filled in by ld.so), and jumps
to it. But in fact the GOT entry varies its value through the program;
on the first call, it points to a resolver function, whose job is to
_find out_ the address of memset. And in the version of libc.so I'm
currently running, that resolver is an STT_GNU_IFUNC indirection
function, which tests the host CPU's capabilities, and chooses an
actual implementation of memset depending on what it finds. (In my
case, it looks as if it's picking one that makes extensive use of x86
SIMD.) To avoid the overhead of doing this on every call, the returned
function pointer is then written into the main program's GOT entry for
memset, overwriting the address of the resolver function, so that the
_next_ call the main program makes through the same PLT entry will go
directly to the memset variant that was chosen.
And the problem is that, after this has happened, none of the new
control flow ever goes near the _official_ address of memset, as read
out of libc.so's dynamic symbol table by DynamoRIO. The PLT entry
isn't at that address, and neither is the particular SIMD variant that
the resolver ended up choosing. So now my wrapper on memset is never
being invoked, and memset cheerfully generates different control flow
in runs of my crypto code that testsc expects to be doing exactly the
same thing as each other, and all my tests fail spuriously.
My solution, at least for the moment, is to completely abandon the
strategy of wrapping memset. Instead, let's just make it behave the
same way every time, by forcing all the affected memory allocations to
have extra-strict alignment. I found that 64-byte alignment is not
good enough to eliminate memset-related test failures, but 128-byte
alignment is.
This would be tricky in itself, if it weren't for the fact that PuTTY
already has its own wrapper function on malloc (for various reasons),
which everything in our code already uses. So I can divert to C11's
aligned_alloc() there. That in turn is done by adding a new #ifdef to
utils/memory.c, and compiling it with that #ifdef into a new object
library that is included in testsc, superseding the standard memory.o
that would otherwise be pulled in from our 'utils' static library.
With the previous memset-compensator removed, this means testsc is now
dependent on having aligned_alloc() available. So we test for it at
cmake time, and don't build testsc at all if it can't be found. This
shouldn't bother anyone very much; aligned_alloc() is available on
_my_ testsc platform, and if anyone else is trying to run this test
suite at all, I expect it will be on something at least as new as
that.
(One awkward thing here is that we can only replace _new_ allocations
with calls to aligned_alloc(): C11 provides no aligned version of
realloc. Happily, this doesn't currently introduce any new problems in
testsc. If it does, I might have to do something even more painful in
future.)
So, why isn't this an ifunc-related backdoor attempt? Because (and you
can check all of this from the patch):
1. The memset-wrapping code exists entirely within the DynamoRIO
plugin module that lives in test/sclog. That is not used in
production, only for running the 'testsc' side-channel tester.
2. The memset-wrapping code is _removed_ by this patch, not added.
3. None of this code is dealing directly with ifuncs - only working
around the unwanted effects on my test suite from the fact that
they exist somewhere else and introduce awkward behaviour.
These are now specified in conf.h and filled in by automated code,
which means test_conf can make sure we didn't forget to provide them.
The default for a mapping type (not that we currently have any unsaved
ones) is expected to be empty.
Also, while adding test_conf checks, I realised I hadn't filled in the
rest of the comment in conf.h. Belatedly updated that.
This allows a couple more settings to be treated automatically on
save, which are more complicated on load because they still honour
older alternative save keywords.
In particular, CONF_proxy_type and CONF_remote_qtitle_action now have
explicit enum mappings. These were needed for the automated save code,
but also, I've rewritten the custom load code to use them too. This
decouples the storage format of those settings from the order of
values in the internal enum, which is generally an advantage of
specifying storage enums explicitly.
Those two settings weren't already tested by test_conf, because I
wasn't changing them in previous commits. Now I've added extra code
that does test them, and verified it works when backported to commit
b567c9b2b5 where I introduced test_conf before beginning the main
refactoring.
A setting can also be specified explicitly as not loaded and saved at
all. There were quite a few commented that way, but now there's a
machine-readable indication of it.
test_conf will now check that all these settings make sense together -
things shouldn't have a save keyword unless they use it, and should
have one if they don't, and shouldn't specify combinations of options
that conflict.
(For that reason, test_conf is now also running the consistency check
before the main test, so that a missing keyword will cause an error
message _before_ it causes a segfault, saving some debugging!)
This is why I wrote conf.h in the form of macros that expanded to
named structure field assignments, instead of just filling it with
named structure field assignments directly. This way, I can #include
the same file again with different macro definitions, and build up a
list of what fields were set in what config options.
This new code checks that if a config option has a default, then the
type of the default matches the declared type of the option value
itself. That's what caught the two goofs in the previous commit.
This is also the part of test_conf that I _won't_ want to delete once
I've finished with the refactoring: it can stay there forever, doing
type checking at test time that the compiler isn't doing for me at
build time.
Ahem. Of course I've been running it interactively until now, so I
never noticed that I'd forgotten to fill in that important point. But
now it's run as part of my build, it should make sure to fail if it
fails.
This aims to be a reasonably exhaustive test of what happens if you
set Conf values to various things, and then save your session, and
find out what ends up in the storage. Or vice versa.
Currently, the test program is written to match the existing
behaviour. The idea is that I can refactor the code that does the
loading and saving, and if this test still passes, I've probably done
it right.
However, in the long term, this test will be a liability: it's yet
another place you have to add every new config option. So my plan is
to get rid of it again once the refactorings I'm planning are
finished.
Or rather, I'll get rid of _that_ part of its functionality. I also
suspect I'll have added new kinds of consistency check by then, which
won't be a liability in the same way, and which I'll want to keep.
This can now happen if, for instance, the CLMUL implementation of
aesgcm is compiled in, but not available at runtime because we're on
an old Intel CPU.
In this situation, testcrypt would segfault when driven by
test/cryptsuite.py, and test/list-accel.py would erroneously claim the
CLMUL implementation was available when it wasn't.
I saw a post on comp.security.ssh just now where someone had
encountered an SSH server that would _only_ speak that, which makes it
worth bothering to implement.
The totally obvious implementation works, and passes the test cases
from RFC 6234.
(cherry picked from commit b77e985513)
I saw a post on comp.security.ssh just now where someone had
encountered an SSH server that would _only_ speak that, which makes it
worth bothering to implement.
The totally obvious implementation works, and passes the test cases
from RFC 6234.
These seem likely to carry on being useful, so let's make sure they
pass before allowing any build to complete successfully. I've added
code to both test programs to return a sensible exit status indicating
pass/fail, and added runs of both to Buildscr.
Another bug turned up by writing tests. The code that spots that the
character won't fit, and wraps it to the next line setting
LATTR_WRAPPED2, was not checking that wrap mode was _enabled_ before
doing that. So if you printed a DW character in the rightmost column
while the terminal was in non-auto-wrap mode, you'd get an unwanted
wrap.
Other terminals disagree on what to do here. xterm leaves the cursor
in the same place and doesn't print any character at all.
gnome-terminal, on the other hand, backspaces by a character so that
it _can_ print the requested DW character, in the rightmost _two_
columns.
I think I don't much like either of those, so instead I'm using the
same fallback we use for displaying a DW character when the whole
terminal is only one column wide: if there is physically no room to
print the requested character, turn it into U+FFFD REPLACEMENT
CHARACTER.
This is the first bug found as a direct result of writing that
terminal test program - I added some tests for things I expected to
work already, and some of them didn't, proving immediately that it was
a good idea!
If the terminal is one column wide, and you've printed a
character (hence, set the wrapnext flag), what should backspace do?
Surely it should behave like any other backspace with wrapnext set,
i.e. clear the wrapnext flag, returning the cursor's _logical_
position to the location of the most recently printed character. But
in fact it was anti-wrapping to the previous line, because I'd got the
cases in the wrong order in the if-else chain that forms the backspace
handler. So the handler for 'we're in column 0, wrapping time' was
coming before 'wrapnext is set, just clear it'.
Now wrapnext is checked _first_, before checking anything at all. Any
time we can just clear that, we should.
Suppose an application tries to print a double-width character
starting in the rightmost column of the screen, so that we apply our
emergency fix of wrapping to the next line immediately and printing
the character in the first two columns. Suppose they then backspace
twice, taking the cursor to the RHS and then the LHS of that
character. What should happen if they backspace a third time?
Our previous behaviour was to completely ignore the unusual situation,
and do the same thing we'd do in any other backspace from column 0:
anti-wrap the cursor to the last column of the previous line, leaving
it in the empty character cell that was skipped when the DW char
couldn't be printed in it.
But I think this isn't the best response, because it breaks the
invariant that printing N columns' worth of graphic characters and
then backspacing N times should leave the cursor on the first of those
characters. If I print "a가" (for example) and then backspace three
times, I want the cursor on the a, _even_ if weird line wrapping
behaviour happened somewhere in that sequence.
(Rationale: this helps naïve terminal applications which don't even
know what the terminal width is, and aren't tracking their absolute x
position. In particular, the simplistic line-based input systems that
appear in OS kernels and our own lineedit.c will want to emit a fixed
number of backspace-space-backspace sequences to delete characters
previously entered on to the line by the user. They still need to
check the wcwidth of the characters they're emitting, so that they can
BSB twice for a DW character or 0 times for a combining one, but it
would be *hugely* more awkward for them to ask the terminal where the
cursor is so that they can take account of difficult line wraps!)
We already have the ability to _recognise_ this situation: on a line
that was wrapped in this unusual way, we set the LATTR_WRAPPED2 line
attribute flag, to prevent the empty rightmost column from injecting
an unwanted space into copy-pastes from the terminal. Now we also use
the same flag to cause the backspace control character to do something
interesting.
This was the fix that inspired me to start writing test_terminal,
because I knew it was touching a delicate area. However, in the course
of writing this fix and its tests, I encountered two (!) further bugs,
which I'll fix in followup commits!
This has all the basic necessities to become a test of the terminal's
behaviour, in terms of how its data structures evolve as output is
sent to it, and perhaps also (by filling in the stub TermWin more
usefully) testing what it draws during updates and what it sends in
response to query sequences.
For the moment, all I've done is to set up the framework, and add one
demo test of printing some ordinary text and observing that it appears
in the data structures and the cursor has moved.
I expect that writing a full test of terminal.c will be a very big
job. But perhaps I or someone else will find time to prod it gradually
in the background of other work. In particular, when I'm _modifying_
any part of the terminal code, it would be good to add some tests for
the part I'm changing, before making the change, and check they still
work afterwards.
This takes over from both the implementation in ldisc.c and the one in
term_get_userpass_input, which were imperfectly duplicating each
other's functionality. The new version should be more consistent
between the two already, and also, it means further improvements can
now be made in just one place.
In the course of this, I've restructured the inside of ldisc.c by
moving the input_queue bufchain to the other side of the translation
code in ldisc_send. Previously, ldisc_send received a string, an
optional 'dedicated key' indication (bodgily signalled by a negative
length) and an 'interactive' flag, translated that somehow into a
combination of raw backend output and specials, and saved the latter
in input_queue. Now it saves the original (string, dedicated flag,
interactive flag) data in input_queue, and doesn't do the translation
until the data is pulled back _out_ of the queue. That's because the
new line editing system expects to receive something much closer to
the original data format.
The term_get_userpass_input system is also substantially restructured.
Instead of ldisc.c handing each individual keystroke to terminal.c so
that it can do line editing on it, terminal.c now just gives the Ldisc
a pointer to its instance of the new TermLineEditor object - and then
ldisc.c can put keystrokes straight into that, in the same way it
would put them into its own TermLineEditor, without having to go via
terminal.c at all. So the term_get_userpass_input edifice is only
called back when the line editor actually delivers the answer to a
username or password prompt.
(I considered not _even_ having a separate TermLineEditor for password
prompts, and just letting ldisc.c use its own. But the problem is that
some of the behaviour differences between the two line editors are
deliberate, for example the use of ^D to signal 'abort this prompt',
and the use of Escape as an alternative line-clearing command. So
TermLineEditor has a flags word that allows ldisc and terminal to set
it up differently. Also this lets me give the two TermLineEditors a
different vtable of callback functions, which is a convenient way for
terminal.c to get notified when a prompt has been answered.)
The new line editor still passes all the tests I wrote for the old
one. But it already has a couple of important improvements, both in
the area of UTF-8 handling:
Firstly, when we display a UTF-8 character on the terminal, we check
with the terminal how many character cells it occupied, and then if
the user deletes it again from the editing buffer, we can emit the
right number of backspace-space-backspace sequences. (The old ldisc
line editor incorrectly assumed all Unicode characters had terminal
with 1, partly because its buffer was byte- rather than character-
oriented and so it was more than enough work just finding where the
character _start_ was.)
Secondly, terminal.c's userpass line editor would never emit a byte in
the 80-BF range to the terminal at all, which meant that nontrivial
UTF-8 characters always came out as U+FFFD blobs!
I'm about to rewrite it completely, so the first thing I need to do is
to write tests for as much of the functionality as possible, so that I
can check the new implementation behaves in the same ways.
Similarly to the one I just added for FontSpec: in a cross-platform
main source file, you don't really want to mess about with
per-platform ifdefs just to initialise a 'struct unicode_data' from a
Conf. But until now, you had to, because init_ucs had a different
prototype on Windows and Unix.
I plan to use this in future test programs. But an immediate positive
effect is that it removes the only platform-dependent call from
fuzzterm.c. So now that could be built on Windows too, given only an
appropriate cmake stanza. (Not that I have much idea if it's useful to
fuzz the terminal separately on multiple platforms, but it's nice to
know that it's possible if anyone does need to.)
Constructing a FontSpec in platform-independent code is awkward,
because you can't call fontspec_new() outside the platform subdirs
(since its prototype varies per platform). But sometimes you just need
_some_ valid FontSpec, e.g. to put in a Conf that will be used in some
place where you don't actually care about font settings, such as a
purely CLI program.
Both Unix and Windows _have_ an idiom for this, but they're different,
because their FontSpec constructors have different prototypes. The
existing CLI tools have always had per-platform main source files, so
they just use the locally appropriate method of constructing a boring
don't-care FontSpec.
But if you want a _platform-independent_ main source file, such as you
might find in a test program, then that's rather awkward. Better to
have a platform-independent API for making a default FontSpec.
I just happened to notice ARG1 and ARGN in the code that builds the
dispatch table in process_line(), which aren't used at all, because
they date from a previous version of the testcrypt-func.h macro
system. They were supposed to be replaced everywhere with the unified
ARG.
So why didn't the missing definition of ARG break anything? Because
ARG only ever appears in the variadic part of a FUNC_INNER call - and
for this particular trawl of testcrypt-func.h, the variadic part isn't
ever used in the macro expansion in the first place. So there's no
need to define ARG and VOID to anything at all, not even the empty
string.
I only recently found out that OpenSSH defined their own protocol IDs
for AES-GCM, defined to work the same as the standard ones except that
they fixed the semantics for how you select the linked cipher+MAC pair
during key exchange.
(RFC 5647 defines protocol ids for AES-GCM in both the cipher and MAC
namespaces, and requires that you MUST select both or neither - but
this contradicts the selection policy set out in the base SSH RFCs,
and there's no discussion of how you resolve a conflict between them!
OpenSSH's answer is to do it the same way ChaCha20-Poly1305 works,
because that will ensure the two suites don't fight.)
People do occasionally ask us for this linked cipher/MAC pair, and now
I know it's actually feasible, I've implemented it, including a pair
of vector implementations for x86 and Arm using their respective
architecture extensions for multiplying polynomials over GF(2).
Unlike ChaCha20-Poly1305, I've kept the cipher and MAC implementations
in separate objects, with an arm's-length link between them that the
MAC uses when it needs to encrypt single cipher blocks to use as the
inputs to the MAC algorithm. That enables the cipher and the MAC to be
independently selected from their hardware-accelerated versions, just
in case someone runs on a system that has polynomial multiplication
instructions but not AES acceleration, or vice versa.
There's a fourth implementation of the GCM MAC, which is a pure
software implementation of the same algorithm used in the vectorised
versions. It's too slow to use live, but I've kept it in the code for
future testing needs, and because it's a convenient place to dump my
design comments.
The vectorised implementations are fairly crude as far as optimisation
goes. I'm sure serious x86 _or_ Arm optimisation engineers would look
at them and laugh. But GCM is a fast MAC compared to HMAC-SHA-256
(indeed compared to HMAC-anything-at-all), so it should at least be
good enough to use. And we've got a working version with some tests
now, so if someone else wants to improve them, they can.
This provides a convenient hook to be called between SSH messages, for
the crypto components to do any per-message processing like
incrementing a sequence number.
Not sure how I missed this! I tested ChaCha20, but not the MAC that
goes with it. Happily, it passes, so no harm done.
This also involved adding a general framework for testing MACs that
are tied to a specific cipher: we have to allocate, key and IV the
cipher before attempting to use the MAC, and free it all afterwards.
Instead of having separate subsidiary list macros for all the AES-NI
or NEON accelerated ciphers, the main list macro now contains each
individual thing conditionalised under an IF_FOO macro defined at the
top.
Makes relatively little difference in the current state of things, but
it will make it easier to do lots of differently conditionalised
single entries in a list, which will be coming up shortly.
This causes cmake to stop whinging that there isn't one. More
usefully, by specifying the LANGUAGES keyword as just C (rather than
the default of both C and CXX), the cmake configure step is sped up by
not having to faff about finding a C++ compiler.
The length test was pasted from the ordinary decrypt function, when it
should have been pasted from encrypt_length (which got this right).
I've never tried to test those functions before, so I never noticed.
OpenSSH, when called on to give the fingerprint of a certified public
key, will in many circumstances generate the hash of the public blob
of the _underlying_ key, rather than the hash of the full certificate.
I think the hash of the certificate is also potentially useful (if
nothing else, it provides a way to tell apart multiple certificates on
the same key). But I can also see that it's useful to be able to
recognise a key as the same one 'really' (since all certificates on
the same key share a private key, so they're unavoidably related).
So I've dealt with this by introducing an extra pair of fingerprint
types, giving the cross product of {MD5, SHA-256} x {base key only,
full certificate}. You can manually select which one you want to see
in some circumstances (notably PuTTYgen), and in others (such as
diagnostics) both fingerprints will be emitted side by side via the
new functions ssh2_double_fingerprint[_blob].
The default, following OpenSSH, is to just fingerprint the base key.
This patch fixes a few other whitespace and formatting issues which
were pointed out by the bulk-reindent or which I spotted in passing,
some involving manual editing to break lines more nicely.
I think the weirdest hunk in here is the one in windows/window.c
TranslateKey() where _half_ of an assignment statement inside an 'if'
was on the same line as the trailing paren of the if condition. No
idea at all how that one managed to happen!
My bulk indentation check also turned up a lot of cases where a run-on
function call or if statement didn't have its later lines aligned
correctly relative to the open paren.
I think this is quite easy to do by getting things out of
sync (editing the first line of the function call and forgetting to
update the rest, perhaps even because you never _saw_ the rest during
a search-replace). But a few didn't quite fit into that pattern, in
particular an outright misleading case in unix/askpass.c where the
second line of a call was aligned neatly below the _wrong_ one of the
open parens on the opening line.
Restored as many alignments as I could easily find.
In several pieces of development recently I've run across the
occasional code block in the middle of a function which suddenly
switched to 2-space indent from this code base's usual 4. I decided I
was tired of it, so I ran the whole code base through a re-indenter,
which made a huge mess, and then manually sifted out the changes that
actually made sense from that pass.
Indeed, this caught quite a few large sections with 2-space indent
level, a couple with 8, and a handful of even weirder things like 3
spaces or 12. This commit fixes them all.