1
0
mirror of https://git.tartarus.org/simon/putty.git synced 2025-01-25 01:02:24 +00:00
Commit Graph

6287 Commits

Author SHA1 Message Date
Simon Tatham
c61158aa34 Add an IV argument to aes_{en,de}crypt_pubkey.
No functional change: currently, the IV passed in is always zero
(except in the test suite). But this prepares to change that in a
future revision of the key file format.
2021-02-20 16:49:52 +00:00
Simon Tatham
609502b04b Add utility function 'memxor'. 2021-02-20 16:49:52 +00:00
Simon Tatham
09fa3f0e80 testcrypt: ability to handle more function parameters.
I'm about to add a function of particularly large arity.
2021-02-20 16:49:52 +00:00
Simon Tatham
29b8e26f7d Fix duplicated exit code in ppk_load_s.
There's no need to have one bunch of free operations before returning
success and another version before returning error. Easier to just set
up the state in the former case so that we can fall through to the
latter.
2021-02-20 16:40:51 +00:00
Simon Tatham
ce60ca727c Correct documentation of PPK key derivation.
When I transcribed the code into this document, I misread 'put_data'
as 'put_string' in several places, and documented SSH-style string
length headers that do not actually exist in the format.
2021-02-20 10:15:22 +00:00
Jacob Nevins
eda4ca6e65 Typos. 2021-02-16 11:16:09 +00:00
Simon Tatham
4eff9ab958 Add another missing window-border recalculation.
This one is triggered by the following sequence:
 - fill up the terminal window with text ('ls -l /dev' or similar)
 - Win+Right then Win+Up to snap to the top right quadrant
 - interactively drag away from the top right quadrant with the title
   bar, which returns the window to its pre-snap size.

After the snap, the window border will have been recomputed to take
account of the window size not being an integer number of character
cells. So it needs recomputing back again the next time the window
size changes to something that _is_ an integer number - which happens
(or rather, we process it in a deferred manner) at the EXITSIZEMOVE.
So that's where we need to recompute the border (again).
2021-02-15 19:48:10 +00:00
Simon Tatham
147adf4e76 Move PPK format documentation into a manual appendix.
Somebody on comp.security.ssh asked about it recently, and I decided
that storing it in a comment in the key file was not really good
enough. Also, that comment was incomplete (it listed the private key
formats for RSA and DSA but not any of the newer ECC key types, simple
as their private-key formats may be).
2021-02-15 18:48:34 +00:00
Simon Tatham
83b07a5c67 Fix downright weird line breaking.
How on earth did that get there?!
2021-02-13 10:11:30 +00:00
Simon Tatham
767a3a0fd7 Add more _MSC_VER translations.
I just re-checked the official list page and found that VS2019 has
grown another five minor versions since commit 0a4e068ada.
2021-02-13 08:32:41 +00:00
Simon Tatham
438c980cf1 Move CSET_OEMCP and CSET_ACP into Unicode surrogate space.
Each of these #defines represents a block of 256 code values that are
used, internally to the terminal code, to indicate that a character is
in one of the currently selected single-byte character sets. One
effect of this is that reconfiguring the character set in mid-session
causes all the text already on screen to be redrawn.

Unfortunately, those 512 code points were allocated at 0xF000-0xF1FF,
which is inside the Unicode private-use area. So if a font uses that
area to define actually useful glyphs, then those glyphs won't be
displayed correctly by PuTTY; instead, outputting 0xF000+'A' (for
example) will display as 'A'. A user recently reported this problem
with the 'Hack' font from https://github.com/ryanoasis/nerd-fonts .

RDB's comment next to the #defines suggested that this was done on
purpose for consistency with Linux (though it's not clear what part of
Linux; perhaps the virtual console driver used to work this way?). But
now it's getting in the way of actually useful Unicode characters,
that consistency doesn't seem like the most important thing. (Also, it
just seems wrong to me that you even _can_ cause PuTTY's terminal
emulator to use these special internal character representations by
sending legal UTF-8.)

So I've moved this block of 512 characters to 0xDC00, which is in the
Unicode surrogate space, and hence can't be stored in the terminal by
sending UTF-8 at all (since our UTF-8 decoder rejects that range, as
per spec). That's where we were already keeping other magic blocks
like CSET_LINEDRW and CSET_SCOACS, and there's still room for two
more.

The net effect should be that in Windows PuTTY, U+F000 to U+F1FF are
now displayed as whatever your font wants to show.
2021-02-12 18:24:23 +00:00
Simon Tatham
e7771a21d9 Fix missing recalculation of window border.
If you open a Windows PuTTY session and press Win+Right, Windows
auto-sizes the terminal window to cover the right-hand half of the
screen. Then if you press Win+Up it will be auto-sized again, this
time to the top right quadrant. In the second resize (if you don't
have font-based resize handling turned on), the WM_SIZE handler code
will find a path through the twisty maze of ifs on which the border
between the text and the client-area edges is not recomputed, or
invalidated, or redrawn. So you can end up with half a line of text
from the previous window size still visible at the bottom of the new
window.

Fixed by factoring out the offset-recomputation code from the large
and complicated reset_window(), so that I can call just that snippet
on the dangerous code path.
2021-02-12 18:11:35 +00:00
Simon Tatham
6bf3b49471 noprint.c: fix compile failure.
This file isn't used by any current in-tree PuTTY tool, so I didn't
notice the missing const in printer_job_data's arguments.
2021-02-10 21:07:57 +00:00
Simon Tatham
8af1d90dca Test program for ancillary window updates.
I've just done a major rewrite of code structure and update policy for
most of the TermWin window-modification methods, and I wrote this test
program in the process to check that old and new versions of the
terminal still respond to all these escape sequences in the same way.
It's quite likely to come in useful again, so I'll commit it.
2021-02-07 19:59:21 +00:00
Simon Tatham
07d334c61d Windows: make the need_backend_resize mechanism consistent.
There were three separate clauses in the WM_SIZE message handler which
potentially called term_size() to resize the actual Terminal object.
Two of them (for maximisation and normal non-maximised resizing drags)
first checked if an interactive resize was in progress, and if so,
instead set the need_backend_resize, to defer the term_size call to
the end of the interactive operation. But the third, for
_un_-maximising a window, didn't have that check.

As a result, if you start with a maximised window, drag its title bar
downward from the top of the screen (which unmaximises it), and
without letting go, drag it back up again (which maximises it), the
effect would be that you'd get one call to term_size in the middle of
the drag, and a second at the end. This isn't what I intended, and it
can also cause a redraw failure in full-screen applications on the
server (such as a terminal-based text editor - I reproduced this with
emacs), in which after the second term_size the terminal doesn't
manage to redraw itself.

Now I've pulled out the common logic that was in two of those three
pieces of code (and should have been in all three) into a subroutine
wm_size_resize_term, and arranged to call that in all three cases.
This fixes the inconsistency, and also fixes the emacs redraw problem
in the edge case I describe above.
2021-02-07 19:59:21 +00:00
Simon Tatham
334688db81 Apply UPDATE_DELAY in arrears, not in advance.
The original aim of the rate limit was to avoid having too many
updates per second. I implemented this by a deferment mechanism: when
any change occurs that makes the terminal want an update, it instead
sets a timer to go off after UPDATE_DELAY (1/50 second), and does the
update at the end of that interval.

Now it's done the other way round: if there has not been an update
within the last UPDATE_DELAY, then we can simply do an update _right
now_, in immediate response to whatever triggered it. And _then_ we
set a timer to track a cooldown period, within which any further
requests for updates will be deferred until the end of the cooldown.

This mechanism should still rate-limit updates, but now the latency in
normal interactive use should be lowered, because terminal updates in
response to keystrokes (which typically arrive separated by more than
UPDATE_DELAY) can now each be enacted as soon as possible after the
triggering keystroke.

This also reverses (in the common case) the slowdown of non-textual
window modifications introduced by the previous commit, in which lots
of them were brought under the umbrella of term_update and therefore
became subject to UPDATE_DELAY. Now they'll only be delayed in
conditions of high traffic, and not in interactive use.
2021-02-07 19:59:21 +00:00
Simon Tatham
d74308e90e Fold ancillary window changes into main redraw.
This fixes a long-standing inconsistency in updates to the terminal
window: redrawing of actual text was deferred for 1/50 second, but all
the other kinds of change the terminal can make to the window
(position, size, z-order, title, mouse pointer shape, scrollbar...)
were enacted immediately. In particular, this could mean that two
updates requested by the terminal output stream happened in reverse
order.

Now they're all done as part of term_update, which should mean that
things requested in the same chunk of terminal input happen at the
same time, or at the very least, not in reverse order compared to the
order the requests came in.

Also, the same timer-based UPDATE_DELAY mechanism that applies to the
text updates now applies to all the other window modifications, which
should prevent any of those from being the limiting factor to how fast
this terminal implementation can process input data (which is exactly
why I set up that system for the main text update).

This makes everything happen with a bit more latency, but I'm about to
reverse that in a follow-up commit.
2021-02-07 19:59:21 +00:00
Simon Tatham
99dfc66457 Decouple frontend's raw mouse mode from pointer shape.
This paves the way for a followup commit that will make them happen at
slightly different times.
2021-02-07 19:59:21 +00:00
Simon Tatham
07aff63e22 Centralise check of CONF_no_mouse_rep into Terminal.
This removes code duplication between the front ends: now the terminal
itself knows when the Conf is asking it not to turn on mouse
reporting, and the front ends can assume that if the terminal asks
them to then they should just do it.

This also makes the behaviour on mid-session reconfiguration more
sensible, in both code organisation and consistent behaviour.
Previously, term_reconfig would detect that CONF_no_mouse_rep had been
*set* in mid-session, and turn off mouse reporting mode in response.
But it would do it by clearing term->xterm_mouse, which isn't how the
front end enabled and disabled that feature, so things could get into
different states from different sequences of events that should have
ended up in the same place.

Also, the terminal wouldn't re-enable mouse reporting if
CONF_no_mouse_rep was *cleared* and the currently running terminal app
had been asking for mouse reports all along. Also, it was silly to
have half the CONF_no_mouse_rep handling in term_reconfig and the
other half in the front ends.

Now it should all be sensible, and also all centralised.
term->xterm_mouse consistently tracks whether the terminal application
is _requesting_ mouse reports; term->xterm_mouse_forbidden tracks
whether the client user is vetoing them; every change to either one of
those settings triggers a call to term_update_raw_mouse_mode which
sets up the front end appropriately for the current combination.
2021-02-07 19:59:21 +00:00
Simon Tatham
696550a5f2 Flip direction of window pos/size queries.
Similarly to other recent changes, the frontend now proactively keeps
Terminal up to date with the current position and size of the terminal
window, so that escape-sequence queries can be answered immediately
from the Terminal's own internal data structures without needing a
call back to the frontend.

Mostly this has let me remove explicit window-system API calls that
retrieve the window position and size, in favour of having the front
ends listen for WM_MOVE / WM_SIZE / ConfigureNotify events and track
the position and size that way. One exception is that the window pixel
size is still requested by Seat via a callback, to put in the
wire-encoded termios settings. That won't be happening very much, so
I'm leaving it this way round for the moment.
2021-02-07 19:59:21 +00:00
Simon Tatham
ca9cd983e1 Centralise palette setup into terminal.c.
Now terminal.c makes nearly all the decisions about what the colour
palette should actually contain: it does the job of reading the
GUI-configurable colours out of Conf, and also the job of making up
the rest of the xterm-256 palette. The only exception is that TermWin
can provide a method to override some of the default colours, which on
Windows is used to implement the 'Use system colours' config option.

This saves code overall, partly because the front ends don't have to
be able to send palette data back to the Terminal any more (the
Terminal keeps the master copy and can answer palette-query escape
sequences from its own knowledge), and also because now there's only
one copy of the xterm-256 palette setup code (previously gtkwin.c and
window.c each had their own version of it).

In this rewrite, I've also introduced a multi-layered storage system
for the palette data in Terminal. One layer contains the palette
information derived from Conf; the next contains platform overrides
(currently just Windows's 'Use system colours'); the last one contains
overrides set by escape sequences in the middle of the session. The
topmost two layers can each _conditionally_ override the ones below.
As a result, if a server-side application manually resets (say) the
default fg and bg colours in mid-session to something that works well
in a particular application, those changes won't be wiped out by a
change in the Windows system colours or the Conf, which they would
have been before. Instead, changes in Conf or the system colours alter
the lower layers of the structure, but then when palette_rebuild is
called, the upper layer continues to override them, until a palette
reset (ESC]R) or terminal reset (e.g. ESC c) removes those upper-layer
changes. This seems like a more consistent strategy, in that the same
set of configuration settings will produce the same end result
regardless of what order they were applied in.

The palette-related methods in TermWin have had a total rework.
palette_get and palette_reset are both gone; palette_set can now set a
contiguous range of colours in one go; and the new
palette_get_overrides replaces window.c's old systopalette().
2021-02-07 19:59:21 +00:00
Simon Tatham
cd32ef8733 Indentation fixes in terminal.c.
While working on the palette code I noticed several sections where a
2-space indent policy was used. Let's keep things consistent.

This patch is whitespace-only, with the single exception that I've
removed a pair of braces where a for statement contains only a switch
(and, cheatingly, put the for and switch on the same indent level so
that the interior of it doesn't go _too_ far off to the right).
2021-02-07 19:59:20 +00:00
Simon Tatham
da3197f395 Bring some order to colour palette indexing.
There are three separate indexing schemes in use by various bits of
the PuTTY front ends, and _none_ of them was clearly documented, let
alone all in the same place. Worse, functions that looked obviously
related, like win_palette_set and win_palette_get, used different
encodings.

Now all the encodings are defined together in putty.h, with
explanation of why there are three in the first place and clear
documentation of where each one is used; terminal.c provides mapping
tables that convert between them; the terminology is consistent
throughout; and win_palette_set has been converted to use the sensible
encoding.
2021-02-07 19:59:20 +00:00
Simon Tatham
61571376cc Remove TermWin's is_minimised method.
Again, I've replaced it with a push-based notification going in the
other direction, so that when the terminal output stream includes a
query for 'is the window minimised?', the Terminal doesn't have to
consult the TermWin, because it already knows the answer.

The GTK API I'm using here (getting a GdkEventWindowState via
GtkWidget's window-state-event) is not present in GTK 1. The API I was
previously using (gdk_window_is_viewable) _is_, but it turns out that
that API doesn't reliably give the right answer: it only checks
visibility of GDK window ancestors, not X window ancestors. So in fact
GTK 1 PuTTY/pterm was only ever _pretending_ to reliably support the
'am I minimised' terminal query. Now it won't pretend any more.
2021-02-07 19:59:20 +00:00
Simon Tatham
42ad454f4f Move all window-title management into Terminal.
Previously, window title management happened in a bipartisan sort of
way: front ends would choose their initial window title once they knew
what host name they were connecting to, but then Terminal would
override that later if the server set the window title by escape
sequences.

Now it's all done the same way round: the Terminal object is always
where titles are invented, and they only propagate in one direction,
from the Terminal to the TermWin.

This allows us to avoid duplicating in multiple front ends the logic
for what the initial window title should be. The frontend just has to
make one initial call to term_setup_window_titles, to tell the
terminal what hostname should go in the default title (if the Conf
doesn't override even that). Thereafter, all it has to do is respond
to the TermWin title-setting methods.

Similarly, the logic that handles window-title changes as a result of
the Change Settings dialog is also centralised into terminal.c. This
involved introducing an extra term_pre_reconfig() call that each
frontend can call to modify the Conf that will be used for the GUI
configurer; that's where the code now lives that copies the current
window title into there. (This also means that GTK PuTTY now behaves
consistently with Windows PuTTY on that point; GTK's previous
behaviour was less well thought out.)

It also means there's no longer any need for Terminal to talk to the
front end when a remote query wants to _find out_ the window title:
the Terminal knows the answer already. So TermWin's get_title method
can go.
2021-02-07 19:59:20 +00:00
Simon Tatham
45b03419fd Remove TermWin's is_utf8 method.
All implementations of it work by checking the line_codepage field in
the ucsdata structure that the terminal itself already has a pointer
to. Therefore, it's a totally unnecessary query function: the terminal
can check the same thing directly by inspecting that structure!

(In fact, it already _does_ do that, for the purpose of actually
deciding how to decode terminal output data. It only uses this query
function at all for the auxiliary purpose of inventing useful tty
modes to pass to the backend.)
2021-02-07 19:59:20 +00:00
Simon Tatham
b63a66cd2c Add a few missing 'static'. 2021-02-02 18:54:39 +00:00
Simon Tatham
af278ac870 Unix Plink: fix tight loop after EOF on stdin.
When Plink saw EOF on stdin, it would continue to put stdin in its
list of poll fds, so that the poll loop would always terminate
instantly with stdin readable. Plink would read from it, see EOF
again, go back to the poll loop, and keep spinning like that.

This was supposed to be fixed by the 'sending' flag, which was set to
false on seeing EOF to indicate that we were no longer interested in
reading stdin data to send to the SSH server. But that flag was
ineffective, because it turns out it was _always_ set to false -
nothing in the code ever set it to true! And the reason why that
didn't totally prevent reading from stdin at all is because it was
also tested with the wrong sense. How embarrassing.

Changed the flag name to 'seen_stdin_eof', and made it behave
sensibly.
2021-02-02 18:22:41 +00:00
Simon Tatham
9e1ec093fd testcrypt: fix fake class methods on MACs.
I had the wrong function name prefix in the method_prefixes array: the
MAC functions all begin with ssh2_mac_* instead of ssh_mac_*. As a
result, MAC objects in the Python testcrypt system didn't provide
OO-like methods such as m.update() and m.genresult(); instead you had
to say ssh2_mac_update(m, ...) and ssh2_mac_genresult(m).
2021-02-02 18:17:46 +00:00
Simon Tatham
d851df486f Fix build failure at -DNOT_X_WINDOWS.
I had been indecisive about whether the definitions and calls of
store_cutbuffer and retrieve_cutbuffer should be compiled out
completely in GTK-without-X mode, or whether the definitions should be
left in as stubs and the calls still present. retrieve_cutbuffer ended
up with a definition but no call in that mode.

It was only an unused-function warning, but -Werror promoted it to an
error. Fixed by making up my mind: now the functions are completely
absent, and so are the calls to them.
2021-01-26 18:12:48 +00:00
Jacob Nevins
aef7640bba Fix out-of-bounds access in Windows CLI tools.
Commit c6ff548ae0 introduced this when not re-using an existing shared
connection.
2021-01-26 12:41:26 +00:00
Simon Tatham
fb130bf6da Cleanup: add some calls to dupstr.
I just happened to spot a couple of cases where I'd apparently
open-coded the dupstr() logic before writing dupstr() itself, and
never got round to replacing the long-winded version with a call to
the standard helper function.
2021-01-21 19:57:38 +00:00
Simon Tatham
c6ff548ae0 wincliloop: cope with winselcli_event not existing.
I found recently that if I ran Windows PSCP as a connection-sharing
downstream, it would send the SSH greeting down the named pipe, but
never receive anything back, though the upstream PuTTY was sending it.
PuTTY and Plink from the same build of the code would act happily as
downstreams.

It turned out that this was because the WaitForMultipleObjects call in
cli_main_loop() in wincliloop.c was failing with ERROR_ACCESS_DENIED.
That happened because it had an INVALID_HANDLE_VALUE in its list of
objects to wait for. That in turn happened because winselcli_event was
set to INVALID_HANDLE_VALUE.

Why was winselcli_event not set up? Because it's set up lazily by
do_select(), so if the program isn't handling any network sockets at
all (which is the case when PSCP is speaking over a named pipe
instead), then it never gets made into a valid event object.

So the problem wasn't that winselcli_event was in a bad state; it was
quite legitimately invalid. The problem was that wincliloop ought to
have _coped_ with it being invalid, by not inserting it in its list of
objects to wait for.

So now we check that case, and only insert winselcli_event in the list
if it's valid. And PSCP works again over connection sharing.
2021-01-19 20:35:13 +00:00
Simon Tatham
20d1c47484 wcwidth: update wide[] array to Unicode 13.0.0. 2021-01-19 18:34:15 +00:00
Simon Tatham
6fc0eb29ac Clarify wording in the new traits section.
Revisiting it today I realised that I'd written 'implementation
structure' where I meant 'instance structure'.
2021-01-17 09:18:42 +00:00
Simon Tatham
f7adf7bca0 Fix a few 'triple letter in place of double' typos.
A user wrote in to point out the one in winhandl.c, and out of sheer
curiosity, I grepped the whole source base for '([a-zA-Z])\1\1' to see
if there were any others. Of course there are a lot of perfectly
sensible ones, like 'www' or 'Grrr', not to mention any amount of
0xFFFF and the iiii/bbbb emphasis system in Halibut code paragraphs,
but I did spot one more in the recently added udp.but section on
traits, and another in a variable name in uxagentsock.c.
2021-01-17 09:18:42 +00:00
Jacob Nevins
7c27cdc16e It's a new year.
(Supposedly.)
2021-01-11 21:37:51 +00:00
Sean Ho
7d086184f8 gtkwin: solved unused variable error
solved unused variable error when KEY_EVENT_DIAGNOSTICS defined but
DEBUG not defined

although we intend to always define DEBUG when KEY_EVENT_DIAGNOSTICS
is going to be defined.
2021-01-11 20:53:52 +00:00
Sean Ho
476e09832f gtkwin: let ctrl-key fix works without debug mode
let 8dfc39bf works when KEY_EVENT_DIAGNOSTICS is not defined
2021-01-11 20:48:11 +00:00
Simon Tatham
1bcab77eb1 Upgrade random_setup_special to use SHA-3.
The idea of the especially large RNG that we use in key generation is
that it should have as much actual entropy as possible. The reason I
based it on SHA-512 previously was that that was the hash function in
our collection with the largest output. But that's no longer true!
Among the SHA-3 family that I added for Ed448 purposes, we have a
ready-made variant of SHAKE-256 that outputs a whopping 114 bytes of
hash. I see no reason not to upgrade to that from SHA-512's 64 bytes.

(I could probably extend it even further by manually making another
SHA-3 variant specially for the purpose, but I don't know that it
would be worth it. This is a one-line change which I think is already
a positive step.)
2020-12-27 08:30:51 +00:00
Simon Tatham
79de16732a Document PuTTY's local idiom for OO / traits.
A user mentioned having found this confusing recently, and fair
enough, because it's done in a way that doesn't quite match the
built-in OO system of any language I know about. But after the
rewriting in recent years, I think pretty much everything in PuTTY
that has a system of interchangeable implementations of the same
abstract type is now done basically the same way, so this seems like a
good moment to document the idiom we use and explain all its ins and
outs.
2020-12-26 16:14:13 +00:00
Simon Tatham
1f8b3b5535 Update docs section about use of global variables.
It referred to the global variable 'flags' as an example. But 'flags'
was retired (and good riddance) nearly a year ago, in commit
4ea811a0bf. So we should be using a different example now!
2020-12-26 15:40:04 +00:00
Pavel I. Kryukov
875a887c8f Include <sys/sysctl.h> for Intel builds 2020-12-25 06:57:35 +00:00
Simon Tatham
d594df9803 Fix build failure on Intel Macs.
sysctlbyname() turns out to be a new library function, so we can't
assume it's present just because defined __APPLE__. Add an autoconf
check to see if it's really there, before trying to call it.
2020-12-24 20:45:28 +00:00
Simon Tatham
31cd5ee19b Fix buffer overflow in NEON SHA-384 output.
An obvious goof - in SHA-384, you don't want to write out the last of
the four state vectors! Fortunately I spotted it only a couple of
hours after introducing it.
2020-12-24 17:39:54 +00:00
Simon Tatham
456120cfac Rewrite MD5 implementation in my modern style.
MD5 is structurally very similar to all the SHA-1 and SHA-2 hashes
(with the main difference being that the message schedule consists of
just repeating the 32-bit words of the message four times in different
permutations, instead of transforming them via an LFSR-style process).
So it helps legibility and maintainability if all the implementations
of these hashes are coded in a similar style - for example, that way,
the next time I need to make a change to the ssh_hash API, I can do it
the same way in all these modules without having to think everything
out again.

After the SHA-512 rewrite earlier today, all the hashes in that family
had been updated to a consistent new style as a side effect of adding
optional hardware acceleration, except for MD5, because there's no
hardware-accelerated version of it. (And not much chance of anyone
ever needing one, I hope!)

So this is a purely stylistic update which reworks MD5 so that it
looks just like all the SHA-1 and SHA-2 hash implementations. No
functional change.
2020-12-24 17:30:23 +00:00
Simon Tatham
a9763ce4ed Hardware-accelerated SHA-512 on the Arm architecture.
The NEON support for SHA-512 acceleration looks very like SHA-256,
with a pair of chained instructions to generate a 128-bit vector
register full of message schedule, and another pair to update the hash
state based on those. But since SHA-512 is twice as big in all
dimensions, those four instructions between them only account for two
rounds of it, in place of four rounds of SHA-256.

Also, it's a tighter squeeze to fit all the data needed by those
instructions into their limited number of register operands. The NEON
SHA-256 implementation was able to keep its hash state and message
schedule stored as 128-bit vectors and then pass combinations of those
vectors directly to the instructions that did the work; for SHA-512,
in several places you have to make one of the input operands to the
main instruction by combining two halves of different vectors from
your existing state. But that operation is a quick single EXT
instruction, so no trouble.

The only other problem I've found is that clang - in particular the
version on M1 macOS, but as far as I can tell, even on current trunk -
doesn't seem to implement the NEON intrinsics for the SHA-512
extension. So I had to bodge my own versions with inline assembler in
order to get my implementation to compile under clang. Hopefully at
some point in the future the gap might be filled and I can relegate
that to a backwards-compatibility hack!

This commit adds the same kind of switching mechanism for SHA-512 that
we already had for SHA-256, SHA-1 and AES, and as with all of those,
plumbs it through to testcrypt so that you can explicitly ask for the
hardware or software version of SHA-512. So the test suite can run the
standard test vectors against both implementations in turn.

On M1 macOS, I'm testing at run time for the presence of SHA-512 by
checking a sysctl setting. You can perform the same test on the
command line by running "sysctl hw.optional.armv8_2_sha512".

As far as I can tell, on Windows there is not yet any flag to test for
this CPU feature, so for the moment, the new accelerated SHA-512 is
turned off unconditionally on Windows.
2020-12-24 15:39:54 +00:00
Simon Tatham
c6d921add5 Reorganise SHA-512 to match SHA-256.
This builds on the previous refactoring by reworking the SHA-512
vtables and block layer to look more like the SHA-256 version, in
which the block and padding structure is a subroutine of the top-level
vtable methods instead of an owning layer around them.

This also organises the code in a way that makes it easy to drop in
hardware-accelerated versions alongside it: the block layer and the
big arrays of constants are now nicely separate from the inner
block-transform part.
2020-12-24 15:20:45 +00:00
Simon Tatham
43cdc3d910 Tidy up arithmetic in the SHA-512 implementation.
It was written in an awkward roundabout way involving all the
arithmetic being done in horrible macros looking like assembler
instructions, and lots of explicit temp variables. That's because,
when I originally wrote it, I needed it to compile on platforms
without a 64-bit integer type.

In commit a647f2ba11 I switched it over to using uint64_t, but I did
it in a way that made minimal change to the code structure, by
rewriting the insides of those macros to contain ordinary uint64_t
arithmetic instead of faffing about with 32-bit halves. So it worked,
but it still looked disgusting.

Now I've reworked it so that individual arithmetic operations are
written directly in the sensible way, and the more complicated
SHA-specific operations are written as inline functions instead of
macros.
2020-12-24 15:20:16 +00:00
Simon Tatham
092c51afed uxutils.c: add special case for M1 macOS.
The M1 chip in the new range of Macs includes the crypto extension
that permits AES, SHA-1 and SHA-256 acceleration. But you can't find
that out by querying the ELF aux vector, because macOS isn't even
ELF-based at all, so there isn't an ELF aux vector, and no web search
I've tried has turned up any MachO thing obviously analogous to it.

Running 'sysctl -a' does show some flags indicating CPU architecture
extensions, but they're more advanced ones than this. So I think we
have to assume that if we're on the new M1 macOS at all, then we have
the basic crypto extension available.

Accordingly, I've added a special case to all the query functions that
simply returns true if defined __APPLE__.
2020-12-24 13:37:08 +00:00