gcc and clang both provide a type called __uint128_t when compiling
for 64-bit targets, code-generated more or less similarly to the way
64-bit long longs are handled on 32-bit targets (spanning two
registers, using ADD/ADC, that sort of thing). Where this is available
(and they also provide a handy macro to make it easy to detect), we
should obviously use it, so that we can handle bignums a larger chunk
at a time and make use of the full width of the hardware's multiplier.
Preliminary benchmarking using 'testbn' suggests a factor of about 2.5
improvement.
I've added the new possibility to the ifdefs in sshbn.h, and also
re-run contrib/make1305.py to generate a set of variants of the
poly1305 arithmetic for the new size of BignumInt.
(cherry picked from commit f8b27925ee)
Conflicts:
sshccp.c
Cherry-picker's notes: the conflict arose because the original commit
also added new 64-bit autogenerated forms of dedicated Poly1305
arithmetic, which doesn't exist on this branch.
This allows files other than sshbn.c to work with the primitives
necessary to build multi-word arithmetic functions satisfying all of
PuTTY's portability constraints.
(cherry picked from commit 2c60070aad)
Cherry-picker's notes: required on this branch because it's a
dependency of f8b27925ee which we want.