1
0
mirror of https://git.tartarus.org/simon/putty.git synced 2025-01-25 09:12:24 +00:00
putty-source/unix/uxutils.c

66 lines
1.5 KiB
C
Raw Permalink Normal View History

#include "putty.h"
#include "ssh.h"
#include "uxutils.h"
#if defined __arm__ || defined __aarch64__
bool platform_aes_hw_available(void)
{
#if defined HWCAP_AES
return getauxval(AT_HWCAP) & HWCAP_AES;
#elif defined HWCAP2_AES
return getauxval(AT_HWCAP2) & HWCAP2_AES;
#elif defined __APPLE__
/* M1 macOS defines no optional sysctl flag indicating presence of
* the AES extension, which I assume to be because it's always
* present */
return true;
#else
return false;
#endif
}
bool platform_sha256_hw_available(void)
{
#if defined HWCAP_SHA2
return getauxval(AT_HWCAP) & HWCAP_SHA2;
#elif defined HWCAP2_SHA2
return getauxval(AT_HWCAP2) & HWCAP2_SHA2;
#elif defined __APPLE__
/* Assume always present on M1 macOS, similarly to AES */
return true;
#else
return false;
#endif
}
bool platform_sha1_hw_available(void)
{
#if defined HWCAP_SHA1
return getauxval(AT_HWCAP) & HWCAP_SHA1;
#elif defined HWCAP2_SHA1
return getauxval(AT_HWCAP2) & HWCAP2_SHA1;
#elif defined __APPLE__
/* Assume always present on M1 macOS, similarly to AES */
return true;
#else
return false;
#endif
}
Hardware-accelerated SHA-512 on the Arm architecture. The NEON support for SHA-512 acceleration looks very like SHA-256, with a pair of chained instructions to generate a 128-bit vector register full of message schedule, and another pair to update the hash state based on those. But since SHA-512 is twice as big in all dimensions, those four instructions between them only account for two rounds of it, in place of four rounds of SHA-256. Also, it's a tighter squeeze to fit all the data needed by those instructions into their limited number of register operands. The NEON SHA-256 implementation was able to keep its hash state and message schedule stored as 128-bit vectors and then pass combinations of those vectors directly to the instructions that did the work; for SHA-512, in several places you have to make one of the input operands to the main instruction by combining two halves of different vectors from your existing state. But that operation is a quick single EXT instruction, so no trouble. The only other problem I've found is that clang - in particular the version on M1 macOS, but as far as I can tell, even on current trunk - doesn't seem to implement the NEON intrinsics for the SHA-512 extension. So I had to bodge my own versions with inline assembler in order to get my implementation to compile under clang. Hopefully at some point in the future the gap might be filled and I can relegate that to a backwards-compatibility hack! This commit adds the same kind of switching mechanism for SHA-512 that we already had for SHA-256, SHA-1 and AES, and as with all of those, plumbs it through to testcrypt so that you can explicitly ask for the hardware or software version of SHA-512. So the test suite can run the standard test vectors against both implementations in turn. On M1 macOS, I'm testing at run time for the presence of SHA-512 by checking a sysctl setting. You can perform the same test on the command line by running "sysctl hw.optional.armv8_2_sha512". As far as I can tell, on Windows there is not yet any flag to test for this CPU feature, so for the moment, the new accelerated SHA-512 is turned off unconditionally on Windows.
2020-12-24 11:40:15 +00:00
bool platform_sha512_hw_available(void)
{
#if defined HWCAP_SHA512
return getauxval(AT_HWCAP) & HWCAP_SHA512;
#elif defined HWCAP2_SHA512
return getauxval(AT_HWCAP2) & HWCAP2_SHA512;
#elif defined __APPLE__
return test_sysctl_flag("hw.optional.armv8_2_sha512");
#else
return false;
#endif
}
#endif /* defined __arm__ || defined __aarch64__ */