If the autoconf/ifdef system ends up taking the trivial branch through
all the Arm-architecture ifdefs, then we define the always-fail
version of getauxval as a 'static inline' function, and then (because
none of our desired HWCAP_FOO values is defined at all) never call it.
This leads to a compiler warning because we defined a static function
and never called it - i.e. at the default -Werror, a build failure.
Of course it's perfectly sensible to define a static inline function
that never gets called! Header files do it all the time, and nobody is
expected to ensure that if they include a header file then they take
care to refer to every static inline function it defines.
But if the definition is in the _source_ file rather than a header
file, then clang (in particular on macOS) will give a warning. So the
easy solution is to move the inline definitions of getauxval into a
header file, which suppresses the warning without requiring me to faff
about with further ifdefs to make the definitions conditional on at
least one use.
We use this for detecting the Arm crypto extension and using it to
enable accelerated AES and/or SHA-{1,2}. Previously, I had code that
called glibc's getauxval(3) function, conditioned on #ifdef __linux__.
Now, instead, I do an autoconf test to query the presence of getauxval
itself (so that any other system with the same API can still work),
and alongside it, also check for the analogous FreeBSD libc function
elf_aux_info(3). As a result, building on Arm FreeBSD now gets the
accelerated-crypto autodetection.
uClibc-ng does not provide <sys/auxv.h>, and a non-Linux-kernel-based
Unixlike system running on Arm will probably not provide
<asm/hwcap.h>. Now we check for both of those headers at autoconf
time, and if either one is absent, we don't do the runtime test for
Arm crypto acceleration.
This should only make a difference on systems where this module
previously failed to compile at all. But obviously it would be nicer
to find alternative ways to check for crypto acceleration on such
systems; patches welcome.
Similarly to my recent addition of NEON-accelerated AES, these new
implementations drop in alongside the SHA-NI ones, under a different
set of ifdefs. All the details of selection and detection are
essentially the same as they were for the AES code.
The refactored sshaes.c gives me a convenient slot to drop in a second
hardware-accelerated AES implementation, similar to the existing one
but using Arm NEON intrinsics in place of the x86 AES-NI ones.
This needed a minor structural change, because Arm systems are often
heterogeneous, containing more than one type of CPU which won't
necessarily all support the same set of architecture features. So you
can't test at run time for the presence of AES acceleration by
querying the CPU you're running on - even if you found a way to do it,
the answer wouldn't be reliable once the OS started migrating your
process between CPUs. Instead, you have to ask the OS itself, because
only that knows about _all_ the CPUs on the system. So that means the
aes_hw_available() mechanism has to extend a tentacle into each
platform subdirectory.
The trickiest part was the nest of ifdefs that tries to detect whether
the compiler can support the necessary parts. I had successful
test-compiles on several compilers, and was able to run the code
directly on an AArch64 tablet (so I know it passes cryptsuite), but
it's likely that at least some Arm platforms won't be able to build it
because of some path through the ifdefs that I haven't been able to
test yet.