putty-source

mirror of https://git.tartarus.org/simon/putty.git synced 2025-01-09 17:38:00 +00:00

Author	SHA1	Message	Date
Simon Tatham	aab0892671	Side-channel tester: align memory allocations. While trying to get an upcoming piece of code through testsc, I had trouble - _yet again_ - with the way that control flow diverges inside the glibc implementations of functions like memcpy and memset, depending on the alignment of the input blocks _above_ the alignment guaranteed by malloc, so that doing the same sequence of malloc + memset can lead to different control flow. (I believe this is done either for cache performance reasons or SIMD alignment requirements, or both: on x86, some SIMD instructions require memory alignment beyond what malloc guarantees, which is also awkward for our x86 hardware crypto implementations.) My previous effort to normalise this problem out of sclog's log files worked by wrapping memset and all its synonyms that I could find. But this weekend, that failed for me, and the reason appears to be ifuncs. I'm aware of the great irony of committing code to a security project with a log message saying something vague about ifuncs, on the same weekend that it came to light that commits matching that description were one of the methods used to smuggle a backdoor into the XZ Utils project (CVE-2024-3094). So I'll bend over backwards to explain both what I think is going on, and why this _isn't_ a weird ifunc-related backdooring attempt: When I say I 'wrap' memset, I mean I use DynamoRIO's 'drwrap' API to arrange that the side-channel test rig calls a function of mine before and after each call to memset. The way drwrap works is to look up the symbol address in either the main program or a shared library; in this case, it's a shared library, namely libc.so. Then it intercepts call instructions with exactly that address as the target. Unfortunately, what _actually_ happens when the main program calls memset is more complicated. First, control goes to the PLT entry for memset (still in the main program). In principle, that loads a GOT entry containing the address of memset (filled in by ld.so), and jumps to it. But in fact the GOT entry varies its value through the program; on the first call, it points to a resolver function, whose job is to _find out_ the address of memset. And in the version of libc.so I'm currently running, that resolver is an STT_GNU_IFUNC indirection function, which tests the host CPU's capabilities, and chooses an actual implementation of memset depending on what it finds. (In my case, it looks as if it's picking one that makes extensive use of x86 SIMD.) To avoid the overhead of doing this on every call, the returned function pointer is then written into the main program's GOT entry for memset, overwriting the address of the resolver function, so that the _next_ call the main program makes through the same PLT entry will go directly to the memset variant that was chosen. And the problem is that, after this has happened, none of the new control flow ever goes near the _official_ address of memset, as read out of libc.so's dynamic symbol table by DynamoRIO. The PLT entry isn't at that address, and neither is the particular SIMD variant that the resolver ended up choosing. So now my wrapper on memset is never being invoked, and memset cheerfully generates different control flow in runs of my crypto code that testsc expects to be doing exactly the same thing as each other, and all my tests fail spuriously. My solution, at least for the moment, is to completely abandon the strategy of wrapping memset. Instead, let's just make it behave the same way every time, by forcing all the affected memory allocations to have extra-strict alignment. I found that 64-byte alignment is not good enough to eliminate memset-related test failures, but 128-byte alignment is. This would be tricky in itself, if it weren't for the fact that PuTTY already has its own wrapper function on malloc (for various reasons), which everything in our code already uses. So I can divert to C11's aligned_alloc() there. That in turn is done by adding a new #ifdef to utils/memory.c, and compiling it with that #ifdef into a new object library that is included in testsc, superseding the standard memory.o that would otherwise be pulled in from our 'utils' static library. With the previous memset-compensator removed, this means testsc is now dependent on having aligned_alloc() available. So we test for it at cmake time, and don't build testsc at all if it can't be found. This shouldn't bother anyone very much; aligned_alloc() is available on _my_ testsc platform, and if anyone else is trying to run this test suite at all, I expect it will be on something at least as new as that. (One awkward thing here is that we can only replace _new_ allocations with calls to aligned_alloc(): C11 provides no aligned version of realloc. Happily, this doesn't currently introduce any new problems in testsc. If it does, I might have to do something even more painful in future.) So, why isn't this an ifunc-related backdoor attempt? Because (and you can check all of this from the patch): 1. The memset-wrapping code exists entirely within the DynamoRIO plugin module that lives in test/sclog. That is not used in production, only for running the 'testsc' side-channel tester. 2. The memset-wrapping code is _removed_ by this patch, not added. 3. None of this code is dealing directly with ifuncs - only working around the unwanted effects on my test suite from the fact that they exist somewhere else and introduce awkward behaviour.	2024-04-01 13:10:49 +01:00
Simon Tatham	83ecb07296	sclog: add a 'project' line in CMakeLists.txt. This causes cmake to stop whinging that there isn't one. More usefully, by specifying the LANGUAGES keyword as just C (rather than the default of both C and CXX), the cmake configure step is sped up by not having to faff about finding a C++ compiler.	2022-08-16 18:23:52 +01:00
Simon Tatham	9cac27946a	Formatting: miscellaneous. This patch fixes a few other whitespace and formatting issues which were pointed out by the bulk-reindent or which I spotted in passing, some involving manual editing to break lines more nicely. I think the weirdest hunk in here is the one in windows/window.c TranslateKey() where _half_ of an assignment statement inside an 'if' was on the same line as the trailing paren of the if condition. No idea at all how that one managed to happen!	2022-08-03 20:48:46 +01:00
Simon Tatham	3a42a09dad	Formatting: normalise back to 4-space indentation. In several pieces of development recently I've run across the occasional code block in the middle of a function which suddenly switched to 2-space indent from this code base's usual 4. I decided I was tired of it, so I ran the whole code base through a re-indenter, which made a huge mess, and then manually sifted out the changes that actually made sense from that pass. Indeed, this caught quite a few large sections with 2-space indent level, a couple with 8, and a handful of even weirder things like 3 spaces or 12. This commit fixes them all.	2022-08-03 20:48:46 +01:00
Simon Tatham	1c78d18acb	sclog: wrap memmove. I had a testsc run fail because of alignment-dependent control flow divergence in a glibc function with 'memmove' in the name, which appears to have been an accident of different memory allocation between two runs of the test in question. sclog was already giving special handling to memset for the same reason, so it's no trouble to add memmove to the same list of functions that are treated as an opaque primitive for logging purposes.	2021-08-27 18:04:49 +01:00
Simon Tatham	04c50b6cfd	sclog: add missing instr_set_translation. When we invent a movzx instruction as part of shift-count logging on x86, we apparently need to set its 'translation' field to point at a pre-existing instruction that it's logically related to. Later versions of DynamoRIO than I was running with will complain if this isn't done.	2020-12-16 09:27:40 +00:00
Simon Tatham	7aca274789	sclog: log the size of allocated memory regions. This occurred to me recently as a (very small) hole in the logging strategy: if the size of an allocated memory block depended on some secret data, it certainly would change the control flow and memory access pattern inside malloc, but since we disable logging inside malloc, the log file from this test suite would never see the difference. Easily fixed by printing the size of each block in the code that intercepts malloc and realloc. As expected, no test actually fails as a result of filling in this gap.	2020-12-13 12:33:43 +00:00
Simon Tatham	e97a364d07	sclog: don't try to find libc functions outside libc. On AArch64, there are unexpectedly malloc and free functions in ld.so, so the module-load function finds them there, wraps them, and then misses the real versions in libc.	2020-11-26 18:04:49 +00:00
Simon Tatham	b3f2726b83	sclog: support AArch64 division and shift instructions. These need to be logged for the same reasons as on x86.	2020-11-26 18:04:49 +00:00
Simon Tatham	f65153ab5b	sclog: put x86-specific parts under ifdef. This allows my side-channel test system to at least _compile_ on other architectures without failing for the lack of OP_xxx enum constants, although it now won't log all the things it needs to be a proper test.	2020-11-26 17:52:11 +00:00
Simon Tatham	8d186c3c93	Formatting change to braces around one case of a switch. Sometimes, within a switch statement, you want to declare local variables specific to the handler for one particular case. Until now I've mostly been writing this in the form switch (discriminant) { case SIMPLE: do stuff; break; case COMPLICATED: { declare variables; do stuff; } break; } which is ugly because the two pieces of essentially similar code appear at different indent levels, and also inconvenient because you have less horizontal space available to write the complicated case handler in - particuarly undesirable because _complicated_ case handlers are the ones most likely to need all the space they can get! After encountering a rather nicer idiom in the LLVM source code, and after a bit of hackery this morning figuring out how to persuade Emacs's auto-indent to do what I wanted with it, I've decided to move to an idiom in which the open brace comes right after the case statement, and the code within it is indented the same as it would have been without the brace. Then the whole case handler (including the break) lives inside those braces, and you get something that looks more like this: switch (discriminant) { case SIMPLE: do stuff; break; case COMPLICATED: { declare variables; do stuff; break; } } This commit is a big-bang change that reformats all the complicated case handlers I could find into the new layout. This is particularly nice in the Pageant main function, in which almost _every_ case handler had a bundle of variables and was long and complicated. (In fact that's what motivated me to get round to this.) Some of the innermost parts of the terminal escape-sequence handling are also breathing a bit easier now the horizontal pressure on them is relieved. (Also, in a few cases, I was able to remove the extra braces completely, because the only variable local to the case handler was a loop variable which our new C99 policy allows me to move into the initialiser clause of its for statement.) Viewed with whitespace ignored, this is not too disruptive a change. Downstream patches that conflict with it may need to be reapplied using --ignore-whitespace or similar.	2020-02-16 11:26:21 +00:00
Pavel I. Kryukov	bac0a4dba7	sclog.c: print 'stores' for memory stores	2019-12-15 20:24:04 +00:00
Pavel I. Kryukov	7dd0351254	Wrap few more functions in side channel attack tests [SGT's note: Pavel Kryukov says he came across these while setting up CI runs of testsc. Two of the three guarded functions are obvious variants on ones we're already guarding; I hadn't heard of cfree before, but apparently it's an alternate form of ordinary free() from an obsolete standard, whose glibc man page says 'never use it'.]	2019-12-14 17:18:35 +00:00
Simon Tatham	83db341e8a	New test system to detect side channels in crypto code. All the work I've put in in the last few months to eliminate timing and cache side channels from PuTTY's mp_int and cipher implementations has been on a seat-of-the-pants basis: just thinking very hard about what kinds of language construction I think would be safe to use, and trying not to absentmindedly leave a conditional branch or a cast to bool somewhere vital. Now I've got a test suite! The basic idea is that you run the same crypto primitive multiple times, with inputs differing only in ways that are supposed to avoid being leaked by timing or leaving evidence in the cache; then you instrument the code so that it logs all the control flow, memory access and a couple of other relevant things in each of those runs, and finally, compare the logs and expect them to be identical. The instrumentation is done using DynamoRIO, which I found to be well suited to this kind of work: it lets you define custom modifications of the code in a reasonably low-effort way, and it lets you work at both the low level of examining single instructions _and_ the higher level of the function call ABI (so you can give things like malloc special treatment, not to mention intercepting communications from the program being instrumented). Build instructions are all in the comment at the top of testsc.c. At present, I've found this test to give a 100% pass rate using gcc -O0 and -O3 (Ubuntu 18.10). With clang, there are a couple of failures, which I'll fix in the next commit.	2019-02-10 13:09:53 +00:00

14 Commits