mirror of
https://git.tartarus.org/simon/putty.git
synced 2025-01-09 17:38:00 +00:00
aab0892671
While trying to get an upcoming piece of code through testsc, I had trouble - _yet again_ - with the way that control flow diverges inside the glibc implementations of functions like memcpy and memset, depending on the alignment of the input blocks _above_ the alignment guaranteed by malloc, so that doing the same sequence of malloc + memset can lead to different control flow. (I believe this is done either for cache performance reasons or SIMD alignment requirements, or both: on x86, some SIMD instructions require memory alignment beyond what malloc guarantees, which is also awkward for our x86 hardware crypto implementations.) My previous effort to normalise this problem out of sclog's log files worked by wrapping memset and all its synonyms that I could find. But this weekend, that failed for me, and the reason appears to be ifuncs. I'm aware of the great irony of committing code to a security project with a log message saying something vague about ifuncs, on the same weekend that it came to light that commits matching that description were one of the methods used to smuggle a backdoor into the XZ Utils project (CVE-2024-3094). So I'll bend over backwards to explain both what I think is going on, and why this _isn't_ a weird ifunc-related backdooring attempt: When I say I 'wrap' memset, I mean I use DynamoRIO's 'drwrap' API to arrange that the side-channel test rig calls a function of mine before and after each call to memset. The way drwrap works is to look up the symbol address in either the main program or a shared library; in this case, it's a shared library, namely libc.so. Then it intercepts call instructions with exactly that address as the target. Unfortunately, what _actually_ happens when the main program calls memset is more complicated. First, control goes to the PLT entry for memset (still in the main program). In principle, that loads a GOT entry containing the address of memset (filled in by ld.so), and jumps to it. But in fact the GOT entry varies its value through the program; on the first call, it points to a resolver function, whose job is to _find out_ the address of memset. And in the version of libc.so I'm currently running, that resolver is an STT_GNU_IFUNC indirection function, which tests the host CPU's capabilities, and chooses an actual implementation of memset depending on what it finds. (In my case, it looks as if it's picking one that makes extensive use of x86 SIMD.) To avoid the overhead of doing this on every call, the returned function pointer is then written into the main program's GOT entry for memset, overwriting the address of the resolver function, so that the _next_ call the main program makes through the same PLT entry will go directly to the memset variant that was chosen. And the problem is that, after this has happened, none of the new control flow ever goes near the _official_ address of memset, as read out of libc.so's dynamic symbol table by DynamoRIO. The PLT entry isn't at that address, and neither is the particular SIMD variant that the resolver ended up choosing. So now my wrapper on memset is never being invoked, and memset cheerfully generates different control flow in runs of my crypto code that testsc expects to be doing exactly the same thing as each other, and all my tests fail spuriously. My solution, at least for the moment, is to completely abandon the strategy of wrapping memset. Instead, let's just make it behave the same way every time, by forcing all the affected memory allocations to have extra-strict alignment. I found that 64-byte alignment is not good enough to eliminate memset-related test failures, but 128-byte alignment is. This would be tricky in itself, if it weren't for the fact that PuTTY already has its own wrapper function on malloc (for various reasons), which everything in our code already uses. So I can divert to C11's aligned_alloc() there. That in turn is done by adding a new #ifdef to utils/memory.c, and compiling it with that #ifdef into a new object library that is included in testsc, superseding the standard memory.o that would otherwise be pulled in from our 'utils' static library. With the previous memset-compensator removed, this means testsc is now dependent on having aligned_alloc() available. So we test for it at cmake time, and don't build testsc at all if it can't be found. This shouldn't bother anyone very much; aligned_alloc() is available on _my_ testsc platform, and if anyone else is trying to run this test suite at all, I expect it will be on something at least as new as that. (One awkward thing here is that we can only replace _new_ allocations with calls to aligned_alloc(): C11 provides no aligned version of realloc. Happily, this doesn't currently introduce any new problems in testsc. If it does, I might have to do something even more painful in future.) So, why isn't this an ifunc-related backdoor attempt? Because (and you can check all of this from the patch): 1. The memset-wrapping code exists entirely within the DynamoRIO plugin module that lives in test/sclog. That is not used in production, only for running the 'testsc' side-channel tester. 2. The memset-wrapping code is _removed_ by this patch, not added. 3. None of this code is dealing directly with ifuncs - only working around the unwanted effects on my test suite from the fact that they exist somewhere else and introduce awkward behaviour. |
||
---|---|---|
charset | ||
cmake | ||
contrib | ||
crypto | ||
doc | ||
icons | ||
keygen | ||
otherbackends | ||
proxy | ||
ssh | ||
stubs | ||
terminal | ||
test | ||
unix | ||
utils | ||
windows | ||
.gitignore | ||
aqsync.c | ||
be_list.c | ||
Buildscr | ||
Buildscr.cv | ||
callback.c | ||
cgtest.c | ||
CHECKLST.txt | ||
clicons.c | ||
CMakeLists.txt | ||
cmdgen.c | ||
cmdline.c | ||
config.c | ||
console.c | ||
console.h | ||
defs.h | ||
dialog.c | ||
dialog.h | ||
errsock.c | ||
import.c | ||
LATEST.VER | ||
ldisc.c | ||
LICENCE | ||
licence.pl | ||
logging.c | ||
marshal.h | ||
misc.h | ||
mksrcarc.sh | ||
mkunxarc.sh | ||
mpint.h | ||
network.h | ||
pageant.c | ||
pageant.h | ||
pinger.c | ||
pscp.c | ||
psftp.c | ||
psftp.h | ||
psftpcommon.c | ||
psocks.c | ||
psocks.h | ||
putty.h | ||
puttymem.h | ||
README | ||
release.pl | ||
settings.c | ||
sign.sh | ||
ssh.h | ||
sshcr.h | ||
sshkeygen.h | ||
sshpubk.c | ||
sshrand.c | ||
storage.h | ||
timing.c | ||
tree234.h | ||
version.h | ||
x11disp.c |
This is the README for PuTTY, a free Windows and Unix Telnet and SSH client. PuTTY is built using CMake <https://cmake.org/>. To compile in the simplest way (on any of Linux, Windows or Mac), run these commands in the source directory: cmake . cmake --build . Then, to install in the simplest way on Linux or Mac: cmake --build . --target install On Unix, pterm would like to be setuid or setgid, as appropriate, to permit it to write records of user logins to /var/run/utmp and /var/log/wtmp. (Of course it will not use this privilege for anything else, and in particular it will drop all privileges before starting up complex subsystems like GTK.) The cmake install step doesn't attempt to add these privileges, so if you want user login recording to work, you should manually ch{own,grp} and chmod the pterm binary yourself after installation. If you don't do this, pterm will still work, but not update the user login databases. Documentation (in various formats including Windows Help and Unix `man' pages) is built from the Halibut (`.but') files in the `doc' subdirectory. If you aren't using one of our source snapshots, you'll need to do this yourself. Halibut can be found at <https://www.chiark.greenend.org.uk/~sgtatham/halibut/>. The PuTTY home web site is https://www.chiark.greenend.org.uk/~sgtatham/putty/ If you want to send bug reports or feature requests, please read the Feedback section of the web site before doing so. Sending one-line reports saying `it doesn't work' will waste your time as much as ours. See the file LICENCE for the licence conditions.