1
0
mirror of https://git.tartarus.org/simon/putty.git synced 2025-01-10 01:48:00 +00:00
Commit Graph

12 Commits

Author SHA1 Message Date
Simon Tatham
aab0892671 Side-channel tester: align memory allocations.
While trying to get an upcoming piece of code through testsc, I had
trouble - _yet again_ - with the way that control flow diverges inside
the glibc implementations of functions like memcpy and memset,
depending on the alignment of the input blocks _above_ the alignment
guaranteed by malloc, so that doing the same sequence of malloc +
memset can lead to different control flow. (I believe this is done
either for cache performance reasons or SIMD alignment requirements,
or both: on x86, some SIMD instructions require memory alignment
beyond what malloc guarantees, which is also awkward for our x86
hardware crypto implementations.)

My previous effort to normalise this problem out of sclog's log files
worked by wrapping memset and all its synonyms that I could find. But
this weekend, that failed for me, and the reason appears to be ifuncs.

I'm aware of the great irony of committing code to a security project
with a log message saying something vague about ifuncs, on the same
weekend that it came to light that commits matching that description
were one of the methods used to smuggle a backdoor into the XZ Utils
project (CVE-2024-3094). So I'll bend over backwards to explain both
what I think is going on, and why this _isn't_ a weird ifunc-related
backdooring attempt:

When I say I 'wrap' memset, I mean I use DynamoRIO's 'drwrap' API to
arrange that the side-channel test rig calls a function of mine before
and after each call to memset. The way drwrap works is to look up the
symbol address in either the main program or a shared library; in this
case, it's a shared library, namely libc.so. Then it intercepts call
instructions with exactly that address as the target.

Unfortunately, what _actually_ happens when the main program calls
memset is more complicated. First, control goes to the PLT entry for
memset (still in the main program). In principle, that loads a GOT
entry containing the address of memset (filled in by ld.so), and jumps
to it. But in fact the GOT entry varies its value through the program;
on the first call, it points to a resolver function, whose job is to
_find out_ the address of memset. And in the version of libc.so I'm
currently running, that resolver is an STT_GNU_IFUNC indirection
function, which tests the host CPU's capabilities, and chooses an
actual implementation of memset depending on what it finds. (In my
case, it looks as if it's picking one that makes extensive use of x86
SIMD.) To avoid the overhead of doing this on every call, the returned
function pointer is then written into the main program's GOT entry for
memset, overwriting the address of the resolver function, so that the
_next_ call the main program makes through the same PLT entry will go
directly to the memset variant that was chosen.

And the problem is that, after this has happened, none of the new
control flow ever goes near the _official_ address of memset, as read
out of libc.so's dynamic symbol table by DynamoRIO. The PLT entry
isn't at that address, and neither is the particular SIMD variant that
the resolver ended up choosing. So now my wrapper on memset is never
being invoked, and memset cheerfully generates different control flow
in runs of my crypto code that testsc expects to be doing exactly the
same thing as each other, and all my tests fail spuriously.

My solution, at least for the moment, is to completely abandon the
strategy of wrapping memset. Instead, let's just make it behave the
same way every time, by forcing all the affected memory allocations to
have extra-strict alignment. I found that 64-byte alignment is not
good enough to eliminate memset-related test failures, but 128-byte
alignment is.

This would be tricky in itself, if it weren't for the fact that PuTTY
already has its own wrapper function on malloc (for various reasons),
which everything in our code already uses. So I can divert to C11's
aligned_alloc() there. That in turn is done by adding a new #ifdef to
utils/memory.c, and compiling it with that #ifdef into a new object
library that is included in testsc, superseding the standard memory.o
that would otherwise be pulled in from our 'utils' static library.

With the previous memset-compensator removed, this means testsc is now
dependent on having aligned_alloc() available. So we test for it at
cmake time, and don't build testsc at all if it can't be found. This
shouldn't bother anyone very much; aligned_alloc() is available on
_my_ testsc platform, and if anyone else is trying to run this test
suite at all, I expect it will be on something at least as new as
that.

(One awkward thing here is that we can only replace _new_ allocations
with calls to aligned_alloc(): C11 provides no aligned version of
realloc. Happily, this doesn't currently introduce any new problems in
testsc. If it does, I might have to do something even more painful in
future.)

So, why isn't this an ifunc-related backdoor attempt? Because (and you
can check all of this from the patch):

 1. The memset-wrapping code exists entirely within the DynamoRIO
    plugin module that lives in test/sclog. That is not used in
    production, only for running the 'testsc' side-channel tester.

 2. The memset-wrapping code is _removed_ by this patch, not added.

 3. None of this code is dealing directly with ifuncs - only working
    around the unwanted effects on my test suite from the fact that
    they exist somewhere else and introduce awkward behaviour.
2024-04-01 13:10:49 +01:00
Simon Tatham
cfda9585ac Build option to disable scrollback compression.
This was requested by a downstream of the code, who wanted to change
the time/space tradeoff in the terminal. I currently have no plans to
change this setting for upstream PuTTY, although there is a cmake
option for it just to make testing it easy.

To avoid sprinkling ifdefs over the whole terminal code, the strategy
is to keep the separate type 'compressed_scrollback_line', and turn it
into a typedef for a 'termline *'. So compressline() becomes almost
trivial, and decompressline() even more so.

Memory management is the fiddly part. To make this work sensibly on
both sides, I've broken up each of compressline() and decompressline()
into two versions, one of which takes ownership of (and logically
speaking frees) its input, and the other doesn't. So at call sites
where a function was followed by a free, it's now calling the
'and_free' version of the function, and where the input object was
reused afterwards, it's calling the 'no_free' version. This means that
in different branches of the #if, I can make one function call the
other or vice versa, and no call site is stuck with having to do
things in a more roundabout way than necessary.

The freeing of the _return_ value from decompressline() is handled for
us, because termlines already have a 'temporary' flag which is set
when they're returned from the decompressor, and anyone receiving a
termline from lineptr() calls unlineptr() when they're finished with
it, which will _conditionally_ free it, depending on that 'temporary'
flag. So in the new mode, 'temporary' is never set at all, and all
those unlineptr() calls do nothing.

However, we also still need to free compressed lines properly when
they're actually being thrown away (scrolled off the top of the
scrollback, or cleaned up in term_free), and for that, I've made a new
special-purpose free_compressed_line() function.

(cherry picked from commit 5f2eff2fea)
2023-04-19 14:28:36 +01:00
Jacob Nevins
02c745e186 Use correct flags for RelWithDebInfo.
Bug introduced in d163b37caf. (Ahem.)
2022-05-19 23:23:14 +01:00
Jacob Nevins
d163b37caf Squash NDEBUG for RelWithDebInfo build type too. 2022-05-18 18:57:45 +01:00
Simon Tatham
53f7da8ce7 Merge be_*.c into one ifdef-controlled module.
This commit replaces all those fiddly little linking modules
(be_all.c, be_none.c, be_ssh.c etc) with a single source file
controlled by ifdefs, and introduces a function be_list() in
setup.cmake that makes it easy to compile a version of it appropriate
to each application.

This is a net reduction in code according to 'git diff --stat', even
though I've introduced more comments. It also gets rid of another pile
of annoying little source files in the top-level directory that didn't
deserve to take up so much room in 'ls'.

More concretely, doing this has some maintenance advantages.
Centralisation means less to maintain (e.g. n_ui_backends is worked
out once in a way that makes sense everywhere), and also, 'appname'
can now be reliably set per program. Previously, some programs got the
wrong appname due to sharing the same linking module (e.g. Plink had
appname="PuTTY"), which was a latent bug that would have manifested if
I'd wanted to reuse the same string in another context.

One thing I've changed in this rework is that Windows pterm no longer
has the ConPTY backend in its backends[]: it now has an empty one. The
special be_conpty.c module shouldn't really have been there in the
first place: it was used in the very earliest uncommitted drafts of
the ConPTY work, where I was using another method of selecting that
backend, but now that Windows pterm has a dedicated
backend_vt_from_conf() that refers to conpty_backend by name, it has
no need to live in backends[] at all, just as it doesn't have to in
Unix pterm.
2021-11-26 17:58:55 +00:00
Simon Tatham
5eee8ca648 Compatibility with older versions of cmake.
After this change, the cmake setup now works even on Debian stretch
(oldoldstable), which runs cmake 3.7.

In order to support a version that early I had to:

 - write a fallback implementation of 'add_compile_definitions' for
   older cmakes, which is easy, because add_compile_definitions(FOO)
   is basically just add_compile_options(-DFOO)

 - stop using list(TRANSFORM) and string(JOIN), of which I had one
   case each, and they were easily replaced with simple foreach loops

 - stop putting OBJECT libraries in the target_link_libraries command
   for executable targets, in favour of adding $<TARGET_OBJECTS:foo>
   to the main sources list for the same target. That matches what I
   do with library targets, so it's probably more sensible anyway.

I tried going back by another Debian release and getting this cmake
setup to work on jessie, but that runs CMake 3.0.1, and in _that_
version of cmake the target_sources command is missing, and I didn't
find any alternative way to add extra sources to a target after having
first declared it. Reorganising to cope with _that_ omission would be
too much upheaval without a very good reason.
2021-10-29 18:08:18 +01:00
Simon Tatham
c931c7f02a gitcommit.cmake: stop needing TOPLEVEL_SOURCE_DIR.
It's always the same as the cwd when the script is invoked, and by
having the script get it _from_ its own cwd, we arrange a bit of
automatic normalisation in situations where you need to invoke it with
some non-canonical path like one ending in "/.." - which I'll do in
the next commit.
2021-05-08 10:25:34 +01:00
Simon Tatham
4a8fc43d81 Prepare gitcommit.cmake to support multiple output types.
I'm about to want to embed the current git commit into a Halibut
source file, for which I'll need to add a second output mode to the
existing script that finds it out.
2021-05-03 17:01:55 +01:00
Simon Tatham
77940f8fa3 Move some add_executable() calls to top-level CMakeLists.
Now that the main source file of Plink in each platform directory has
the same name, we can put centralise the main definition of the
program in the main CMakeLists.txt, and in the platform directory,
just add the few extra modules needed to clear up platform-specific
details.

The same goes for psocks. And PSCP and PSFTP could have been moved to
the top level already - I just hadn't done it in the initial setup.
2021-04-26 18:00:01 +01:00
Simon Tatham
70f6ce5628 Rename one of my cmake support functions. (NFC)
add_platform_sources_to_library() is now called
add_sources_from_current_dir(), so that it will make sense when I use
it in subdirectories that aren't for a particular platform.
2021-04-19 18:26:56 +01:00
Pavel I. Kryukov
3d55a2befb Add coverage flags 2021-04-18 08:30:44 +01:00
Simon Tatham
c19e7215dd Replace mkfiles.pl with a CMake build system.
This brings various concrete advantages over the previous system:

 - consistent support for out-of-tree builds on all platforms

 - more thorough support for Visual Studio IDE project files

 - support for Ninja-based builds, which is particularly useful on
   Windows where the alternative nmake has no parallel option

 - a really simple set of build instructions that work the same way on
   all the major platforms (look how much shorter README is!)

 - better decoupling of the project configuration from the toolchain
   configuration, so that my Windows cross-building doesn't need
   (much) special treatment in CMakeLists.txt

 - configure-time tests on Windows as well as Linux, so that a lot of
   ad-hoc #ifdefs second-guessing a particular feature's presence from
   the compiler version can now be replaced by tests of the feature
   itself

Also some longer-term software-engineering advantages:

 - other people have actually heard of CMake, so they'll be able to
   produce patches to the new build setup more easily

 - unlike the old mkfiles.pl, CMake is not my personal problem to
   maintain

 - most importantly, mkfiles.pl was just a horrible pile of
   unmaintainable cruft, which even I found it painful to make changes
   to or to use, and desperately needed throwing in the bin. I've
   already thrown away all the variants of it I had in other projects
   of mine, and was only delaying this one so we could make the 0.75
   release branch first.

This change comes with a noticeable build-level restructuring. The
previous Recipe worked by compiling every object file exactly once,
and then making each executable by linking a precisely specified
subset of the same object files. But in CMake, that's not the natural
way to work - if you write the obvious command that puts the same
source file into two executable targets, CMake generates a makefile
that compiles it once per target. That can be an advantage, because it
gives you the freedom to compile it differently in each case (e.g.
with a #define telling it which program it's part of). But in a
project that has many executable targets and had carefully contrived
to _never_ need to build any module more than once, all it does is
bloat the build time pointlessly!

To avoid slowing down the build by a large factor, I've put most of
the modules of the code base into a collection of static libraries
organised vaguely thematically (SSH, other backends, crypto, network,
...). That means all those modules can still be compiled just once
each, because once each library is built it's reused unchanged for all
the executable targets.

One upside of this library-based structure is that now I don't have to
manually specify exactly which objects go into which programs any more
- it's enough to specify which libraries are needed, and the linker
will figure out the fine detail automatically. So there's less
maintenance to do in CMakeLists.txt when the source code changes.

But that reorganisation also adds fragility, because of the trad Unix
linker semantics of walking along the library list once each, so that
cyclic references between your libraries will provoke link errors. The
current setup builds successfully, but I suspect it only just manages
it.

(In particular, I've found that MinGW is the most finicky on this
score of the Windows compilers I've tried building with. So I've
included a MinGW test build in the new-look Buildscr, because
otherwise I think there'd be a significant risk of introducing
MinGW-only build failures due to library search order, which wasn't a
risk in the previous library-free build organisation.)

In the longer term I hope to be able to reduce the risk of that, via
gradual reorganisation (in particular, breaking up too-monolithic
modules, to reduce the risk of knock-on references when you included a
module for function A and it also contains function B with an
unsatisfied dependency you didn't really need). Ideally I want to
reach a state in which the libraries all have sensibly described
purposes, a clearly documented (partial) order in which they're
permitted to depend on each other, and a specification of what stubs
you have to put where if you're leaving one of them out (e.g.
nocrypto) and what callbacks you have to define in your non-library
objects to satisfy dependencies from things low in the stack (e.g.
out_of_memory()).

One thing that's gone completely missing in this migration,
unfortunately, is the unfinished MacOS port linked against Quartz GTK.
That's because it turned out that I can't currently build it myself,
on my own Mac: my previous installation of GTK had bit-rotted as a
side effect of an Xcode upgrade, and I haven't yet been able to
persuade jhbuild to make me a new one. So I can't even build the MacOS
port with the _old_ makefiles, and hence, I have no way of checking
that the new ones also work. I hope to bring that port back to life at
some point, but I don't want it to block the rest of this change.
2021-04-17 13:53:02 +01:00