DIT, for 'Data-Independent Timing', is a bit you can set in the
processor state on sufficiently new Arm CPUs, which promises that a
long list of instructions will deliberately avoid varying their timing
based on the input register values. Just what you want for keeping
your constant-time crypto primitives constant-time.
As far as I'm aware, no CPU has _yet_ implemented any data-dependent
optimisations, so DIT is a safety precaution against them doing so in
future. It would be embarrassing to be caught without it if a future
CPU does do that, so we now turn on DIT in the PuTTY process state.
I've put a call to the new enable_dit() function at the start of every
main() and WinMain() belonging to a program that might do
cryptography (even testcrypt, in case someone uses it for something!),
and in case I missed one there, also added a second call at the first
moment that any cryptography-using part of the code looks as if it
might become active: when an instance of the SSH protocol object is
configured, when the system PRNG is initialised, and when selecting
any cryptographic authentication protocol in an HTTP or SOCKS proxy
connection. With any luck those precautions between them should ensure
it's on whenever we need it.
Arm's own recommendation is that you should carefully choose the
granularity at which you enable and disable DIT: there's a potential
time cost to turning it on and off (I'm not sure what, but plausibly
something of the order of a pipeline flush), so it's a performance hit
to do it _inside_ each individual crypto function, but if CPUs start
supporting significant data-dependent optimisation in future, then it
will also become a noticeable performance hit to just leave it on
across the whole process. So you'd like to do it somewhere in the
middle: for example, you might turn on DIT once around the whole
process of verifying and decrypting an SSH packet, instead of once for
decryption and once for MAC.
With all respect to that recommendation as a strategy for maximum
performance, I'm not following it here. I turn on DIT at the start of
the PuTTY process, and then leave it on. Rationale:
1. PuTTY is not otherwise a performance-critical application: it's
not likely to max out your CPU for any purpose _other_ than
cryptography. The most CPU-intensive non-cryptographic thing I can
imagine a PuTTY process doing is the complicated computation of
font rendering in the terminal, and that will normally be cached
(you don't recompute each glyph from its outline and hints for
every time you display it).
2. I think a bigger risk lies in accidental side channels from having
DIT turned off when it should have been on. I can imagine lots of
causes for that. Missing a crypto operation in some unswept corner
of the code; confusing control flow (like my coroutine macros)
jumping with DIT clear into the middle of a region of code that
expected DIT to have been set at the beginning; having a reference
counter of DIT requests and getting it out of sync.
In a more sophisticated programming language, it might be possible to
avoid the risk in #2 by cleverness with the type system. For example,
in Rust, you could have a zero-sized type that acts as a proof token
for DIT being enabled (it would be constructed by a function that also
sets DIT, have a Drop implementation that clears DIT, and be !Send so
you couldn't use it in a thread other than the one where DIT was set),
and then you could require all the actual crypto functions to take a
DitToken as an extra parameter, at zero runtime cost. Then "oops I
forgot to set DIT around this piece of crypto" would become a compile
error. Even so, you'd have to take some care with coroutine-structured
code (what happens if a Rust async function yields while holding a DIT
token?) and with nesting (if you have two DIT tokens, you don't want
dropping the inner one to clear DIT while the outer one is still there
to wrongly convince callees that it's set). Maybe in Rust you could
get this all to work reliably. But not in C!
DIT is an optional feature of the Arm architecture, so we must first
test to see if it's supported. This is done the same way as we already
do for the various Arm crypto accelerators: on ELF-based systems,
check the appropriate bit in the 'hwcap' words in the ELF aux vector;
on Mac, look for an appropriate sysctl flag.
On Windows I don't know of a way to query the DIT feature, _or_ of a
way to write the necessary enabling instruction in an MSVC-compatible
way. I've _heard_ that it might not be necessary, because Windows
might just turn on DIT unconditionally and leave it on, in an even
more extreme version of my own strategy. I don't have a source for
that - I heard it by word of mouth - but I _hope_ it's true, because
that would suit me very well! Certainly I can't write code to enable
DIT without knowing (a) how to do it, (b) how to know if it's safe.
Nonetheless, I've put the enable_dit() call in all the right places in
the Windows main programs as well as the Unix and cross-platform code,
so that if I later find out that I _can_ put in an explicit enable of
DIT in some way, I'll only have to arrange to set HAVE_ARM_DIT and
compile the enable_dit() function appropriately.
In 0.81 and before, we put an application manifest (XML-formatted
Windows resource) into all the GUI tools on purpose, and the CLI tools
like Plink didn't have one. But in 0.82, the CLI tools do have one,
and it's a small default one we didn't write ourselves, inserted by
some combination of cmake and clang-imitating-MSVC (I haven't checked
which of those is the cause).
This appears to have happened as a side effect of a build-tools
update, not on purpose. And its effect is that Windows XP now objects
to our plink.exe, because it's very picky about manifest format (we
have an old 'xp-wont-run' bug record about that).
Since it seemed to work fine to not have a manifest at all in 0.81,
let's go back to that. We were already passing /manifest:no to inhibit
the default manifest in the GUI tools, to stop it fighting with our
custom one; now I've moved /manifest:no into the global linker flags,
so it's applied to _all_ binaries, whether we're putting our own
manifest in or not.
This begins the process of enabling our Windows applications to handle
Unicode characters on their command lines which don't fit in the
system code page.
Instead of passing plain strings to cmdline_process_param, we now pass
a partially opaque and platform-specific thing called a CmdlineArg.
This has a method that extracts the argument word as a default-encoded
string, and another one that tries to extract it as UTF-8 (though it
may fail if the UTF-8 isn't available).
On Windows, the command line is now constructed by calling
split_into_argv_w on the Unicode command line returned by
GetCommandLineW(), and the UTF-8 method returns text converted
directly from that wide-character form, not going via the system code
page. So it _can_ include UTF-8 characters that wouldn't have
round-tripped via CP_ACP.
This commit introduces the abstraction and switches over the
cross-platform and Windows argv-handling code to use it, with minimal
functional change. Nothing yet tries to call cmdline_arg_get_utf8().
I say 'cross-platform and Windows' because on the Unix side there's
still a lot of use of plain old argv which I haven't converted. That
would be a much larger project, and isn't currently needed: the
_current_ aim of this abstraction is to get the right things to happen
relating to Unicode on Windows, so for code that doesn't run on
Windows anyway, it's not adding value. (Also there's a tension with
GTK, which wants to talk to standard argv and extract arguments _it_
knows about, so at the very least we'd have to let it munge argv
before importing it into this new system.)
This aims to be a reasonably exhaustive test of what happens if you
set Conf values to various things, and then save your session, and
find out what ends up in the storage. Or vice versa.
Currently, the test program is written to match the existing
behaviour. The idea is that I can refactor the code that does the
loading and saving, and if this test still passes, I've probably done
it right.
However, in the long term, this test will be a liability: it's yet
another place you have to add every new config option. So my plan is
to get rid of it again once the refactorings I'm planning are
finished.
Or rather, I'll get rid of _that_ part of its functionality. I also
suspect I'll have added new kinds of consistency check by then, which
won't be a liability in the same way, and which I'll want to keep.
Created in the simplest way, by parametrising the existing code using
macros.
Nothing actually uses this yet. I hope to gradually switch
command-line parsing from 'ANSI' to Unicode strings, but this isn't
the only preparation needed, so it might yet be a while.
This will contain test code and test subprograms that don't belong in
the top-level test directory due to not being cross-platform.
Initial contents are test_screenshot.c, which was already its own
source file in the windows subdir, and test_split_into_argv.c, which
I've sawn off the bottom of windows/utils/split_into_argv.c and moved
into its own source file.
I'm not conveniently set up to actually run it during my main build,
since that happens entirely on Linux and cross-builds the Windows
binaries. But it should at least be possible to build and run it by
hand.
I'm about to rewrite it completely, so the first thing I need to do is
to write tests for as much of the functionality as possible, so that I
can check the new implementation behaves in the same ways.
This removes one case from several of the individual tools'
command-line parsers, and moves it into a central place where it will
automatically be supported by any tool containing console.c.
In order to make that not cause a link failure, there's now a
stubs/no-console.c which GUI clients of cmdline.c must include.
Previously, the timing.c subsystem worked in Windows PuTTY by means of
scheduling WM_TIMER messages to be sent to the terminal window. Now it
uses a separate hidden window instead, and all the machinery for
handling that window lives on its own in windows/utils/gui-timing.c.
Most immediately, this removes a use of wgs.term_hwnd that will become
awkward when I move that structure in a following commit. But also, it
will make it easier to add the same timing subsystem to unrelated GUI
programs, such as Windows Pageant: if we ever decide to implement
automatic deletion or re-encryption of unused keys after a timeout,
this will help make that easier.
I made a specific subdirectory 'stubs' to keep all the link-time stub
modules in, like notiming.c. And I put _one_ run-time stub in it,
namely nullplug.c. But the rest of the runtime stubs went into utils.
I think it's better to keep all the stubs together, so I've moved all
the null*.c in utils into stubs (with the exception of nullstrcmp.c,
which means the 'null' in a different sense). Also, fiddled with the
naming to be a bit more consistent, and stated in the new CMakeLists
the naming policy that distinguishes no-*.c from null-*.c.
In the course of polishing up this dialog box, I'm going to want it to
actually do cryptographic things (such as checking validity of a
public key blob and printing its fingerprint), which means it will
need to link against SSH utility functions.
So I've moved the dialog-box setup and handling code out of config.c
into a new file in the ssh subdirectory and in the ssh library, where
those facilities will be conveniently available.
This also means that dialog-box setup code _won't_ be linked into
PuTTYtel or pterm (on either platform), so I've added a stub source
file to provide its entry-point function in those tools. Also,
provided a const bool to indicate whether that dialog is available,
which we use to decide whether to recognise that command-line option.
It's self-contained enough not to really need to live in dialog.c as a
set of static functions. Also, moving it means we can isolate the
implementation details - which also makes it easy to change them.
One such change is that I've added the ability to bake a context
pointer into the dialog - unused so far, but it will be shortly.
(Also, while I'm here, renamed the functions so they sound more as if
they're adding features than working around bugs - not to mention not
imputing mental illness to the usual versions.)
All the fiddly business where you have to check that a thing exists,
make sure of its type, find its size, allocate some memory, and then
read it again properly (or, alternatively, loop round dealing with
ERROR_MORE_DATA) just doesn't belong at every call site. It's crying
out to be moved out into some separate utility functions that present
a more ergonomic API, so that the code that decides _which_ Registry
entries to read and what to do with them can concentrate on that.
So I've written a fresh set of registry API wrappers in windows/utils,
and simplified windows/storage.c as a result. The jump-list handling
code in particular is almost legible now!
Using a new screenshot-taking module I just added in windows/utils,
these new options allow me to start up one of the tools with
demonstration window contents and automatically save a .BMP screenshot
to disk. This will allow me to keep essentially the same set of demo
images and update them easily to keep pace with the current appearance
of the real tools as PuTTY - and Windows itself - both evolve.
These two tools had ad-hoc command loops with similar options, and I
want to extend both (in particular, in a way that introduces options
with arguments). So I've started by throwing together some common code
to do all the tedious bits like finding option arguments wherever they
might be, throwing errors, handling "--" and so on.
Should be no functional change to the existing command-line syntax,
except that now all long options are recognised in both "-foo" and
"--foo" form.
In the previous state of the code, we first tested agent_exists() to
decide whether to be the long-running Pageant server or a short-lived
client; then, during the command-line parsing loop, we prompted for
passphrases to add keys presented on the command line (to ourself or
the server, respectively); *then* we set up the named pipe and
WM_COPYDATA receiver window to actually start functioning as a server,
if we decided that was our role.
A consequence is that if a user started up two Pageants each with an
encrypted key on the command line, there would be a race condition:
each one would decide that it was _going_ to be the server, then
prompt for a passphrase, and then try to set itself up as the server
once the passphrase is entered. So whichever one's passphrase prompt
was answered second would add its key to its own internal data
structures, then fail to set up the server's named pipe, terminate
with an error, and end up not having added its key to the _surviving_
server.
This change reorders the setup steps so that the command-line parsing
loop does not add the keys immediately; instead it merely caches the
key filenames provided. Then we make the decision about whether we're
the server, and set up both the named pipe and WM_COPYDATA window if
we are; and finally, we go back to our list of key filenames and
actually add them, either to ourself (if we're the server) or to some
other Pageant (if we're a client).
Moreover, the decision about whether to be the server is now wrapped
in an interprocess mutex similar to the one used in connection
sharing, which means that even if two or more Pageants are started up
as close to simultaneously as possible, they should avoid a race
condition and successfully manage to agree on exactly one of
themselves to be the server. For example, a user reported that this
could occur if you put shortcuts to multiple private key files in your
Windows Startup folder, so that they were all launched simultaneously
at startup.
One slightly odd behaviour that remains: if the server Pageant has to
prompt for private key passphrases at startup, then it won't actually
start _servicing_ requests from other Pageants until it's finished
dealing with its own prompts. As a result, if you do start up two
Pageants at once each with an encrypted key file on its command line,
the second one won't even manage to present its passphrase prompt
until the first one's prompt is dismissed, because it will block
waiting for the initial check of the key list. But it will get there
in the end, so that's only a cosmetic oddity.
It would be nice to arrange that Pageant GUI operations don't block
unrelated agent requests (e.g. by having the GUI and the agent code
run in separate threads). But that's a bigger problem, not specific to
startup time - the same thing happens if you interactively load a key
via Pageant's file dialog. And it would require a major reorganisation
to fix that fully, because currently the GUI code depends on being
able to call _internal_ Pageant query functions like
pageant_count_ssh2_keys() that don't work by constructing an agent
request at all.
All the seat functions that request an interactive prompt of some kind
to the user - both the main seat_get_userpass_input and the various
confirmation dialogs for things like host keys - were using a simple
int return value, with the general semantics of 0 = "fail", 1 =
"proceed" (and in the case of seat_get_userpass_input, answers to the
prompts were provided), and -1 = "request in progress, wait for a
callback".
In this commit I change all those functions' return types to a new
struct called SeatPromptResult, whose primary field is an enum
replacing those simple integer values.
The main purpose is that the enum has not three but _four_ values: the
"fail" result has been split into 'user abort' and 'software abort'.
The distinction is that a user abort occurs as a result of an
interactive UI action, such as the user clicking 'cancel' in a dialog
box or hitting ^D or ^C at a terminal password prompt - and therefore,
there's no need to display an error message telling the user that the
interactive operation has failed, because the user already knows,
because they _did_ it. 'Software abort' is from any other cause, where
PuTTY is the first to know there was a problem, and has to tell the
user.
We already had this 'user abort' vs 'software abort' distinction in
other parts of the code - the SSH backend has separate termination
functions which protocol layers can call. But we assumed that any
failure from an interactive prompt request fell into the 'user abort'
category, which is not true. A couple of examples: if you configure a
host key fingerprint in your saved session via the SSH > Host keys
pane, and the server presents a host key that doesn't match it, then
verify_ssh_host_key would report that the user had aborted the
connection, and feel no need to tell the user what had gone wrong!
Similarly, if a password provided on the command line was not
accepted, then (after I fixed the semantics of that in the previous
commit) the same wrong handling would occur.
So now, those Seat prompt functions too can communicate whether the
user or the software originated a connection abort. And in the latter
case, we also provide an error message to present to the user. Result:
in those two example cases (and others), error messages should no
longer go missing.
Implementation note: to avoid the hassle of having the error message
in a SeatPromptResult being a dynamically allocated string (and hence,
every recipient of one must always check whether it's non-NULL and
free it on every exit path, plus being careful about copying the
struct around), I've instead arranged that the structure contains a
function pointer and a couple of parameters, so that the string form
of the message can be constructed on demand. That way, the only users
who need to free it are the ones who actually _asked_ for it in the
first place, which is a much smaller set.
(This is one of the rare occasions that I regret not having C++'s
extra features available in this code base - a unique_ptr or
shared_ptr to a string would have been just the thing here, and the
compiler would have done all the hard work for me of remembering where
to insert the frees!)
This commit replaces all those fiddly little linking modules
(be_all.c, be_none.c, be_ssh.c etc) with a single source file
controlled by ifdefs, and introduces a function be_list() in
setup.cmake that makes it easy to compile a version of it appropriate
to each application.
This is a net reduction in code according to 'git diff --stat', even
though I've introduced more comments. It also gets rid of another pile
of annoying little source files in the top-level directory that didn't
deserve to take up so much room in 'ls'.
More concretely, doing this has some maintenance advantages.
Centralisation means less to maintain (e.g. n_ui_backends is worked
out once in a way that makes sense everywhere), and also, 'appname'
can now be reliably set per program. Previously, some programs got the
wrong appname due to sharing the same linking module (e.g. Plink had
appname="PuTTY"), which was a latent bug that would have manifested if
I'd wanted to reuse the same string in another context.
One thing I've changed in this rework is that Windows pterm no longer
has the ConPTY backend in its backends[]: it now has an empty one. The
special be_conpty.c module shouldn't really have been there in the
first place: it was used in the very earliest uncommitted drafts of
the ConPTY work, where I was using another method of selecting that
backend, but now that Windows pterm has a dedicated
backend_vt_from_conf() that refers to conpty_backend by name, it has
no need to live in backends[] at all, just as it doesn't have to in
Unix pterm.
While I'm in the mood for cleaning up the top-level directory here:
all the 'nostuff.c' files have moved into a new 'stubs' directory, and
I broke up be_misc.c into smaller modules that can live in 'utils'.
There are quite a few of them already, and I'm about to make another
one, so let's start with a bit of tidying up.
The CMake build organisation is unchanged: I haven't put the proxy
object files into a separate library, just moved the locations of the
source files. (Organising proxying as a library would be tricky
anyway, because of the various overrides for tools that want to avoid
cryptography.)
Before commit 6e69223dc2, Pageant would stop working after a
certain number of PuTTYs were active at the same time. (At most about
60, but maybe fewer - see below.)
This was because of two separate bugs. The easy one, fixed in
6e69223dc2 itself, was that PuTTY left each named-pipe connection
to Pageant open for the rest of its lifetime. So the real problem was
that Pageant had too many active connections at once. (And since a
given PuTTY might make multiple connections during userauth - one to
list keys, and maybe another to actually make a signature - that was
why the number of _PuTTYs_ might vary.)
It was clearly a bug that PuTTY was leaving connections to Pageant
needlessly open. But it was _also_ a bug that Pageant couldn't handle
more than about 60 at once. In this commit, I fix that secondary bug.
The cause of the bug is that the WaitForMultipleObjects function
family in the Windows API have a limit on the number of HANDLE objects
they can select between. The limit is MAXIMUM_WAIT_OBJECTS, defined to
be 64. And handle-io.c was using a separate event object for each I/O
subthread to communicate back to the main thread, so as soon as all
those event objects (plus a handful of other HANDLEs) added up to more
than 64, we'd start passing an overlarge handle array to
WaitForMultipleObjects, and it would start not doing what we wanted.
To fix this, I've reorganised handle-io.c so that all its subthreads
share just _one_ event object to signal readiness back to the main
thread. There's now a linked list of 'struct handle' objects that are
ready to be processed, protected by a CRITICAL_SECTION. Each subthread
signals readiness by adding itself to the linked list, and setting the
event object to indicate that the list is now non-empty. When the main
thread receives the event, it iterates over the whole list processing
all the ready handles.
(Each 'struct handle' still has a separate event object for the main
thread to use to communicate _to_ the subthread. That's OK, because no
thread is ever waiting on all those events at once: each subthread
only waits on its own.)
The previous HT_FOREIGN system didn't really fit into this framework.
So I've moved it out into its own system. There's now a handle-wait.c
which deals with the relatively simple job of managing a list of
handles that need to be waited for, each with a callback function;
that's what communicates a list of HANDLEs to event loops, and
receives the notification when the event loop notices that one of them
has done something. And handle-io.c is now just one client of
handle-wait.c, providing a single HANDLE to the event loop, and
dealing internally with everything that needs to be done when that
handle fires.
The new top-level handle-wait.c system *still* can't deal with more
than MAXIMUM_WAIT_OBJECTS. At the moment, I'm reasonably convinced it
doesn't need to: the only kind of HANDLE that any of our tools could
previously have needed to wait on more than one of was the one in
handle-io.c that I've just removed. But I've left some assertions and
a TODO comment in there just in case we need to change that in future.
This introduces a new entry to the radio-button list of proxy types,
in which the 'Proxy host' box is taken to be the name of an SSH server
or saved session. We make an entire subsidiary SSH connection to that
host, open a direct-tcpip channel through it, and use that as the
connection over which to run the primary network connection.
The result is basically the same as if you used a local proxy
subprocess, with a command along the lines of 'plink -batch %proxyhost
-nc %host:%port'. But it's all done in-process, by having an SshProxy
object implement the Socket trait to talk to the main connection, and
implement Seat and LogPolicy to talk to its subsidiary SSH backend.
All the refactoring in recent years has got us to the point where we
can do that without both SSH instances fighting over some global
variable or unique piece of infrastructure.
From an end user perspective, doing SSH proxying in-process like this
is a little bit easier to set up: it doesn't require you to bake the
full pathname of Plink into your saved session (or to have it on the
system PATH), and the SshProxy setup function automatically turns off
SSH features that would be inappropriate in this context, such as
additional port forwardings, or acting as a connection-sharing
upstream. And it has minor advantages like getting the Event Log for
the subsidiary connection interleaved in the main Event Log, as if it
were stderr output from a proxy subcommand, without having to
deliberately configure the subsidiary Plink into verbose mode.
However, this is an initial implementation only, and it doesn't yet
support the _big_ payoff for doing this in-process, which (I hope)
will be the ability to handle interactive prompts from the subsidiary
SSH connection via the same user interface as the primary one. For
example, you might need to answer two password prompts in succession,
or (the first time you use a session configured this way) confirm the
host keys for both proxy and destination SSH servers. Comments in the
new source file discuss some design thoughts on filling in this gap.
For the moment, if the proxy SSH connection encounters any situation
where an interactive prompt is needed, it will make the safe
assumption, the same way 'plink -batch' would do. So it's at least no
_worse_ than the existing technique of putting the proxy connection in
a subprocess.
This fulfills our long-standing Mayhem-difficulty wishlist item
'win-command-prompt': this is a Windows pterm in the sense that when
you run it you get a local cmd.exe running inside a PuTTY-style window.
Advantages of this: you get the same free choice of fonts as PuTTY has
(no restriction to a strange subset of the system's available fonts);
you get the same copy-paste gestures as PuTTY (no mental gear-shifting
when you have command prompts and SSH sessions open on the same
desktop); you get scrollback with the PuTTY semantics (scrolling to
the bottom gets you to where the action is, as opposed to the way you
could accidentally find yourself 500 lines past the end of the action
in a real console).
'win-command-prompt' was at Mayhem difficulty ('Probably impossible')
basically on the grounds that with Windows's old APIs for accessing
the contents of consoles, there was no way I could find to get this to
work sensibly. What was needed to make it feasible was a major piece
of re-engineering work inside Windows itself.
But, of course, that's exactly what happened! In 2019, the new ConPTY
API arrived, which lets you create an object that behaves like a
Windows console at one end, and round the back, emits a stream of
VT-style escape sequences as the screen contents evolve, and accepts a
VT-style input stream in return which it will parse function and arrow
keys out of in the usual way.
So now it's actually _easy_ to get this to basically work. The new
backend, in conpty.c, has to do a handful of magic Windows API calls
to set up the pseudo-console and its feeder pipes and start a
subprocess running in it, a further magic call every time the PuTTY
window is resized, and detect the end of the session by watching for
the subprocess terminating. But apart from that, all it has to do is
pass data back and forth unmodified between those pipes and the
backend's associated Seat!
That said, this is new and experimental, and there will undoubtedly be
issues. One that I already know about is that you can't copy and paste
a word that has wrapped between lines without getting an annoying
newline in the middle of it. As far as I can see this is a fundamental
limitation: the ConPTY system sends the _same_ escape sequence stream
for a line that wrapped as it would send for a line that had a logical
\n at what would have been the wrap point. Probably the best we can do
to mitigate this is to adopt a different heuristic for newline elision
that's right more often than it's wrong.
For the moment, that experimental-ness is indicated by the fact that
Buildscr will build, sign and deliver a copy of pterm.exe for each
flavour of Windows, but won't include it in the .zip file or in the
installer. (In fact, that puts it in exactly the same ad-hoc category
as PuTTYtel, although for completely different reasons.)
This prepares the ground for a second essentially similarly-shaped
program reusing most of window.c but handling its command line and
startup differently. A couple of large parts of WinMain() to do with
backend selection and command-line handling are now subfunctions in a
separate file putty.c.
Also, our custom AppUserModelId is defined in that file, so that it
can vary with the client application.
The code to find out the location of the c:\windows\system32 directory
was already present, in load_system32_dll(). Now it's moved out into a
function of its own, so it can be called in other contexts.
Now that the main source file of Plink in each platform directory has
the same name, we can put centralise the main definition of the
program in the main CMakeLists.txt, and in the platform directory,
just add the few extra modules needed to clear up platform-specific
details.
The same goes for psocks. And PSCP and PSFTP could have been moved to
the top level already - I just hadn't done it in the initial setup.
This gets rid of all those annoying 'win', 'ux' and 'gtk' prefixes
which made filenames annoying to type and to tab-complete. Also, as
with my other recent renaming sprees, I've taken the opportunity to
expand and clarify some of the names so that they're not such cryptic
abbreviations.
This clears up another large pile of clutter at the top level, and in
the process, allows me to rename source files to things that don't all
have that annoying 'ssh' prefix at the top.
add_platform_sources_to_library() is now called
add_sources_from_current_dir(), so that it will make sense when I use
it in subdirectories that aren't for a particular platform.
I found these while going through the code, and decided if we're going
to have them then we should compile them. They didn't all compile
first time, proving my point :-)
I've enhanced the tree234 test so that it has a verbose option, which
by default is off.
This new implementation uses the same optimisation-barrier technique
that I used in various places in testsc: have a no-op function, and a
volatile function pointer pointing at it, and then call through the
function pointer, so that nothing actually happens (apart from the
physical call and return) but the compiler has to assume that
_anything_ might have happened.
Doing this just after a memset enforces that the compiler can't have
thrown away the memset, because the called function might (for
example) check that all the memory really is zero and abort if not.
I've been turning this over in my mind ever since coming up with the
technique for testsc. I think it's far more robust than the previous
smemclr technique: so much so that I'm switching to using it
_everywhere_, and no longer using platform alternatives like Windows's
SecureZeroMemory().
This is a module that I'd noticed in the past was too monolithic.
There's a big pile of stub functions in uxpgnt.c that only have to be
there because the implementation of true X11 _forwarding_ (i.e.
actually managing a channel within an SSH connection), which Pageant
doesn't need, was in the same module as more general X11-related
utility functions which Pageant does need.
So I've broken up this awkward monolith. Now x11fwd.c contains only
the code that really does all go together for dealing with SSH X
forwarding: the management of an X forwarding channel (including the
vtables to make it behave as Channel at the SSH end and a Plug at the
end that connects to the local X server), and the management of
authorisation for those channels, including maintaining a tree234 of
possible auth values and verifying the one we received.
Most of the functions removed from this file have moved into the utils
subdir, and also into the utils library (i.e. further down the link
order), because they were basically just string and data processing.
One exception is x11_setup_display, which parses a display string and
returns a struct telling you everything about how to connect to it.
That talks to the networking code (it does name lookups and makes a
SockAddr), so it has to live in the network library rather than utils,
and therefore it's not in the utils subdirectory either.
The other exception is x11_get_screen_number, which it turned out
nothing called at all! Apparently the job it used to do is now done as
part of x11_setup_display. So I've just removed it completely.
Now that the new CMake build system is encouraging us to lay out the
code like a set of libraries, it seems like a good idea to make them
look more _like_ libraries, by putting things into separate modules as
far as possible.
This fixes several previous annoyances in which you had to link
against some object in order to get a function you needed, but that
object also contained other functions you didn't need which included
link-time symbol references you didn't want to have to deal with. The
usual offender was subsidiary supporting programs including misc.c for
some innocuous function and then finding they had to deal with the
requirements of buildinfo().
This big reorganisation introduces three new subdirectories called
'utils', one at the top level and one in each platform subdir. In each
case, the directory contains basically the same files that were
previously placed in the 'utils' build-time library, except that the
ones that were extremely miscellaneous (misc.c, utils.c, uxmisc.c,
winmisc.c, winmiscs.c, winutils.c) have been split up into much
smaller pieces.
This brings various concrete advantages over the previous system:
- consistent support for out-of-tree builds on all platforms
- more thorough support for Visual Studio IDE project files
- support for Ninja-based builds, which is particularly useful on
Windows where the alternative nmake has no parallel option
- a really simple set of build instructions that work the same way on
all the major platforms (look how much shorter README is!)
- better decoupling of the project configuration from the toolchain
configuration, so that my Windows cross-building doesn't need
(much) special treatment in CMakeLists.txt
- configure-time tests on Windows as well as Linux, so that a lot of
ad-hoc #ifdefs second-guessing a particular feature's presence from
the compiler version can now be replaced by tests of the feature
itself
Also some longer-term software-engineering advantages:
- other people have actually heard of CMake, so they'll be able to
produce patches to the new build setup more easily
- unlike the old mkfiles.pl, CMake is not my personal problem to
maintain
- most importantly, mkfiles.pl was just a horrible pile of
unmaintainable cruft, which even I found it painful to make changes
to or to use, and desperately needed throwing in the bin. I've
already thrown away all the variants of it I had in other projects
of mine, and was only delaying this one so we could make the 0.75
release branch first.
This change comes with a noticeable build-level restructuring. The
previous Recipe worked by compiling every object file exactly once,
and then making each executable by linking a precisely specified
subset of the same object files. But in CMake, that's not the natural
way to work - if you write the obvious command that puts the same
source file into two executable targets, CMake generates a makefile
that compiles it once per target. That can be an advantage, because it
gives you the freedom to compile it differently in each case (e.g.
with a #define telling it which program it's part of). But in a
project that has many executable targets and had carefully contrived
to _never_ need to build any module more than once, all it does is
bloat the build time pointlessly!
To avoid slowing down the build by a large factor, I've put most of
the modules of the code base into a collection of static libraries
organised vaguely thematically (SSH, other backends, crypto, network,
...). That means all those modules can still be compiled just once
each, because once each library is built it's reused unchanged for all
the executable targets.
One upside of this library-based structure is that now I don't have to
manually specify exactly which objects go into which programs any more
- it's enough to specify which libraries are needed, and the linker
will figure out the fine detail automatically. So there's less
maintenance to do in CMakeLists.txt when the source code changes.
But that reorganisation also adds fragility, because of the trad Unix
linker semantics of walking along the library list once each, so that
cyclic references between your libraries will provoke link errors. The
current setup builds successfully, but I suspect it only just manages
it.
(In particular, I've found that MinGW is the most finicky on this
score of the Windows compilers I've tried building with. So I've
included a MinGW test build in the new-look Buildscr, because
otherwise I think there'd be a significant risk of introducing
MinGW-only build failures due to library search order, which wasn't a
risk in the previous library-free build organisation.)
In the longer term I hope to be able to reduce the risk of that, via
gradual reorganisation (in particular, breaking up too-monolithic
modules, to reduce the risk of knock-on references when you included a
module for function A and it also contains function B with an
unsatisfied dependency you didn't really need). Ideally I want to
reach a state in which the libraries all have sensibly described
purposes, a clearly documented (partial) order in which they're
permitted to depend on each other, and a specification of what stubs
you have to put where if you're leaving one of them out (e.g.
nocrypto) and what callbacks you have to define in your non-library
objects to satisfy dependencies from things low in the stack (e.g.
out_of_memory()).
One thing that's gone completely missing in this migration,
unfortunately, is the unfinished MacOS port linked against Quartz GTK.
That's because it turned out that I can't currently build it myself,
on my own Mac: my previous installation of GTK had bit-rotted as a
side effect of an Xcode upgrade, and I haven't yet been able to
persuade jhbuild to make me a new one. So I can't even build the MacOS
port with the _old_ makefiles, and hence, I have no way of checking
that the new ones also work. I hope to bring that port back to life at
some point, but I don't want it to block the rest of this change.