Before commit 6e69223dc2, Pageant would stop working after a
certain number of PuTTYs were active at the same time. (At most about
60, but maybe fewer - see below.)
This was because of two separate bugs. The easy one, fixed in
6e69223dc2 itself, was that PuTTY left each named-pipe connection
to Pageant open for the rest of its lifetime. So the real problem was
that Pageant had too many active connections at once. (And since a
given PuTTY might make multiple connections during userauth - one to
list keys, and maybe another to actually make a signature - that was
why the number of _PuTTYs_ might vary.)
It was clearly a bug that PuTTY was leaving connections to Pageant
needlessly open. But it was _also_ a bug that Pageant couldn't handle
more than about 60 at once. In this commit, I fix that secondary bug.
The cause of the bug is that the WaitForMultipleObjects function
family in the Windows API have a limit on the number of HANDLE objects
they can select between. The limit is MAXIMUM_WAIT_OBJECTS, defined to
be 64. And handle-io.c was using a separate event object for each I/O
subthread to communicate back to the main thread, so as soon as all
those event objects (plus a handful of other HANDLEs) added up to more
than 64, we'd start passing an overlarge handle array to
WaitForMultipleObjects, and it would start not doing what we wanted.
To fix this, I've reorganised handle-io.c so that all its subthreads
share just _one_ event object to signal readiness back to the main
thread. There's now a linked list of 'struct handle' objects that are
ready to be processed, protected by a CRITICAL_SECTION. Each subthread
signals readiness by adding itself to the linked list, and setting the
event object to indicate that the list is now non-empty. When the main
thread receives the event, it iterates over the whole list processing
all the ready handles.
(Each 'struct handle' still has a separate event object for the main
thread to use to communicate _to_ the subthread. That's OK, because no
thread is ever waiting on all those events at once: each subthread
only waits on its own.)
The previous HT_FOREIGN system didn't really fit into this framework.
So I've moved it out into its own system. There's now a handle-wait.c
which deals with the relatively simple job of managing a list of
handles that need to be waited for, each with a callback function;
that's what communicates a list of HANDLEs to event loops, and
receives the notification when the event loop notices that one of them
has done something. And handle-io.c is now just one client of
handle-wait.c, providing a single HANDLE to the event loop, and
dealing internally with everything that needs to be done when that
handle fires.
The new top-level handle-wait.c system *still* can't deal with more
than MAXIMUM_WAIT_OBJECTS. At the moment, I'm reasonably convinced it
doesn't need to: the only kind of HANDLE that any of our tools could
previously have needed to wait on more than one of was the one in
handle-io.c that I've just removed. But I've left some assertions and
a TODO comment in there just in case we need to change that in future.
This introduces a new entry to the radio-button list of proxy types,
in which the 'Proxy host' box is taken to be the name of an SSH server
or saved session. We make an entire subsidiary SSH connection to that
host, open a direct-tcpip channel through it, and use that as the
connection over which to run the primary network connection.
The result is basically the same as if you used a local proxy
subprocess, with a command along the lines of 'plink -batch %proxyhost
-nc %host:%port'. But it's all done in-process, by having an SshProxy
object implement the Socket trait to talk to the main connection, and
implement Seat and LogPolicy to talk to its subsidiary SSH backend.
All the refactoring in recent years has got us to the point where we
can do that without both SSH instances fighting over some global
variable or unique piece of infrastructure.
From an end user perspective, doing SSH proxying in-process like this
is a little bit easier to set up: it doesn't require you to bake the
full pathname of Plink into your saved session (or to have it on the
system PATH), and the SshProxy setup function automatically turns off
SSH features that would be inappropriate in this context, such as
additional port forwardings, or acting as a connection-sharing
upstream. And it has minor advantages like getting the Event Log for
the subsidiary connection interleaved in the main Event Log, as if it
were stderr output from a proxy subcommand, without having to
deliberately configure the subsidiary Plink into verbose mode.
However, this is an initial implementation only, and it doesn't yet
support the _big_ payoff for doing this in-process, which (I hope)
will be the ability to handle interactive prompts from the subsidiary
SSH connection via the same user interface as the primary one. For
example, you might need to answer two password prompts in succession,
or (the first time you use a session configured this way) confirm the
host keys for both proxy and destination SSH servers. Comments in the
new source file discuss some design thoughts on filling in this gap.
For the moment, if the proxy SSH connection encounters any situation
where an interactive prompt is needed, it will make the safe
assumption, the same way 'plink -batch' would do. So it's at least no
_worse_ than the existing technique of putting the proxy connection in
a subprocess.
This fulfills our long-standing Mayhem-difficulty wishlist item
'win-command-prompt': this is a Windows pterm in the sense that when
you run it you get a local cmd.exe running inside a PuTTY-style window.
Advantages of this: you get the same free choice of fonts as PuTTY has
(no restriction to a strange subset of the system's available fonts);
you get the same copy-paste gestures as PuTTY (no mental gear-shifting
when you have command prompts and SSH sessions open on the same
desktop); you get scrollback with the PuTTY semantics (scrolling to
the bottom gets you to where the action is, as opposed to the way you
could accidentally find yourself 500 lines past the end of the action
in a real console).
'win-command-prompt' was at Mayhem difficulty ('Probably impossible')
basically on the grounds that with Windows's old APIs for accessing
the contents of consoles, there was no way I could find to get this to
work sensibly. What was needed to make it feasible was a major piece
of re-engineering work inside Windows itself.
But, of course, that's exactly what happened! In 2019, the new ConPTY
API arrived, which lets you create an object that behaves like a
Windows console at one end, and round the back, emits a stream of
VT-style escape sequences as the screen contents evolve, and accepts a
VT-style input stream in return which it will parse function and arrow
keys out of in the usual way.
So now it's actually _easy_ to get this to basically work. The new
backend, in conpty.c, has to do a handful of magic Windows API calls
to set up the pseudo-console and its feeder pipes and start a
subprocess running in it, a further magic call every time the PuTTY
window is resized, and detect the end of the session by watching for
the subprocess terminating. But apart from that, all it has to do is
pass data back and forth unmodified between those pipes and the
backend's associated Seat!
That said, this is new and experimental, and there will undoubtedly be
issues. One that I already know about is that you can't copy and paste
a word that has wrapped between lines without getting an annoying
newline in the middle of it. As far as I can see this is a fundamental
limitation: the ConPTY system sends the _same_ escape sequence stream
for a line that wrapped as it would send for a line that had a logical
\n at what would have been the wrap point. Probably the best we can do
to mitigate this is to adopt a different heuristic for newline elision
that's right more often than it's wrong.
For the moment, that experimental-ness is indicated by the fact that
Buildscr will build, sign and deliver a copy of pterm.exe for each
flavour of Windows, but won't include it in the .zip file or in the
installer. (In fact, that puts it in exactly the same ad-hoc category
as PuTTYtel, although for completely different reasons.)
This prepares the ground for a second essentially similarly-shaped
program reusing most of window.c but handling its command line and
startup differently. A couple of large parts of WinMain() to do with
backend selection and command-line handling are now subfunctions in a
separate file putty.c.
Also, our custom AppUserModelId is defined in that file, so that it
can vary with the client application.
The code to find out the location of the c:\windows\system32 directory
was already present, in load_system32_dll(). Now it's moved out into a
function of its own, so it can be called in other contexts.
Now that the main source file of Plink in each platform directory has
the same name, we can put centralise the main definition of the
program in the main CMakeLists.txt, and in the platform directory,
just add the few extra modules needed to clear up platform-specific
details.
The same goes for psocks. And PSCP and PSFTP could have been moved to
the top level already - I just hadn't done it in the initial setup.
This gets rid of all those annoying 'win', 'ux' and 'gtk' prefixes
which made filenames annoying to type and to tab-complete. Also, as
with my other recent renaming sprees, I've taken the opportunity to
expand and clarify some of the names so that they're not such cryptic
abbreviations.
This clears up another large pile of clutter at the top level, and in
the process, allows me to rename source files to things that don't all
have that annoying 'ssh' prefix at the top.
add_platform_sources_to_library() is now called
add_sources_from_current_dir(), so that it will make sense when I use
it in subdirectories that aren't for a particular platform.
I found these while going through the code, and decided if we're going
to have them then we should compile them. They didn't all compile
first time, proving my point :-)
I've enhanced the tree234 test so that it has a verbose option, which
by default is off.
This new implementation uses the same optimisation-barrier technique
that I used in various places in testsc: have a no-op function, and a
volatile function pointer pointing at it, and then call through the
function pointer, so that nothing actually happens (apart from the
physical call and return) but the compiler has to assume that
_anything_ might have happened.
Doing this just after a memset enforces that the compiler can't have
thrown away the memset, because the called function might (for
example) check that all the memory really is zero and abort if not.
I've been turning this over in my mind ever since coming up with the
technique for testsc. I think it's far more robust than the previous
smemclr technique: so much so that I'm switching to using it
_everywhere_, and no longer using platform alternatives like Windows's
SecureZeroMemory().
This is a module that I'd noticed in the past was too monolithic.
There's a big pile of stub functions in uxpgnt.c that only have to be
there because the implementation of true X11 _forwarding_ (i.e.
actually managing a channel within an SSH connection), which Pageant
doesn't need, was in the same module as more general X11-related
utility functions which Pageant does need.
So I've broken up this awkward monolith. Now x11fwd.c contains only
the code that really does all go together for dealing with SSH X
forwarding: the management of an X forwarding channel (including the
vtables to make it behave as Channel at the SSH end and a Plug at the
end that connects to the local X server), and the management of
authorisation for those channels, including maintaining a tree234 of
possible auth values and verifying the one we received.
Most of the functions removed from this file have moved into the utils
subdir, and also into the utils library (i.e. further down the link
order), because they were basically just string and data processing.
One exception is x11_setup_display, which parses a display string and
returns a struct telling you everything about how to connect to it.
That talks to the networking code (it does name lookups and makes a
SockAddr), so it has to live in the network library rather than utils,
and therefore it's not in the utils subdirectory either.
The other exception is x11_get_screen_number, which it turned out
nothing called at all! Apparently the job it used to do is now done as
part of x11_setup_display. So I've just removed it completely.
Now that the new CMake build system is encouraging us to lay out the
code like a set of libraries, it seems like a good idea to make them
look more _like_ libraries, by putting things into separate modules as
far as possible.
This fixes several previous annoyances in which you had to link
against some object in order to get a function you needed, but that
object also contained other functions you didn't need which included
link-time symbol references you didn't want to have to deal with. The
usual offender was subsidiary supporting programs including misc.c for
some innocuous function and then finding they had to deal with the
requirements of buildinfo().
This big reorganisation introduces three new subdirectories called
'utils', one at the top level and one in each platform subdir. In each
case, the directory contains basically the same files that were
previously placed in the 'utils' build-time library, except that the
ones that were extremely miscellaneous (misc.c, utils.c, uxmisc.c,
winmisc.c, winmiscs.c, winutils.c) have been split up into much
smaller pieces.
This brings various concrete advantages over the previous system:
- consistent support for out-of-tree builds on all platforms
- more thorough support for Visual Studio IDE project files
- support for Ninja-based builds, which is particularly useful on
Windows where the alternative nmake has no parallel option
- a really simple set of build instructions that work the same way on
all the major platforms (look how much shorter README is!)
- better decoupling of the project configuration from the toolchain
configuration, so that my Windows cross-building doesn't need
(much) special treatment in CMakeLists.txt
- configure-time tests on Windows as well as Linux, so that a lot of
ad-hoc #ifdefs second-guessing a particular feature's presence from
the compiler version can now be replaced by tests of the feature
itself
Also some longer-term software-engineering advantages:
- other people have actually heard of CMake, so they'll be able to
produce patches to the new build setup more easily
- unlike the old mkfiles.pl, CMake is not my personal problem to
maintain
- most importantly, mkfiles.pl was just a horrible pile of
unmaintainable cruft, which even I found it painful to make changes
to or to use, and desperately needed throwing in the bin. I've
already thrown away all the variants of it I had in other projects
of mine, and was only delaying this one so we could make the 0.75
release branch first.
This change comes with a noticeable build-level restructuring. The
previous Recipe worked by compiling every object file exactly once,
and then making each executable by linking a precisely specified
subset of the same object files. But in CMake, that's not the natural
way to work - if you write the obvious command that puts the same
source file into two executable targets, CMake generates a makefile
that compiles it once per target. That can be an advantage, because it
gives you the freedom to compile it differently in each case (e.g.
with a #define telling it which program it's part of). But in a
project that has many executable targets and had carefully contrived
to _never_ need to build any module more than once, all it does is
bloat the build time pointlessly!
To avoid slowing down the build by a large factor, I've put most of
the modules of the code base into a collection of static libraries
organised vaguely thematically (SSH, other backends, crypto, network,
...). That means all those modules can still be compiled just once
each, because once each library is built it's reused unchanged for all
the executable targets.
One upside of this library-based structure is that now I don't have to
manually specify exactly which objects go into which programs any more
- it's enough to specify which libraries are needed, and the linker
will figure out the fine detail automatically. So there's less
maintenance to do in CMakeLists.txt when the source code changes.
But that reorganisation also adds fragility, because of the trad Unix
linker semantics of walking along the library list once each, so that
cyclic references between your libraries will provoke link errors. The
current setup builds successfully, but I suspect it only just manages
it.
(In particular, I've found that MinGW is the most finicky on this
score of the Windows compilers I've tried building with. So I've
included a MinGW test build in the new-look Buildscr, because
otherwise I think there'd be a significant risk of introducing
MinGW-only build failures due to library search order, which wasn't a
risk in the previous library-free build organisation.)
In the longer term I hope to be able to reduce the risk of that, via
gradual reorganisation (in particular, breaking up too-monolithic
modules, to reduce the risk of knock-on references when you included a
module for function A and it also contains function B with an
unsatisfied dependency you didn't really need). Ideally I want to
reach a state in which the libraries all have sensibly described
purposes, a clearly documented (partial) order in which they're
permitted to depend on each other, and a specification of what stubs
you have to put where if you're leaving one of them out (e.g.
nocrypto) and what callbacks you have to define in your non-library
objects to satisfy dependencies from things low in the stack (e.g.
out_of_memory()).
One thing that's gone completely missing in this migration,
unfortunately, is the unfinished MacOS port linked against Quartz GTK.
That's because it turned out that I can't currently build it myself,
on my own Mac: my previous installation of GTK had bit-rotted as a
side effect of an Xcode upgrade, and I haven't yet been able to
persuade jhbuild to make me a new one. So I can't even build the MacOS
port with the _old_ makefiles, and hence, I have no way of checking
that the new ones also work. I hope to bring that port back to life at
some point, but I don't want it to block the rest of this change.