The immediate usefulness of this is in pterm.exe: when the user uses
-e to specify a command to run in the pterm, we retrieve the command
in Unicode, store it in CONF_remote_cmd as UTF-8, and then in conpty.c
we can extract it in the same form and convert it back to Unicode to
pass losslessly to CreateProcessW. So now non-ACP Unicode works in
that part of the pterm command line.
Turns out that standard C 'size_t' and the Win32 API's 'SIZE_T' aren't
the same integer type in all cases! They have the same _size_, but in
32-bit, one of them is a typedef for 'unsigned int' and the other for
'unsigned long', leading to annoying pointer-type mismatch warnings.
Unlike on Unix, a Windows process's exit status is a DWORD, i.e. a
32-bit unsigned integer. And exit statuses seen in practice can range
up into the high half of that space. For example, if a process dies of
an 'illegal instruction' exception, then the exit status retrieved by
GetExitCodeProcess will be 0xC000001D == STATUS_ILLEGAL_INSTRUCTION.
If this happens to the process running inside pterm.exe, then
conpty->exitstatus will be set to a value greater than INT_MAX, which
will cause conpty_exitcode to return -1. Unfortunately, a -1 return
from conpty_exitstatus is treated by front ends as saying that the
backend process hasn't exited yet, and is still running. So pterm will
sit around after its subprocess has already terminated, contrary to
your 'close on exit' setting.
Moreover, when cmd.exe exits, it apparently passes on to its parent
process the exit status of the last subcommand it ran. So if you run a
Windows pterm containing an ordinary interactive console session, and
the last subprogram you happen to run inside that session dies of a
fatal signal, then the same thing will happen after you type 'exit' at
the command prompt.
This has been happening to me intermittently ever since I created
pterm.exe in the first place, and I guessed completely wrong about the
cause (I feared some kind of subtle race condition in pterm's use of
the process API). I've only just managed to reproduce it reliably
enough to debug, and I'm relieved to find it's much simpler than that!
The immediate fix, in any case, is to ensure we don't return -1 from
conpty_exitcode unless the process really is still running. And don't
return INT_MAX either, because that indicates 'unclean exit' in a way
that triggers 'close window only on clean exit' (and even Unix pterm
doesn't consider the primary subprocess dying of a signal to count as
unclean). So we clip all out-of-range Windows exception codes to
INT_MAX-1.
In the longer term I think it would be nice to turn exit codes into a
full 32-bit space, and move the special values completely out of it.
That would permit actually keeping the exact exception code and
passing it on to a caller who needed it. For example, if we were to
write a Windows psusan (which I could imagine being occasionally
useful), this way it would be able to return the unmodified full
Windows exit code via the "exit-status" chanreq. But I don't think we
currently have any clients needing that much detail, so that's a more
intrusive cleanup for a later date.
Now every struct that implements the Backend trait is completely
cleared before we start initialising any of its fields. This will mean
I can add new fields that default to 0 or NULL, without having to mess
around initialising them explicitly everywhere.
The current 'displayname' field is designed for presenting in the
config UI, so it starts with a capital letter even when it's not a
proper noun. If I want to name the backend in the middle of a
sentence, I'll need a version that starts with lowercase where
appropriate.
The old field is renamed displayname_tc, to avoid ambiguity.
When a writable HANDLE is managed by the handle-io.c system, you ask
to send EOF on the handle by calling handle_write_eof. That waits
until all buffered data has been written, and then sends an EOF event
by simply closing the handle.
That is, of course, the only way to send an EOF signal on a handle at
all. And yet, it's a bug, because the handle_output system does not
take ownership of the handle you give it: the client of handle_output
retains ownership, keeps its own copy of the handle, and will expect
to close it itself.
In most cases, the extra close will harmlessly fail, and return
ERROR_INVALID_HANDLE (which the caller didn't notice anyway). But if
you're unlucky, in conditions of frantic handle opening and closing
(e.g. with a lot of separate named-pipe-style agent forwarding
connections being constantly set up and torn down), the handle value
might have been reused between the two closes, so that the second
CloseHandle closes an unrelated handle belonging to some other part of
the program.
We can't fix this by giving handle_output permanent ownership of the
handle, because it really _is_ necessary for copies of it to survive
elsewhere: in particular, for a bidirectional file such as a serial
port or named pipe, the reading side also needs a copy of the same
handle! And yet, we can't replace the handle_write_eof call in the
client with a direct CloseHandle, because that won't wait until
buffered output has been drained.
The solution is that the client still calls handle_write_eof to
register that it _wants_ an EOF sent; the handle_output system will
wait until it's ready, but then, instead of calling CloseHandle, it
will ask its _client_ to close the handle, by calling the provided
'sentdata' callback with the new 'close' flag set to true. And then
the client can not only close the handle, but do whatever else it
needs to do to record that that has been done.
On a similar theme of separating the query operation from the
attempted change, backend_send() now no longer has the side effect of
returning the current size of the send buffer. Instead, you have to
call backend_sendbuffer() every time you want to know that.
Before commit 6e69223dc2, Pageant would stop working after a
certain number of PuTTYs were active at the same time. (At most about
60, but maybe fewer - see below.)
This was because of two separate bugs. The easy one, fixed in
6e69223dc2 itself, was that PuTTY left each named-pipe connection
to Pageant open for the rest of its lifetime. So the real problem was
that Pageant had too many active connections at once. (And since a
given PuTTY might make multiple connections during userauth - one to
list keys, and maybe another to actually make a signature - that was
why the number of _PuTTYs_ might vary.)
It was clearly a bug that PuTTY was leaving connections to Pageant
needlessly open. But it was _also_ a bug that Pageant couldn't handle
more than about 60 at once. In this commit, I fix that secondary bug.
The cause of the bug is that the WaitForMultipleObjects function
family in the Windows API have a limit on the number of HANDLE objects
they can select between. The limit is MAXIMUM_WAIT_OBJECTS, defined to
be 64. And handle-io.c was using a separate event object for each I/O
subthread to communicate back to the main thread, so as soon as all
those event objects (plus a handful of other HANDLEs) added up to more
than 64, we'd start passing an overlarge handle array to
WaitForMultipleObjects, and it would start not doing what we wanted.
To fix this, I've reorganised handle-io.c so that all its subthreads
share just _one_ event object to signal readiness back to the main
thread. There's now a linked list of 'struct handle' objects that are
ready to be processed, protected by a CRITICAL_SECTION. Each subthread
signals readiness by adding itself to the linked list, and setting the
event object to indicate that the list is now non-empty. When the main
thread receives the event, it iterates over the whole list processing
all the ready handles.
(Each 'struct handle' still has a separate event object for the main
thread to use to communicate _to_ the subthread. That's OK, because no
thread is ever waiting on all those events at once: each subthread
only waits on its own.)
The previous HT_FOREIGN system didn't really fit into this framework.
So I've moved it out into its own system. There's now a handle-wait.c
which deals with the relatively simple job of managing a list of
handles that need to be waited for, each with a callback function;
that's what communicates a list of HANDLEs to event loops, and
receives the notification when the event loop notices that one of them
has done something. And handle-io.c is now just one client of
handle-wait.c, providing a single HANDLE to the event loop, and
dealing internally with everything that needs to be done when that
handle fires.
The new top-level handle-wait.c system *still* can't deal with more
than MAXIMUM_WAIT_OBJECTS. At the moment, I'm reasonably convinced it
doesn't need to: the only kind of HANDLE that any of our tools could
previously have needed to wait on more than one of was the one in
handle-io.c that I've just removed. But I've left some assertions and
a TODO comment in there just in case we need to change that in future.
This fulfills our long-standing Mayhem-difficulty wishlist item
'win-command-prompt': this is a Windows pterm in the sense that when
you run it you get a local cmd.exe running inside a PuTTY-style window.
Advantages of this: you get the same free choice of fonts as PuTTY has
(no restriction to a strange subset of the system's available fonts);
you get the same copy-paste gestures as PuTTY (no mental gear-shifting
when you have command prompts and SSH sessions open on the same
desktop); you get scrollback with the PuTTY semantics (scrolling to
the bottom gets you to where the action is, as opposed to the way you
could accidentally find yourself 500 lines past the end of the action
in a real console).
'win-command-prompt' was at Mayhem difficulty ('Probably impossible')
basically on the grounds that with Windows's old APIs for accessing
the contents of consoles, there was no way I could find to get this to
work sensibly. What was needed to make it feasible was a major piece
of re-engineering work inside Windows itself.
But, of course, that's exactly what happened! In 2019, the new ConPTY
API arrived, which lets you create an object that behaves like a
Windows console at one end, and round the back, emits a stream of
VT-style escape sequences as the screen contents evolve, and accepts a
VT-style input stream in return which it will parse function and arrow
keys out of in the usual way.
So now it's actually _easy_ to get this to basically work. The new
backend, in conpty.c, has to do a handful of magic Windows API calls
to set up the pseudo-console and its feeder pipes and start a
subprocess running in it, a further magic call every time the PuTTY
window is resized, and detect the end of the session by watching for
the subprocess terminating. But apart from that, all it has to do is
pass data back and forth unmodified between those pipes and the
backend's associated Seat!
That said, this is new and experimental, and there will undoubtedly be
issues. One that I already know about is that you can't copy and paste
a word that has wrapped between lines without getting an annoying
newline in the middle of it. As far as I can see this is a fundamental
limitation: the ConPTY system sends the _same_ escape sequence stream
for a line that wrapped as it would send for a line that had a logical
\n at what would have been the wrap point. Probably the best we can do
to mitigate this is to adopt a different heuristic for newline elision
that's right more often than it's wrong.
For the moment, that experimental-ness is indicated by the fact that
Buildscr will build, sign and deliver a copy of pterm.exe for each
flavour of Windows, but won't include it in the .zip file or in the
installer. (In fact, that puts it in exactly the same ad-hoc category
as PuTTYtel, although for completely different reasons.)