In the previous state of the code, we first tested agent_exists() to
decide whether to be the long-running Pageant server or a short-lived
client; then, during the command-line parsing loop, we prompted for
passphrases to add keys presented on the command line (to ourself or
the server, respectively); *then* we set up the named pipe and
WM_COPYDATA receiver window to actually start functioning as a server,
if we decided that was our role.
A consequence is that if a user started up two Pageants each with an
encrypted key on the command line, there would be a race condition:
each one would decide that it was _going_ to be the server, then
prompt for a passphrase, and then try to set itself up as the server
once the passphrase is entered. So whichever one's passphrase prompt
was answered second would add its key to its own internal data
structures, then fail to set up the server's named pipe, terminate
with an error, and end up not having added its key to the _surviving_
server.
This change reorders the setup steps so that the command-line parsing
loop does not add the keys immediately; instead it merely caches the
key filenames provided. Then we make the decision about whether we're
the server, and set up both the named pipe and WM_COPYDATA window if
we are; and finally, we go back to our list of key filenames and
actually add them, either to ourself (if we're the server) or to some
other Pageant (if we're a client).
Moreover, the decision about whether to be the server is now wrapped
in an interprocess mutex similar to the one used in connection
sharing, which means that even if two or more Pageants are started up
as close to simultaneously as possible, they should avoid a race
condition and successfully manage to agree on exactly one of
themselves to be the server. For example, a user reported that this
could occur if you put shortcuts to multiple private key files in your
Windows Startup folder, so that they were all launched simultaneously
at startup.
One slightly odd behaviour that remains: if the server Pageant has to
prompt for private key passphrases at startup, then it won't actually
start _servicing_ requests from other Pageants until it's finished
dealing with its own prompts. As a result, if you do start up two
Pageants at once each with an encrypted key file on its command line,
the second one won't even manage to present its passphrase prompt
until the first one's prompt is dismissed, because it will block
waiting for the initial check of the key list. But it will get there
in the end, so that's only a cosmetic oddity.
It would be nice to arrange that Pageant GUI operations don't block
unrelated agent requests (e.g. by having the GUI and the agent code
run in separate threads). But that's a bigger problem, not specific to
startup time - the same thing happens if you interactively load a key
via Pageant's file dialog. And it would require a major reorganisation
to fix that fully, because currently the GUI code depends on being
able to call _internal_ Pageant query functions like
pageant_count_ssh2_keys() that don't work by constructing an agent
request at all.
All the seat functions that request an interactive prompt of some kind
to the user - both the main seat_get_userpass_input and the various
confirmation dialogs for things like host keys - were using a simple
int return value, with the general semantics of 0 = "fail", 1 =
"proceed" (and in the case of seat_get_userpass_input, answers to the
prompts were provided), and -1 = "request in progress, wait for a
callback".
In this commit I change all those functions' return types to a new
struct called SeatPromptResult, whose primary field is an enum
replacing those simple integer values.
The main purpose is that the enum has not three but _four_ values: the
"fail" result has been split into 'user abort' and 'software abort'.
The distinction is that a user abort occurs as a result of an
interactive UI action, such as the user clicking 'cancel' in a dialog
box or hitting ^D or ^C at a terminal password prompt - and therefore,
there's no need to display an error message telling the user that the
interactive operation has failed, because the user already knows,
because they _did_ it. 'Software abort' is from any other cause, where
PuTTY is the first to know there was a problem, and has to tell the
user.
We already had this 'user abort' vs 'software abort' distinction in
other parts of the code - the SSH backend has separate termination
functions which protocol layers can call. But we assumed that any
failure from an interactive prompt request fell into the 'user abort'
category, which is not true. A couple of examples: if you configure a
host key fingerprint in your saved session via the SSH > Host keys
pane, and the server presents a host key that doesn't match it, then
verify_ssh_host_key would report that the user had aborted the
connection, and feel no need to tell the user what had gone wrong!
Similarly, if a password provided on the command line was not
accepted, then (after I fixed the semantics of that in the previous
commit) the same wrong handling would occur.
So now, those Seat prompt functions too can communicate whether the
user or the software originated a connection abort. And in the latter
case, we also provide an error message to present to the user. Result:
in those two example cases (and others), error messages should no
longer go missing.
Implementation note: to avoid the hassle of having the error message
in a SeatPromptResult being a dynamically allocated string (and hence,
every recipient of one must always check whether it's non-NULL and
free it on every exit path, plus being careful about copying the
struct around), I've instead arranged that the structure contains a
function pointer and a couple of parameters, so that the string form
of the message can be constructed on demand. That way, the only users
who need to free it are the ones who actually _asked_ for it in the
first place, which is a much smaller set.
(This is one of the rare occasions that I regret not having C++'s
extra features available in this code base - a unique_ptr or
shared_ptr to a string would have been just the thing here, and the
compiler would have done all the hard work for me of remembering where
to insert the frees!)
This fills in the remaining gap in the interactive prompt rework of
the proxy system in general. If you used the Telnet proxy with a
command containing %user or %pass, and hadn't filled in those
variables in the PuTTY config, then proxy/telnet.c would prompt you at
run time to enter the proxy auth details. But the local proxy command,
which uses the same format_telnet_command function, would not do that.
Now it does!
I've implemented this by moving the formatting of the proxy command
into a new module proxy/local.c, shared between both the Unix and
Windows local-proxy implementations. That module implements a
DeferredSocketOpener, which constructs the proxy command (prompting
first if necessary), and once it's constructed, hands it to a
per-platform function platform_setup_local_proxy().
So each platform-specific proxy function, instead of starting a
subprocess there and then and passing its details to make_fd_socket or
make_handle_socket, now returns a _deferred_ version of one of those
sockets, with the DeferredSocketOpener being the thing in
proxy/local.c. When that calls back to platform_setup_local_proxy(),
we actually start the subprocess and pass the resulting fds/handles to
the deferred socket to un-defer it.
A side effect of the rewrite is that when proxy commands are logged in
the Event Log, they now get the same amenities as in the Telnet proxy
type: the proxy password is sanitised out, and any difficult
characters are escaped.
Previously, a setup function returning one of these socket types (such
as platform_new_connection) had to do all its setup synchronously,
because if it was going to call make_fd_socket or make_handle_socket,
it had to have the actual fds or HANDLEs ready-made. If some kind of
asynchronous operation were needed before those fds become available,
there would be no way the function could achieve it, except by
becoming a whole extra permanent Socket wrapper layer.
Now there is, because you can make an FdSocket when you don't yet have
the fds, or a HandleSocket without the HANDLEs. Instead, you provide
an instance of the new trait 'DeferredSocketOpener', which is
responsible for setting in motion whatever asynchronous setup
procedure it needs, and when that finishes, calling back to
setup_fd_socket / setup_handle_socket to provide the missing pieces.
In the meantime, the FdSocket or HandleSocket will sit there inertly,
buffering any data the client might eagerly hand it via sk_write(),
and waiting for its setup to finish. When it does finish, buffered
data will be released.
In FdSocket, this is easy enough, because we were doing our own
buffering anyway - we called the uxsel system to find out when the fds
were readable/writable, and then wrote to them from our own bufchain.
So more or less all I had to do was make the try_send function do
nothing if the setup phase wasn't finished yet.
In HandleSocket, on the other hand, we're passing all our data to the
underlying handle-io.c system, and making _that_ deferrable in the
same way would be much more painful, because that's the place where
the scary threads live. So instead I've arranged it by replacing the
whole vtable, so that a deferred HandleSocket and a normal
HandleSocket are effectively separate trait implementations that can
share their state structure. And in fact that state struct itself now
contains a big anonymous union, containing one branch to go with each
vtable.
Nothing yet uses this system, but the next commit will do so.
This will mean that platform-specific proxy types will also be able to
set themselves up as child Interactors and prompt the user
interactively for passwords and the like.
NFC: nothing uses the new parameter yet.
The return value of term_data() is used as the return value from the
GUI-terminal versions of the Seat output method, which means backends
will take it to be the amount of standard-output data currently
buffered, and exert back-pressure on the remote peer if it gets too
big (e.g. by ceasing to extend the window in that particular SSH-2
channel).
Historically, as a comment in term_data() explained, we always just
returned 0 from that function, on the basis that we were processing
all the terminal data through our terminal emulation code immediately,
and never retained any of it in the buffer at all. If the terminal
emulation code were to start running slowly, then it would slow down
the _whole_ PuTTY system, due to single-threadedness, and
back-pressure of a sort would be exerted on the remote by it simply
failing to get round to reading from the network socket. But by the
time we got back to the top level of term_data(), we'd have finished
reading all the data we had, so it was still appropriate to return 0.
That comment is still correct if you're thinking about the limiting
factor on terminal data processing being the CPU usage in term_out().
But now that's no longer the whole story, because sometimes we leave
data in term->inbuf without having processed it: during drag-selects
in the terminal window, and (just introduced) while waiting for the
response to a pending window resize request. For both those reasons,
we _don't_ always have a buffer size of zero when we return from
term_data().
So now that hole in our buffer size management is filled in:
term_data() returns the true size of the remaining unprocessed
terminal output, so that back-pressure will be exerted if the terminal
is currently not consuming it. And when processing resumes and we
start to clear our backlog, we call backend_unthrottle to let the
backend know it can relax the back-pressure if necessary.
Previously, the Windows implementation of win_request_resize would
call term_size() to tell the terminal about its new size, _before_
calling SetWindowPos to actually change the window size.
If SetWindowPos ends up not getting the exact window size it asks for,
Windows notifies the application by sending back a WM_SIZE message -
synchronously, by calling back to the window procedure from within
SetWindowPos. So after the first over-optimistic term_size(), the
WM_SIZE handler would trigger a second one, resetting the terminal
again to the _actual_ size.
This was more or less harmless, since current handling of terminal
resizes is such that no terminal data gets too confused: any lines
pushed into the scrollback by the first term_size will be pulled back
out by the second one if needed (or vice versa), and since commit
271de3e4ec, individual termlines are not eagerly truncated by
resizing them twice.
However, it's more work than we really wanted to be doing, and it
seems presumptuous to send term_size() before we know it's right! So
now we send term_size() after SetWindowPos returns, unless it already
got sent by a re-entrant call to the WM_SIZE handler _before_
SetWindowPos returned. That way, the terminal should get exactly one
term_size() call in response to win_request_resize(), whether it got
the size it was expecting or not.
This commit replaces all those fiddly little linking modules
(be_all.c, be_none.c, be_ssh.c etc) with a single source file
controlled by ifdefs, and introduces a function be_list() in
setup.cmake that makes it easy to compile a version of it appropriate
to each application.
This is a net reduction in code according to 'git diff --stat', even
though I've introduced more comments. It also gets rid of another pile
of annoying little source files in the top-level directory that didn't
deserve to take up so much room in 'ls'.
More concretely, doing this has some maintenance advantages.
Centralisation means less to maintain (e.g. n_ui_backends is worked
out once in a way that makes sense everywhere), and also, 'appname'
can now be reliably set per program. Previously, some programs got the
wrong appname due to sharing the same linking module (e.g. Plink had
appname="PuTTY"), which was a latent bug that would have manifested if
I'd wanted to reuse the same string in another context.
One thing I've changed in this rework is that Windows pterm no longer
has the ConPTY backend in its backends[]: it now has an empty one. The
special be_conpty.c module shouldn't really have been there in the
first place: it was used in the very earliest uncommitted drafts of
the ConPTY work, where I was using another method of selecting that
backend, but now that Windows pterm has a dedicated
backend_vt_from_conf() that refers to conpty_backend by name, it has
no need to live in backends[] at all, just as it doesn't have to in
Unix pterm.
I'm actually quite surprised there was only _one_ copy of each of
these standard macros in the code base, given my general habit of
casually redefining them anywhere I need them! But each one was in a
silly place. Moved them up to the top level where they're available
globally.
While I'm in the mood for cleaning up the top-level directory here:
all the 'nostuff.c' files have moved into a new 'stubs' directory, and
I broke up be_misc.c into smaller modules that can live in 'utils'.
The Telnet proxy system is not a proper network protocol - we have no
reliable way to receive communication from the proxy telling us
whether a password is even required. However, we _do_ know (a) whether
the keywords '%user' or '%pass' appeared in the format string stored
in the Conf, and (b) whether we actually had a username or a password
to substitute into them. So that's how we know whether to ask for a
username or a password: if the format string asks for them and the
Conf doesn't provide them, we prompt for them at startup.
This involved turning TelnetProxyNegotiator into a coroutine (matching
all the other proxy types, but previously, it was the only one simple
enough not to need to be one), so that it can wait until a response
arrives to that prompt. (And also, as it turned out, so that it can
wait until setup is finished before even presenting the prompt!)
It also involves having format_telnet_command grow an extra output
parameter, in the form of 'unsigned *flags', with which it can
communicate back to the caller that a username or password was wanted
but not found. The other clients of that function (the local proxy
implementations) don't use those flags, but if necessary, they could.
When I wanted to append an ordinary C string to a BinarySink, without
any prefix length field or suffix terminator, I was using the idiom
put_datapl(bs, ptrlen_from_asciz(string));
but I've finally decided that's too cumbersome, and it deserves a
shorter name. put_dataz(bs, string) now does the same thing - in fact
it's a macro expanding to exactly the above.
While I'm at it, I've also added put_datalit(), which is the same
except that it expects a C string literal (and will enforce that at
compile time, via PTRLEN_LITERAL which it calls in turn). You can use
that where possible to avoid the run-time cost of the strlen.
marshal.h now provides a macro put_fmt() which allows you to write
arbitrary printf-formatted data to an arbitrary BinarySink.
We already had this facility for strbufs in particular, in the form of
strbuf_catf(). That was able to take advantage of knowing the inner
structure of a strbuf to minimise memory allocation (it would snprintf
directly into the strbuf's existing buffer if possible). For a general
black-box BinarySink we can't do that, so instead we dupvprintf into a
temporary buffer.
For consistency, I've removed strbuf_catf, and converted all uses of
it into the new put_fmt - and I've also added an extra vtable method
in the BinarySink API, so that put_fmt can still use strbuf_catf's
more efficient memory management when talking to a strbuf, and fall
back to the simpler strategy when that's not available.
(TL;DR: to suppress redundant 'Press Return to begin session' prompts
in between hops of a jump-host configuration, in Plink.)
This new query method directly asks the Seat the question: is the same
stream of input used to provide responses to interactive login
prompts, and the session input provided after login concludes?
It's used to suppress the last-ditch anti-spoofing defence in Plink of
interactively asking 'Access granted. Press Return to begin session',
on the basis that any such spoofing attack works by confusing the user
about what's a legit login prompt before the session begins and what's
sent by the server after the main session begins - so if those two
things take input from different places, the user can't be confused.
This doesn't change the existing behaviour of Plink, which was already
suppressing the antispoof prompt in cases where its standard input was
redirected from something other than a terminal. But previously it was
doing it within the can_set_trust_status() seat query, and I've now
moved it out into a separate query function.
The reason why these need to be separate is for SshProxy, which needs
to give an unusual combination of answers when run inside Plink. For
can_set_trust_status(), it needs to return whatever the parent Seat
returns, so that all the login prompts for a string of proxy
connections in session will be antispoofed the same way. But you only
want that final 'Access granted' prompt to happen _once_, after all
the proxy connection setup phases are done, because up until then
you're still in the safe hands of PuTTY itself presenting an unbroken
sequence of legit login prompts (even if they come from a succession
of different servers). Hence, SshProxy unconditionally returns 'no' to
the query of whether it has a single mixed input stream, because
indeed, it never does - for purposes of session input it behaves like
an always-redirected Plink, no matter what kind of real Seat it ends
up sending its pre-session login prompts to.
Passing an operating-system-specific error code to plug_closing(),
such as errno or GetLastError(), was always a bit weird, given that it
generally had to be handled by cross-platform receiving code in
backends. I had the platform.h implementations #define any error
values that the cross-platform code would have to handle specially,
but that's still not a great system, because it also doesn't leave
freedom to invent error representations of my own that don't
correspond to any OS code. (For example, the ones I just removed from
proxy.h.)
So now, the OS error code is gone from the plug_closing API, and in
its place is a custom enumeration of closure types: normal, error, and
the special case BROKEN_PIPE which is the only OS error code we have
so far needed to handle specially. (All others just mean 'abandon the
connection and print the textual message'.)
Having already centralised the handling of OS error codes in the
previous commit, we've now got a convenient place to add any further
type codes for errors needing special handling: each of Unix
plug_closing_errno(), Windows plug_closing_system_error(), and Windows
plug_closing_winsock_error() can easily grow extra special cases if
need be, and each one will only have to live in one place.
Having a single plug_closing() function covering various kinds of
closure is reasonably convenient from the point of view of Plug
implementations, but it's annoying for callers, who all have to fill
in pointless NULL and 0 parameters in the cases where they're not
used.
Added some inline helper functions in network.h alongside the main
plug_closing() dispatch wrappers, so that each kind of connection
closure can present a separate API for the Socket side of the
interface, without complicating the vtable for the Plug side.
Also, added OS-specific extra helpers in the Unix and Windows
directories, which centralise the job of taking an OS error code (of
whatever kind) and translating it into its error message.
In passing, this removes the horrible ad-hoc made-up error codes in
proxy.h, which is OK, because nothing checked for them anyway, and
also I'm about to do an API change to plug_closing proper that removes
the need for them.
Previously, SSH authentication banners were displayed by calling the
ordinary seat_output function, and passing it a special value in the
SeatOutputType enumeration indicating an auth banner.
The awkwardness of this was already showing a little in SshProxy's
implementation of seat_output, where it had to check for that special
value and do totally different things for SEAT_OUTPUT_AUTH_BANNER and
everything else. Further work in that area is going to make it more
and more awkward if I keep the two output systems unified.
So let's split them up. Now, Seat has separate output() and banner()
methods, which each implementation can override differently if it
wants to.
All the 'end user' Seat implementations use the centralised
implementation function nullseat_banner_to_stderr(), which turns
banner text straight back into SEAT_OUTPUT_STDERR and passes it on to
seat_output. So I didn't have to tediously implement a boring version
of this function in GTK, Windows GUI, consoles, file transfer etc.
There are quite a few of them already, and I'm about to make another
one, so let's start with a bit of tidying up.
The CMake build organisation is unchanged: I haven't put the proxy
object files into a separate library, just moved the locations of the
source files. (Organising proxying as a library would be tricky
anyway, because of the various overrides for tools that want to avoid
cryptography.)
Now every struct that implements the Backend trait is completely
cleared before we start initialising any of its fields. This will mean
I can add new fields that default to 0 or NULL, without having to mess
around initialising them explicitly everywhere.
Previously, checking the host key against the persistent cache managed
by the storage.h API was done as part of the seat_verify_ssh_host_key
method, i.e. separately by each Seat.
Now that check is done by verify_ssh_host_key(), which is a new
function in ssh/common.c that centralises all the parts of host key
checking that don't need an interactive prompt. It subsumes the
previous verify_ssh_manual_host_key() that checked against the Conf,
and it does the check against the storage API that each Seat was
previously doing separately. If it can't confirm or definitively
reject the host key by itself, _then_ it calls out to the Seat, once
an interactive prompt is definitely needed.
The main point of doing this is so that when SshProxy forwards a Seat
call from the proxy SSH connection to the primary Seat, it won't print
an announcement of which connection is involved unless it's actually
going to do something interactive. (Not that we're printing those
announcements _yet_ anyway, but this is a piece of groundwork that
works towards doing so.)
But while I'm at it, I've also taken the opportunity to clean things
up a bit by renaming functions sensibly. Previously we had three very
similarly named functions verify_ssh_manual_host_key(), SeatVtable's
'verify_ssh_host_key' method, and verify_host_key() in storage.h. Now
the Seat method is called 'confirm' rather than 'verify' (since its
job is now always to print an interactive prompt, so it looks more
like the other confirm_foo methods), and the storage.h function is
called check_stored_host_key(), which goes better with store_host_key
and avoids having too many functions with similar names. And the
'manual' function is subsumed into the new centralised code, so
there's now just *one* host key function with 'verify' in the name.
Several functions are reindented in this commit. Best viewed with
whitespace changes ignored.
The current 'displayname' field is designed for presenting in the
config UI, so it starts with a capital letter even when it's not a
proper noun. If I want to name the backend in the middle of a
sentence, I'll need a version that starts with lowercase where
appropriate.
The old field is renamed displayname_tc, to avoid ambiguity.
It was totally unused. No implementation of the 'closing' method in a
Plug vtable was checking it for any reason at all, except for
ProxySocket which captured it from its client in order to pass on to
its server (which, perhaps after further iterations of ProxySocket,
would have ended up ignoring it similarly). And every caller of
plug_closing set it to 0 (aka false), except for the one in sshproxy.c
which passed true (but it would have made no difference to anyone).
The comment in network.h refers to a FIXME comment which was in
try_send() when that code was written (see winnet.c in commit
7b0e082700). That FIXME is long gone, replaced by a use of a
toplevel callback. So I think the aim must have been to avoid
re-entrancy when sk_write called try_send which encountered a socket
error and called back to plug_closing - but that's long since fixed by
other means now.
This is the same as the previous FUNKY_XTERM mode if you don't press
any modifier keys, but now Shift or Ctrl or Alt with function keys
adds an extra bitmap parameter. The bitmaps are the same as the ones
used by the new SHARROW_BITMAP arrow key mode.
As well as affecting the bitmap field in the escape sequence, it was
_also_ having its otherwise standard effect of prefixing Esc to the
whole sequence. It shouldn't do both.
This commit introduces a new config option for how to handle shifted
arrow keys.
In the default mode (SHARROW_APPLICATION), we do what we've always
done: Ctrl flips the arrow keys between sending their most usual
escape sequences (ESC [ A ... ESC [ D) and sending the 'application
cursor keys' sequences (ESC O A ... ESC O D). Whichever of those modes
is currently configured, Ctrl+arrow sends the other one.
In the new mode (SHARROW_BITMAP), application cursor key mode is
unaffected by any shift keys, but the default sequences acquire two
numeric arguments. The first argument is 1 (reflecting the fact that a
shifted arrow key still notionally moves just 1 character cell); the
second is the bitmap (1 for Shift) + (2 for Alt) + (4 for Ctrl),
offset by 1. (Except that if _none_ of those modifiers is pressed,
both numeric arguments are simply omitted.)
The new bitmap mode is what current xterm generates, and also what
Windows ConPTY seems to expect. If you start an ordinary Command
Prompt and launch into WSL, those are the sequences it will generate
for shifted arrow keys; conversely, if you run a Command Prompt within
a ConPTY, then these sequences for Ctrl+arrow will have the effect you
expect in cmd.exe command-line editing (going backward or forward a
word). For that reason, I enable this mode unconditionally when
launching Windows pterm.
While fixing the previous commit I noticed that window titles don't
actually _work_ properly if you change the terminal character set,
because the text accumulated in the OSC string buffer is sent to the
TermWin as raw bytes, with no indication of what character set it
should interpret them as. You might get lucky if you happened to
choose the right charset (in particular, UTF-8 is a common default),
but if you change the charset half way through a run, then there's
certainly no way the frontend will know to interpret two window titles
sent before and after the change in two different charsets.
So, now win_set_title() and win_set_icon_title() both include a
codepage parameter along with the byte string, and it's up to them to
translate the provided window title from that encoding to whatever the
local window system expects to receive.
On Windows, that's wide-string Unicode, so we can just use the
existing dup_mb_to_wc utility function. But in GTK, it's UTF-8, so I
had to write an extra utility function to encode a wide string as
UTF-8.
In the Windows Pageant message loop, we were alternating between
MsgWaitForMultipleObjects with timeout=INFINITE, and
run_toplevel_callbacks. But run_toplevel_callbacks doesn't loop until
the callback queue is empty: it just runs at least one of the
currently queued callbacks, so that some progress was made. So if two
or more callbacks were queued, we'd leave the rest in the queue and go
back into MsgWaitForMultipleObjects, which could hang indefinitely.
(A very silly workaround was available: move the mouse over the
Pageant systray icon! Every mouse event would terminate the wait and
let you get one more iteration of run_toplevel_callbacks.)
Now we do the same thing as in other main loops: if any further
callbacks are pending, then we still run MsgWaitForMultipleObjects,
but we do it with timeout=0 instead of timeout=INFINITE, so that we
won't go back to a _blocking_ sleep until all callbacks have been
serviced.
When a writable HANDLE is managed by the handle-io.c system, you ask
to send EOF on the handle by calling handle_write_eof. That waits
until all buffered data has been written, and then sends an EOF event
by simply closing the handle.
That is, of course, the only way to send an EOF signal on a handle at
all. And yet, it's a bug, because the handle_output system does not
take ownership of the handle you give it: the client of handle_output
retains ownership, keeps its own copy of the handle, and will expect
to close it itself.
In most cases, the extra close will harmlessly fail, and return
ERROR_INVALID_HANDLE (which the caller didn't notice anyway). But if
you're unlucky, in conditions of frantic handle opening and closing
(e.g. with a lot of separate named-pipe-style agent forwarding
connections being constantly set up and torn down), the handle value
might have been reused between the two closes, so that the second
CloseHandle closes an unrelated handle belonging to some other part of
the program.
We can't fix this by giving handle_output permanent ownership of the
handle, because it really _is_ necessary for copies of it to survive
elsewhere: in particular, for a bidirectional file such as a serial
port or named pipe, the reading side also needs a copy of the same
handle! And yet, we can't replace the handle_write_eof call in the
client with a direct CloseHandle, because that won't wait until
buffered output has been drained.
The solution is that the client still calls handle_write_eof to
register that it _wants_ an EOF sent; the handle_output system will
wait until it's ready, but then, instead of calling CloseHandle, it
will ask its _client_ to close the handle, by calling the provided
'sentdata' callback with the new 'close' flag set to true. And then
the client can not only close the handle, but do whatever else it
needs to do to record that that has been done.
Trying to debug a problem involving threads just now, it turned out
that the version of the diagnostics going to my debug.log was getting
data in a different order from the version going to the debug console.
Now I open and write to debug_fp by going directly to the Win32 API
instead of via a buffering userland stdio, and that seems to have
solved the problem.
Similarly to cmdgen's passphrase options, this replaces the password
on the command line with a filename to read the password out of, which
means it can't show up in 'ps' or the Windows task manager.
The jump host system ought really to be treating SSH authentication
banners as a distinct thing from the standard-error session output, so
that the former can be presented to the user in the same way as the
auth banner for the main session.
This change converts the 'bool is_stderr' parameter of seat_output()
into an enumerated type with three values. For the moment, stderr and
banners are treated the same, but the plan is for that to change.
Now that it's possible for a single invocation of PuTTY to connect to
multiple SSH servers (jump host followed by ultimate destination
host), it's rather unhelpful for host key prompts to just say "the
server". To check an unknown host key, users will need to know _which_
host it's purporting to be the key for.
Another possibility is to put a message in the terminal window
indicating which server we're currently in the SSH setup phase for.
That will certainly be what we have to end up doing for userpass
prompts that appear _in_ the terminal window. But that by itself is
still unhelpful for host key prompts in a separate dialog, because the
user would have to check both windows to get all the information they
need. Easier if the host key dialog itself tells you everything you
need to know to answer the question: is _this_ key the one you expect
for _that_ host?
The format _strings_ were previously centralised into the platform-
independent console.c, as const char arrays. Now the actual formatting
operation is centralised as well, by means of console.c providing a
function that takes all the necessary parameters and returns a
formatted piece of text for the console.
Mostly this is so that I can add extra parameters to the message with
some confidence: changing a format string in one file and two fprintf
statements in other files to match seems like the kind of situation
you wish you hadn't got into in the first place :-)
The system for handling seat_get_userpass_input has always been
structured differently between GUI PuTTY and CLI tools like Plink.
In the CLI tools, password input is read directly from the OS
terminal/console device by console_get_userpass_input; this means that
you need to ensure the same terminal input data _hasn't_ already been
consumed by the main event loop and sent on to the backend. This is
achieved by the backend_sendok() method, which tells the event loop
when the backend has finished issuing password prompts, and hence,
when it's safe to start passing standard input to backend_send().
But in the GUI tools, input generated by the terminal window has
always been sent straight to backend_send(), regardless of whether
backend_sendok() says it wants it. So the terminal-based
implementation of username and password prompts has to work by
consuming input data that had _already_ been passed to the backend -
hence, any backend that needs to do that must keep its input on a
bufchain, and pass that bufchain to seat_get_userpass_input.
It's awkward that these two totally different systems coexist in the
first place. And now that SSH proxying needs to present interactive
prompts of its own, it's clear which one should win: the CLI style is
the Right Thing. So this change reworks the GUI side of the mechanism
to be more similar: terminal data now goes into a queue in the Ldisc,
and is not sent on to the backend until the backend says it's ready
for it via backend_sendok(). So terminal-based userpass prompts can
now consume data directly from that queue during the connection setup
stage.
As a result, the 'bufchain *' parameter has vanished from all the
userpass_input functions (both the official implementations of the
Seat trait method, and term_get_userpass_input() to which some of
those implementations delegate). The only function that actually used
that bufchain, namely term_get_userpass_input(), now instead reads
from the ldisc's input queue via a couple of new Ldisc functions.
(Not _trivial_ functions, since input buffered by Ldisc can be a
mixture of raw bytes and session specials like SS_EOL! The input queue
inside Ldisc is a bufchain containing a fiddly binary encoding that
can represent an arbitrary interleaving of those things.)
This greatly simplifies the calls to seat_get_userpass_input in
backends, which now don't have to mess about with passing their own
user_input bufchain around, or toggling their want_user_input flag
back and forth to request data to put on to that bufchain.
But the flip side is that now there has to be some _other_ method for
notifying the terminal when there's more input to be consumed during
an interactive prompt, and for notifying the backend when prompt input
has finished so that it can proceed to the next stage of the protocol.
This is done by a pair of extra callbacks: when more data is put on to
Ldisc's input queue, it triggers a call to term_get_userpass_input,
and when term_get_userpass_input finishes, it calls a callback
function provided in the prompts_t.
Therefore, any use of a prompts_t which *might* be asynchronous must
fill in the latter callback when setting up the prompts_t. In SSH, the
callback is centralised into a common PPL helper function, which
reinvokes the same PPL's process_queue coroutine; in rlogin we have to
set it up ourselves.
I'm sorry for this large and sprawling patch: I tried fairly hard to
break it up into individually comprehensible sub-patches, but I just
couldn't tease out any part of it that would stand sensibly alone.
In the case where these socket types are constructed because of a
local proxy command, we do actually have a SockAddr representing the
logical host we were trying to make a connection to. So we might as
well store it in the socket implementation, and then we can include it
in the PLUGLOG_CONNECT_SUCCESS call to make the log message more
informative.
Now the non-SSH backends critically depend on it, it's important not
to forget to send it, for any socket type that's going to be used for
any of those backends. But ProxySocket, and the Unix and Windows
'socket' types wrapping pipes to local subprocesses, were not doing
so.
Some of these socket types don't have a SockAddr available to
represent the destination host. (Sometimes the concept isn't even
meaningful). Therefore, I've also expanded the semantics of
PLUGLOG_CONNECT_SUCCESS so that the addr parameter is allowed to be
NULL, and invented a noncommittal fallback version of the log message
in that situation.
I just happened to notice that just below my huge comment explaining
the two command-line splitting policies, there's a smaller one that
refers to it as '(see large comment below)'. It's not below - it's
above!
That was because the older parts of that comment had previously been
inside split_into_argv(), until I moved the explanation further up the
file to the top level. Another consequence of that was that the older
section of the comment was wrapped to a strangely narrow line width,
because it had previously been indented further right.
Folded the two comments together, and rewrapped the narrow paragraphs.
This is called by the backend to notify the Seat that the connection
has progressed to the point where the main session channel (i.e. the
thing that would typically correspond to the client's stdin/stdout)
has been successfully set up.
The only Seat that implements this method nontrivially is the one in
SshProxy, which uses it as an indication that the proxied connection
to the remote host has succeeded, and sends the
PLUGLOG_CONNECT_SUCCESS notification to its own Plug.
Hence, the only backends that need to implement it at the moment are
the two SSH-shaped backends (SSH proper and bare-connection / psusan).
For other backends, it's not always obvious what 'main session
channel' would even mean, or whether it means anything very useful; so
I've also introduced a backend flag indicating whether the backend is
expecting to call that method at all, so as not to have to spend
pointless effort on defining an arbitrary meaning for it in other
contexts.
So a lot of this patch is just introducing the new method and putting
its trivial do-nothing implementation into all the existing Seat
methods. The interesting parts happen in ssh/mainchan.c (which
actually calls it), and sshproxy.c (which does something useful in
response).
On a similar theme of separating the query operation from the
attempted change, backend_send() now no longer has the side effect of
returning the current size of the send buffer. Instead, you have to
call backend_sendbuffer() every time you want to know that.
This complicates the API in one sense (more separate functions), but
in another sense, simplifies it (each function does something
simpler). When I start putting one Seat in front of another during SSH
proxying, the latter will be more important - in particular, it means
you can find out _whether_ a seat can support changing trust status
without having to actually attempt a destructive modification.
Ian Jackson recently tried to use the recipe in the psusan manpage for
talking to UML, and found that the connection was not successfully set
up, because at some point during startup, UML read the SSH greeting
(ok, the bare-ssh-connection greeting) from its input fd and threw it
away. So by the time psusan was run by the guest init process, the
greeting wasn't there to be read.
Ian's report: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=991958
I was also able to reproduce this locally, which makes me wonder why I
_didn't_ notice it when I originally wrote that part of the psusan man
page. It worked for me before, honest! But now it doesn't.
Anyway. The ssh verstring module already has a mode switch to decide
whether we ought to send our greeting before or after waiting for the
other side's greeting (because that decision varies between client and
server, and between SSH-1 and SSH-2). So it's easy to implement an
override that forces it to 'wait for the server greeting first'.
I've added this as yet another bug workaround flag. But unlike all the
others, it can't be autodetected from the server's version string,
because, of course, we have to act on it _before_ seeing the server's
greeting and version string! So it's a manual-only flag.
However, I've mentioned it in the UML section of the psusan man page,
since that's the place where I _know_ people are likely to need to use
this flag.
A user reports that if you have MIT KfW loaded, and your PuTTY session
terminates without the PuTTY process exiting, and you select 'Restart
Session' from the menu, then a crash occurs inside the Kerberos
library itself. Scuttlebutt on the Internet suggested this might be to
do with unloading and then reloading the DLL within the process
lifetime, which indeed we were doing.
Now we avoid doing that for the KfW library in particular, by keeping
a tree234 of module handles marked 'never unload this'.
This is a workaround at best, but it seems to stop the problem
happening in my own tests.
If you don't, they are permanently leaked. A user points out that this
is particularly bad in Pageant, with the new named-pipe-based IPC,
since it will spawn an input and output I/O thread per named pipe
connection, leading to two handles being leaked every time.
In commit f69cf86a61, I added a call to term_update that happens
when we receive WM_VSCROLL / SB_THUMBPOSITION in the subsidiary
message loop that Windows creates during the handling of WM_SYSCOMMAND
/ SC_VSCROLL. The effect was that interactive dragging of the
scrollbar now redraws the window at every step, whereas previously it
didn't.
A user just pointed out that if you click on one of the scrollbar end
buttons and hold it down until it begins emulating key repeat, the
same bug occurs: the window isn't redrawn until you release the mouse
button and the subsidiary message loop ends.
This commit extends the previous fix to cover all of the WM_VSCROLL
subtypes, instead of just SB_THUMBPOSITION and SB_THUMBTRACK. Redraws
while holding down those scrollbar buttons now work again.
This is used to notify the Seat that some data has been cleared from
the backend's outgoing data buffer. In other words, it notifies the
Seat that it might be worth calling backend_sendbuffer() again.
We've never needed this before, because until now, Seats have always
been the 'main program' part of the application, meaning they were
also in control of the event loop. So they've been able to call
backend_sendbuffer() proactively, every time they go round the event
loop, instead of having to wait for a callback.
But now, the SSH proxy is the first example of a Seat without
privileged access to the event loop, so it has no way to find out that
the backend's sendbuffer has got smaller. And without that, it can't
pass that notification on to plug_sent, to unblock in turn whatever
the proxied connection might have been waiting to send.
In fact, before this commit, sshproxy.c never called plug_sent at all.
As a result, large data uploads over an SSH jump host would hang
forever as soon as the outgoing buffer filled up for the first time:
the main backend (to which sshproxy.c was acting as a Socket) would
carefully stop filling up the buffer, and then never receive the call
to plug_sent that would cause it to start again.
The new callback is ignored everywhere except in sshproxy.c. It might
be a good idea to remove backend_sendbuffer() entirely and convert all
previous uses of it into non-empty implementations of this callback,
so that we've only got one system; but for the moment, I haven't done
that.