The Telnet proxy system is not a proper network protocol - we have no
reliable way to receive communication from the proxy telling us
whether a password is even required. However, we _do_ know (a) whether
the keywords '%user' or '%pass' appeared in the format string stored
in the Conf, and (b) whether we actually had a username or a password
to substitute into them. So that's how we know whether to ask for a
username or a password: if the format string asks for them and the
Conf doesn't provide them, we prompt for them at startup.
This involved turning TelnetProxyNegotiator into a coroutine (matching
all the other proxy types, but previously, it was the only one simple
enough not to need to be one), so that it can wait until a response
arrives to that prompt. (And also, as it turned out, so that it can
wait until setup is finished before even presenting the prompt!)
It also involves having format_telnet_command grow an extra output
parameter, in the form of 'unsigned *flags', with which it can
communicate back to the caller that a username or password was wanted
but not found. The other clients of that function (the local proxy
implementations) don't use those flags, but if necessary, they could.
When I wanted to append an ordinary C string to a BinarySink, without
any prefix length field or suffix terminator, I was using the idiom
put_datapl(bs, ptrlen_from_asciz(string));
but I've finally decided that's too cumbersome, and it deserves a
shorter name. put_dataz(bs, string) now does the same thing - in fact
it's a macro expanding to exactly the above.
While I'm at it, I've also added put_datalit(), which is the same
except that it expects a C string literal (and will enforce that at
compile time, via PTRLEN_LITERAL which it calls in turn). You can use
that where possible to avoid the run-time cost of the strlen.
marshal.h now provides a macro put_fmt() which allows you to write
arbitrary printf-formatted data to an arbitrary BinarySink.
We already had this facility for strbufs in particular, in the form of
strbuf_catf(). That was able to take advantage of knowing the inner
structure of a strbuf to minimise memory allocation (it would snprintf
directly into the strbuf's existing buffer if possible). For a general
black-box BinarySink we can't do that, so instead we dupvprintf into a
temporary buffer.
For consistency, I've removed strbuf_catf, and converted all uses of
it into the new put_fmt - and I've also added an extra vtable method
in the BinarySink API, so that put_fmt can still use strbuf_catf's
more efficient memory management when talking to a strbuf, and fall
back to the simpler strategy when that's not available.
(TL;DR: to suppress redundant 'Press Return to begin session' prompts
in between hops of a jump-host configuration, in Plink.)
This new query method directly asks the Seat the question: is the same
stream of input used to provide responses to interactive login
prompts, and the session input provided after login concludes?
It's used to suppress the last-ditch anti-spoofing defence in Plink of
interactively asking 'Access granted. Press Return to begin session',
on the basis that any such spoofing attack works by confusing the user
about what's a legit login prompt before the session begins and what's
sent by the server after the main session begins - so if those two
things take input from different places, the user can't be confused.
This doesn't change the existing behaviour of Plink, which was already
suppressing the antispoof prompt in cases where its standard input was
redirected from something other than a terminal. But previously it was
doing it within the can_set_trust_status() seat query, and I've now
moved it out into a separate query function.
The reason why these need to be separate is for SshProxy, which needs
to give an unusual combination of answers when run inside Plink. For
can_set_trust_status(), it needs to return whatever the parent Seat
returns, so that all the login prompts for a string of proxy
connections in session will be antispoofed the same way. But you only
want that final 'Access granted' prompt to happen _once_, after all
the proxy connection setup phases are done, because up until then
you're still in the safe hands of PuTTY itself presenting an unbroken
sequence of legit login prompts (even if they come from a succession
of different servers). Hence, SshProxy unconditionally returns 'no' to
the query of whether it has a single mixed input stream, because
indeed, it never does - for purposes of session input it behaves like
an always-redirected Plink, no matter what kind of real Seat it ends
up sending its pre-session login prompts to.
Passing an operating-system-specific error code to plug_closing(),
such as errno or GetLastError(), was always a bit weird, given that it
generally had to be handled by cross-platform receiving code in
backends. I had the platform.h implementations #define any error
values that the cross-platform code would have to handle specially,
but that's still not a great system, because it also doesn't leave
freedom to invent error representations of my own that don't
correspond to any OS code. (For example, the ones I just removed from
proxy.h.)
So now, the OS error code is gone from the plug_closing API, and in
its place is a custom enumeration of closure types: normal, error, and
the special case BROKEN_PIPE which is the only OS error code we have
so far needed to handle specially. (All others just mean 'abandon the
connection and print the textual message'.)
Having already centralised the handling of OS error codes in the
previous commit, we've now got a convenient place to add any further
type codes for errors needing special handling: each of Unix
plug_closing_errno(), Windows plug_closing_system_error(), and Windows
plug_closing_winsock_error() can easily grow extra special cases if
need be, and each one will only have to live in one place.
Having a single plug_closing() function covering various kinds of
closure is reasonably convenient from the point of view of Plug
implementations, but it's annoying for callers, who all have to fill
in pointless NULL and 0 parameters in the cases where they're not
used.
Added some inline helper functions in network.h alongside the main
plug_closing() dispatch wrappers, so that each kind of connection
closure can present a separate API for the Socket side of the
interface, without complicating the vtable for the Plug side.
Also, added OS-specific extra helpers in the Unix and Windows
directories, which centralise the job of taking an OS error code (of
whatever kind) and translating it into its error message.
In passing, this removes the horrible ad-hoc made-up error codes in
proxy.h, which is OK, because nothing checked for them anyway, and
also I'm about to do an API change to plug_closing proper that removes
the need for them.
Previously, SSH authentication banners were displayed by calling the
ordinary seat_output function, and passing it a special value in the
SeatOutputType enumeration indicating an auth banner.
The awkwardness of this was already showing a little in SshProxy's
implementation of seat_output, where it had to check for that special
value and do totally different things for SEAT_OUTPUT_AUTH_BANNER and
everything else. Further work in that area is going to make it more
and more awkward if I keep the two output systems unified.
So let's split them up. Now, Seat has separate output() and banner()
methods, which each implementation can override differently if it
wants to.
All the 'end user' Seat implementations use the centralised
implementation function nullseat_banner_to_stderr(), which turns
banner text straight back into SEAT_OUTPUT_STDERR and passes it on to
seat_output. So I didn't have to tediously implement a boring version
of this function in GTK, Windows GUI, consoles, file transfer etc.
There are quite a few of them already, and I'm about to make another
one, so let's start with a bit of tidying up.
The CMake build organisation is unchanged: I haven't put the proxy
object files into a separate library, just moved the locations of the
source files. (Organising proxying as a library would be tricky
anyway, because of the various overrides for tools that want to avoid
cryptography.)
Now every struct that implements the Backend trait is completely
cleared before we start initialising any of its fields. This will mean
I can add new fields that default to 0 or NULL, without having to mess
around initialising them explicitly everywhere.
Previously, checking the host key against the persistent cache managed
by the storage.h API was done as part of the seat_verify_ssh_host_key
method, i.e. separately by each Seat.
Now that check is done by verify_ssh_host_key(), which is a new
function in ssh/common.c that centralises all the parts of host key
checking that don't need an interactive prompt. It subsumes the
previous verify_ssh_manual_host_key() that checked against the Conf,
and it does the check against the storage API that each Seat was
previously doing separately. If it can't confirm or definitively
reject the host key by itself, _then_ it calls out to the Seat, once
an interactive prompt is definitely needed.
The main point of doing this is so that when SshProxy forwards a Seat
call from the proxy SSH connection to the primary Seat, it won't print
an announcement of which connection is involved unless it's actually
going to do something interactive. (Not that we're printing those
announcements _yet_ anyway, but this is a piece of groundwork that
works towards doing so.)
But while I'm at it, I've also taken the opportunity to clean things
up a bit by renaming functions sensibly. Previously we had three very
similarly named functions verify_ssh_manual_host_key(), SeatVtable's
'verify_ssh_host_key' method, and verify_host_key() in storage.h. Now
the Seat method is called 'confirm' rather than 'verify' (since its
job is now always to print an interactive prompt, so it looks more
like the other confirm_foo methods), and the storage.h function is
called check_stored_host_key(), which goes better with store_host_key
and avoids having too many functions with similar names. And the
'manual' function is subsumed into the new centralised code, so
there's now just *one* host key function with 'verify' in the name.
Several functions are reindented in this commit. Best viewed with
whitespace changes ignored.
The current 'displayname' field is designed for presenting in the
config UI, so it starts with a capital letter even when it's not a
proper noun. If I want to name the backend in the middle of a
sentence, I'll need a version that starts with lowercase where
appropriate.
The old field is renamed displayname_tc, to avoid ambiguity.
It was totally unused. No implementation of the 'closing' method in a
Plug vtable was checking it for any reason at all, except for
ProxySocket which captured it from its client in order to pass on to
its server (which, perhaps after further iterations of ProxySocket,
would have ended up ignoring it similarly). And every caller of
plug_closing set it to 0 (aka false), except for the one in sshproxy.c
which passed true (but it would have made no difference to anyone).
The comment in network.h refers to a FIXME comment which was in
try_send() when that code was written (see winnet.c in commit
7b0e082700). That FIXME is long gone, replaced by a use of a
toplevel callback. So I think the aim must have been to avoid
re-entrancy when sk_write called try_send which encountered a socket
error and called back to plug_closing - but that's long since fixed by
other means now.
This is the same as the previous FUNKY_XTERM mode if you don't press
any modifier keys, but now Shift or Ctrl or Alt with function keys
adds an extra bitmap parameter. The bitmaps are the same as the ones
used by the new SHARROW_BITMAP arrow key mode.
As well as affecting the bitmap field in the escape sequence, it was
_also_ having its otherwise standard effect of prefixing Esc to the
whole sequence. It shouldn't do both.
This commit introduces a new config option for how to handle shifted
arrow keys.
In the default mode (SHARROW_APPLICATION), we do what we've always
done: Ctrl flips the arrow keys between sending their most usual
escape sequences (ESC [ A ... ESC [ D) and sending the 'application
cursor keys' sequences (ESC O A ... ESC O D). Whichever of those modes
is currently configured, Ctrl+arrow sends the other one.
In the new mode (SHARROW_BITMAP), application cursor key mode is
unaffected by any shift keys, but the default sequences acquire two
numeric arguments. The first argument is 1 (reflecting the fact that a
shifted arrow key still notionally moves just 1 character cell); the
second is the bitmap (1 for Shift) + (2 for Alt) + (4 for Ctrl),
offset by 1. (Except that if _none_ of those modifiers is pressed,
both numeric arguments are simply omitted.)
The new bitmap mode is what current xterm generates, and also what
Windows ConPTY seems to expect. If you start an ordinary Command
Prompt and launch into WSL, those are the sequences it will generate
for shifted arrow keys; conversely, if you run a Command Prompt within
a ConPTY, then these sequences for Ctrl+arrow will have the effect you
expect in cmd.exe command-line editing (going backward or forward a
word). For that reason, I enable this mode unconditionally when
launching Windows pterm.
While fixing the previous commit I noticed that window titles don't
actually _work_ properly if you change the terminal character set,
because the text accumulated in the OSC string buffer is sent to the
TermWin as raw bytes, with no indication of what character set it
should interpret them as. You might get lucky if you happened to
choose the right charset (in particular, UTF-8 is a common default),
but if you change the charset half way through a run, then there's
certainly no way the frontend will know to interpret two window titles
sent before and after the change in two different charsets.
So, now win_set_title() and win_set_icon_title() both include a
codepage parameter along with the byte string, and it's up to them to
translate the provided window title from that encoding to whatever the
local window system expects to receive.
On Windows, that's wide-string Unicode, so we can just use the
existing dup_mb_to_wc utility function. But in GTK, it's UTF-8, so I
had to write an extra utility function to encode a wide string as
UTF-8.
In the Windows Pageant message loop, we were alternating between
MsgWaitForMultipleObjects with timeout=INFINITE, and
run_toplevel_callbacks. But run_toplevel_callbacks doesn't loop until
the callback queue is empty: it just runs at least one of the
currently queued callbacks, so that some progress was made. So if two
or more callbacks were queued, we'd leave the rest in the queue and go
back into MsgWaitForMultipleObjects, which could hang indefinitely.
(A very silly workaround was available: move the mouse over the
Pageant systray icon! Every mouse event would terminate the wait and
let you get one more iteration of run_toplevel_callbacks.)
Now we do the same thing as in other main loops: if any further
callbacks are pending, then we still run MsgWaitForMultipleObjects,
but we do it with timeout=0 instead of timeout=INFINITE, so that we
won't go back to a _blocking_ sleep until all callbacks have been
serviced.
When a writable HANDLE is managed by the handle-io.c system, you ask
to send EOF on the handle by calling handle_write_eof. That waits
until all buffered data has been written, and then sends an EOF event
by simply closing the handle.
That is, of course, the only way to send an EOF signal on a handle at
all. And yet, it's a bug, because the handle_output system does not
take ownership of the handle you give it: the client of handle_output
retains ownership, keeps its own copy of the handle, and will expect
to close it itself.
In most cases, the extra close will harmlessly fail, and return
ERROR_INVALID_HANDLE (which the caller didn't notice anyway). But if
you're unlucky, in conditions of frantic handle opening and closing
(e.g. with a lot of separate named-pipe-style agent forwarding
connections being constantly set up and torn down), the handle value
might have been reused between the two closes, so that the second
CloseHandle closes an unrelated handle belonging to some other part of
the program.
We can't fix this by giving handle_output permanent ownership of the
handle, because it really _is_ necessary for copies of it to survive
elsewhere: in particular, for a bidirectional file such as a serial
port or named pipe, the reading side also needs a copy of the same
handle! And yet, we can't replace the handle_write_eof call in the
client with a direct CloseHandle, because that won't wait until
buffered output has been drained.
The solution is that the client still calls handle_write_eof to
register that it _wants_ an EOF sent; the handle_output system will
wait until it's ready, but then, instead of calling CloseHandle, it
will ask its _client_ to close the handle, by calling the provided
'sentdata' callback with the new 'close' flag set to true. And then
the client can not only close the handle, but do whatever else it
needs to do to record that that has been done.
Trying to debug a problem involving threads just now, it turned out
that the version of the diagnostics going to my debug.log was getting
data in a different order from the version going to the debug console.
Now I open and write to debug_fp by going directly to the Win32 API
instead of via a buffering userland stdio, and that seems to have
solved the problem.
Similarly to cmdgen's passphrase options, this replaces the password
on the command line with a filename to read the password out of, which
means it can't show up in 'ps' or the Windows task manager.
The jump host system ought really to be treating SSH authentication
banners as a distinct thing from the standard-error session output, so
that the former can be presented to the user in the same way as the
auth banner for the main session.
This change converts the 'bool is_stderr' parameter of seat_output()
into an enumerated type with three values. For the moment, stderr and
banners are treated the same, but the plan is for that to change.
Now that it's possible for a single invocation of PuTTY to connect to
multiple SSH servers (jump host followed by ultimate destination
host), it's rather unhelpful for host key prompts to just say "the
server". To check an unknown host key, users will need to know _which_
host it's purporting to be the key for.
Another possibility is to put a message in the terminal window
indicating which server we're currently in the SSH setup phase for.
That will certainly be what we have to end up doing for userpass
prompts that appear _in_ the terminal window. But that by itself is
still unhelpful for host key prompts in a separate dialog, because the
user would have to check both windows to get all the information they
need. Easier if the host key dialog itself tells you everything you
need to know to answer the question: is _this_ key the one you expect
for _that_ host?
The format _strings_ were previously centralised into the platform-
independent console.c, as const char arrays. Now the actual formatting
operation is centralised as well, by means of console.c providing a
function that takes all the necessary parameters and returns a
formatted piece of text for the console.
Mostly this is so that I can add extra parameters to the message with
some confidence: changing a format string in one file and two fprintf
statements in other files to match seems like the kind of situation
you wish you hadn't got into in the first place :-)
The system for handling seat_get_userpass_input has always been
structured differently between GUI PuTTY and CLI tools like Plink.
In the CLI tools, password input is read directly from the OS
terminal/console device by console_get_userpass_input; this means that
you need to ensure the same terminal input data _hasn't_ already been
consumed by the main event loop and sent on to the backend. This is
achieved by the backend_sendok() method, which tells the event loop
when the backend has finished issuing password prompts, and hence,
when it's safe to start passing standard input to backend_send().
But in the GUI tools, input generated by the terminal window has
always been sent straight to backend_send(), regardless of whether
backend_sendok() says it wants it. So the terminal-based
implementation of username and password prompts has to work by
consuming input data that had _already_ been passed to the backend -
hence, any backend that needs to do that must keep its input on a
bufchain, and pass that bufchain to seat_get_userpass_input.
It's awkward that these two totally different systems coexist in the
first place. And now that SSH proxying needs to present interactive
prompts of its own, it's clear which one should win: the CLI style is
the Right Thing. So this change reworks the GUI side of the mechanism
to be more similar: terminal data now goes into a queue in the Ldisc,
and is not sent on to the backend until the backend says it's ready
for it via backend_sendok(). So terminal-based userpass prompts can
now consume data directly from that queue during the connection setup
stage.
As a result, the 'bufchain *' parameter has vanished from all the
userpass_input functions (both the official implementations of the
Seat trait method, and term_get_userpass_input() to which some of
those implementations delegate). The only function that actually used
that bufchain, namely term_get_userpass_input(), now instead reads
from the ldisc's input queue via a couple of new Ldisc functions.
(Not _trivial_ functions, since input buffered by Ldisc can be a
mixture of raw bytes and session specials like SS_EOL! The input queue
inside Ldisc is a bufchain containing a fiddly binary encoding that
can represent an arbitrary interleaving of those things.)
This greatly simplifies the calls to seat_get_userpass_input in
backends, which now don't have to mess about with passing their own
user_input bufchain around, or toggling their want_user_input flag
back and forth to request data to put on to that bufchain.
But the flip side is that now there has to be some _other_ method for
notifying the terminal when there's more input to be consumed during
an interactive prompt, and for notifying the backend when prompt input
has finished so that it can proceed to the next stage of the protocol.
This is done by a pair of extra callbacks: when more data is put on to
Ldisc's input queue, it triggers a call to term_get_userpass_input,
and when term_get_userpass_input finishes, it calls a callback
function provided in the prompts_t.
Therefore, any use of a prompts_t which *might* be asynchronous must
fill in the latter callback when setting up the prompts_t. In SSH, the
callback is centralised into a common PPL helper function, which
reinvokes the same PPL's process_queue coroutine; in rlogin we have to
set it up ourselves.
I'm sorry for this large and sprawling patch: I tried fairly hard to
break it up into individually comprehensible sub-patches, but I just
couldn't tease out any part of it that would stand sensibly alone.
In the case where these socket types are constructed because of a
local proxy command, we do actually have a SockAddr representing the
logical host we were trying to make a connection to. So we might as
well store it in the socket implementation, and then we can include it
in the PLUGLOG_CONNECT_SUCCESS call to make the log message more
informative.
Now the non-SSH backends critically depend on it, it's important not
to forget to send it, for any socket type that's going to be used for
any of those backends. But ProxySocket, and the Unix and Windows
'socket' types wrapping pipes to local subprocesses, were not doing
so.
Some of these socket types don't have a SockAddr available to
represent the destination host. (Sometimes the concept isn't even
meaningful). Therefore, I've also expanded the semantics of
PLUGLOG_CONNECT_SUCCESS so that the addr parameter is allowed to be
NULL, and invented a noncommittal fallback version of the log message
in that situation.
I just happened to notice that just below my huge comment explaining
the two command-line splitting policies, there's a smaller one that
refers to it as '(see large comment below)'. It's not below - it's
above!
That was because the older parts of that comment had previously been
inside split_into_argv(), until I moved the explanation further up the
file to the top level. Another consequence of that was that the older
section of the comment was wrapped to a strangely narrow line width,
because it had previously been indented further right.
Folded the two comments together, and rewrapped the narrow paragraphs.
This is called by the backend to notify the Seat that the connection
has progressed to the point where the main session channel (i.e. the
thing that would typically correspond to the client's stdin/stdout)
has been successfully set up.
The only Seat that implements this method nontrivially is the one in
SshProxy, which uses it as an indication that the proxied connection
to the remote host has succeeded, and sends the
PLUGLOG_CONNECT_SUCCESS notification to its own Plug.
Hence, the only backends that need to implement it at the moment are
the two SSH-shaped backends (SSH proper and bare-connection / psusan).
For other backends, it's not always obvious what 'main session
channel' would even mean, or whether it means anything very useful; so
I've also introduced a backend flag indicating whether the backend is
expecting to call that method at all, so as not to have to spend
pointless effort on defining an arbitrary meaning for it in other
contexts.
So a lot of this patch is just introducing the new method and putting
its trivial do-nothing implementation into all the existing Seat
methods. The interesting parts happen in ssh/mainchan.c (which
actually calls it), and sshproxy.c (which does something useful in
response).
On a similar theme of separating the query operation from the
attempted change, backend_send() now no longer has the side effect of
returning the current size of the send buffer. Instead, you have to
call backend_sendbuffer() every time you want to know that.
This complicates the API in one sense (more separate functions), but
in another sense, simplifies it (each function does something
simpler). When I start putting one Seat in front of another during SSH
proxying, the latter will be more important - in particular, it means
you can find out _whether_ a seat can support changing trust status
without having to actually attempt a destructive modification.
Ian Jackson recently tried to use the recipe in the psusan manpage for
talking to UML, and found that the connection was not successfully set
up, because at some point during startup, UML read the SSH greeting
(ok, the bare-ssh-connection greeting) from its input fd and threw it
away. So by the time psusan was run by the guest init process, the
greeting wasn't there to be read.
Ian's report: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=991958
I was also able to reproduce this locally, which makes me wonder why I
_didn't_ notice it when I originally wrote that part of the psusan man
page. It worked for me before, honest! But now it doesn't.
Anyway. The ssh verstring module already has a mode switch to decide
whether we ought to send our greeting before or after waiting for the
other side's greeting (because that decision varies between client and
server, and between SSH-1 and SSH-2). So it's easy to implement an
override that forces it to 'wait for the server greeting first'.
I've added this as yet another bug workaround flag. But unlike all the
others, it can't be autodetected from the server's version string,
because, of course, we have to act on it _before_ seeing the server's
greeting and version string! So it's a manual-only flag.
However, I've mentioned it in the UML section of the psusan man page,
since that's the place where I _know_ people are likely to need to use
this flag.
A user reports that if you have MIT KfW loaded, and your PuTTY session
terminates without the PuTTY process exiting, and you select 'Restart
Session' from the menu, then a crash occurs inside the Kerberos
library itself. Scuttlebutt on the Internet suggested this might be to
do with unloading and then reloading the DLL within the process
lifetime, which indeed we were doing.
Now we avoid doing that for the KfW library in particular, by keeping
a tree234 of module handles marked 'never unload this'.
This is a workaround at best, but it seems to stop the problem
happening in my own tests.
If you don't, they are permanently leaked. A user points out that this
is particularly bad in Pageant, with the new named-pipe-based IPC,
since it will spawn an input and output I/O thread per named pipe
connection, leading to two handles being leaked every time.
In commit f69cf86a61, I added a call to term_update that happens
when we receive WM_VSCROLL / SB_THUMBPOSITION in the subsidiary
message loop that Windows creates during the handling of WM_SYSCOMMAND
/ SC_VSCROLL. The effect was that interactive dragging of the
scrollbar now redraws the window at every step, whereas previously it
didn't.
A user just pointed out that if you click on one of the scrollbar end
buttons and hold it down until it begins emulating key repeat, the
same bug occurs: the window isn't redrawn until you release the mouse
button and the subsidiary message loop ends.
This commit extends the previous fix to cover all of the WM_VSCROLL
subtypes, instead of just SB_THUMBPOSITION and SB_THUMBTRACK. Redraws
while holding down those scrollbar buttons now work again.
This is used to notify the Seat that some data has been cleared from
the backend's outgoing data buffer. In other words, it notifies the
Seat that it might be worth calling backend_sendbuffer() again.
We've never needed this before, because until now, Seats have always
been the 'main program' part of the application, meaning they were
also in control of the event loop. So they've been able to call
backend_sendbuffer() proactively, every time they go round the event
loop, instead of having to wait for a callback.
But now, the SSH proxy is the first example of a Seat without
privileged access to the event loop, so it has no way to find out that
the backend's sendbuffer has got smaller. And without that, it can't
pass that notification on to plug_sent, to unblock in turn whatever
the proxied connection might have been waiting to send.
In fact, before this commit, sshproxy.c never called plug_sent at all.
As a result, large data uploads over an SSH jump host would hang
forever as soon as the outgoing buffer filled up for the first time:
the main backend (to which sshproxy.c was acting as a Socket) would
carefully stop filling up the buffer, and then never receive the call
to plug_sent that would cause it to start again.
The new callback is ignored everywhere except in sshproxy.c. It might
be a good idea to remove backend_sendbuffer() entirely and convert all
previous uses of it into non-empty implementations of this callback,
so that we've only got one system; but for the moment, I haven't done
that.
Suggested by Manfred Kaiser, who also wrote most of this patch
(although outlying parts, like documentation and SSH-1 support, are by
me).
This is a second line of defence against the kind of spoofing attacks
in which a malicious or compromised SSH server rushes the client
through the userauth phase of SSH without actually requiring any auth
inputs (passwords or signatures or whatever), and then at the start of
the connection phase it presents something like a spoof prompt,
intended to be taken for part of userauth by the user but in fact with
some more sinister purpose.
Our existing line of defence against this is the trust sigil system,
and as far as I know, that's still working. This option allows a bit of
extra defence in depth: if you don't expect your SSH server to
trivially accept authentication in the first place, then enabling this
option will cause PuTTY to disconnect if it unexpectedly does so,
without the user having to spot the presence or absence of a fiddly
little sigil anywhere.
Several types of authentication count as 'trivial'. The obvious one is
the SSH-2 "none" method, which clients always try first so that the
failure message will tell them what else they can try, and which a
server can instead accept in order to authenticate you unconditionally.
But there are two other ways to do it that we know of: one is to run
keyboard-interactive authentication and send an empty INFO_REQUEST
packet containing no actual prompts for the user, and another even
weirder one is to send USERAUTH_SUCCESS in response to the user's
preliminary *offer* of a public key (instead of sending the usual PK_OK
to request an actual signature from the key).
This new option detects all of those, by clearing the 'is_trivial_auth'
flag only when we send some kind of substantive authentication response
(be it a password, a k-i prompt response, a signature, or a GSSAPI
token). So even if there's a further path through the userauth maze we
haven't spotted, that somehow avoids sending anything substantive, this
strategy should still pick it up.
Enthusiastic copy-paste: in commit 17c57e1078 I added the same
precautionary call to ensure_handlewaits_tree_exists() everywhere,
even in functions that didn't actually need to use the tree.
Before commit 6e69223dc2, Pageant would stop working after a
certain number of PuTTYs were active at the same time. (At most about
60, but maybe fewer - see below.)
This was because of two separate bugs. The easy one, fixed in
6e69223dc2 itself, was that PuTTY left each named-pipe connection
to Pageant open for the rest of its lifetime. So the real problem was
that Pageant had too many active connections at once. (And since a
given PuTTY might make multiple connections during userauth - one to
list keys, and maybe another to actually make a signature - that was
why the number of _PuTTYs_ might vary.)
It was clearly a bug that PuTTY was leaving connections to Pageant
needlessly open. But it was _also_ a bug that Pageant couldn't handle
more than about 60 at once. In this commit, I fix that secondary bug.
The cause of the bug is that the WaitForMultipleObjects function
family in the Windows API have a limit on the number of HANDLE objects
they can select between. The limit is MAXIMUM_WAIT_OBJECTS, defined to
be 64. And handle-io.c was using a separate event object for each I/O
subthread to communicate back to the main thread, so as soon as all
those event objects (plus a handful of other HANDLEs) added up to more
than 64, we'd start passing an overlarge handle array to
WaitForMultipleObjects, and it would start not doing what we wanted.
To fix this, I've reorganised handle-io.c so that all its subthreads
share just _one_ event object to signal readiness back to the main
thread. There's now a linked list of 'struct handle' objects that are
ready to be processed, protected by a CRITICAL_SECTION. Each subthread
signals readiness by adding itself to the linked list, and setting the
event object to indicate that the list is now non-empty. When the main
thread receives the event, it iterates over the whole list processing
all the ready handles.
(Each 'struct handle' still has a separate event object for the main
thread to use to communicate _to_ the subthread. That's OK, because no
thread is ever waiting on all those events at once: each subthread
only waits on its own.)
The previous HT_FOREIGN system didn't really fit into this framework.
So I've moved it out into its own system. There's now a handle-wait.c
which deals with the relatively simple job of managing a list of
handles that need to be waited for, each with a callback function;
that's what communicates a list of HANDLEs to event loops, and
receives the notification when the event loop notices that one of them
has done something. And handle-io.c is now just one client of
handle-wait.c, providing a single HANDLE to the event loop, and
dealing internally with everything that needs to be done when that
handle fires.
The new top-level handle-wait.c system *still* can't deal with more
than MAXIMUM_WAIT_OBJECTS. At the moment, I'm reasonably convinced it
doesn't need to: the only kind of HANDLE that any of our tools could
previously have needed to wait on more than one of was the one in
handle-io.c that I've just removed. But I've left some assertions and
a TODO comment in there just in case we need to change that in future.
This introduces a new entry to the radio-button list of proxy types,
in which the 'Proxy host' box is taken to be the name of an SSH server
or saved session. We make an entire subsidiary SSH connection to that
host, open a direct-tcpip channel through it, and use that as the
connection over which to run the primary network connection.
The result is basically the same as if you used a local proxy
subprocess, with a command along the lines of 'plink -batch %proxyhost
-nc %host:%port'. But it's all done in-process, by having an SshProxy
object implement the Socket trait to talk to the main connection, and
implement Seat and LogPolicy to talk to its subsidiary SSH backend.
All the refactoring in recent years has got us to the point where we
can do that without both SSH instances fighting over some global
variable or unique piece of infrastructure.
From an end user perspective, doing SSH proxying in-process like this
is a little bit easier to set up: it doesn't require you to bake the
full pathname of Plink into your saved session (or to have it on the
system PATH), and the SshProxy setup function automatically turns off
SSH features that would be inappropriate in this context, such as
additional port forwardings, or acting as a connection-sharing
upstream. And it has minor advantages like getting the Event Log for
the subsidiary connection interleaved in the main Event Log, as if it
were stderr output from a proxy subcommand, without having to
deliberately configure the subsidiary Plink into verbose mode.
However, this is an initial implementation only, and it doesn't yet
support the _big_ payoff for doing this in-process, which (I hope)
will be the ability to handle interactive prompts from the subsidiary
SSH connection via the same user interface as the primary one. For
example, you might need to answer two password prompts in succession,
or (the first time you use a session configured this way) confirm the
host keys for both proxy and destination SSH servers. Comments in the
new source file discuss some design thoughts on filling in this gap.
For the moment, if the proxy SSH connection encounters any situation
where an interactive prompt is needed, it will make the safe
assumption, the same way 'plink -batch' would do. So it's at least no
_worse_ than the existing technique of putting the proxy connection in
a subprocess.
This notifies the Seat that the entire backend session has finished
and closed its network connection - or rather, that it _might_ have
done, and that the frontend should check backend_connected() if it
wasn't planning to do so already.
The existing Seat implementations haven't needed this: the GUI ones
don't actually need to do anything specific when the network
connection goes away, and the CLI ones deal with it by being in charge
of their own event loop so that they can easily check
backend_connected() at every possible opportunity in any case. But I'm
about to introduce a new Seat implementation that does need to know
this, and doesn't have any other way to get notified of it.
Since ca9cd983e1, changing colour config mid-session had no effect
(until the palette was reset for some other reason). Now it does take
effect immediately (provided that the palette has not been overridden by
escape sequence -- this is new with ca9cd983e1).
This changes the semantics of palette_reset(): the only important
parameter when doing that is whether we keep escape sequence overrides
-- there's no harm in re-fetching config and platform colours whether or
not they've changed -- so that's what the parameter becomes (with a
sense that doesn't require changing the call sites). The other part of
this change is actually remembering to trigger this when the
configuration is changed.
I was cleaning up the 'struct handle', but not the underlying HANDLE.
As a result, any PuTTY process that makes a request to Pageant keeps
the named pipe connection open until the end of the process's
lifetime.
I was checking a HANDLE against INVALID_HANDLE_VALUE to decide whether
it should be closed. But ten lines further up, I was setting it
manually to NULL to suppress the close. Oops.
Less than 12 hours after 0.75 went out of the door, a user pointed out
that enabling the 'Use system colours' config option causes an
immediate NULL-dereference crash. The reason is because a chain of
calls from term_init() ends up calling back to the Windows
implementation of the palette_get_overrides() method, which responds
by trying to call functions on the static variable 'term' in window.c,
which won't be initialised until term_init() has returned.
Simple fix: palette_get_overrides() is now given a pointer to the
Terminal that it should be updating, because it can't find it out any
other way.
This fulfills our long-standing Mayhem-difficulty wishlist item
'win-command-prompt': this is a Windows pterm in the sense that when
you run it you get a local cmd.exe running inside a PuTTY-style window.
Advantages of this: you get the same free choice of fonts as PuTTY has
(no restriction to a strange subset of the system's available fonts);
you get the same copy-paste gestures as PuTTY (no mental gear-shifting
when you have command prompts and SSH sessions open on the same
desktop); you get scrollback with the PuTTY semantics (scrolling to
the bottom gets you to where the action is, as opposed to the way you
could accidentally find yourself 500 lines past the end of the action
in a real console).
'win-command-prompt' was at Mayhem difficulty ('Probably impossible')
basically on the grounds that with Windows's old APIs for accessing
the contents of consoles, there was no way I could find to get this to
work sensibly. What was needed to make it feasible was a major piece
of re-engineering work inside Windows itself.
But, of course, that's exactly what happened! In 2019, the new ConPTY
API arrived, which lets you create an object that behaves like a
Windows console at one end, and round the back, emits a stream of
VT-style escape sequences as the screen contents evolve, and accepts a
VT-style input stream in return which it will parse function and arrow
keys out of in the usual way.
So now it's actually _easy_ to get this to basically work. The new
backend, in conpty.c, has to do a handful of magic Windows API calls
to set up the pseudo-console and its feeder pipes and start a
subprocess running in it, a further magic call every time the PuTTY
window is resized, and detect the end of the session by watching for
the subprocess terminating. But apart from that, all it has to do is
pass data back and forth unmodified between those pipes and the
backend's associated Seat!
That said, this is new and experimental, and there will undoubtedly be
issues. One that I already know about is that you can't copy and paste
a word that has wrapped between lines without getting an annoying
newline in the middle of it. As far as I can see this is a fundamental
limitation: the ConPTY system sends the _same_ escape sequence stream
for a line that wrapped as it would send for a line that had a logical
\n at what would have been the wrap point. Probably the best we can do
to mitigate this is to adopt a different heuristic for newline elision
that's right more often than it's wrong.
For the moment, that experimental-ness is indicated by the fact that
Buildscr will build, sign and deliver a copy of pterm.exe for each
flavour of Windows, but won't include it in the .zip file or in the
installer. (In fact, that puts it in exactly the same ad-hoc category
as PuTTYtel, although for completely different reasons.)
icons/Makefile will now rebuild them, but also, as per this code
base's usual policy with Windows icons, they're committed directly in
the windows subdir.
Now they're done by putty.rc and puttytel.rc, before including
putty-common.rc2. So another user of putty-common.rc2 can disagree on
what icons to use.
This prepares the ground for a second essentially similarly-shaped
program reusing most of window.c but handling its command line and
startup differently. A couple of large parts of WinMain() to do with
backend selection and command-line handling are now subfunctions in a
separate file putty.c.
Also, our custom AppUserModelId is defined in that file, so that it
can vary with the client application.
The code to find out the location of the c:\windows\system32 directory
was already present, in load_system32_dll(). Now it's moved out into a
function of its own, so it can be called in other contexts.
Jacob spots that on Windows, current PuTTY is not compatible with
0.74, if one of them acts as a connection sharing upstream and the
other as a downstream. That's because commit 1344d4d1cd
accidentally changed the hash preimage in capi_obfuscate_string() so
that it no longer had an SSH-like string length field at the front. So
the two versions of PuTTY will expect the named pipe to have a
different pathname, and so they won't be able to find each other.
Interoperation between PuTTY versions is not the most important use
case of connection sharing - surely the typical user will invoke it by
activating the same session twice, or by using Duplicate Session. But
it was never intended to deliberately _not_ work, so let's fix it
before 0.75 goes out, so that at least the incompatible behaviour will
only ever have appeared in development snapshots.
Now that the main source file of Plink in each platform directory has
the same name, we can put centralise the main definition of the
program in the main CMakeLists.txt, and in the platform directory,
just add the few extra modules needed to clear up platform-specific
details.
The same goes for psocks. And PSCP and PSFTP could have been moved to
the top level already - I just hadn't done it in the initial setup.
This gets rid of all those annoying 'win', 'ux' and 'gtk' prefixes
which made filenames annoying to type and to tab-complete. Also, as
with my other recent renaming sprees, I've taken the opportunity to
expand and clarify some of the names so that they're not such cryptic
abbreviations.
GetFileType() takes a HANDLE, not a pathname. So passing it the
pathname of the agent named pipe would never have worked at all.
I hadn't noticed, because the only call to that function logical-ORs
its return value with that of wm_copydata_agent_exists(), and the
latter _does_ work.
So if you're running true Pageant, which presents both IPC interfaces,
then there's no problem. But if a Pageant-emulating system wanted to
present only the named-pipe version, then we wouldn't have detected
it. Now we should do.
I just discovered that they weren't happening, and the reason why is
thoroughly annoying. Details are in the long comment I've added to the
WM_VSCROLL handler in WndProc, but the short version is that when you
interactively drag the terminal window's scrollbar, a subsidiary
message loop is launched by DefWndProc, causing all our timer events
to go missing until the user lets go of the scrollbar again. So we
have to manually update the terminal window on scroll events, because
the normal system is out of action.
I assume this changed behaviour round about the big rework of terminal
updating in February. Good job I spotted it just _before_ 0.75, and
not just after!
As we do in other similar situations. (The resulting passphrase dialog
is annoyingly unsymmetric, but probably less annoying than a Help
button which does nothing, and the situation shouldn't arise with our
standard builds.)
Suggested by Jacob: if this dialog box is going to pop up
_unexpectedly_ - perhaps when people have momentarily forgotten
they're even running Pageant, or at least forgotten they added a key
encrypted,, or maybe haven't found out yet that their IT installed it
- then it could usefully come with a help button that pops up further
explanation of what the dialog box means, and from which you can find
your way to the rest of the help.
I continue to believe that there's nothing I can (or should) do about
the fact that on Windows, Pageant's async passphrase prompt dialog box
doesn't automatically get the input focus when it pops up in response
to a request received via invisible IPC.
However, one thing I can do is add some text to the box that _warns_
people about it, so that at least there's some kind of suggestion that
you should get into the habit of clicking on the passphrase prompt
before typing your passphrase into it.
(I would be less concerned about all of this if it weren't for the
fact that focus is surprisingly non-obvious on Windows 10, at least on
the machine I have here. When the window doesn't have focus, the title
bar has the same background colour, and only the text is fainter. And
perhaps more confusingly, the cursor in the edit box still flashes!
That fooled _me_ a few times to begin with.)
This code base has always been a bit confused about which spelling it
likes to use to refer to that signature algorithm. The SSH protocol id
is "ssh-dss". But everyone I know refers to it as the Digital
Signature _Algorithm_, not the Digital Signature _Standard_.
When I moved everything down into the crypto subdir, I took the
opportunity to rename sshdss.c to dsa.c. Now I'm doing the rest of the
job: all internal identifiers and code comments refer to DSA, and the
spelling "dss" only survives in externally visible identifiers that
have to remain constant.
(Such identifiers include the SSH protocol id, and also the string id
used to identify the key type in PuTTY's own host key cache. We can't
change the latter without causing everyone a backwards-compatibility
headache, and if we _did_ ever decide to do that, we'd surely want to
do a much more thorough job of making the cache format more sensible!)
This clears up another large pile of clutter at the top level, and in
the process, allows me to rename source files to things that don't all
have that annoying 'ssh' prefix at the top.
This applies to all of AES, SHA-1, SHA-256 and SHA-512. All those
source files previously contained multiple implementations of the
algorithm, enabled or disabled by ifdefs detecting whether they would
work on a given compiler. And in order to get advanced machine
instructions like AES-NI or NEON crypto into the output file when the
compile flags hadn't enabled them, we had to do nasty stuff with
compiler-specific pragmas or attributes.
Now we can do the detection at cmake time, and enable advanced
instructions in the more sensible way, by compile-time flags. So I've
broken up each of these modules into lots of sub-pieces: a file called
(e.g.) 'foo-common.c' containing common definitions across all
implementations (such as round constants), one called 'foo-select.c'
containing the top-level vtable(s), and a separate file for each
implementation exporting just the vtable(s) for that implementation.
One advantage of this is that it depends a lot less on compiler-
specific bodgery. My particular least favourite part of the previous
setup was the part where I had to _manually_ define some Arm ACLE
feature macros before including <arm_neon.h>, so that it would define
the intrinsics I wanted. Now I'm enabling interesting architecture
features in the normal way, on the compiler command line, there's no
need for that kind of trick: the right feature macros are already
defined and <arm_neon.h> does the right thing.
Another change in this reorganisation is that I've stopped assuming
there's just one hardware implementation per platform. Previously, the
accelerated vtables were called things like sha256_hw, and varied
between FOO-NI and NEON depending on platform; and the selection code
would simply ask 'is hw available? if so, use hw, else sw'. Now, each
HW acceleration strategy names its vtable its own way, and the
selection vtable has a whole list of possibilities to iterate over
looking for a supported one. So if someone feels like writing a second
accelerated implementation of something for a given platform - for
example, I've heard you can use plain NEON to speed up AES somewhat
even without the crypto extension - then it will now have somewhere to
drop in alongside the existing ones.
The old behaviour is still present under an ifdef based on _MSC_VER,
so it should still appear in the w32old builds we're still making.
(cherry picked from commit 49b91bc128)
The definition of HAVE_CMAKE_H is now at the very top of the main
CMakeLists.txt, so that it applies to all objects. And the consequent
include of cmake.h is at the very top of defs.h, so that it should be
included first by everything. This way, I don't have to worry any more
that the HAVE_FOO definitions in cmake.h might accidentally have
failed to reach some part of the code.
add_platform_sources_to_library() is now called
add_sources_from_current_dir(), so that it will make sense when I use
it in subdirectories that aren't for a particular platform.
PuTTYgen and its documentation are pretty consistent about calling their
encryption key a 'passphrase', as opposed to a 'password' supplied
directly to a server; but the Argon2 parameters UI reverted to
'password hash', which seemed unecessarily confusing.
I think it's better to use the term 'passphrase' consistently in the UI.
(People who are used to Argon2 being called a 'password hash' can
probably deal.)
This required tweaking the coordinates of the Windows PuTTYgen UI.
I've finally got round to updating this system for the fixed
(post-VS7) command-line splitting. That means I need to regenerate the
table in the big comment. So here's an automated method of doing it
that doesn't require me to read off the output of -generate in an
error-prone manual way.
Something weird was happening in the string handling which caused the
output to be full of the kind of gibberish you expect to see from
unterminated strings. Rather than debug it in detail, I've taken
advantage of now having the utils library conveniently available, and
simply used a strbuf, which I _know_ works sensibly.
It was there because of a limitation of mkfiles.pl, which had a single
list of include directories that it used on all platforms. CMake does
not. So now there's an easier and more sensible way to have a
different header file included on Windows and Unix: call it the same
name in the two subdirectories, and rely on CMake having put the right
one of those subdirs on the include path.
I found these while going through the code, and decided if we're going
to have them then we should compile them. They didn't all compile
first time, proving my point :-)
I've enhanced the tree234 test so that it has a verbose option, which
by default is off.
This new implementation uses the same optimisation-barrier technique
that I used in various places in testsc: have a no-op function, and a
volatile function pointer pointing at it, and then call through the
function pointer, so that nothing actually happens (apart from the
physical call and return) but the compiler has to assume that
_anything_ might have happened.
Doing this just after a memset enforces that the compiler can't have
thrown away the memset, because the called function might (for
example) check that all the memory really is zero and abort if not.
I've been turning this over in my mind ever since coming up with the
technique for testsc. I think it's far more robust than the previous
smemclr technique: so much so that I'm switching to using it
_everywhere_, and no longer using platform alternatives like Windows's
SecureZeroMemory().
This is the start of the payoff for all that reorganisation (and
perhaps also from having moved to a library-based build structure in
the first place): a collection of pointless stub functions in outlying
programs, which were only there to prevent link failures, now no
longer need to be there even for that purpose.
This is a module that I'd noticed in the past was too monolithic.
There's a big pile of stub functions in uxpgnt.c that only have to be
there because the implementation of true X11 _forwarding_ (i.e.
actually managing a channel within an SSH connection), which Pageant
doesn't need, was in the same module as more general X11-related
utility functions which Pageant does need.
So I've broken up this awkward monolith. Now x11fwd.c contains only
the code that really does all go together for dealing with SSH X
forwarding: the management of an X forwarding channel (including the
vtables to make it behave as Channel at the SSH end and a Plug at the
end that connects to the local X server), and the management of
authorisation for those channels, including maintaining a tree234 of
possible auth values and verifying the one we received.
Most of the functions removed from this file have moved into the utils
subdir, and also into the utils library (i.e. further down the link
order), because they were basically just string and data processing.
One exception is x11_setup_display, which parses a display string and
returns a struct telling you everything about how to connect to it.
That talks to the networking code (it does name lookups and makes a
SockAddr), so it has to live in the network library rather than utils,
and therefore it's not in the utils subdirectory either.
The other exception is x11_get_screen_number, which it turned out
nothing called at all! Apparently the job it used to do is now done as
part of x11_setup_display. So I've just removed it completely.
Now that the new CMake build system is encouraging us to lay out the
code like a set of libraries, it seems like a good idea to make them
look more _like_ libraries, by putting things into separate modules as
far as possible.
This fixes several previous annoyances in which you had to link
against some object in order to get a function you needed, but that
object also contained other functions you didn't need which included
link-time symbol references you didn't want to have to deal with. The
usual offender was subsidiary supporting programs including misc.c for
some innocuous function and then finding they had to deal with the
requirements of buildinfo().
This big reorganisation introduces three new subdirectories called
'utils', one at the top level and one in each platform subdir. In each
case, the directory contains basically the same files that were
previously placed in the 'utils' build-time library, except that the
ones that were extremely miscellaneous (misc.c, utils.c, uxmisc.c,
winmisc.c, winmiscs.c, winutils.c) have been split up into much
smaller pieces.
It's had its day. It was there to support pre-WinNT platforms, on
which the security APIs don't exist - but more specifically, it was
there to support _build tools_ that only knew about pre-WinNT versions
of Windows, so that you couldn't even compile a program that would
_try_ to refer to the interprocess security APIs.
But we don't support those build systems any more in any case: more
recent changes like the assumption of (most of) C99 will have stopped
this code from building with compilers that old. So there's no reason
to clutter the code with backwards compatibility features that won't
help.
I left NO_SECURITY in place during the CMake migration, so that _just_
in case it needs resurrecting, some version of it will be available in
the git history. But I don't expect it to be needed, and I'm deleting
the whole thing now.
The _runtime_ check for interprocess security libraries is still in
place. So PuTTY tools built with a modern toolchain can still at least
try to run on the Win95/98/ME series, and they should detect that
those system DLLs don't exist and proceed sensibly in their absence.
That may also be a thing to throw out sooner or later, but I haven't
thrown it out as part of this commit.
This brings various concrete advantages over the previous system:
- consistent support for out-of-tree builds on all platforms
- more thorough support for Visual Studio IDE project files
- support for Ninja-based builds, which is particularly useful on
Windows where the alternative nmake has no parallel option
- a really simple set of build instructions that work the same way on
all the major platforms (look how much shorter README is!)
- better decoupling of the project configuration from the toolchain
configuration, so that my Windows cross-building doesn't need
(much) special treatment in CMakeLists.txt
- configure-time tests on Windows as well as Linux, so that a lot of
ad-hoc #ifdefs second-guessing a particular feature's presence from
the compiler version can now be replaced by tests of the feature
itself
Also some longer-term software-engineering advantages:
- other people have actually heard of CMake, so they'll be able to
produce patches to the new build setup more easily
- unlike the old mkfiles.pl, CMake is not my personal problem to
maintain
- most importantly, mkfiles.pl was just a horrible pile of
unmaintainable cruft, which even I found it painful to make changes
to or to use, and desperately needed throwing in the bin. I've
already thrown away all the variants of it I had in other projects
of mine, and was only delaying this one so we could make the 0.75
release branch first.
This change comes with a noticeable build-level restructuring. The
previous Recipe worked by compiling every object file exactly once,
and then making each executable by linking a precisely specified
subset of the same object files. But in CMake, that's not the natural
way to work - if you write the obvious command that puts the same
source file into two executable targets, CMake generates a makefile
that compiles it once per target. That can be an advantage, because it
gives you the freedom to compile it differently in each case (e.g.
with a #define telling it which program it's part of). But in a
project that has many executable targets and had carefully contrived
to _never_ need to build any module more than once, all it does is
bloat the build time pointlessly!
To avoid slowing down the build by a large factor, I've put most of
the modules of the code base into a collection of static libraries
organised vaguely thematically (SSH, other backends, crypto, network,
...). That means all those modules can still be compiled just once
each, because once each library is built it's reused unchanged for all
the executable targets.
One upside of this library-based structure is that now I don't have to
manually specify exactly which objects go into which programs any more
- it's enough to specify which libraries are needed, and the linker
will figure out the fine detail automatically. So there's less
maintenance to do in CMakeLists.txt when the source code changes.
But that reorganisation also adds fragility, because of the trad Unix
linker semantics of walking along the library list once each, so that
cyclic references between your libraries will provoke link errors. The
current setup builds successfully, but I suspect it only just manages
it.
(In particular, I've found that MinGW is the most finicky on this
score of the Windows compilers I've tried building with. So I've
included a MinGW test build in the new-look Buildscr, because
otherwise I think there'd be a significant risk of introducing
MinGW-only build failures due to library search order, which wasn't a
risk in the previous library-free build organisation.)
In the longer term I hope to be able to reduce the risk of that, via
gradual reorganisation (in particular, breaking up too-monolithic
modules, to reduce the risk of knock-on references when you included a
module for function A and it also contains function B with an
unsatisfied dependency you didn't really need). Ideally I want to
reach a state in which the libraries all have sensibly described
purposes, a clearly documented (partial) order in which they're
permitted to depend on each other, and a specification of what stubs
you have to put where if you're leaving one of them out (e.g.
nocrypto) and what callbacks you have to define in your non-library
objects to satisfy dependencies from things low in the stack (e.g.
out_of_memory()).
One thing that's gone completely missing in this migration,
unfortunately, is the unfinished MacOS port linked against Quartz GTK.
That's because it turned out that I can't currently build it myself,
on my own Mac: my previous installation of GTK had bit-rotted as a
side effect of an Xcode upgrade, and I haven't yet been able to
persuade jhbuild to make me a new one. So I can't even build the MacOS
port with the _old_ makefiles, and hence, I have no way of checking
that the new ones also work. I hope to bring that port back to life at
some point, but I don't want it to block the rest of this change.
In commit bb59f27386 I changed a use of the constant GWL_ID to
GWLP_ID, on the grounds that the former caused a build failure under
winelib. But the GWLP constants are supposed to be used with
GetWindowLongPtr, and I was still calling GetWindowLong.
(Benign, since the two sets of constants are the same. But that is the
only case in the whole code base where I'd made that error, and since
it was only introduced a couple of days ago, there's no possibility of
a longstanding historical reason for carefully not touching it!)
Turns out that the precautions against winelib builds failing, which I
put in years ago because I was using winelib as a build setup for
Coverity testing, are all obsolete. My Coverity build scripts runs
fine now without any of them.
This will let us put two controls side by side (e.g. in disjoint
columns of a multi-col layout) and indicate that instead of the
default behaviour of aligning their top edges, their centreline (or,
even better if available, font baseline) should be aligned.
NFC: nothing uses this yet.
Coverity points out that it's theoretically possible for the main loop
in radioline_common() to read r.bottom without having gone through the
conditional setup at the start of the function _or_ a previous
iteration of the main loop. I think this can only happen in some silly
case that doesn't actually come up, but on the other hand, it's easy
to add the necessary robustness.
If named_pipe_agent_gotdata was called with an error or EOF status, it
would call agent_cancel_query(pq), but then accidentally fall through
to the non-error handler which would dereference pq. I meant to return
early in that situation, and Coverity spotted that I'd left out the
early return statement.
The winelib headers don't have GWL_foo, only GWLP_foo (which, fair
enough, I should have been using already). And a side effect was to
point out some slightly incautious integer types in printf argument
lists.
This has apparently been missing more or less forever (though Unix
Plink does have it). Without this, ssh.c can't call ldisc_update,
which can't pass the current editing and echoing settings through to
seat_echoedit_update. Windows Plink has always _had_ an implementation
of that seat method (and the static function that preceded it), but it
was never able to be called, because of that missing link.
The result was that manual overrides in the Conf to force local
editing/echoing to a particular state were not honoured by Windows
Plink, and neither were mainchan.c's attempts to set the state
automatically based on whether a pty had been allocated at the far end
of the connection.
Thanks to Jacob for spotting this one: when we hand a passphrase back
to pageant.c via pageant_passphrase_request_success(), if the key
doesn't decrypt successfully, pageant.c responds by immediately
issuing another passphrase prompt - and it does it _synchronously_, by
calling back from within pageant_passphrase_request_success(). In this
case, the effect is that we end up in ask_passphrase_common(), which
starts by asserting that nonmodal_passphrase_hwnd is NULL - but it
wasn't NULL _quite_ yet, because end_passphrase_dialog() was expecting
to clean it up immediately after pageant_passphrase_request_success()
returned, i.e. just too late.
The heavyweight fix would be to arrange a toplevel callback to defer
opening the new window until after the old one had been cleaned up.
But in this case I don't think there's any need: it's enough to simply
do the operations in end_passphrase_dialog() in the opposite order, so
that first we destroy the old window and set nonmodal_passphrase_hwnd
back to NULL, and _then_ we call into pageant.c which might call us
back and open a fresh window.
Now the Remove button is disabled if there aren't any keys at all
loaded, and the Re-encrypt button is disabled if no key is currently
in a state where it's decrypted but re-encryptable.
Not quite sure how that happened! But at some point in the past, a bunch
of other definitions in winpgnt.c managed to get in between the first
few IDM_FOO constants and the last few. Bring them all back together.
I'm tired of remembering all those fiddly magic numbers and copying
them back and forth between the .rc file and the source code. I'm even
more tired of having to remember that in the long string of numbers
after a dialog item definition, the first one of them _isn't_ one of
the position and size coordinates. I've given them all symbolic names,
like they should have had all along.
I think I originally didn't bother because this was such a small GUI
compared to the much larger one in PuTTY proper. But it's growing!
This causes the main key list window to open when Pageant starts up,
instead of waiting until you select 'View Keys' from the systray menu.
My main motivation for adding this option is for development: if I'm
_working_ on some detail of the key list window, it cuts down
keystrokes in my edit-compile-retry cycle if I can have it
automatically pop up in every new test run of Pageant.
Normally I'd solve that by hacking an extra couple of lines
temporarily into the code while I was doing that piece of development.
But it suddenly struck me that there's no reason _not_ to add an
option like this permanently (the space of word-length command-line
flags is huge, and that particular one is unlikely to be needed for a
different meaning), and who knows, it _might_ come in useful to
someone in normal use. And at the very least it'll save me doing
another temporary hack the next time I'm doing development work on the
Pageant GUI. So I'll leave it in.
On a system with 2 or more displays with different DPI settings,
moving the PuTTY window from one display to another will make Windows
resize the window using its "bitmap" strategy, stretching/compressing
the text, making it fuzzy and harder to read. This change makes PuTTY
resize its window and font size to accurately fit the DPI of the
display it is on.
We process the WM_DPICHANGED message, saving the new DPI, window size
and position. We proceed to then reset the window, recreating the
fonts using the new DPI and calculate the new window size and position
based on the new font size, user display options (ie. with/without
scrollbar) and the suggested window position provided by Windows. The
suggested window size is usually not a perfect fit, therefore we must
add a small offset to the new window position in order to avoid issues
with repeated DPI changes while dragging the window from one display
to another.
The GUI loop that responded to the 'Remove Key' button in the key list
worked by actually trying to retrieve a pointer to the ssh_key for a
stored key, and then passing that back to the delete function. But
when a key is encrypted, that pointer is NULL, so we segfaulted.
Fixed by changing pageant_delete_ssh2_key() to take a numeric index in
the list instead of a key pointer.
This makes Windows Pageant's slightly ad-hoc command-line handling a
bit more like a standard option loop: we start by deciding whether we
think any given argument _is_ an option or not, and if we think it is,
we give an error message if it's one we don't recognise.
Now Windows Pageant has two clearly distinct dialog boxes for
requesting a key passphrase: one to use synchronously when the user
has just used the 'Add Key' GUI action, and one to use asynchronously
in response to an agent client's attempt to use a key that was loaded
encrypted.
Also fixed the wording in the asynchronous box: there were two copies
of the 'enter passphrase' instruction, one from the dialog definition
in pageant.rc file and one from the cross-platform pageant.c. Now
pageant.c doesn't format a whole user-facing message any more: it
leaves that to the platform front end to do it the way it wants.
I've also added a call to SetForegroundWindow, to try to get the
passphrase prompt into the foreground. In my experience this doesn't
actually get it the keyboard focus, which I think is deliberate on
Windows's part and there's nothing I can do about it. But at least the
user should _see_ that the prompt is there, so they can focus it
themself.
Now they have '(encrypted)' or '(re-encryptable)' after them, the same
as Unix Pageant.
Mostly this just involved tinkering with the code in winpgnt.c that
makes up the entry to put in the list box. But I also had to sprinkle
a few more calls to keylist_update() into the cross-platform
pageant.c, to make sure that the key list window is proactively
updated whenever a key is decrypted, re-encrypted, or loaded in
encrypted-only form.
The advantage of this API is that it gives us the extra flags saying
whether each key is encrypted or re-encryptable.
NFC: we don't yet do anything with that information, just make it
available for future work.
Now you can press 'i' at the host key prompt, and it will print all
the key fingerprints we know about, plus the full public key. So if
you wanted to check against a fingerprint type that wasn't the one
shown in the default prompt, you can see all the ones we've got.
Now we pass the whole set of fingerprints, and also a displayable
format for the full host public key.
NFC: this commit doesn't modify any of the host key prompts to _use_
any of the new information. That's coming next.
There's now a drop-down list box below the key list, from which you
can select a fingerprint type. Also, like GUI PuTTYgen, I've widened
the key list window to make room for wider SHA256 fingerprints.
The fingerprint type shown in the PuTTYgen main dialog can now be
selected from the Key menu. Also, I've widened the dialog box, because
SHA256 fingerprints are wider than MD5 ones.
(In a fixed-pitch font, the fingerprint itself is slightly shorter -
43 base64 characters in place of 47 characters of colon-separated hex.
But the "SHA256:" prefix lengthens it, and also, in a non-fixed-pitch
font such as the default one in Windows dialogs, the colons are very
narrow, so the MD5 fingerprint has a far smaller pixel width.)
There's a new enumeration of fingerprint types, and you tell
ssh2_fingerprint() or ssh2_fingerprint_blob() which of them to use.
So far, this is only implemented behind the scenes, and exposed for
testcrypt to test. All the call sites of ssh2_fingerprint pass a fixed
default fptype, which is still set to the old MD5. That will change
shortly.
The assorted host-key and warning prompt messages have no reason to
differ between the two platforms, so let's centralise them. Also,
while I'm here, some basic support functions that are the same in both
modules.
I've replaced the old versions using the standard MessageBox with new
versions using custom-drawn dialog templates and dialog procedures.
The visible changes are that the acceptance buttons have custom text
describing the actions they'll take, like the GTK versions, instead of
having to stick with bog-standard "Yes" and "No" and hope the user
reads the explanation in the main box text.
Also, this gives me the opportunity to spiff up the looks a bit, by
making the "POTENTIAL SECURITY BREACH" in the wrong-host-key dialog
larger and boldface.
But those are minor cosmetic side effects of my real purpose, which is
to make it possible to add further controls to these boxes in future.
The About text is in a readonly edit control rather than a static
control, so that it can be copy-pasted. Previously, I haven't managed
to avoid the side effect of the edit control being surrounded by a
border - but now I've finally found out how you can do it: clear all
the border styles and _then_ use SetWindowPos to force a redraw of the
frame.
I left this out of yesterday's collection of cmdgen CLI options and
GUI PuTTYgen dialog box, but only because I forgot about it. I don't
know off the top of my head why someone would particularly want to
configure this detail, but given that it _is_ configurable, it seems
like no extra trouble to expose it along with the rest of the
parameters, just in case.
The GUI key generator doesn't need a --reencrypt option, because you
can already just click Load and then Save without changing anything in
between. But it does need a dialog box with all the fiddly Argon2
settings in it, plus a setting to go back to PPK v2.
This removes both uses of SHA-1 in the file format: it was used as the
MAC protecting the key file against tamperproofing, and also used in
the key derivation step that converted the user's passphrase to cipher
and MAC keys.
The MAC is simply upgraded from HMAC-SHA-1 to HMAC-SHA-256; it is
otherwise unchanged in how it's applied (in particular, to what data).
The key derivation is totally reworked, to be based on Argon2, which
I've just added to the code base. This should make stolen encrypted
key files more resistant to brute-force attack.
Argon2 has assorted configurable parameters for memory and CPU usage;
the new key format includes all those parameters. So there's no reason
we can't have them under user control, if a user wants to be
particularly vigorous or particularly lightweight with their own key
files. They could even switch to one of the other flavours of Argon2,
if they thought side channels were an especially large or small risk
in their particular environment. In this commit I haven't added any UI
for controlling that kind of thing, but the PPK loading function is
all set up to cope, so that can all be added in a future commit
without having to change the file format.
While I'm at it, I've also switched the CBC encryption to using a
random IV (or rather, one derived from the passphrase along with the
cipher and MAC keys). That's more like normal SSH-2 practice.
This one is triggered by the following sequence:
- fill up the terminal window with text ('ls -l /dev' or similar)
- Win+Right then Win+Up to snap to the top right quadrant
- interactively drag away from the top right quadrant with the title
bar, which returns the window to its pre-snap size.
After the snap, the window border will have been recomputed to take
account of the window size not being an integer number of character
cells. So it needs recomputing back again the next time the window
size changes to something that _is_ an integer number - which happens
(or rather, we process it in a deferred manner) at the EXITSIZEMOVE.
So that's where we need to recompute the border (again).
If you open a Windows PuTTY session and press Win+Right, Windows
auto-sizes the terminal window to cover the right-hand half of the
screen. Then if you press Win+Up it will be auto-sized again, this
time to the top right quadrant. In the second resize (if you don't
have font-based resize handling turned on), the WM_SIZE handler code
will find a path through the twisty maze of ifs on which the border
between the text and the client-area edges is not recomputed, or
invalidated, or redrawn. So you can end up with half a line of text
from the previous window size still visible at the bottom of the new
window.
Fixed by factoring out the offset-recomputation code from the large
and complicated reset_window(), so that I can call just that snippet
on the dangerous code path.
There were three separate clauses in the WM_SIZE message handler which
potentially called term_size() to resize the actual Terminal object.
Two of them (for maximisation and normal non-maximised resizing drags)
first checked if an interactive resize was in progress, and if so,
instead set the need_backend_resize, to defer the term_size call to
the end of the interactive operation. But the third, for
_un_-maximising a window, didn't have that check.
As a result, if you start with a maximised window, drag its title bar
downward from the top of the screen (which unmaximises it), and
without letting go, drag it back up again (which maximises it), the
effect would be that you'd get one call to term_size in the middle of
the drag, and a second at the end. This isn't what I intended, and it
can also cause a redraw failure in full-screen applications on the
server (such as a terminal-based text editor - I reproduced this with
emacs), in which after the second term_size the terminal doesn't
manage to redraw itself.
Now I've pulled out the common logic that was in two of those three
pieces of code (and should have been in all three) into a subroutine
wm_size_resize_term, and arranged to call that in all three cases.
This fixes the inconsistency, and also fixes the emacs redraw problem
in the edge case I describe above.
This removes code duplication between the front ends: now the terminal
itself knows when the Conf is asking it not to turn on mouse
reporting, and the front ends can assume that if the terminal asks
them to then they should just do it.
This also makes the behaviour on mid-session reconfiguration more
sensible, in both code organisation and consistent behaviour.
Previously, term_reconfig would detect that CONF_no_mouse_rep had been
*set* in mid-session, and turn off mouse reporting mode in response.
But it would do it by clearing term->xterm_mouse, which isn't how the
front end enabled and disabled that feature, so things could get into
different states from different sequences of events that should have
ended up in the same place.
Also, the terminal wouldn't re-enable mouse reporting if
CONF_no_mouse_rep was *cleared* and the currently running terminal app
had been asking for mouse reports all along. Also, it was silly to
have half the CONF_no_mouse_rep handling in term_reconfig and the
other half in the front ends.
Now it should all be sensible, and also all centralised.
term->xterm_mouse consistently tracks whether the terminal application
is _requesting_ mouse reports; term->xterm_mouse_forbidden tracks
whether the client user is vetoing them; every change to either one of
those settings triggers a call to term_update_raw_mouse_mode which
sets up the front end appropriately for the current combination.
Similarly to other recent changes, the frontend now proactively keeps
Terminal up to date with the current position and size of the terminal
window, so that escape-sequence queries can be answered immediately
from the Terminal's own internal data structures without needing a
call back to the frontend.
Mostly this has let me remove explicit window-system API calls that
retrieve the window position and size, in favour of having the front
ends listen for WM_MOVE / WM_SIZE / ConfigureNotify events and track
the position and size that way. One exception is that the window pixel
size is still requested by Seat via a callback, to put in the
wire-encoded termios settings. That won't be happening very much, so
I'm leaving it this way round for the moment.
Now terminal.c makes nearly all the decisions about what the colour
palette should actually contain: it does the job of reading the
GUI-configurable colours out of Conf, and also the job of making up
the rest of the xterm-256 palette. The only exception is that TermWin
can provide a method to override some of the default colours, which on
Windows is used to implement the 'Use system colours' config option.
This saves code overall, partly because the front ends don't have to
be able to send palette data back to the Terminal any more (the
Terminal keeps the master copy and can answer palette-query escape
sequences from its own knowledge), and also because now there's only
one copy of the xterm-256 palette setup code (previously gtkwin.c and
window.c each had their own version of it).
In this rewrite, I've also introduced a multi-layered storage system
for the palette data in Terminal. One layer contains the palette
information derived from Conf; the next contains platform overrides
(currently just Windows's 'Use system colours'); the last one contains
overrides set by escape sequences in the middle of the session. The
topmost two layers can each _conditionally_ override the ones below.
As a result, if a server-side application manually resets (say) the
default fg and bg colours in mid-session to something that works well
in a particular application, those changes won't be wiped out by a
change in the Windows system colours or the Conf, which they would
have been before. Instead, changes in Conf or the system colours alter
the lower layers of the structure, but then when palette_rebuild is
called, the upper layer continues to override them, until a palette
reset (ESC]R) or terminal reset (e.g. ESC c) removes those upper-layer
changes. This seems like a more consistent strategy, in that the same
set of configuration settings will produce the same end result
regardless of what order they were applied in.
The palette-related methods in TermWin have had a total rework.
palette_get and palette_reset are both gone; palette_set can now set a
contiguous range of colours in one go; and the new
palette_get_overrides replaces window.c's old systopalette().
There are three separate indexing schemes in use by various bits of
the PuTTY front ends, and _none_ of them was clearly documented, let
alone all in the same place. Worse, functions that looked obviously
related, like win_palette_set and win_palette_get, used different
encodings.
Now all the encodings are defined together in putty.h, with
explanation of why there are three in the first place and clear
documentation of where each one is used; terminal.c provides mapping
tables that convert between them; the terminology is consistent
throughout; and win_palette_set has been converted to use the sensible
encoding.
Again, I've replaced it with a push-based notification going in the
other direction, so that when the terminal output stream includes a
query for 'is the window minimised?', the Terminal doesn't have to
consult the TermWin, because it already knows the answer.
The GTK API I'm using here (getting a GdkEventWindowState via
GtkWidget's window-state-event) is not present in GTK 1. The API I was
previously using (gdk_window_is_viewable) _is_, but it turns out that
that API doesn't reliably give the right answer: it only checks
visibility of GDK window ancestors, not X window ancestors. So in fact
GTK 1 PuTTY/pterm was only ever _pretending_ to reliably support the
'am I minimised' terminal query. Now it won't pretend any more.
Previously, window title management happened in a bipartisan sort of
way: front ends would choose their initial window title once they knew
what host name they were connecting to, but then Terminal would
override that later if the server set the window title by escape
sequences.
Now it's all done the same way round: the Terminal object is always
where titles are invented, and they only propagate in one direction,
from the Terminal to the TermWin.
This allows us to avoid duplicating in multiple front ends the logic
for what the initial window title should be. The frontend just has to
make one initial call to term_setup_window_titles, to tell the
terminal what hostname should go in the default title (if the Conf
doesn't override even that). Thereafter, all it has to do is respond
to the TermWin title-setting methods.
Similarly, the logic that handles window-title changes as a result of
the Change Settings dialog is also centralised into terminal.c. This
involved introducing an extra term_pre_reconfig() call that each
frontend can call to modify the Conf that will be used for the GUI
configurer; that's where the code now lives that copies the current
window title into there. (This also means that GTK PuTTY now behaves
consistently with Windows PuTTY on that point; GTK's previous
behaviour was less well thought out.)
It also means there's no longer any need for Terminal to talk to the
front end when a remote query wants to _find out_ the window title:
the Terminal knows the answer already. So TermWin's get_title method
can go.
All implementations of it work by checking the line_codepage field in
the ucsdata structure that the terminal itself already has a pointer
to. Therefore, it's a totally unnecessary query function: the terminal
can check the same thing directly by inspecting that structure!
(In fact, it already _does_ do that, for the purpose of actually
deciding how to decode terminal output data. It only uses this query
function at all for the auxiliary purpose of inventing useful tty
modes to pass to the backend.)
I just happened to spot a couple of cases where I'd apparently
open-coded the dupstr() logic before writing dupstr() itself, and
never got round to replacing the long-winded version with a call to
the standard helper function.
I found recently that if I ran Windows PSCP as a connection-sharing
downstream, it would send the SSH greeting down the named pipe, but
never receive anything back, though the upstream PuTTY was sending it.
PuTTY and Plink from the same build of the code would act happily as
downstreams.
It turned out that this was because the WaitForMultipleObjects call in
cli_main_loop() in wincliloop.c was failing with ERROR_ACCESS_DENIED.
That happened because it had an INVALID_HANDLE_VALUE in its list of
objects to wait for. That in turn happened because winselcli_event was
set to INVALID_HANDLE_VALUE.
Why was winselcli_event not set up? Because it's set up lazily by
do_select(), so if the program isn't handling any network sockets at
all (which is the case when PSCP is speaking over a named pipe
instead), then it never gets made into a valid event object.
So the problem wasn't that winselcli_event was in a bad state; it was
quite legitimately invalid. The problem was that wincliloop ought to
have _coped_ with it being invalid, by not inserting it in its list of
objects to wait for.
So now we check that case, and only insert winselcli_event in the list
if it's valid. And PSCP works again over connection sharing.
A user wrote in to point out the one in winhandl.c, and out of sheer
curiosity, I grepped the whole source base for '([a-zA-Z])\1\1' to see
if there were any others. Of course there are a lot of perfectly
sensible ones, like 'www' or 'Grrr', not to mention any amount of
0xFFFF and the iiii/bbbb emphasis system in Halibut code paragraphs,
but I did spot one more in the recently added udp.but section on
traits, and another in a variable name in uxagentsock.c.
The NEON support for SHA-512 acceleration looks very like SHA-256,
with a pair of chained instructions to generate a 128-bit vector
register full of message schedule, and another pair to update the hash
state based on those. But since SHA-512 is twice as big in all
dimensions, those four instructions between them only account for two
rounds of it, in place of four rounds of SHA-256.
Also, it's a tighter squeeze to fit all the data needed by those
instructions into their limited number of register operands. The NEON
SHA-256 implementation was able to keep its hash state and message
schedule stored as 128-bit vectors and then pass combinations of those
vectors directly to the instructions that did the work; for SHA-512,
in several places you have to make one of the input operands to the
main instruction by combining two halves of different vectors from
your existing state. But that operation is a quick single EXT
instruction, so no trouble.
The only other problem I've found is that clang - in particular the
version on M1 macOS, but as far as I can tell, even on current trunk -
doesn't seem to implement the NEON intrinsics for the SHA-512
extension. So I had to bodge my own versions with inline assembler in
order to get my implementation to compile under clang. Hopefully at
some point in the future the gap might be filled and I can relegate
that to a backwards-compatibility hack!
This commit adds the same kind of switching mechanism for SHA-512 that
we already had for SHA-256, SHA-1 and AES, and as with all of those,
plumbs it through to testcrypt so that you can explicitly ask for the
hardware or software version of SHA-512. So the test suite can run the
standard test vectors against both implementations in turn.
On M1 macOS, I'm testing at run time for the presence of SHA-512 by
checking a sysctl setting. You can perform the same test on the
command line by running "sysctl hw.optional.armv8_2_sha512".
As far as I can tell, on Windows there is not yet any flag to test for
this CPU feature, so for the moment, the new accelerated SHA-512 is
turned off unconditionally on Windows.
This is mostly easy: it's just like drawing an underline, except that
you put it at a different height in the character cell. The only
question is _where_ in the character cell.
Pango, and Windows GetOutlineTextMetrics, will tell you exactly where
the font wants to have it. Following xterm, I fall back to 3/8 of the
font's ascent (above the baseline) if either of those is unavailable.
Two minor memory-leak fixes on 0.74 seem not to be needed on master:
the fix in an early exit path of pageant_add_keyfile is done already
on master in a different way, and the missing sfree(fdlist) in
uxsftp.c is in code that's been completely rewritten in the uxcliloop
refactoring.
Other minor conflicts: the rework in commit b52641644905 of
ssh1login.c collided with the change from FLAG_VERBOSE to
seat_verbose(), and master and 0.74 each added an unrelated extra
field to the end of struct SshServerConfig.
udata[uindex] is a wchar_t, so if we pass it to sprintf("%d") we
should cast it to int (because who knows what primitive integer type
that might have corresponded to otherwise). I had done this in the
first of the two sprintfs that use it, but missed the second one a few
lines further on. Spotted by Coverity.
This mitigates CVE-2020-14002: if you're in the habit of clicking OK
to unknown host keys (the TOFU policy - trust on first use), then an
active attacker looking to exploit that policy to substitute their own
host key in your first connection to a server can use the host key
algorithm order in your KEXINIT to (not wholly reliably) detect
whether you have a key already stored for this host, and if so, abort
their attack to avoid giving themself away.
However, for users who _don't_ use the TOFU policy and instead check
new host keys out of band, the dynamic policy is more useful. So it's
provided as a configurable option.
We received a report that if you enable Windows 10's high-contrast
mode, the text in PuTTY's installer UI becomes invisible, because it's
displayed in the system default foreground colour against a background
of the white right-hand side of our 'msidialog.bmp' image. That's fine
when the system default fg is black, but high-contrast mode flips it
to white, and now you have white on white text, oops.
Some research in the WiX bug tracker suggests that in Windows 10 you
don't actually have to use BMP files for your installer images any
more: you can use PNG, and PNGs can be transparent. However, someone
else reported that that only works in up-to-date versions of Windows.
And in fact there's no need to go that far. A more elegant answer is
to simply not cover the whole dialog box with our background image in
the first place. I've reduced the size of the background image so that
it _only_ contains the pretty picture on the left-hand side, and omits
the big white rectangle that used to sit under the text. So now the
RHS of the dialog is not covered by any image at all, which has the
same effect as it being covered with a transparent image, except that
it doesn't require transparency support from msiexec. Either way, the
background for the text ends up being the system's default dialog-box
background, in the absence of any images or controls placed on top of
it - so when the high-contrast mode is enabled, it flips to black at
the same time as the text flips to white, and everything works as it
should.
The slight snag is that the pre-cooked WiX UI dialog specifications
let you override the background image itself, but not the Width and
Height fields in the control specifications that refer to them. So if
you just try to drop in a narrow image in the most obvious way, it
gets stretched across the whole window.
But that's not a show-stopper, because we're not 100% dependent on
getting WiX to produce exactly the right output. We already have the
technology to postprocess the MSI _after_ it comes out of WiX: we're
using it to fiddle the target-platform field for the Windows on Arm
installers. So all I had to do was to turn msiplatform.py into a more
general msifixup.py, add a second option to change the width of the
dialog background image, and run it on the x86 installers as well as
the Arm ones.
If a terminal window closed with a popup (due to a network error,
for instance) while the mouse pointer was hidden by 'Hide mouse
pointer when typing in window', the mouse pointer could remain hidden
while over the terminal window, making it hard to navigate to the
popup.
(cherry picked from commit d9c4ce9fd8)
On Windows, due to a copy-paste goof, the message that should have
read "Configuring n stop bits" instead ended with "data bits".
While I'm here, I've arranged that the "1 stop bit" case of that
message is in the singular. And then I've done the same thing again on
Unix, because I noticed that message was unconditionally plural too.
(cherry picked from commit bdb7b47a5e)
Now you can see exactly what pathname the backend tried to open for
the serial port, and what error code it got back from the OS when it
tried. That should help users distinguish between (for example) a
permissions problem and a typo in the filename.
Now, instead of a 'const char *' in the static data segment, error
messages returned from backend setup are dynamically allocated and
freed by the caller.
This will allow me to make the messages much more specific (including
errno values and the like). However, this commit is pure refactoring:
I've _just_ changed the allocation policy, and left all the messages
alone.
If a terminal window closed with a popup (due to a network error,
for instance) while the mouse pointer was hidden by 'Hide mouse
pointer when typing in window', the mouse pointer could remain hidden
while over the terminal window, making it hard to navigate to the
popup.
This fills in the missing piece of Windows Pageant's story on deferred
decryption: we now actually know how to put up a dialog box asking for
the passphrase, when a not-yet-decrypted key is used.
This is quite a rough implementation so far, but it's a start. Known
issues:
- these new non-modal dialog boxes are serialised with respect to
each other by the Pageant core, but they can run in parallel with a
passphrase prompt popping up from the ordinary GUI 'Add Key'
operation. That may be too confusing; perhaps I should fix it.
- I'm not confident that the passphrase dialog box gets the keyboard
focus in all situations where I'd like it to (or what I can do
about it if not).
- the text in the non-modal box has two copies of the instruction
'enter passphrase for key'.
In commit 1f399bec58 I had the idea of generating the protocol radio
buttons in the GUI configurer by looping over the backends[] array,
which gets the reliably correct list of available backends for a given
binary rather than having to second-guess. That's given me an idea: we
can do the same for the per-backend config panels too.
Now the GUI config panel for every backend is guarded by a check of
backend_vt_from_proto, and we won't display the config for that
backend unless it's present.
In particular, this allows me to move the serial-port configuration
back into config.c from the separate file sercfg.c: we find out
whether to apply it by querying backend_vt_from_proto(PROT_SERIAL),
the same as any other backend.
In _particular_ particular, that also makes it much easier for me to
move the serial config up the pecking order, so that it's now second
only to SSH in the list of per-protocol config panes, which I think is
now where it deserves to be.
(A side effect of that is that I now have to come up with a different
method of having each serial backend specify the subset of parity and
flow control schemes it supports. I've done it by adding an extra pair
of serial-port specific bitmask fields to BackendVtable, taking
advantage of the new vtable definition idiom to avoid having to
boringly declare them as zero in all the other backends.)
This is a sweeping change applied across the whole code base by a spot
of Emacs Lisp. Now, everywhere I declare a vtable filled with function
pointers (and the occasional const data member), all the members of
the vtable structure are initialised by name using the '.fieldname =
value' syntax introduced in C99.
We were already using this syntax for a handful of things in the new
key-generation progress report system, so it's not new to the code
base as a whole.
The advantage is that now, when a vtable only declares a subset of the
available fields, I can initialise the rest to NULL or zero just by
leaving them out. This is most dramatic in a couple of the outlying
vtables in things like psocks (which has a ConnectionLayerVtable
containing only one non-NULL method), but less dramatically, it means
that the new 'flags' field in BackendVtable can be completely left out
of every backend definition except for the SUPDUP one which defines it
to a nonzero value. Similarly, the test_for_upstream method only used
by SSH doesn't have to be mentioned in the rest of the backends;
network Plugs for listening sockets don't have to explicitly null out
'receive' and 'sent', and vice versa for 'accepting', and so on.
While I'm at it, I've normalised the declarations so they don't use
the unnecessarily verbose 'struct' keyword. Also a handful of them
weren't const; now they are.
The motivation is for the SUPDUP protocol. The server may send a
signal for the terminal to reset any input buffers. After this, the
server will not know the state of the terminal, so it is required to
send its cursor position back.
A 'strong' prime, as defined by the Handbook of Applied Cryptography,
is a prime p such that each of p-1 and p+1 has a large prime factor,
and that the large factor q of p-1 is such that q-1 in turn _also_ has
a large prime factor.
HoAC says that making your RSA key using primes of this form defeats
some factoring algorithms - but there are other faster algorithms to
which it makes no difference. So this is probably not a useful
precaution in practice. However, it has been recommended in the past
by some official standards, and it's easy to implement given the new
general facility in PrimeCandidateSource that lets you ask for your
prime to satisfy an arbitrary modular congruence. (And HoAC also says
there's no particular reason _not_ to use strong primes.) So I provide
it as an option, just in case anyone wants to select it.
The change to the key generation algorithm is entirely in sshrsag.c,
and is neatly independent of the prime-generation system in use. If
you're using Maurer provable prime generation, then the known factor q
of p-1 can be used to help certify p, and the one for q-1 to help with
q in turn; if you switch to probabilistic prime generation then you
still get an RSA key with the right structure, except that every time
the definition says 'prime factor' you just append '(probably)'.
(The probabilistic version of this procedure is described as 'Gordon's
algorithm' in HoAC section 4.4.2.)
Most of them are now _mandatory_ P3 scripts, because I'm tired of
maintaining everything to be compatible with both versions.
The current exceptions are gdb.py (which has to live with whatever gdb
gives it), and kh2reg.py (which is actually designed for other people
to use, and some of them might still be stuck on P2 for the moment).
In the Windows GUI, all the controls that were previously named or
labelled Ed25519 are now labelled EdDSA, and when you select that
top-level key type, there's a dropdown for the specific curve (just
like for ECDSA), whose only current value is Ed25519.
In command-line PuTTYgen, you can say '-t eddsa' and give a number of
bits, just like '-t ecdsa'. You can also still say '-t ed25519', for
backwards compatibility.
Also in command-line PuTTYgen, I've reworked the error messages if you
give a number of bits that doesn't correspond to a known elliptic
curve. Now the messages are generated by consulting the list of
curves, so that that list has to be updated by hand in one fewer
place.
In Windows PuTTYgen, this is selected by an extra set of radio-button
style menu options in the Key menu. In the command-line version,
there's a new --primes=provable option.
This whole system is new, so I'm not enabling it by default just yet.
I may in future, though: it's running faster than I expected (in
particular, a lot faster than any previous prototype of the same
algorithm I attempted in standalone Python).
The old system I removed in commit 79d3c1783b had both linear and
exponential phase types, but the new one only had exponential, because
at that point I'd just thrown away all the clients of the linear phase
type. But I'm going to add another one shortly, so I have to put it
back in.
The functions primegen() and primegen_add_progress_phase() are gone.
In their place is a small vtable system with two methods corresponding
to them, plus the usual admin of allocating and freeing contexts.
This API change is the starting point for being able to drop in
different prime generation algorithms at run time in response to user
configuration.
The old API was one of those horrible things I used to do when I was
young and foolish, in which you have just one function, and indicate
which of lots of things it's doing by passing in flags. It was crying
out to be replaced with a vtable.
While I'm at it, I've reworked the code on the Windows side that
decides what to do with the progress bar, so that it's based on
actually justifiable estimates of probability rather than magic
integer constants.
Since computers are generally faster now than they were at the start
of this project, I've also decided there's no longer any point in
making the fixed final part of RSA key generation bother to report
progress at all. So the progress bars are now only for the variable
part, i.e. the actual prime generations.
(This is a reapplication of commit a7bdefb39, without the Miller-Rabin
refactoring accidentally folded into it. Also this time I've added -lm
to the link options, which for some reason _didn't_ cause me a link
failure last time round. No idea why not.)
This reverts commit a7bdefb394.
I had accidentally mashed it together with another commit. I did
actually want to push both of them, but I'd rather push them
separately! So I'm backing out the combined blob, and I'll re-push
them with their proper comments and explanations.
The old API was one of those horrible things I used to do when I was
young and foolish, in which you have just one function, and indicate
which of lots of things it's doing by passing in flags. It was crying
out to be replaced with a vtable.
While I'm at it, I've reworked the code on the Windows side that
decides what to do with the progress bar, so that it's based on
actually justifiable estimates of probability rather than magic
integer constants.
Since computers are generally faster now than they were at the start
of this project, I've also decided there's no longer any point in
making the fixed final part of RSA key generation bother to report
progress at all. So the progress bars are now only for the variable
part, i.e. the actual prime generations.
This is built more or less entirely out of pieces I already had. The
SOCKS server code is provided by the dynamic forwarding code in
portfwd.c. When that accepts a connection request, it wants to talk to
an SSH ConnectionLayer, which is already a trait with interchangeable
implementations - so I just provide one of my own which only supports
the lportfwd_open() method. And that in turn returns an SshChannel
object, with a special trait implementation all of whose methods
just funnel back to an ordinary Socket.
Result: you get a Socket-to-Socket SOCKS implementation with no SSH
anywhere, and even a minimal amount of need to _pretend_ internally to
be an SSH implementation.
Additional features include the ability to log all the traffic in the
form of diagnostics to standard error, or log each direction of each
connection separately to a file, or for anything more general, to log
each direction of each connection through a pipe to a subcommand that
can filter out whatever you think are the interesting parts. Also, you
can spawn a subcommand after the SOCKS server is set up, and terminate
automatically when that subcommand does - e.g. you might use this to
wrap the execution of a single SOCKS-using program.
This is a modernisation of a diagnostic utility I've had kicking
around out-of-tree for a long time. With all of last year's
refactorings, it now becomes feasible to keep it in-tree without
needing huge amounts of scaffolding. Also, this version runs on
Windows, which is more than the old one did. (On Windows I haven't
implemented the subprocess parts, although there's no reason I
_couldn't_.)
As well as diagnostic uses, this may also be useful in some situations
as a thing to forward ports to: PuTTY doesn't currently support
reverse dynamic port forwarding (in which the remote listening port
acts as a SOCKS server), but you could get the same effect by
forwarding a remote port to a local instance of this. (Although, of
course, that's nothing you couldn't achieve using any other SOCKS
server.)
The previous 'name' field was awkwardly serving both purposes: it was
a machine-readable identifier for the backend used in the saved
session format, and it was also used in error messages when Plink
wanted to complain that it didn't support a particular backend. Now
there are two separate name fields for those purposes.
Sometimes, within a switch statement, you want to declare local
variables specific to the handler for one particular case. Until now
I've mostly been writing this in the form
switch (discriminant) {
case SIMPLE:
do stuff;
break;
case COMPLICATED:
{
declare variables;
do stuff;
}
break;
}
which is ugly because the two pieces of essentially similar code
appear at different indent levels, and also inconvenient because you
have less horizontal space available to write the complicated case
handler in - particuarly undesirable because _complicated_ case
handlers are the ones most likely to need all the space they can get!
After encountering a rather nicer idiom in the LLVM source code, and
after a bit of hackery this morning figuring out how to persuade
Emacs's auto-indent to do what I wanted with it, I've decided to move
to an idiom in which the open brace comes right after the case
statement, and the code within it is indented the same as it would
have been without the brace. Then the whole case handler (including
the break) lives inside those braces, and you get something that looks
more like this:
switch (discriminant) {
case SIMPLE:
do stuff;
break;
case COMPLICATED: {
declare variables;
do stuff;
break;
}
}
This commit is a big-bang change that reformats all the complicated
case handlers I could find into the new layout. This is particularly
nice in the Pageant main function, in which almost _every_ case
handler had a bundle of variables and was long and complicated. (In
fact that's what motivated me to get round to this.) Some of the
innermost parts of the terminal escape-sequence handling are also
breathing a bit easier now the horizontal pressure on them is
relieved.
(Also, in a few cases, I was able to remove the extra braces
completely, because the only variable local to the case handler was a
loop variable which our new C99 policy allows me to move into the
initialiser clause of its for statement.)
Viewed with whitespace ignored, this is not too disruptive a change.
Downstream patches that conflict with it may need to be reapplied
using --ignore-whitespace or similar.
We received a report that if you enable Windows 10's high-contrast
mode, the text in PuTTY's installer UI becomes invisible, because it's
displayed in the system default foreground colour against a background
of the white right-hand side of our 'msidialog.bmp' image. That's fine
when the system default fg is black, but high-contrast mode flips it
to white, and now you have white on white text, oops.
Some research in the WiX bug tracker suggests that in Windows 10 you
don't actually have to use BMP files for your installer images any
more: you can use PNG, and PNGs can be transparent. However, someone
else reported that that only works in up-to-date versions of Windows.
And in fact there's no need to go that far. A more elegant answer is
to simply not cover the whole dialog box with our background image in
the first place. I've reduced the size of the background image so that
it _only_ contains the pretty picture on the left-hand side, and omits
the big white rectangle that used to sit under the text. So now the
RHS of the dialog is not covered by any image at all, which has the
same effect as it being covered with a transparent image, except that
it doesn't require transparency support from msiexec. Either way, the
background for the text ends up being the system's default dialog-box
background, in the absence of any images or controls placed on top of
it - so when the high-contrast mode is enabled, it flips to black at
the same time as the text flips to white, and everything works as it
should.
The slight snag is that the pre-cooked WiX UI dialog specifications
let you override the background image itself, but not the Width and
Height fields in the control specifications that refer to them. So if
you just try to drop in a narrow image in the most obvious way, it
gets stretched across the whole window.
But that's not a show-stopper, because we're not 100% dependent on
getting WiX to produce exactly the right output. We already have the
technology to postprocess the MSI _after_ it comes out of WiX: we're
using it to fiddle the target-platform field for the Windows on Arm
installers. So all I had to do was to turn msiplatform.py into a more
general msifixup.py, add a second option to change the width of the
dialog background image, and run it on the x86 installers as well as
the Arm ones.
Similarly to the previous commit, this function had an inconsistent
parameter list between Unix and Windows, because the Windows source
file that defines it (winnet.c) didn't include ssh.h where its
prototype lives, so the compiler never checked.
Luckily, the discrepancy was that the Windows version of the function
was declared as taking an extra parameter which it ignored, so the fix
is very easy.
(cherry picked from commit b7f011aed7)
This was pointed out as a compiler warning when I test-built with
up-to-date clang-cl. It looks as if it would cause the IDM_FULLSCREEN
item on the system menu to be wrongly greyed/ungreyed, but in fact I
think it's benign, because MF_BYCOMMAND == 0. So it's _just_ a
warning fix, luckily!
(cherry picked from commit 213723a718)
A user reports that Visual Studio 2013 and earlier have printf
implementations in their C library that don't support the 'z' modifier
to indicate that an integer argument is size_t. The 'I' modifier
apparently works in place of it.
To avoid littering ifdefs everywhere, I've invented my own inttypes.h
style macros to wrap size_t formatting directives, which are defined
to %zu and %zx normally, or %Iu and %Ix in old-VS mode. Those are in
defs.h, and they're used everywhere that a %z might otherwise get into
the Windows build.
(cherry picked from commit 82a7e8c4ac)
A user reports that the ReadFile call in console_get_userpass_input
fails with ERROR_NOT_ENOUGH_MEMORY on Windows 7, and further reports
that this problem only happens if you tell ReadFile to read more than
31366 bytes in a single call.
That seems to be a thing that other people have found as well: I
turned up a similar workaround in Ruby's Win32 support module, except
that there it's for WriteConsole. So I'm reducing my arbitrary read
size of 64K to 16K, which is well under that limit.
This issue became noticeable in PuTTY as of the recent commit
cd6bc14f0, which reworked console_get_userpass_input to use strbufs.
Previously we were trying to read an amount proportional to the
existing size of the buffer, so as to grow the buffer exponentially to
save quadratic-time reallocation. That was OK in practice, since the
initial read size was nice and small. But in principle, the same bug
was present in that version of the code, just latent - if we'd ever
been called on to read a _really large_ amount of data, then
_eventually_ the input size parameter to ReadFile would have grown
beyond that mysterious limit!
(cherry picked from commit 7b79d22021)
Those chomp operations in wincons.c and uxcons.c looked ugly, and I'm
not totally convinced they couldn't underrun the buffer by 1 byte in
weird circumstances. strbuf_chomp is neater.
(cherry picked from commit 7590d0625b)
UBsan pointed out another memcpy from NULL (again with length 0) in
the prompts_t system. When I looked at it, I realised that firstly
prompt_ensure_result_size was an early not-so-good implementation of
sgrowarray_nm that would benefit from being replaced with a call to
the real one, and secondly, the whole system for storing prompt
results should really have been replaced with strbufs with the no-move
option, because that's doing all the same jobs better.
So, now each prompt_t holds a strbuf in place of its previous manually
managed string. prompt_ensure_result_size is gone (the console
prompt-reading functions use strbuf_append, and everything else just
adds to the strbuf in the usual marshal.c way). New functions exist to
retrieve a prompt_t's result, either by reference or copied.
(cherry picked from commit cd6bc14f04)
These are better than my previous approach of just assigning to
sb->len, because firstly they check by assertion that the new length
is within range, and secondly they preserve the invariant that the
byte stored in the buffer just after the length runs out is \0.
Switched to using the new functions everywhere a grep could turn up
opportunities.
(cherry picked from commit 5891142aee)
In setting up the ECC tests for cmdgen, I noticed that OpenSSH and
PuTTYgen disagree on the bit length to put in a key fingerprint for an
ed25519 key: we think 255, they think 256.
On reflection, I think 255 is more accurate, which is why I bodged
get_fp() in the test suite to ignore that difference when checking our
key fingerprint against OpenSSH's. But having done that, it now seems
silly that if you unnecessarily specify a bit count at ed25519
generation time, cmdgen will insist that it be 256!
255 is now permitted everywhere an ed25519 bit count is input. 256 is
also still allowed for backwards compatibility but 255 is preferred by
the error message if you give any other value.
(cherry picked from commit 187cc8bfcc)
The do_select function is called with a boolean parameter indicating
whether we're supposed to start or stop paying attention to network
activity on a given socket. So if we freeze and unfreeze the socket in
mid-session because of backlog, we'll call do_select(s, false) to
freeze it, and do_select(s, true) to unfreeze it.
But the implementation of do_select in the Windows SFTP code predated
the rigorous handling of socket backlogs, so it assumed that
do_select(s, true) would only be called at initialisation time, i.e.
only once, and therefore that it was safe to use that flag as a cue to
set up the Windows event object to associate with socket activity.
Hence, every time the socket was frozen and unfrozen, we would create
a new netevent at unfreeze time, leaking the old one.
I think perhaps part of the reason why that was hard to figure out was
that the boolean parameter was called 'startup' rather than 'enable'.
To make it less confusing the next time I read this code, I've also
renamed it, and while I was at it, adjusted another related comment.
(cherry picked from commit bd5c957e5b)
This applies to both server modes ('pageant -E key.ppk [lifetime]')
and client mode ('pageant -a -E key.ppk').
I'm not completely confident that the CLI syntax is actually right
yet, but for the moment, it's enough that it _exists_. Now I don't
have to test the encrypted-key loading via manually mocked-up agent
requests.
On Windows, due to a copy-paste goof, the message that should have
read "Configuring n stop bits" instead ended with "data bits".
While I'm here, I've arranged that the "1 stop bit" case of that
message is in the singular. And then I've done the same thing again on
Unix, because I noticed that message was unconditionally plural too.
Now I've got an enum for PlugLogType, it's easier to add things to it.
We were giving a blow-by-blow account of each connection attempt, and
when it failed, saying what went wrong before we moved on to the next
candidate address, but when one finally succeeded, we never logged
_that_. Now we do.
There aren't quite as many of these as there are on Unix, but Windows
Plink and PSFTP still share some suspiciously similar-looking code.
Now they're both clients of wincliloop.c.
This begins to head towards the goal of storing a key file encrypted
in Pageant, and decrypting it on demand via a GUI prompt the first
time a client requests a signature from it. That won't be a facility
available in all situations, so we have to be able to return failure
from the prompt.
More precisely, there are two versions of this API, one in
PageantClient and one in PageantListenerClient: the stream
implementation of PageantClient implements the former API and hands it
off to the latter. Windows Pageant has to directly implement both (but
they will end up funnelling to the same function within winpgnt.c).
NFC: for the moment, the new API functions are never called, and every
implementation of them returns failure.
In the previous trawl of this, I didn't bother with the statics in
main-program modules, on the grounds that my main aim was to avoid
'library' objects (shared between multiple programs) from polluting
the global namespace. But I think it's worth being more strict after
all, so this commit adds 'static' to a lot more file-scope variables
that aren't needed outside their own module.
Now it's no longer used, we can get rid of it, and better still, get
rid of every #define PUTTY_DO_GLOBALS in the many source files that
previously had them.
We now have no remaining things in header files that switch from being
a declaration to a definition depending on an awkward #define at the
point of including that header. There are still a few mutable
variables with external linkage, but at least now each one is defined
in a specific source file file appropriate to its purpose and context.
The remaining globals as of this commit were:
- 'logctx' and 'term', which never needed to be globals in the first
place, because they were never actually shared between source
files. Now 'term' is just a static in window.c, and 'logctx' is a
static in each of that and winplink.c.
- 'hinst', which still has external linkage, but is now defined
separately in each source file that sets it up (i.e. those with a
WinMain)
- osMajorVersion, osMinorVersion and osPlatformId, whose definitions
now live in winmisc.c alongside the code which sets them up.
(Actually they were defined there all along, it turns out, but
every toolchain I've built with has commoned them together with the
version defined by the GLOBAL in the header.)
- 'hwnd', which nothing was actually _using_ any more after previous
commits, so all this commit had to do was delete it.
The declarations in header files now use ordinary 'extern'. That means
I have to arrange to put definitions matching those declarations in
the appropriate modules; so I've made a macro DEFINE_WINDOWS_FUNCTION
which performs a definition matching a prior DECLARE_WINDOWS_FUNCTION
(and reusing the typedef made by the latter).
This applies not only to the batch of functions that were marked
GLOBAL in winstuff.h, but also the auxiliary sets marked
WINCAPI_GLOBAL and WINSECUR_GLOBAL in wincapi.h and winsecur.h
respectively.
This was the difficult part of cleaning up that global variable. The
main Windows PuTTY GUI is split between source files, so that _does_
actually need to refer to the main window from multiple places.
But all the places where windlg.c needed to use 'hwnd' are seat
methods, so they were already receiving a Seat pointer as a parameter.
In other words, the methods of the Windows GUI Seat were already split
between source files. So it seems only fair that they should be able
to share knowledge of the seat's data as well.
Hence, I've created a small 'WinGuiSeat' structure which both window.c
and windlg.c can see the layout of, and put the main terminal window
handle in there. Then the seat methods implemented in windlg.c, like
win_seat_verify_ssh_host_key, can use container_of to turn the Seat
pointer parameter back into the address of that structure, just as the
methods in window.c can do (even though they currently don't need to).
(Who knows: now that it _exists_, perhaps that structure can be
gradually expanded in future to turn it into a proper encapsulation of
all the Windows frontend's state, like we should have had all
along...)
I've also moved the Windows GUI LogPolicy implementation into the same
object (i.e. WinGuiSeat implements both traits at once). That allows
win_gui_logging_error to recover the same WinGuiSeat from its input
LogPolicy pointer, which means it can get from there to the Seat facet
of the same object, so that I don't need the extern variable
'win_seat' any more either.
Windows Pageant doesn't really have a 'main window' any more, ever
since I separated the roles of system-tray management and IPC receiver
into two different hidden windows managed by different threads. So it
was already silly to be storing one of them in the global 'HWND hwnd'
variable, because it's no longer obvious which it should be.
So there's now a static variable 'traywindow' within winpgnt.c which
it uses in place of the global 'hwnd'.
The GUI version of pgp_fingerprints() is now a differently named
function that takes a parent HWND as a parameter, and so does my
help-enabled wrapper around MessageBox.
It's now a static in the main source file of each application that
uses it, and isn't accessible from any other source file unless the
main one passes it by reference.
In fact, there were almost no instances of the latter: only the
config-box functions in windlg.c were using 'conf' by virtue of its
globalness, and it's easy to make those take it as a parameter.
It's silly to set it at each call site of restrict_process_acl() if
that function returns success! More sensible to have it be a flag in
the same source file as restrict_process_acl(), set as an automatic
_side effect_ of success.
I've renamed the variable itself, and the global name 'restricted_acl'
is now a query function that asks winsecur.c whether that operation
has been (successfully) performed.
These global variables are only ever used by load_settings, which uses
them to vary the default protocol and port number in the absence of
any specification elsewhere. So there's no real need for them to be
universally accessible via the awkward GLOBAL mechanism: they can be
statics inside settings.c, with accessor functions that can set them.
That was the last GLOBAL in putty.h, so I've removed the definition of
the macro GLOBAL itself as well. There are still some GLOBALs in the
Windows subdirectory, though.
I haven't managed to make this one _not_ be a mutable variable, but at
least it's not global across all tools any more: it lives in cmdline.c
along with the code that decides what to set it to, and cmdline.c
exports a query method to ask for its value.
Another ugly mutable global variable gone: now, instead of this
variable being defined in cmdline.c and written to by everyone's
main(), it's defined _alongside_ everyone's main() as a constant, and
cmdline.c just refers to it.
A bonus is that now nocmdline.c doesn't have to define it anyway for
tools that don't use cmdline.c. But mostly, it didn't need to be
mutable, so better for it not to be.
While I'm at it, I've also fiddled with the bit flags that go in it,
to define their values automatically using a list macro instead of
manually specifying each one to be a different power of 2.
This is another piece of the old 2003 attempt at async agent requests.
Nothing ever calls this function (in particular, the new working
version of async-agent doesn't need it). Remove it completely, and all
its special-window-message implementations too.
(If we _were_ still using this function, then it would surely be
possible to fold it into the more recently introduced general
toplevel-callback system, and get rid of all this single-use special
code. But we're not, so removing it completely is even easier.)
In particular, this system was the only reason why Windows Plink paid
any attention to its message queue. So now I can make it call plain
WaitForMultipleObjects instead of MsgWaitForMultipleObjects.
This was the easiest flag to remove: nothing ever checks it at all!
It was part of an abandoned early attempt to make Pageant requests
asynchronous. The flag was added in commit 135abf244 (April 2003); the
code that used it was #ifdef-ed out in commit 98d735fde (January 2004),
and removed completely in commit f864265e3 (January 2017).
We now have an actually working system for async agent requests on
Windows, via the new named-pipe IPC. And we also have a perfectly good
way to force a particular agent request to work synchronously: just
pass NULL as the callback function pointer. All of that works just
fine, without ever using this flag. So begone!
The global 'int flags' has always been an ugly feature of this code
base, and I suddenly thought that perhaps it's time to start throwing
it out, one flag at a time, until it's totally unused.
My first target is FLAG_VERBOSE. This was usually set by cmdline.c
when it saw a -v option on the program's command line, except that GUI
PuTTY itself sets it unconditionally on startup. And then various bits
of the code would check it in order to decide whether to print a given
message.
In the current system of front-end abstraction traits, there's no
_one_ place that I can move it to. But there are two: every place that
checked FLAG_VERBOSE has access to either a Seat or a LogPolicy. So
now each of those traits has a query method for 'do I want verbose
messages?'.
A good effect of this is that subsidiary Seats, like the ones used in
Uppity for the main SSH server module itself and the server end of
shell channels, now get to have their own verbosity setting instead of
inheriting the one global one. In fact I don't expect any code using
those Seats to be generating any messages at all, but if that changes
later, we'll have a way to control it. (Who knows, perhaps logging in
Uppity might become a thing.)
As part of this cleanup, I've added a new flag to cmdline_tooltype,
called TOOLTYPE_NO_VERBOSE_OPTION. The unconditionally-verbose tools
now set that, and it has the effect of making cmdline.c disallow -v
completely. So where 'putty -v' would previously have been silently
ignored ("I was already verbose"), it's now an error, reminding you
that that option doesn't actually do anything.
Finally, the 'default_logpolicy' provided by uxcons.c and wincons.c
(with identical definitions) has had to move into a new file of its
own, because now it has to ask cmdline.c for the verbosity setting as
well as asking console.c for the rest of its methods. So there's a new
file clicons.c which can only be included by programs that link
against both cmdline.c _and_ one of the *cons.c, and I've renamed the
logpolicy to reflect that.
These were just too footling for even me to bother splitting up into
multiple commits:
- a couple of int -> size_t changes left out of the big-bang commit
0cda34c6f
- a few 'const' added to pointer-type casts that are only going to be
read from (leaving out the const provokes a warning if the pointer
was const _before_ the cast)
- a couple of 'return' statements trying to pass the void return of
one function through to another.
- another missing (void) in a declaration in putty.h (but this one
didn't cause any knock-on confusion).
- a few tweaks to macros, to arrange that they eat a semicolon after
the macro call (extra do ... while (0) wrappers, mostly, and one
case where I had to do it another way because the macro included a
variable declaration intended to remain in scope)
- reworked key_type_to_str to stop putting an unreachable 'break'
statement after every 'return'
- removed yet another type-check of a function loaded from a Windows
system DLL
- and finally, a totally spurious semicolon right after an open brace
in mainchan.c.
A trawl through the code with -Wmissing-prototypes and
-Wmissing-variable-declarations turned up a lot of things that should
have been internal to a particular source file, but were accidentally
global. Keep the namespace clean by making them all static.
(Also, while I'm here, a couple of them were missing a 'const': the
ONE and ZERO arrays in sshcrcda.c, and EMPTY_WINDOW_TITLE in
terminal.c.)
These are all intended to ensure that the declarations of things in
header files are in scope where the same thing is subsequently
defined, to make it harder to define it in a way that doesn't match.
(For example, the new #include in winnet.c would have caught the
just-fixed mis-definition of platform_get_x11_unix_address.)
Similarly to the previous commit, this function had an inconsistent
parameter list between Unix and Windows, because the Windows source
file that defines it (winnet.c) didn't include ssh.h where its
prototype lives, so the compiler never checked.
Luckily, the discrepancy was that the Windows version of the function
was declared as taking an extra parameter which it ignored, so the fix
is very easy.
In commit 09954a87c I introduced the portfwdmgr_connect_socket()
system, which opened a port forwarding given a callback to create the
Socket itself, with the aim of using it to make forwardings to Unix-
domain sockets and Windows named pipes (both initially for agent
forwarding).
But I forgot that a year and a bit ago, in commit 834396170, I already
introduced a similar low-level system for creating a PortForwarding
around an unusual kind of socket: the portfwd_raw_new() system, which
in place of a callback uses a two-phase setup protocol (you create the
socket in between the two setup calls, and can roll it back if the
socket can't be created).
There's really no need to have _both_ these systems! So now I'm
merging them, which is to say, I'm enhancing portfwd_raw_new to have
the one new feature it needs, and throwing away the newer system
completely. The new feature is to be able to control the initial state
of the 'ready' flag: portfwd_raw_new was always used for initiating
port forwardings in response to an incoming local connection, which
means you need to start off with ready=false and set it true when the
other end of the SSH connection sends back OPEN_CONFIRMATION. Now it's
being used for initiating port forwardings in response to a
CHANNEL_OPEN, we need to be able to start with ready=true.
This commit reverts 09954a87c2 and its
followup fix 12aa06ccc9, and simplifies
the agent_connect system down to a single trivial function that makes
a Socket given a Plug.
This was pointed out as a compiler warning when I test-built with
up-to-date clang-cl. It looks as if it would cause the IDM_FULLSCREEN
item on the system menu to be wrongly greyed/ungreyed, but in fact I
think it's benign, because MF_BYCOMMAND == 0. So it's _just_ a
warning fix, luckily!
A user reports that Visual Studio 2013 and earlier have printf
implementations in their C library that don't support the 'z' modifier
to indicate that an integer argument is size_t. The 'I' modifier
apparently works in place of it.
To avoid littering ifdefs everywhere, I've invented my own inttypes.h
style macros to wrap size_t formatting directives, which are defined
to %zu and %zx normally, or %Iu and %Ix in old-VS mode. Those are in
defs.h, and they're used everywhere that a %z might otherwise get into
the Windows build.
In an early draft of commit de38a4d82 I used 'void *' as the reqid
type, and then I thought better of it and made it a special type of
its own, in keeping with my usual idea that it's better to have your
casts somewhat checked than totally unchecked. One remnant of the
'void *' version got past me. Now fixed.
A user reports that the ReadFile call in console_get_userpass_input
fails with ERROR_NOT_ENOUGH_MEMORY on Windows 7, and further reports
that this problem only happens if you tell ReadFile to read more than
31366 bytes in a single call.
That seems to be a thing that other people have found as well: I
turned up a similar workaround in Ruby's Win32 support module, except
that there it's for WriteConsole. So I'm reducing my arbitrary read
size of 64K to 16K, which is well under that limit.
This issue became noticeable in PuTTY as of the recent commit
cd6bc14f0, which reworked console_get_userpass_input to use strbufs.
Previously we were trying to read an amount proportional to the
existing size of the buffer, so as to grow the buffer exponentially to
save quadratic-time reallocation. That was OK in practice, since the
initial read size was nice and small. But in principle, the same bug
was present in that version of the code, just latent - if we'd ever
been called on to read a _really large_ amount of data, then
_eventually_ the input size parameter to ReadFile would have grown
beyond that mysterious limit!
This is a pure refactoring: no functional change expected.
This commit introduces two new small vtable-style APIs. One is
PageantClient, which identifies a particular client of the Pageant
'core' (meaning the code that handles each individual request). This
changes pageant_handle_msg into an asynchronous operation: you pass in
an agent request message and an identifier, and at some later point,
the got_response method in your PageantClient will be called with the
answer (and the same identifier, to allow you to match requests to
responses). The trait vtable also contains a logging system.
The main importance of PageantClient, and the reason why it has to
exist instead of just passing pageant_handle_msg a bare callback
function pointer and context parameter, is that it provides robustness
if a client stops existing while a request is still pending. You call
pageant_unregister_client, and any unfinished requests associated with
that client in the Pageant core will be cleaned up, so that you're
guaranteed that after the unregister operation, no stray callbacks
will happen with a stale pointer to that client.
The WM_COPYDATA interface of Windows Pageant is a direct client of
this API. The other client is PageantListener, the system that lives
in pageant.c and handles stream-based agent connections for both Unix
Pageant and the new Windows named-pipe IPC. More specifically, each
individual connection to the listening socket is a separate
PageantClient, which means that if a socket is closed abruptly or
suffers an OS error, that client can be unregistered and any pending
requests cancelled without disrupting other connections.
Users of PageantListener have a second client vtable they can use,
called PageantListenerClient. That contains _only_ logging facilities,
and at the moment, only Unix Pageant bothers to use it (and even that
only in debugging mode).
Finally, internally to the Pageant core, there's a new trait called
PageantAsyncOp which describes an agent request in the process of
being handled. But at the moment, it has only one trivial
implementation, which is handed the full response message already
constructed, and on the next toplevel callback, passes it back to the
PageantClient.
This is preparation to allow Pageant to be able to return to its GUI
main loop in the middle of handling a request (e.g. present a dialog
box to the user related to that particular request, and wait for the
user's response). In order to do that, we need the main thread's
Windows message loop to never be blocked by a WM_COPYDATA agent
request.
So I've split Pageant's previous hidden window into two hidden
windows, each with a subset of the original roles, and created in
different threads so that they get independent message loops. The one
in the main thread receives messages relating to Pageant's system tray
icon; the one in the subthread has the identity known to (old) Pageant
clients, and receives WM_COPYDATA messages only. Each WM_COPYDATA is
handled by passing the request back to the main thread via an event
object integrated into the Pageant main loop, and then waiting for a
second event object that the main thread will signal when the answer
comes back, and not returning from the WndProc handler until the
response arrives.
Hence, if an agent request received via WM_COPYDATA requires GUI
activity, then the main thread's GUI message loop will be able to do
that in parallel with all Pageant's other activity, including other
GUI activity (like the key list dialog box) and including responding
to other requests via named pipe.
I can't stop WM_COPYDATA requests from blocking _each other_, but this
allows them not to block anything else. And named-pipe requests block
nothing at all, so as clients switch over to the new IPC, even that
blockage will become less and less common.
Those chomp operations in wincons.c and uxcons.c looked ugly, and I'm
not totally convinced they couldn't underrun the buffer by 1 byte in
weird circumstances. strbuf_chomp is neater.
UBsan pointed out another memcpy from NULL (again with length 0) in
the prompts_t system. When I looked at it, I realised that firstly
prompt_ensure_result_size was an early not-so-good implementation of
sgrowarray_nm that would benefit from being replaced with a call to
the real one, and secondly, the whole system for storing prompt
results should really have been replaced with strbufs with the no-move
option, because that's doing all the same jobs better.
So, now each prompt_t holds a strbuf in place of its previous manually
managed string. prompt_ensure_result_size is gone (the console
prompt-reading functions use strbuf_append, and everything else just
adds to the strbuf in the usual marshal.c way). New functions exist to
retrieve a prompt_t's result, either by reference or copied.
These are better than my previous approach of just assigning to
sb->len, because firstly they check by assertion that the new length
is within range, and secondly they preserve the invariant that the
byte stored in the buffer just after the length runs out is \0.
Switched to using the new functions everywhere a grep could turn up
opportunities.
In setting up the ECC tests for cmdgen, I noticed that OpenSSH and
PuTTYgen disagree on the bit length to put in a key fingerprint for an
ed25519 key: we think 255, they think 256.
On reflection, I think 255 is more accurate, which is why I bodged
get_fp() in the test suite to ignore that difference when checking our
key fingerprint against OpenSSH's. But having done that, it now seems
silly that if you unnecessarily specify a bit count at ed25519
generation time, cmdgen will insist that it be 256!
255 is now permitted everywhere an ed25519 bit count is input. 256 is
also still allowed for backwards compatibility but 255 is preferred by
the error message if you give any other value.
Now they have names that are more consistent (no more userkey_this but
that_userkey); a bit shorter; and, most importantly, all the current
functions end in _f to indicate that they deal with keys stored in
disk files. I'm about to add a second set of entry points that deal
with keys via the more general BinarySource / BinarySink interface,
which will sit alongside these with a different suffix.
As in the previous commit, this means that agent_query() is now able
to operate in an asynchronous mode, so that if Pageant takes time to
answer a request, the GUI of the PuTTY instance making the request
won't be blocked.
Also as in the previous commit, we still fall back to the WM_COPYDATA
protocol if the new named pipe protocol isn't available.
Now that Pageant runs a named-pipe server as well as a WM_COPYDATA
server, we prefer the former (if available) for agent forwarding, for
the same reasons as on Unix: it lets us establish a simple raw-data
streaming connection instead of agentf.c's complicated message
boundary detection and buffer management, and if agent connections
ever become stateful, this technique will cope.
On Windows, another advantage of this change is that forwarded agent
requests can now be asynchronous: if the agent takes time to respond
to a request for any reason, then the rest of PuTTY's GUI and SSH
connection are not blocked, and you can carry on working while the
agent is thinking about the request.
(I didn't list that as a benefit of doing the same thing for Unix in
commit ae1148267, because on Unix, agent_query() could _already_ run
asynchronously. It's only on Windows that that's new.)
This reuses all the named-pipe IPC code I set up for connection
sharing a few years ago, to set up a named pipe with a predictable
name and speak the stream-oriented SSH agent protocol over it.
In this commit, we just set up the server, and there's no code that
speaks the client end of the new IPC yet. But my plan is that clients
should switch over to using this interface if possible, because it's
generally better: it doesn't have to be handled synchronously in the
middle of a GUI event loop (either in Pageant itself _or_ in its
client), and it's a better fit to the connection-oriented nature of
forwarded agent connections (so if any features ever appear in the
agent protocol that require state within a connection, we'll now be
able to support them).
This contains most of the guts of the previously monolithic function
new_named_pipe_client(), but it directly returns the HANDLE to the
opened pipe, or a string error message on failure.
new_named_pipe_client() is now a thin veneer on top of that, which
returns a Socket * by wrapping up the HANDLE into a HandleSocket or
the error message into an ErrorSocket as appropriate.
So it's now possible to connect to a named pipe, using all our usual
infrastructure (including in particular the ownership check of the
server, to defend against spoofing attacks), without having to have a
Socket-capable event loop in progress.
Windows Plink and PSFTP had very similar implementations, and now they
share one that lives in a new file winselcli.c. I've similarly moved
GUI PuTTY's implementation out of window.c into winselgui.c, where
other GUI programs wanting to do networking will be able to access
that too.
In the spirit of centralisation, I've also taken the opportunity to
make both functions use the reasonably complete winsock_error_string()
rather than (for some historical reason) each inlining a minimal
version that reports most errors as 'unknown'.
Historically, because of the way Windows Pageant's IPC works, PuTTY's
agent forwarding has always been message-oriented. The channel
implementation in agentf.c deals with receiving a data stream from the
remote agent client and breaking it up into messages, and then it
passes each message individually to agent_query().
On Unix, this is more work than is really needed, and I've always
meant to get round to doing the more obvious thing: making an agent
forwarding channel into simply a stream-oriented proxy, passing raw
data back and forth between the SSH channel and the local AF_UNIX
socket without having to know or care about the message boundaries in
the stream.
The portfwdmgr_connect_socket() facility introduced by the previous
commit is the missing piece of infrastructure to make that possible.
Now, the agent client module provides an API that includes a callback
you can pass to portfwdmgr_connect_socket() to open a streamed agent
connection, and the agent forwarding setup function tries to use that
where possible, only falling back to the message-based agentf.c system
if it can't be done. On Windows, the new piece of agent-client API
returns failure, so we still fall back to agentf.c there.
There are two benefits to doing it this way. One is that it's just
simpler and more robust: if PuTTY isn't trying to parse the agent
connection, then it has less work to do and fewer places to introduce
bugs. The other is that it's futureproof against changes in the agent
protocol: if any kind of extension is ever introduced that requires
keeping state within a single agent connection, or that changes the
protocol itself so that agentf's message-boundary detection stops
working, then this forwarding system will still work.
When I'm declaring a local instance of some context structure type to
pass to a function which will pass it in turn to a callback, I've
tended to use a declaration of the form
struct context actx, *ctx = &actx;
so that the outermost caller can initialise the context, and/or read
out fields of it afterwards, by the same syntax 'ctx->foo' that the
callback function will be using. So you get visual consistency between
the two functions that share this context.
It only just occurred to me that there's a much neater way to declare
a context struct of this kind, which still makes 'ctx' behave like a
pointer in the owning function, and doesn't need all that weird
verbiage or a spare variable name:
struct context ctx[1];
That's much nicer! I've switched to doing that in all existing cases I
could find, and also in a couple of cases where I hadn't previously
bothered to do the previous more cumbersome idiom.
The do_select function is called with a boolean parameter indicating
whether we're supposed to start or stop paying attention to network
activity on a given socket. So if we freeze and unfreeze the socket in
mid-session because of backlog, we'll call do_select(s, false) to
freeze it, and do_select(s, true) to unfreeze it.
But the implementation of do_select in the Windows SFTP code predated
the rigorous handling of socket backlogs, so it assumed that
do_select(s, true) would only be called at initialisation time, i.e.
only once, and therefore that it was safe to use that flag as a cue to
set up the Windows event object to associate with socket activity.
Hence, every time the socket was frozen and unfrozen, we would create
a new netevent at unfreeze time, leaking the old one.
I think perhaps part of the reason why that was hard to figure out was
that the boolean parameter was called 'startup' rather than 'enable'.
To make it less confusing the next time I read this code, I've also
renamed it, and while I was at it, adjusted another related comment.
This commit switches as many ssh_hash_free / ssh_hash_new pairs as
possible to reuse the previous hash object via ssh_hash_reset. Also a
few other cleanups: use the wrapper function hash_simple() where
possible, and I've also introduced ssh_hash_digest_nondestructive()
and switched to that where possible as well.
Up until now, it's been a variadic _function_, whose argument list
consists of 'const char *' ASCIZ strings to concatenate, terminated by
one containing a null pointer. Now, that function is dupcat_fn(), and
it's wrapped by a C99 variadic _macro_ called dupcat(), which
automatically suffixes the null-pointer terminating argument.
This has three benefits. Firstly, it's just less effort at every call
site. Secondly, it protects against the risk of accidentally leaving
off the NULL, causing arbitrary words of stack memory to be
dereferenced as char pointers. And thirdly, it protects against the
more subtle risk of writing a bare 'NULL' as the terminating argument,
instead of casting it explicitly to a pointer. That last one is
necessary because C permits the macro NULL to expand to an integer
constant such as 0, so NULL by itself may not have pointer type, and
worse, it may not be marshalled in a variadic argument list in the
same way as a pointer. (For example, on a 64-bit machine it might only
occupy 32 bits. And yet, on another 64-bit platform, it might work
just fine, so that you don't notice the mistake!)
I was inspired to do this by happening to notice one of those bare
NULL terminators, and thinking I'd better check if there were any
more. Turned out there were quite a few. Now there are none.
Thanks to Patrick Stekovic for pointing out that, unlike sensible IP
stacks, Windows requires a non-default socket option to prevent a
second application from binding to a port you were already listening
on, causing some of your incoming connections to be diverted.
This replaces the previous setsockopt that enabled SO_REUSEADDR, which
I put there a long time ago in order to fix an annoying behaviour if
you used the same listening socket twice in rapid succession (e.g. for
successive PuTTYs forwarding the same port) and the second one failed
to bind the listening port because a left-over connection from the
first one was still in TIME_WAIT and causing the port number to be
marked as used.
As far as I can see, SO_EXCLUSIVEADDRUSE and SO_REUSEADDR are mutually
exclusive - if I try to set both, either way round, then setsockopt
returns failure on the second one - so if I have to set the former
then I _can't_ set the latter. And fortunately, re-testing on Windows
10, the TIME_WAIT problem that SO_REUSEADDR was supposed to solve
doesn't seem to exist any more: I deliberately tried listening on a
port that had a TIME_WAIT connection sitting on it, and it worked for
me even without SO_REUSEADDR.
(I can't remember now whether I definitely confirmed the TIME_WAIT
problem on a previous version of Windows, or whether I just assumed it
would happen on Windows in the same way as Linux, where I definitely
do remember observing it.)
While I'm changing that setsockopt call, I've also fixed its 'on'
parameter so that it's a BOOL rather than an int, in accordance with
the docs for WinSock setsockopt.
The message "Reusing a shared connection to this server" is sent to
the seat's output method during the call to ssh_init. In Windows
Plink, that output method wants to talk to the BinarySink stderr_bs
(or stdout_bs, but for this particular message, stderr). So we have to
have already set up stderr_bs by the time the backend init function is
called.
When I introduced the unreachable() macro in commit 0112936ef, I
searched the source code for assert(0) and assert(false), together
with their variant form assert(0 && "explanatory text"). But I didn't
search for assert(!"explanatory text"), which is the form I used to
use before finding that assert(0 && "text") seemed to be preferred in
other code bases.
So, here's a belated replacement of all the assert(!"stuff") macros
with further instances of unreachable().
The number of people has been steadily increasing who read our source
code with an editor that thinks tab stops are 4 spaces apart, as
opposed to the traditional tty-derived 8 that the PuTTY code expects.
So I've been wondering for ages about just fixing it, and switching to
a spaces-only policy throughout the code. And I recently found out
about 'git blame -w', which should make this change not too disruptive
for the purposes of source-control archaeology; so perhaps now is the
time.
While I'm at it, I've also taken the opportunity to remove all the
trailing spaces from source lines (on the basis that git dislikes
them, and is the only thing that seems to have a strong opinion one
way or the other).
Apologies to anyone downstream of this code who has complicated patch
sets to rebase past this change. I don't intend it to be needed again.
The RESIZE_EITHER resizing mode responds to a window resize by
changing the logical terminal size if the window is shown normally, or
by changing the font size to keep the terminal size the same if the
resize is a transition between normal and maximised state.
But a user pointed out that it's also possible for a window to receive
a WM_SIZE message while _remaining_ in maximised state, and that
PuTTY's resize logic didn't allow for that possibility. It occurs when
there's a change in the amount of available screen space for the
window to be maximised _in_: e.g. when the video resolution is
reconfigured, or when you reconnect to a Remote Desktop session using
a client window of a different size, or even when you toggle the
'Automatically hide the taskbar' option in the Windows taskbar settings.
In that situation, the right thing seems to be for PuTTY to continue
to go with the policy of changing the font size rather than the
logical terminal size. In other words, we prefer to change the font
size when the resize is _from_ maximised state, _to_ maximised state,
_or both_.
That's easily implemented by removing the check of the 'was_zoomed'
flag, in the case where we've received a WM_SIZE message with the
state SIZE_MAXIMIZED: once we know the transition is _to_ maximised
state, it doesn't matter whether or not it was also _from_ it. (But we
still set the was_zoomed flag to the most recent maximised status, so
that we can recognise transitions _out_ of maximised mode.)
Commit f2e61275f converted the integer casts in cmpforsearch to
uintptr_t from unsigned long. But it left the companion function
cmpfortree alone, presumably on the grounds that the compiler didn't
report a warning for that one.
But those two functions (cmpfortree and cmpforsearch) are used with
the same tree234, so they're supposed to implement the same sorting
criterion. And the thing they're actually comparing, namely the
Windows API typedef SOCKET, is a pointer-sized integer. So there was a
latent bug here in which cmpforsearch was comparing all 64 bits of the
pointer, while cmpfortree was only comparing the low-order 32.
Set the initial buffer size to MAX_PATH + 1 (261). Increment e->i before
the function returns instead of incrementing it in the call to
RegEnumKey.
The initial buffer size was too small to fit a subkey with a 256
characters long name plus '\0', the first call to RegEnumKey would fail
with ERROR_MORE_DATA, sgrowarray would grow the buffer, and RegEnumKey
would be called again.
However, because e->i was incremented in the first RegEnumKey call, the
second call would get the next subkey and the subkey with the long name
would be skipped.
Saving a session with a 256 characters long name would trigger this
problem. The session would be saved in the registry, but Putty would not
be able to display it in the saved sessions list.
Pageant didn't have this problem since it uses a different function to get
the saved sessions and the size of the buffer used is MAX_PATH + 1. Pageant
and Putty would display slightly different lists of saved sessions.
I've only just noticed that it doesn't do anything at all!
Almost every implementation of the Socket vtable provides a flush()
method which does nothing, optionally with a comment explaining why
it's OK to do nothing. The sole exception is the wrapper Proxy_Socket,
which implements the method during its setup phase by setting a
pending_flush flag, so that when its sub-socket is later created, it
can call sk_flush on that. But since the sub-socket's sk_flush will do
nothing, even that is completely pointless!
Source control history says that sk_flush was introduced by Dave
Hinton in 2001 (commit 7b0e08270), who was going to use it for some
purpose involving the SSL Telnet support he was working on at the
time. That SSL support was never finished, and its vestigial
declarations in network.h were removed in 2015 (commit 42334b65b). So
sk_flush is just another vestige of that abandoned work, which I
should have removed in the latter commit but overlooked.
If the agent sent a response whose length field describes an interval
of memory larger than the file-mapping object the message is supposed
to be stored in, we shouldn't return that message to the client as if
nothing is wrong. Treat that the same as a failure to receive any
response at all.
Looking over this function today, I spotted several questionable uses
of strcat to concatenate "\PUTTY.RND" to the end of a pathname,
without having checked whether the pathname had filled up the static
fixed-size buffer already.
I don't think this is exploitable (because you'd have to be in control
of the local account already to control any of the data sources used
to fill those buffers). But it's horrible anyway, of course. Now all
of those are replaced with sensible dupcats.
(This patch re-indents a lot of the function, to give variables
tighter scopes. So the diff is best viewed with whitespace ignored.)
This assertion was supposed to be checking for the buffer overrun
fixed by the previous commit, but because it checks the buffer index
just _after_ writing into the buffer, it would have permitted a
one-byte overrun before failing the assertion.
If you select an entry in the saved sessions list box, but without
double-clicking to actually load it, and then you hit OK, the config-
box code will automatically load it. That just saves one click in a
common situation.
But in order to load that session, the config-box system first has to
ask the list-box control _which_ session is selected. On Windows, this
causes an assertion failure if the user has switched away from the
Session panel to some other panel of the config box, because when the
list box isn't on screen, its Windows control object is actually
destroyed.
I think a sensible answer is that we shouldn't be doing that implicit
load behaviour in any case if the list box isn't _visible_: silently
loading and launching a session someone selected a lot of UI actions
ago wasn't really the point. So now I make that behaviour only happen
when the list box (i.e. the Session panel) _is_ visible. That should
prevent the assertion failure on Windows, but the UI effect is cross-
platform, applying even on GTK where the control objects for invisible
panels persist and so the assertion failure didn't happen. I think
it's a reasonable UI change to make globally.
In order to implement it, I've had to invent a new query function so
that config.c can tell whether a given control is visible. In order to
do that on GTK, I had to give each control a pointer to the 'selparam'
structure describing its config-box pane, so that query function could
check it against the current one - and in order to do _that_, I had to
first arrange that those 'selparam' structures have stable addresses
from the moment they're first created, which meant adding a layer of
indirection so that the array of them in the top-level dlgparam
structure is now an array of _pointers_ rather than of actual structs.
(That way, realloc half way through config box creation can't
invalidate the important pointer values.)
Normally I never notice warnings in this build, because it runs inside
bob and dumps all the warnings in a part of the build log I never look
at. But I've had these fixes lying around for a while and should
commit them.
They're benign: all we need is an explicit declaration of strtoumax to
replace the one that stdlib.h doesn't provide, and a couple more of
those annoying NO_TYPECHECK modifiers on GET_WINDOWS_FUNCTION calls.
Having decided that the terminal's local echo setting shouldn't be
allowed to propagate through to termios, I think the local edit
setting shouldn't either. Also, no other terminal emulator I know
seems to implement this sequence, and if you enable it, things get
very confused in general. I think it's generally better off absent; if
somebody turns out to have been using it, then we'll at least be able
to find out what it's good for.
The functions that previously lived in it now live in terminal.c
itself; they've been renamed term_keyinput and term_keyinputw, and
their function is to add data to the terminal's user input buffer from
a char or wchar_t string respectively.
They sit more comfortably in terminal.c anyway, because their whole
point is to translate into the character encoding that the terminal is
currently configured to use. Also, making them part of the terminal
code means they can also take care of calling term_seen_key_event(),
which simplifies most of the call sites in the GTK and Windows front
ends.
Generation of text _inside_ terminal.c, from responses to query escape
sequences, is therefore not done by calling those external entry
points: we send those responses directly to the ldisc, so that they
don't count as keypresses for all the user-facing purposes like bell
overload handling and scrollback reset. To make _that_ convenient,
I've arranged that most of the code that previously lived in
lpage_send and luni_send is now in separate translation functions, so
those can still be called from situations where you're not going to do
the default thing with the translated data.
(However, pasted data _does_ still count as close enough to a keypress
to call term_seen_key_event - but it clears the 'interactive' flag
when the data is passed on to the line discipline, which tweaks a
minor detail of control-char handling in line ending mode but mostly
just means pastes aren't interrupted.)
It's identical in uxnoise and winnoise, being written entirely in
terms of existing cross-platform functions. Might as well centralise
it into sshrand.c.
Having explicitly _stated_ in commit 4dcc0fddf the principle that if
you ever queue a toplevel callback on a freeable object then you
should also call delete_callbacks_for_context on that object before
freeing it, I realised I'd never actually gone through and checked
methodically at every call site of queue_toplevel_callback. So I did,
and naturally, I found several missing ones.
The recent rewriting in both the GTK and Windows keyboard handlers
left the keypad 'Enter' key in a bad state, when no override is
enabled that causes it to generate an escape sequence.
On Windows, a series of fallbacks was causing it to generate \r
regardless of configuration, whereas in Telnet mode it should default
to generating the special Telnet new-line sequence, and in response to
ESC[20h (enabling term->cr_lf_return) it should generate \r\n.
On GTK, it wasn't generating anything _at all_, and also, I can't see
any evidence that the GTK keyboard handler had ever remembered to
implement the cr_lf_return mode.
Now Keypad Enter in non-escape-sequence mode should behave just like
Return, on both platforms.
TranslateKey() on Windows passed all numeric-keypad key events to this
function in terminal.c, and accepted whatever it gave back. That
included the handling for the trivial case of the numeric keypad, when
Num Lock is on and application keypad mode hasn't overridden it, so
that the keypad should be returning actual digits. In that case,
format_numeric_keypad_key() itself was returning the same ASCII
character I had passed in to it as a keypad identifier, and
TranslateKey was returning that in turn as the final translation.
Unfortunately, that means that with Num Lock on, the numeric keypad
translates into what _I_ used as the logical keypad codes inside the
source code, not what the local keyboard layout thinks are the right
codes. In particular, the key I identified as keypad '.' would render
as '.' even on a German keyboard where it ought to produce ','.
Fixed by removing the fallback case in format_numeric_keypad_key()
itself, so now it returns the empty string if it didn't produce an
escape sequence as its translation. Instead, the special case is in
window.c, which checks for a zero-length output string and handles it
by falling through to the keyboard-layout specific ToUnicode code
further down TranslateKey().
On the GTK side, no change is needed here: the GTK keyboard handler
does things in the opposite order, by trying the local input method
_first_ (unless it can see a reason up front to override it), and only
calling format_numeric_keypad_key() if that didn't provide a
translation. So the fallback ASCII translation in the latter was
already not used.
Re-consider the icon in light of the font size, so that we pick the icon
whose size mostly closely matches the terminal font, rather than always
scaling the default icon.
Remove the 'winhelp-topic' IDs from the Halibut source, and from the
code. Now we have one fewer name to think of every time we add a
setting.
I've left the HELPCTX system in place, with the vague notion that it
might be a useful layer of indirection for some future help system on a
platform like Mac OS X.
(I've left the putty.hlp target in doc/Makefile, if nothing else because
this is a convenient test case for Halibut's WinHelp support. But the
resulting help file will no longer support context help.)
GetDlgItemText_alloc is often used to get passwords from text fields,
so the memory should be freed and erased properly. Otherwise parts
of passwords might leak in memory.
Signed-off-by: Sven Strickroth <email@cs-ware.de>
In my eagerness to make sure we didn't _accidentally_ change the
seat's trust status back to trusted at any point, I forgot to do it on
purpose if a second SSH login phase is legitimately run in the same
terminal after the first session has ended.
Fill in all the function pointers for 3rd party Windows GSS DLLs, not
just some of them. These were missed out when GSS key exchange was added
in d515e4f1a3.
At the point when we change over the seat's trust status to untrusted
for the last time, to finish authentication, Plink will now present a
final interactive prompt saying 'Press Return to begin session'. This
is a hint that anything after that that resembles an auth prompt
should be treated with suspicion, because _PuTTY_ thinks it's finished
authenticating.
This is of course an annoying inconvenience for interactive users, so
I've tried to reduce its impact as much as I can. It doesn't happen in
GUI PuTTY at all (because the trust sigil system is used instead); it
doesn't happen if you use plink -batch (because then the user already
knows that they _never_ expect an interactive prompt); and it doesn't
happen if Plink's standard input is being redirected from anywhere
other than the terminal / console (because then it would be pointless
for the server to try to scam passphrases out of the user anyway,
since the user isn't in a position to enter one in response to a spoof
prompt). So it should only happen to people who are using Plink in a
terminal for interactive login purposes, and that's not _really_ what
I ever intended Plink to be used for (which is why it's never had any
out-of-band control UI like OpenSSH's ~ system).
If anyone _still_ doesn't like this new prompt, it can also be turned
off using the new -no-antispoof flag, if the user is willing to
knowingly assume the risk.
In terminal-based GUI applications, this is passed through to
term_set_trust_status, to toggle whether lines are prefixed with the
new trust sigil. In console applications, the function returns false,
indicating to the backend that it should employ some other technique
for spoofing protection.
This is not yet used by anything, but the idea is that it'll be a
graphic in the terminal window that can't be replicated by a server
sending escape sequences, and hence can be used as a reliable
indication that the text on a particular terminal line is generated by
PuTTY itself and not passed through from the server. This will make it
possible to detect a malicious server trying to mimic local prompts to
trick you out of information that shouldn't be sent over the wire
(such as private-key passphrases).
The trust sigil I've picked is a small copy of the PuTTY icon, which
is thematically nice (it can be read as if the PuTTY icon is the name
of the speaker in a dialogue) and also convenient because we had that
graphic available already on all platforms. (Though the contortions I
had to go through to make the GTK 1 code draw it were quite annoying.)
The trust sigil has the same dimensions as a CJK double-width
character, i.e. it's 2 character cells wide by 1 high.
Now, instead of each seat's prompt-handling function doing the
control-char sanitisation of prompt text, the SSH code does it. This
means we can do it differently depending on the prompt.
In particular, prompts _we_ generate (e.g. a genuine request for your
private key's passphrase) are not sanitised; but prompts coming from
the server (in keyboard-interactive mode, or its more restricted SSH-1
analogues, TIS and CryptoCard) are not only sanitised but also
line-length limited and surrounded by uncounterfeitable headers, like
I've just done to the authentication banners.
This should mean that if a malicious server tries to fake the local
passphrase prompt (perhaps because it's somehow already got a copy of
your _encrypted_ private key), you can tell the difference.
The executables were already ignoring it.
This is a minimal change; PUTTY.HLP can still be built, and there's
still all the context IDs lying around.
Buildscr changes are untested.
With this change, we stop expecting to find putty.chm alongside the
executable file. That was a security hazard comparable to DLL
hijacking, because of the risk that a malicious CHM file could be
dropped into the same directory as putty.exe (e.g. if someone ran
PuTTY from their browser's download dir)..
Instead, the standalone putty.exe (and other binaries needing help)
embed the proper CHM file within themselves, as a Windows resource,
and if called on to display the help then they write the file out to a
temporary location. This has the advantage that if you download and
run the standalone putty.exe then you actually _get_ help, which
previously didn't happen!
The versions of the binaries in the installer don't each contain a
copy of the help file; that would be extravagant. Instead, the
installer itself writes a registry entry pointing at the proper help
file, and the executables will look there.
Another effect of this commit is that I've withdrawn support for the
older .HLP format completely. It's now entirely outdated, and
supporting it through this security fix would have been a huge pain.
Now instead of taking raw arguments to configure the output
StripCtrlChars with, it takes an enumerated value giving the context
of what's being sanitised, and allows the seat to decide what the
output parameters for that context should be.
The only context currently used is SIC_BANNER (SSH login banners).
I've also added a not-yet-used one for keyboard-interactive prompts.
Now if a pathname ends with a slash already, we detect that (using the
shiny new ptrlen_endswith), and don't bother putting another one in.
No functional change, but this should improve the occasional error
message, e.g. 'pscp remote:some.filename /' will now say it can't
create /some.filename instead of //some.filename.
Local functions in uxcons.c and wincons.c were calling the old
simplistic sanitise_term_data to print console-based prompts. Now they
use the same new system as everything else.
This removes the last use of the ASCII-centric sanitise_term_data.
If centralised code like the SSH implementation wants to sanitise
escape sequences out of a piece of server-provided text, it will need
to do it by making a locale-based StripCtrlChars if it's running in a
console context, or a Terminal-based one if it's in a GUI terminal-
window application.
All the other changes of behaviour needed between those two contexts
are handled by providing reconfigurable methods in the Seat vtable;
this one is no different. So now there's a new method in the Seat
vtable that will construct a StripCtrlChars appropriate to that kind
of seat. Terminal-window seats (gtkwin.c, window.c) implement it by
calling the new stripctrl_new_term(), and console ones use the locale-
based stripctrl_new().
If a proxy command jabbers on standard error in a way that doesn't
involve any newline characters, we now won't keep buffering data for
ever.
(Not that I've heard of it happening, but I noticed the theoretical
possibility on the way past in a recent cleanup pass.)
The idea of these is that they centralise the common idiom along the
lines of
if (logical_array_len >= physical_array_size) {
physical_array_size = logical_array_len * 5 / 4 + 256;
array = sresize(array, physical_array_size, ElementType);
}
which happens at a zillion call sites throughout this code base, with
different random choices of the geometric factor and additive
constant, sometimes forgetting them completely, and generally doing a
lot of repeated work.
The new macro sgrowarray(array,size,n) has the semantics: here are the
array pointer and its physical size for you to modify, now please
ensure that the nth element exists, so I can write into it. And
sgrowarrayn(array,size,n,m) is the same except that it ensures that
the array has size at least n+m (so sgrowarray is just the special
case where m=1).
Now that this is a single centralised implementation that will be used
everywhere, I've also gone to more effort in the implementation, with
careful overflow checks that would have been painful to put at all the
previous call sites.
This commit also switches over every use of sresize(), apart from a
few where I really didn't think it would gain anything. A consequence
of that is that a lot of array-size variables have to have their types
changed to size_t, because the macros require that (they address-take
the size to pass to the underlying function).
I haven't tried compiling with /DMINEFIELD in a while, and when I just
did, I found that the declarations in winstuff.h weren't actually
being included by memory.c where they're needed.
I've just noticed that the MSDN docs for WinSock gethostname()
guarantee that a size-256 buffer is large enough. That seems a lot
simpler than the previous faff.
I've fixed a handful of these where I found them in passing, but when
I went systematically looking, there were a lot more that I hadn't
found!
A particular highlight of this collection is the code that formats
Windows clipboard data in RTF, which was absolutely crying out for
strbuf_catf, and now it's got it.
Previously, we returned a valid settings_r containing a null HKEY.
That didn't actually cause trouble (I think all the registry API
functions must have spotted the null HKEY and returned a clean error
code instead of crashing), but it means the caller can't tell if the
session really existed or not. Now we return NULL in that situation,
and close_settings_r avoids crashing if we pass the NULL to it later.
My trawl of all the vtable systems in the code spotted a couple of
other function-like macros in passing, which might as well be
rewritten as inline functions too for the same reasons.
If Plink's standard output and/or standard error points at a Windows
console or a Unix tty device, and if Plink was not configured to
request a remote pty (and hence to send a terminal-type string), then
we apply the new control-character stripping facility.
The idea is to be a mild defence against malicious remote processes
sending confusing escape sequences through the standard error channel
when Plink is being used as a transport for something like git: it's
OK to have actual sensible error messages come back from the server,
but when you run a git command, you didn't really intend to give the
remote server the implicit licence to write _all over_ your local
terminal display. At the same time, in that scenario, the standard
_output_ of Plink is left completely alone, on the grounds that git
will be expecting it to be 8-bit clean. (And Plink can tell that
because it's redirected away from the console.)
For interactive login sessions using Plink, this behaviour is
disabled, on the grounds that once you've sent a terminal-type string
it's assumed that you were _expecting_ the server to use it to know
what escape sequences to send to you.
So it should be transparent for all the use cases I've so far thought
of. But in case it's not, there's a family of new command-line options
like -no-sanitise-stdout and -sanitise-stderr that you can use to
forcibly override the autodetection of whether to do it.
This all applies the same way to both Unix and Windows Plink.
Rather like isatty() on Unix, this tells you if a raw Windows HANDLE
points at a console or not. Useful to know if your standard output or
standard error is going to be shown to a user, or redirected to
something that will make automated use of it.