The variable 'strbuf *realhost' was only initialised in the branch of
the ifdefs where IPV6 is enabled, so at NO_IPV6, it's used
uninitialised, which in my usual build configuration means a compile
failure.
Similarly to my recent addition of NEON-accelerated AES, these new
implementations drop in alongside the SHA-NI ones, under a different
set of ifdefs. All the details of selection and detection are
essentially the same as they were for the AES code.
This tears out the entire previous random-pool system in sshrand.c. In
its place is a system pretty close to Ferguson and Schneier's
'Fortuna' generator, with the main difference being that I use SHA-256
instead of AES for the generation side of the system (rationale given
in comment).
The PRNG implementation lives in sshprng.c, and defines a self-
contained data type with no state stored outside the object, so you
can instantiate however many of them you like. The old sshrand.c still
exists, but in place of the previous random pool system, it's just
become a client of sshprng.c, whose job is to hold a single global
instance of the PRNG type, and manage its reference count, save file,
noise-collection timers and similar administrative business.
Advantages of this change include:
- Fortuna is designed with a more varied threat model in mind than my
old home-grown random pool. For example, after any request for
random numbers, it automatically re-seeds itself, so that if the
state of the PRNG should be leaked, it won't give enough
information to find out what past outputs _were_.
- The PRNG type can be instantiated with any hash function; the
instance used by the main tools is based on SHA-256, an improvement
on the old pool's use of SHA-1.
- The new PRNG only uses the completely standard interface to the
hash function API, instead of having to have privileged access to
the internal SHA-1 block transform function. This will make it
easier to revamp the hash code in general, and also it means that
hardware-accelerated versions of SHA-256 will automatically be used
for the PRNG as well as for everything else.
- The new PRNG can be _tested_! Because it has an actual (if not
quite explicit) specification for exactly what the output numbers
_ought_ to be derived from the hashes of, I can (and have) put
tests in cryptsuite that ensure the output really is being derived
in the way I think it is. The old pool could have been returning
any old nonsense and it would have been very hard to tell for sure.
The upcoming PRNG revamp will want to tell noise sources apart, so
that it can treat them all fairly. So I've added an extra parameter to
noise_ultralight and random_add_noise, which takes values in an
enumeration covering all the vague classes of entropy source I'm
collecting. In this commit, though, it's simply ignored.
This is in preparation for a PRNG revamp which will want to have a
well defined boundary for any given request-for-randomness, so that it
can destroy the evidence afterwards. So no more looping round calling
random_byte() and then stopping when we feel like it: now you say up
front how many random bytes you want, and call random_read() which
gives you that many in one go.
Most of the call sites that had to be fixed are fairly mechanical, and
quite a few ended up more concise afterwards. A few became more
cumbersome, such as mp_random_bits, in which the new API doesn't let
me load the random bytes directly into the target integer without
triggering undefined behaviour, so instead I have to allocate a
separate temporary buffer.
The _most_ interesting call site was in the PKCS#1 v1.5 padding code
in sshrsa.c (used in SSH-1), in which you need a stream of _nonzero_
random bytes. The previous code just looped on random_byte, retrying
if it got a zero. Now I'm doing a much more interesting thing with an
mpint, essentially scaling a binary fraction repeatedly to extract a
number in the range [0,255) and then adding 1 to it.
Mostly on the Unix side: there are lots of places the Windows code was
collecting noise that the corresponding Unix/GTK code wasn't bothering
to, such as mouse movements, keystrokes and various network events.
Also, both platforms had forgotten to collect noise when reading data
from a pipe to a local proxy process, even though in that
configuration that's morally equivalent to the network packet timings
that we'd normally be collecting from.
Another thing pointed out by ASan: new_unix_listener takes ownership
of the SockAddr you give it, so I shouldn't have been freeing it at
the end of platform_make_x11_server().
All the hash-specific state structures, and the functions that
directly accessed them, are now local to the source files implementing
the hashes themselves. Everywhere we previously used those types or
functions, we're now using the standard ssh_hash or ssh2_mac API.
The 'simple' functions (hmacmd5_simple, SHA_Simple etc) are now a pair
of wrappers in sshauxcrypt.c, each of which takes an algorithm
structure and can do the same conceptual thing regardless of what it
is.
The SSH wire protocol for tty modes corresponding to control
characters (e.g. configuring what Ctrl-Foo you can press to generate
SIGINT or SIGQUIT) specifies (RFC 4254 section 8, under VINTR, saying
'similarly for the other characters') that you should send the value
255 on the wire if you want _no_ character code to map to the action
in question.
But in the <termios.h> API, that's indicated by setting the
appropriate field of 'struct termios' to _POSIX_VDISABLE, which is a
platform-dependent value and varies between (at least) Linux and *BSD.
On the client side, Unix Plink has always known this: when it copies
the local termios settings into a struct ssh_ttymodes to be sent on
the wire, it checks for _POSIX_VDISABLE and replaces it with 255. But
uxpty.c, mapping ssh_ttymodes back to termios for Uppity's pty
sessions, wasn't making the reverse transformation.
The refactored sshaes.c gives me a convenient slot to drop in a second
hardware-accelerated AES implementation, similar to the existing one
but using Arm NEON intrinsics in place of the x86 AES-NI ones.
This needed a minor structural change, because Arm systems are often
heterogeneous, containing more than one type of CPU which won't
necessarily all support the same set of architecture features. So you
can't test at run time for the presence of AES acceleration by
querying the CPU you're running on - even if you found a way to do it,
the answer wouldn't be reliable once the OS started migrating your
process between CPUs. Instead, you have to ask the OS itself, because
only that knows about _all_ the CPUs on the system. So that means the
aes_hw_available() mechanism has to extend a tentacle into each
platform subdirectory.
The trickiest part was the nest of ifdefs that tries to detect whether
the compiler can support the necessary parts. I had successful
test-compiles on several compilers, and was able to run the code
directly on an AArch64 tablet (so I know it passes cryptsuite), but
it's likely that at least some Arm platforms won't be able to build it
because of some path through the ifdefs that I haven't been able to
test yet.
When the user clicks 'yes' to a 'weak crypto primitive' warning, and
another such warning is pending next in line, we were failing an
assertion when ssh2transport called register_dialog() for the second
warning box, because the result callback in gtkdlg.c had not called
unregister_dialog() for the previous one yet. Now that's done before
rather than after delivering the result to the dialog's client.
This is the commit that f3295e0fb _should_ have been. Yesterday I just
added some typedefs so that I didn't have to wear out my fingers
typing 'struct' in new code, but what I ought to have done is to move
all the typedefs into defs.h with the rest, and then go through
cleaning up the legacy 'struct's all through the existing code.
But I was mostly trying to concentrate on getting the test suite
finished, so I just did the minimum. Now it's time to come back and do
it better.
Taking a leaf out of the LLVM code base: this macro still includes an
assert(false) so that the message will show up in a typical build, but
it follows it up with a call to a function explicitly marked as no-
return.
So this ought to do a better job of convincing compilers that once a
code path hits this function it _really doesn't_ have to still faff
about with making up a bogus return value or filling in a variable
that 'might be used uninitialised' in the following code that won't be
reached anyway.
I've gone through the existing code looking for the assert(false) /
assert(0) idiom and replaced all the ones I found with the new macro,
which also meant I could remove a few pointless return statements and
variable initialisations that I'd already had to put in to placate
compiler front ends.
The old 'Bignum' data type is gone completely, and so is sshbn.c. In
its place is a new thing called 'mp_int', handled by an entirely new
library module mpint.c, with API differences both large and small.
The main aim of this change is that the new library should be free of
timing- and cache-related side channels. I've written the code so that
it _should_ - assuming I haven't made any mistakes - do all of its
work without either control flow or memory addressing depending on the
data words of the input numbers. (Though, being an _arbitrary_
precision library, it does have to at least depend on the sizes of the
numbers - but there's a 'formal' size that can vary separately from
the actual magnitude of the represented integer, so if you want to
keep it secret that your number is actually small, it should work fine
to have a very long mp_int and just happen to store 23 in it.) So I've
done all my conditionalisation by means of computing both answers and
doing bit-masking to swap the right one into place, and all loops over
the words of an mp_int go up to the formal size rather than the actual
size.
I haven't actually tested the constant-time property in any rigorous
way yet (I'm still considering the best way to do it). But this code
is surely at the very least a big improvement on the old version, even
if I later find a few more things to fix.
I've also completely rewritten the low-level elliptic curve arithmetic
from sshecc.c; the new ecc.c is closer to being an adjunct of mpint.c
than it is to the SSH end of the code. The new elliptic curve code
keeps all coordinates in Montgomery-multiplication transformed form to
speed up all the multiplications mod the same prime, and only converts
them back when you ask for the affine coordinates. Also, I adopted
extended coordinates for the Edwards curve implementation.
sshecc.c has also had a near-total rewrite in the course of switching
it over to the new system. While I was there, I've separated ECDSA and
EdDSA more completely - they now have separate vtables, instead of a
single vtable in which nearly every function had a big if statement in
it - and also made the externally exposed types for an ECDSA key and
an ECDH context different.
A minor new feature: since the new arithmetic code includes a modular
square root function, we can now support the compressed point
representation for the NIST curves. We seem to have been getting along
fine without that so far, but it seemed a shame not to put it in,
since it was suddenly easy.
In sshrsa.c, one major change is that I've removed the RSA blinding
step in rsa_privkey_op, in which we randomise the ciphertext before
doing the decryption. The purpose of that was to avoid timing leaks
giving away the plaintext - but the new arithmetic code should take
that in its stride in the course of also being careful enough to avoid
leaking the _private key_, which RSA blinding had no way to do
anything about in any case.
Apart from those specific points, most of the rest of the changes are
more or less mechanical, just changing type names and translating code
into the new API.
A user points out that the call to close_directory() in pscp.c's
rsource() function should have been inside rather than outside the if
statement that checks whether the directory handle is NULL. As a
result, any failed attempt to open a directory during a 'pscp -r'
recursive upload leads to a null-pointer dereference.
Moved the close_directory call to where it should be, and also
arranged to print the OS error code if the directory-open fails, by
also changing the API of open_directory to return an error string on
failure.
In the previous commit I happened to notice a %.150s in a ppl_logevent
call, which was probably an important safety precaution a couple of
decades ago when that format string was being used for an sprintf into
a fixed-size buffer, but now it's just pointless cruft.
This commit removes all printf string formatting directives with a
compile-time fixed size, with the one exception of a %.3s used to cut
out a 3-letter month name in scpserver.c. In cases where the format
string in question was already going to an arbitrary-length function
like dupprintf or ppl_logevent, that's all I've done; in cases where
there was still a fixed-size buffer, I've replaced it with a dynamic
buffer and dupprintf.
In the past, I've had a lot of macros which you call with double
parentheses, along the lines of debug(("format string", params)), so
that the inner parens protect the commas and permit the macro to treat
the whole printf-style argument list as one macro argument.
That's all very well, but it's a bit inconvenient (it doesn't leave
you any way to implement such a macro by prepending another argument
to the list), and now this code base's rules allow C99isms, I can
switch all those macros to using a single pair of parens, using the
C99 ability to say '...' in the parameter list of the #define and get
at the corresponding suffix of the arguments as __VA_ARGS__.
So I'm doing it. I've made the following printf-style macros variadic:
bpp_logevent, ppl_logevent, ppl_printf and debug.
While I'm here, I've also fixed up a collection of conditioned-out
calls to debug() in the Windows front end which were clearly expecting
a macro with a different calling syntax, because they had an integer
parameter first. If I ever have a need to condition those back in,
they should actually work now.
A long time ago, in commit 4d77b6567, I moved the generation of the
arrow-key escape sequences into a function format_arrow_key(). Mostly
the reason for that was a special purpose I had in mind at the time
which involved auto-generating the same sequences in response to
things other than a keypress, but I always thought it would be nice to
centralise a lot more of PuTTY's complicated keyboard handling in the
same way - at least the handling of the function keys and their
numerous static and dynamic config options.
In this year's general spirit of tidying up and refactoring, I think
it's finally time. So here I introduce three more centralised
functions for dealing with the numbered function keys, the small
keypad (Ins, Home, PgUp etc) and the numeric keypad. Lots of horrible
and duplicated code from the key handling functions in window.c and
gtkwin.c is now more sensibly centralised: each platform keyboard
handler concerns itself with the local format of a keyboard event and
platform-specific enumeration of key codes, and once it's decided what
the logical key press actually _is_, it hands off to the new functions
in terminal.c to generate the appropriate escape code.
Mostly this is intended to be a refactoring without functional change,
leaving the keyboard handling how it's always been. But in cases where
the Windows and GTK handlers were accidentally inconsistent, I've
fixed the inconsistency rather than carefully keeping both sides how
they were. Known consistency fixes:
- swapping the arrow keys between normal (ESC [ A) and application
(ESC O A) is now done by pressing Ctrl with them, and _not_ by
pressing Shift. That was how it was always supposed to work, and
how it's worked on GTK all along, but on Windows it's been done by
Shift as well since 2010, due to a bug at the call site of
format_arrow_key() introduced when I originally wrote that function.
- in Xterm function key mode plus application keypad mode, the /*-
keys on the numeric keypad now send ESC O {o,j,m} in place of ESC O
{Q,R,S}. That's how the Windows keyboard handler has worked all
along (it was a deliberate behaviour tweak for the Xterm-like
function key mode, because in that mode ESC O {Q,R,S} are generated
by F2-F4). But the GTK keyboard handler omitted that particular
special case and was still sending ESC O {Q,R,S} for those keys in
all application keypad modes.
- also in Xterm function key mode plus app keypad mode, we only
generates the app-keypad escape sequences if Num Lock is on; with
Num Lock off, the numeric keypad becomes arrow keys and
Home/End/etc, just as it would in non-app-keypad mode. Windows has
done this all along, but again, GTK lacked that special case.
This fixes a batch of clang-analyzer warnings of the form 'you
declared / assigned this variable and then never use it'. It doesn't
fix _all_ of them - some are there so that when I add code in the
future _it_ can use the variable without me having to remember to
start setting it - but these are the ones I thought it would make the
code better instead of worse to fix.
uxnet.c's sk_namelookup and the sorting-key construction in
pangofont_enum_fonts() were both using s[n]printf and strncpy into
buffers that had no real need to be fixed-size; format_telnet_command
and the GTK Event Log selection-data builder were doing their own
sresize loops, but now we have strbuf they can just use that and save
redoing the same work.
This is another cleanup I felt a need for while I was doing
boolification. If you define a function or variable in one .c file and
declare it extern in another, then nothing will check you haven't got
the types of the two declarations mismatched - so when you're
_changing_ the type, it's a pain to make sure you've caught all the
copies of it.
It's better to put all those extern declarations in header files, so
that the declaration in the header is also in scope for the
definition. Then the compiler will complain if they don't match, which
is what I want.
sk_startup and sk_nextaddr are entirely internal to winnet.c; nearly
all of import.c and minibidi.c's internal routines should have been
static and weren't; {read,write}_utf8 are internal to charset/utf8.c
(and didn't even need separate declarations at all); do_sftp_cleanup
is internal to psftp.c, and get_listitemheight to gtkdlg.c.
While I was editing those prototypes anyway, I've also added missing
'const' to the 'char *' passphrase parameters in import,c.
For a start, they now have different names on Windows and Unix,
reflecting their different roles: on Windows they apply escaping to
any string that's going to be used as a registry key (be it a session
name, or a host name for host key storage), whereas on Unix they're
for constructing saved-session file names in particular (and also
handle the special case of filling in "Default Settings" for NULL).
Also, they now produce output by writing to a strbuf, which simplifies
a lot of the call sites. In particular, the strbuf output idiom is
passed on to enum_settings_next, which is especially nice because its
only actual caller was doing an ad-hoc realloc loop that I can now get
rid of completely.
Thirdly, on Windows they're centralised into winmisc.c instead of
living in winstore.c, because that way Pageant can use the unescape
function too. (It was spotting the duplication there that made me
think of doing this in the first place, but once I'd started, I had to
keep unravelling the thread...)
I came across this unexplained static variable in my boolification
trawl. It seems clearly unintentional that it has only one instance
instead of one per terminal window - the code in question closely
resembles the Windows front end, and I think this must just be a
variable that I never swept up into 'inst' in the very early days when
I was making gtkwin.c out of a cloned-and-hacked window.c in the first
place.
These days it's even a bug, now that the OS X port actually does run
multiple terminal windows in the same process: if one goes into mouse
reporting mode, I'm pretty sure this would have done confusing things
to the effects of mouse actions in the other.
My normal habit these days, in new code, is to treat int and bool as
_almost_ completely separate types. I'm still willing to use C's
implicit test for zero on an integer (e.g. 'if (!blob.len)' is fine,
no need to spell it out as blob.len != 0), but generally, if a
variable is going to be conceptually a boolean, I like to declare it
bool and assign to it using 'true' or 'false' rather than 0 or 1.
PuTTY is an exception, because it predates the C99 bool, and I've
stuck to its existing coding style even when adding new code to it.
But it's been annoying me more and more, so now that I've decided C99
bool is an acceptable thing to require from our toolchain in the first
place, here's a quite thorough trawl through the source doing
'boolification'. Many variables and function parameters are now typed
as bool rather than int; many assignments of 0 or 1 to those variables
are now spelled 'true' or 'false'.
I managed this thorough conversion with the help of a custom clang
plugin that I wrote to trawl the AST and apply heuristics to point out
where things might want changing. So I've even managed to do a decent
job on parts of the code I haven't looked at in years!
To make the plugin's work easier, I pushed platform front ends
generally in the direction of using standard 'bool' in preference to
platform-specific boolean types like Windows BOOL or GTK's gboolean;
I've left the platform booleans in places they _have_ to be for the
platform APIs to work right, but variables only used by my own code
have been converted wherever I found them.
In a few places there are int values that look very like booleans in
_most_ of the places they're used, but have a rarely-used third value,
or a distinction between different nonzero values that most users
don't care about. In these cases, I've _removed_ uses of 'true' and
'false' for the return values, to emphasise that there's something
more subtle going on than a simple boolean answer:
- the 'multisel' field in dialog.h's list box structure, for which
the GTK front end in particular recognises a difference between 1
and 2 but nearly everything else treats as boolean
- the 'urgent' parameter to plug_receive, where 1 vs 2 tells you
something about the specific location of the urgent pointer, but
most clients only care about 0 vs 'something nonzero'
- the return value of wc_match, where -1 indicates a syntax error in
the wildcard.
- the return values from SSH-1 RSA-key loading functions, which use
-1 for 'wrong passphrase' and 0 for all other failures (so any
caller which already knows it's not loading an _encrypted private_
key can treat them as boolean)
- term->esc_query, and the 'query' parameter in toggle_mode in
terminal.c, which _usually_ hold 0 for ESC[123h or 1 for ESC[?123h,
but can also hold -1 for some other intervening character that we
don't support.
In a few places there's an integer that I haven't turned into a bool
even though it really _can_ only take values 0 or 1 (and, as above,
tried to make the call sites consistent in not calling those values
true and false), on the grounds that I thought it would make it more
confusing to imply that the 0 value was in some sense 'negative' or
bad and the 1 positive or good:
- the return value of plug_accepting uses the POSIXish convention of
0=success and nonzero=error; I think if I made it bool then I'd
also want to reverse its sense, and that's a job for a separate
piece of work.
- the 'screen' parameter to lineptr() in terminal.c, where 0 and 1
represent the default and alternate screens. There's no obvious
reason why one of those should be considered 'true' or 'positive'
or 'success' - they're just indices - so I've left it as int.
ssh_scp_recv had particularly confusing semantics for its previous int
return value: its call sites used '<= 0' to check for error, but it
never actually returned a negative number, just 0 or 1. Now the
function and its call sites agree that it's a bool.
In a couple of places I've renamed variables called 'ret', because I
don't like that name any more - it's unclear whether it means the
return value (in preparation) for the _containing_ function or the
return value received from a subroutine call, and occasionally I've
accidentally used the same variable for both and introduced a bug. So
where one of those got in my way, I've renamed it to 'toret' or 'retd'
(the latter short for 'returned') in line with my usual modern
practice, but I haven't done a thorough job of finding all of them.
Finally, one amusing side effect of doing this is that I've had to
separate quite a few chained assignments. It used to be perfectly fine
to write 'a = b = c = TRUE' when a,b,c were int and TRUE was just a
the 'true' defined by stdbool.h, that idiom provokes a warning from
gcc: 'suggest parentheses around assignment used as truth value'!
I think this is the full set of things that ought logically to be
boolean.
One annoyance is that quite a few radio-button controls in config.c
address Conf fields that are now bool rather than int, which means
that the shared handler function can't just access them all with
conf_{get,set}_int. Rather than back out the rigorous separation of
int and bool in conf.c itself, I've just added a similar alternative
handler function for the bool-typed ones.
This commit includes <stdbool.h> from defs.h and deletes my
traditional definitions of TRUE and FALSE, but other than that, it's a
100% mechanical search-and-replace transforming all uses of TRUE and
FALSE into the C99-standardised lowercase spellings.
No actual types are changed in this commit; that will come next. This
is just getting the noise out of the way, so that subsequent commits
can have a higher proportion of signal.
The annoying int64.h is completely retired, since C99 guarantees a
64-bit integer type that you can actually treat like an ordinary
integer. Also, I've replaced the local typedefs uint32 and word32
(scattered through different parts of the crypto code) with the
standard uint32_t.
If values are boolean, it's confusing to use & and | in place of &&
and ||. In two of these three cases it was simply a typo and I've used
the other one; in the third, it was a deliberate avoidance of short-
circuit evaluation (and commented as such), but having seen how easy
it is to make the same typo by accident, I've decided it's clearer to
just move the LHS and RHS evaluations outside the expression.
x11_char_struct returns a pointer or NULL, so while returning FALSE
would have ended up doing the right thing after macro expansion and
integer->pointer conversion, it wasn't actually how I _meant_ to spell
a failure return.
In GTK 2, this function was a new and convenient way to override the
order in which the Tab key cycled through the sub-widgets of a
container, replacing the more verbose mechanism in GTK 1 where you had
to provide a custom implementation of the "focus" method in
GtkContainerClass.
GTK 3.24 has now deprecated gtk_container_set_focus_chain(),
apparently on the grounds that that old system is what they think you
_ought_ to be doing. So I've abandoned set_focus_chain completely, and
switched back to doing it by a custom focus method for _all_ versions
of GTK, with the only slight wrinkle being that between GTK 1 and 2
the method in question moved from GtkContainer to GtkWidget (my guess
is so that an individual non-container widget can still have multiple
separately focusable UI components).
After the recent Seat and LogContext revamps, _nearly_ all the
remaining uses of the type 'Frontend' were in terminal.c, which needs
all sorts of interactions with the GUI window the terminal lives in,
from the obvious (actually drawing text on the window, reading and
writing the clipboard) to the obscure (minimising, maximising and
moving the window in response to particular escape sequences).
All of those functions are now provided by an abstraction called
TermWin. The few remaining uses of Frontend after _that_ are internal
to a particular platform directory, so as to spread the implementation
of that particular kind of Frontend between multiple source files; so
I've renamed all of those so that they take a more specifically named
type that refers to the particular implementation rather than the
general abstraction.
So now the name 'Frontend' no longer exists in the code base at all,
and everywhere one used to be used, it's completely clear whether it
was operating in one of Frontend's three abstract roles (and if so,
which), or whether it was specific to a particular implementation.
Another type that's disappeared is 'Context', which used to be a
typedef defined to something different on each platform, describing
whatever short-lived resources were necessary to draw on the terminal
window: the front end would provide a ready-made one when calling
term_paint, and the terminal could request one with get_ctx/free_ctx
if it wanted to do proactive window updates. Now that drawing context
lives inside the TermWin itself, because there was never any need to
have two of those contexts live at the same time.
(Another minor API change is that the window-title functions - both
reading and writing - have had a missing 'const' added to their char *
parameters / return values.)
I don't expect this change to enable any particularly interesting new
functionality (in particular, I have no plans that need more than one
implementation of TermWin in the same application). But it completes
the tidying-up that began with the Seat and LogContext rework.
This adds the server side of the SSH-2 keyboard-interactive protocol,
and the pair of very similar SSH-1 methods AUTH_TIS and AUTH_CCARD
(which basically differ only in message numbers, and each involve a
single challenge from the server and a response from the user).
We now show the --help output if invoked with no arguments, and the
help text also includes a big safety warning in the hope of stopping
anyone from mistaking this for a _secure_ SSH server implementation.
While I'm here, the errors now all use appname[] in place of
constantly repeating the program name. (Not because I anticipate a
change right now, but if nothing else, it makes things easier moving
errors out into shared source files or between applications.)
Naturally, there's one really glaring goof I find out instants after
'git push'! If Pageant starts a watchdog subprocess which will wait
until the main process terminates and then clean up the socket, then
it had better not have that subprocess keep the standard I/O handles
open, or else commands like eval $(pageant -X) won't terminate.
I've applied the same fix in the X11 socket creation, though I think
it's less critical there.
Unlike the traditional Unix SSH server organisation, the SFTP server
is built into the same process as all the rest of the code. sesschan.c
spots a subsystem request for "sftp", and responds to it by
instantiating an SftpServer object and swapping out its own vtable for
one that talks to it.
(I rather like the idea of an object swapping its own vtable for a
different one in the middle of its lifetime! This is one of those
tricks that would be absurdly hard to implement in a 'proper' OO
language, but when you're doing vtables by hand in C, it's no more
difficult than any other piece of ordinary pointer manipulation. As
long as the methods in both vtables expect the same physical structure
layout, it doesn't cause a problem.)
The SftpServer object doesn't deal directly with SFTP packet formats;
it implements the SFTP server logic in a more abstract way, by having
a vtable method for each SFTP request type with an appropriate
parameter list. It sends its replies by calling methods in another
vtable called SftpReplyBuilder, which in the normal case will write an
SFTP reply packet to send back to the client. So SftpServer can focus
more or less completely on the details of a particular filesystem API
- and hence, the implementation I've got lives in the unix source
directory, and works directly with file descriptors and struct stat
and the like.
(One purpose of this abstraction layer is that I may well want to
write a second dummy implementation, for test-suite purposes, with
completely controllable behaviour, and now I have a handy place to
plug it in in place of the live filesystem.)
In between sesschan's parsing of the byte stream into SFTP packets and
the SftpServer object, there's a layer in the new file sftpserver.c
which does the actual packet decoding and encoding: each request
packet is passed to that, which pulls the fields out of the request
packet and calls the appropriate method of SftpServer. It also
provides the default SftpReplyBuilder which makes the output packet.
I've moved some code out of the previous SFTP client implementation -
basic packet construction code, and in particular the BinarySink/
BinarySource marshalling fuinction for fxp_attrs - into sftpcommon.c,
so that the two directions can share as much as possible.
This server is NOT SECURE! If anyone is reading this commit message,
DO NOT DEPLOY IT IN A HOSTILE-FACING ENVIRONMENT! Its purpose is to
speak the server end of everything PuTTY speaks on the client side, so
that I can test that I haven't broken PuTTY when I reorganise its
code, even things like RSA key exchange or chained auth methods which
it's hard to find a server that speaks at all.
(For this reason, it's declared with [UT] in the Recipe file, so that
it falls into the same category as programs like testbn, which won't
be installed by 'make install'.)
Working title is 'Uppity', partly for 'Universal PuTTY Protocol
Interaction Test Yoke', but mostly because it looks quite like the
word 'PuTTY' with part of it reversed. (Apparently 'test yoke' is a
very rarely used term meaning something not altogether unlike 'test
harness', which is a bit of a stretch, but it'll do.)
It doesn't actually _support_ everything I want yet. At the moment,
it's a proof of concept only. But it has most of the machinery
present, and the parts it's missing - such as chained auth methods -
should be easy enough to add because I've built in the required
flexibility, in the form of an AuthPolicy object which can request
them if it wants to. However, the current AuthPolicy object is
entirely trivial, and will let in any user with the password "weasel".
(Another way in which this is not a production-ready server is that it
also has no interaction with the OS's authentication system. In
particular, it will not only let in any user with the same password,
but it won't even change uid - it will open shells and forwardings
under whatever user id you started it up as.)
Currently, the program can only speak the SSH protocol on its standard
I/O channels (using the new FdSocket facility), so if you want it to
listen on a network port, you'll have to run it from some kind of
separate listening program similar to inetd. For my own tests, I'm not
even doing that: I'm just having PuTTY spawn it as a local proxy
process, which also conveniently eliminates the risk of anyone hostile
connecting to it.
The bulk of the actual code reorganisation is already done by previous
commits, so this change is _mostly_ just dropping in a new set of
server-specific source files alongside the client-specific ones I
created recently. The remaining changes in the shared SSH code are
numerous, but all minor:
- a few extra parameters to BPP and PPL constructors (e.g. 'are you
in server mode?'), and pass both sets of SSH-1 protocol flags from
the login to the connection layer
- in server mode, unconditionally send our version string _before_
waiting for the remote one
- a new hook in the SSH-1 BPP to handle enabling compression in
server mode, where the message exchange works the other way round
- new code in the SSH-2 BPP to do _deferred_ compression the other
way round (the non-deferred version is still nicely symmetric)
- in the SSH-2 transport layer, some adjustments to do key derivation
either way round (swapping round the identifying letters in the
various hash preimages, and making sure to list the KEXINITs in the
right order)
- also in the SSH-2 transport layer, an if statement that controls
whether we send SERVICE_REQUEST and wait for SERVICE_ACCEPT, or
vice versa
- new ConnectionLayer methods for opening outgoing channels for X and
agent forwardings
- new functions in portfwd.c to establish listening sockets suitable
for remote-to-local port forwarding (i.e. not under the direction
of a Conf the way it's done on the client side).
If the child process's standard input is provided by a pipe that's
separate from its output channels, we can - and should - honour a
request to cause that process to receive input EOF, by closing the
output end of that pipe.
As usual, we do this by setting a pending-EOF flag and calling
try_write, to ensure that any buffered output data is sent before the
pipe actually closes.
Not every "session" channel in SSH allocates a pty at all, of course,
and so I'll need a way to run a subprocess without doing so. The
simplest approach seems to be to expand uxpty's remit so that the pty
is optional: now it can open either a pty or a set of pipes for
stdin/out/err, according to an option provided to pty_backend_create.
(It amuses me that without this option I'd have an SSH server which is
incapable of _not_ honouring the "pty-req" channel request. That's
normally the easy part!)
This breaks the previous one-to-one coupling between pty backend
instances and file descriptors passed to uxsel, which I was using to
look up the Pty structure in a tree234 indexed by fd when an uxsel
notification came back. So now each Pty structure contains a
collection of subobjects of a new type PtyFd, and _those_ are what's
stored in the fd-indexed tree.
Another awkward part is that uxsel_set is not incremental: the rwx
flags you pass to it completely supersede the previous set for that
file descriptor, so I had to set up the logic that decides whether
we're trying to read or write each fd in a way that can cope equally
well with the fd aliasing another one (if it's the pty master) or not
(if there are three completely separate pipes).
The SS_SIGFOO family are implemented by sending a signal directly to
the pid of the immediate child process.
I had had the vague idea that it might be more desirable to send the
specified signal to the foreground process group in the tty. That way,
you'd be able to SIGINT (say) the foreground job in a shell session,
and return to the shell _prompt_ without terminating the whole
session, and you could do this in an emergency even if the job was a
full-screen application which had configured termios so that no
keystroke generated SIGINT.
But as far as I can see there's no actual way to do that. I wasn't
able to find any ioctl or termios call to send a signal to a pty's
foreground pgrp, and you can't even do it manually via kill(2) because
first you'd have to find out what the pgrp id _is_, and according to
the man pages, you can only call tcgetpgrp on the slave end of the pty
and even then only if it's your controlling terminal.
So SS_SIGFOO goes to the child process, because that's the only place
I can find that I _can_ send it to sensibly.
SS_BRK translates to tcsendbreak, of course (though I haven't actually
seen any effect of calling this on a pty master, not even if I set
PARMRK on the slave end which by my understanding _ought_ to show me
when break events occur).
This will be applied to the pty's termios settings at creation time,
superseding the default settings uxpty has always used. It works by
including the new sshttymodes.h with TTYMODES_LOCAL_ONLY defined, so
that modes not supported by a particular Unix system are automatically
quietly ignored.
Of course, a struct ssh_ttymodes always has the option of representing
"please make no change to the defaults", and of course, that's
precisely what is done by the one that pty_init constructs for clients
that aren't calling pty_backend_create directly.
The function that does the main pty setup is now called
pty_backend_create(), and has an API better suited to uxpty in
particular than the standard backend_init() virtual constructor. It
leaves off a load of standard parameters to backend_init() which
aren't really relevant to this backend, and it adds the 'argv'
parameter to pass in a split-up command line, which is unique to it.
The old creation function still exists, as a tiny wrapper that calls
the new pty_backend_create. And that version still gets the argv
parameter from the process-global variable pty_argv[], so the call
sites in pterm haven't had to change for this.
This will make it possible to instantiate a pty backend directly from
the SSH server code, without having to do anything really excessively
cumbersome to pass in a subcommand in the form of pre-split argv. (And
I'll add a few more specialist parameters to the new function shortly.)
There was a bit of a race condition depending on whether uxpty spotted
the EOF/EIO on the process's output first, or the SIGCHLD for its
actual termination: if the former came first, it would never bother to
reap the exit code at all.
It still doesn't bother if it's closing the session immediately and
the process genuinely _hasn't_ died (say, if it detaches itself
completely from the controlling tty to run in the background like a
weird parody of an old DOS TSR). But now when we see EOF, we make an
immediate (but nonblocking) attempt to wait for the child process, in
case its exit code was already available and we just hadn't noticed
yet.
The uxpty backend is going to be reused to implement the "session"
channel type in the upcoming SSH server implementation, which puts
quite a few new requirements on it. The first of them is that when we
get EOF from the subprocess's output channel (or rather, EIO from the
pty), we should actually notify the Seat of this.
In principle we should have been doing this all along, I'm pretty
sure. It hasn't happened to matter until now because the receiving
Seats haven't done much with that notification. But it will matter
when that's what controls the sending of SSH_MSG_CHANNEL_EOF.
ssh2connection.c now knows how to unmarshal the message formats for
all the channel requests we'll need to handle when we're the server
and a client sends them. Each one is translated into a call to a new
method in the Channel vtable, which is implemented by a trivial
'always fail' routine in every channel type we know about so far.
This will be used for the server side of X forwarding. It wraps up the
mechanics of listening on the right TCP port and (if possible) the
associated AF_UNIX socket, and also creates an appropriate X authority
file containing authorisation data provided by its caller.
Like the new platform_create_agent_socket, this function spawns a
watchdog subprocess to clean up the mess afterwards, in the hope of at
least _most_ of the time not leaving old sockets and authority files
lying around /tmp,
The code in Pageant that sets up the Unix socket and its containing
directory now lives in a separate file, uxagentsock.c, where it will
also be callable from the upcoming new SSH server when it wants to
create a similar socket for agent forwarding.
While I'm at it, I've also added a feature to create a watchdog
subprocess that will try to clean up the socket and directory once
Pageant itself terminates, in the hope of leaving less cruft lying
around /tmp.
Some kinds of channel, even after they've sent EOF in both directions,
still have something to do before they initiate the CLOSE mechanism
and wind up the channel completely. For example, a session channel
with a subprocess running inside it will want to be sure to send the
"exit-status" or "exit-signal" notification, even if that happens
after bidirectional EOF of the data channels.
Previously, the SSH-2 connection layer had the standard policy that
once EOF had been both sent and received, it would start the final
close procedure. There's a method chan_want_close() by which a Channel
could vary this policy in one direction, by indicating that it wanted
the close procedure to commence after EOF was sent in only one
direction. Its parameters are a pair of booleans saying whether EOF
has been sent, and whether it's been received.
Now chan_want_close can vary the policy in the other direction as
well: if it returns FALSE even when _both_ parameters are true, the
connection layer will honour that, and not send CHANNEL_CLOSE. If it
does that, the Channel is responsible for indicating when it _does_
want close later, by calling sshfwd_initiate_close.
Previously, it returned a human-readable string suitable for log
files, which tried to say something useful about the remote end of a
socket. Now it returns a whole SocketPeerInfo structure, of which that
human-friendly log string is just one field, but also some of the same
information - remote IP address and port, in particular - is provided
in machine-readable form where it's available.
That's more directly useful in uxpty.c (which is currently the only
actual client of the function), and also matches the data that SSH
clients send in "pty-req". Also, it makes that method behave more like
the GUI query function get_window_pixels used by terminal.c (with the
sole exception that unlike g_w_p it's allowed to return failure), so
it becomes even more trivial to implement in the GUI front ends.
The new FdSocket just takes an arbitrary pair of file descriptors to
read and write, optionally with an extra input fd providing the
standard error output from a command. uxproxy.c now just does the
forking and pipe setup, and once it's got all its fds, it hands off to
FdSocket to actually do the reading and writing.
This is very like the reorganisation I did on the Windows side in
commit 98a6a3553 (back in 2013, in preparation for named-pipe sockets
and connection sharing). The idea is that it should enable me to make
a thing that the PuTTY code base sees as a Socket, but which actually
connects to the standard I/O handles of the process it lives in.
Each of the new subroutines corresponds to one of the channel types
for which we know how to parse a CHANNEL_OPEN, and has a collection of
parameters corresponding to the fields of that message structure.
ssh2_connection_filter_queue now confines itself to parsing the
message, calling one of those functions, and constructing an
appropriate reply message if any.
Instead of the central code in ssh2_connection_filter_queue doing both
the job of parsing the channel request and deciding whether it's
acceptable, each Channel vtable now has a method for every channel
request type we recognise.
This is a new vtable-based abstraction which is passed to a backend in
place of Frontend, and it implements only the subset of the Frontend
functions needed by a backend. (Many other Frontend functions still
exist, notably the wide range of things called by terminal.c providing
platform-independent operations on the GUI terminal window.)
The purpose of making it a vtable is that this opens up the
possibility of creating a backend as an internal implementation detail
of some other activity, by providing just that one backend with a
custom Seat that implements the methods differently.
For example, this refactoring should make it feasible to directly
implement an SSH proxy type, aka the 'jump host' feature supported by
OpenSSH, aka 'open a secondary SSH session in MAINCHAN_DIRECT_TCP
mode, and then expose the main channel of that as the Socket for the
primary connection'. (Which of course you can already do by spawning
'plink -nc' as a separate proxy process, but this would permit it in
the _same_ process without anything getting confused.)
I've centralised a full set of stub methods in misc.c for the new
abstraction, which allows me to get rid of several annoying stubs in
the previous code. Also, while I'm here, I've moved a lot of
duplicated modalfatalbox() type functions from application main
program files into wincons.c / uxcons.c, which I think saves
duplication overall. (A minor visible effect is that the prefixes on
those console-based fatal error messages will now be more consistent
between applications.)
This was used by ldisc to communicate back to the front end that a key
had been pressed (or rather, that a keypress had caused a nonzero
amount of session input data). Its only nontrivial implementation was
in gtkwin.c, which used that notification to implement the Unix GUI's
"close window on keypress, if the session was already over" policy.
(Which in turn is Unix-specific, because the rationale is that
sometimes X servers don't have a functioning window manager, so it's
useful to have a way of telling any application to close without using
WM-provided facilities like a close button.)
But gtkwin.c doesn't need to be told by the ldisc that a keypress
happened - it's the one _sending_ those keypresses to ldisc in the
first place! So I've thrown away the three stub implementations of
frontend_keypress, removed the call to it in ldisc.c, and replaced it
with calls in gtkwin.c at all the points during keypress handling
that call ldisc_send.
A visible effect is that pterm's close-on-keypress behaviour will now
only trigger on an actual (input-generating) _keypress_, and not on
other input generation such as a paste action. I think that's an
improvement.
LogContext is now the owner of the logevent() function that back ends
and so forth are constantly calling. Previously, logevent was owned by
the Frontend, which would store the message into its list for the GUI
Event Log dialog (or print it to standard error, or whatever) and then
pass it _back_ to LogContext to write to the currently open log file.
Now it's the other way round: LogContext gets the message from the
back end first, writes it to its log file if it feels so inclined, and
communicates it back to the front end.
This means that lots of parts of the back end system no longer need to
have a pointer to a full-on Frontend; the only thing they needed it
for was logging, so now they just have a LogContext (which many of
them had to have anyway, e.g. for logging SSH packets or session
traffic).
LogContext itself also doesn't get a full Frontend pointer any more:
it now talks back to the front end via a little vtable of its own
called LogPolicy, which contains the method that passes Event Log
entries through, the old askappend() function that decides whether to
truncate a pre-existing log file, and an emergency function for
printing an especially prominent message if the log file can't be
created. One minor nice effect of this is that console and GUI apps
can implement that last function subtly differently, so that Unix
console apps can write it with a plain \n instead of the \r\n
(harmless but inelegant) that the old centralised implementation
generated.
One other consequence of this is that the LogContext has to be
provided to backend_init() so that it's available to backends from the
instant of creation, rather than being provided via a separate API
call a couple of function calls later, because backends have typically
started doing things that need logging (like making network
connections) before the call to backend_provide_logctx. Fortunately,
there's no case in the whole code base where we don't already have
logctx by the time we make a backend (so I don't actually remember why
I ever delayed providing one). So that shortens the backend API by one
function, which is always nice.
While I'm tidying up, I've also moved the printf-style logeventf() and
the handy logevent_and_free() into logging.c, instead of having copies
of them scattered around other places. This has also let me remove
some stub functions from a couple of outlying applications like
Pageant. Finally, I've removed the pointless "_tag" at the end of
LogContext's official struct name.
This is the structure that stores the truncated version of the Event
Log data to be displayed by the GTK Event Log dialog. It persists for
the lifetime of the parent SSH window, so it was deliberate that it
wasn't freed on destruction of the dialog itself, but I also forgot to
free it on destruction of the SSH window. (This will be more important
in multi-connection process architectures like the OS X port, of
course.)
While I'm at it, I'll follow my recent practice by exposing the
structure tag outside gtkdlg.c so that callers can more easily not
confuse it with some other kind of void *.
Looks as if I introduced this in commit 733fcca2c, where the pointer
returned from enum_settings_start() stopped being the same thing as
the underlying 'DIR *' - I needed to retain a check for the outer
containing structure not being NULL but the DIR * being NULL inside
it.
These are things where no fix was actually necessary in the code, but
the FIXME indicated that the comment itself was either in need of a
rewrite or removal.
It's never set to anything but NULL at any call site, and there's been
a FIXME comment in uxucs.c for ages saying it should be removed. I
think it only existed in the first place because it was a facility
supported by the underlying Windows API function and we couldn't see a
reason _not_ to pass it through. But I'm cleaning up FIXMEs, so we
should get rid of it.
(It stood for 'default used', incidentally - as in 'did the function
at any point have to make use of the parameter providing a default
fallback character?'. Nothing to do with _defusing_ things :-)
Ian Jackson points out that the Linux kernel has a macro of this name
with the same purpose, and suggests that it's a good idea to use the
same name as they do, so that at least some people reading one code
base might recognise it from the other.
I never really thought very hard about what order FROMFIELD's
parameters should go in, and therefore I'm pleasantly surprised to
find that my order agrees with the kernel's, so I don't have to
permute every call site as part of making this change :-)
I don't actually know why this was ever here; it appeared in the very
first commit that invented Plug in the first place (7b0e08270) without
explanation. Perhaps Dave's original idea was that sometimes you'd
need those macros _not_ to be defined so that the same names could be
reused as the methods for a particular Plug instance? But I don't
think that ever actually happened, and the code base builds just fine
with those macros defined unconditionally just like all the other sets
of method macros we now have, so let's get rid of this piece of cruft
that was apparently unnecessary all along.
I think that means that _every_ one of my traitoids is now a struct
containing a vtable pointer as one of its fields (albeit sometimes the
only field), and never just a bare pointer.
Now that I'm doing that in so many of the new classes as a more
type-safe alternative to ordinary C casts, I should make sure all the
old code is also reaping the benefits. This commit converts the system
of unifont vtables in the GTK front end, and also the 'unifontsel'
structure that exposes only a few of its fields outside gtkfont.c.
All the main backend structures - Ssh, Telnet, Pty, Serial etc - now
describe structure types themselves rather than pointers to them. The
same goes for the codebase-wide trait types Socket and Plug, and the
supporting types SockAddr and Pinger.
All those things that were typedefed as pointers are older types; the
newer ones have the explicit * at the point of use, because that's
what I now seem to be preferring. But whichever one of those is
better, inconsistently using a mixture of the two styles is worse, so
let's make everything consistent.
A few types are still implicitly pointers, such as Bignum and some of
the GSSAPI types; generally this is either because they have to be
void *, or because they're typedefed differently on different
platforms and aren't always pointers at all. Can't be helped. But I've
got rid of the main ones, at least.
Otherwise we loop round repeatedly with the event loop continuing to
report the same EOF condition on them over and over again, consuming
CPU pointlessly and probably causing other knock-on trouble too.
Without this, we don't receive EOF notifications on pipes, because gtk
uses poll rather than select, which separates those out into distinct
event types.
If you call plug_closing directly from localproxy_try_send, which can
in turn be called directly from sk_write, then the plug's
implementation of plug_closing may well free things that the caller of
sk_write expected not to have vanished.
The corresponding routine in uxnet.c pushes that call to plug_closing
into a toplevel callback, so let's do that here too.
In order to list cross-certifiable host keys in the GUI specials menu,
the SSH backend has been inventing new values on the end of the
Telnet_Special enumeration, starting from the value TS_LOCALSTART.
This is inelegant, and also makes it awkward to break up special
handlers (e.g. to dispatch different specials to different SSH
layers), since if all you know about a special is that it's somewhere
in the TS_LOCALSTART+n space, you can't tell what _general kind_ of
thing it is. Also, if I ever need another open-ended set of specials
in future, I'll have to remember which TS_LOCALSTART+n codes are in
which set.
So here's a revamp that causes every special to take an extra integer
argument. For all previously numbered specials, this argument is
passed as zero and ignored, but there's a new main special code for
SSH host key cross-certification, in which the integer argument is an
index into the backend's list of available keys. TS_LOCALSTART is now
a thing of the past: if I need any other open-ended sets of specials
in future, I can add a new top-level code with a nicely separated
space of arguments.
While I'm at it, I've removed the legacy misnomer 'Telnet_Special'
from the code completely; the enum is now SessionSpecialCode, the
struct containing full details of a menu entry is SessionSpecial, and
the enum values now start SS_ rather than TS_.
Originally, it controlled whether ssh.c should send terminal messages
(such as login and password prompts) to terminal.c or to stderr. But
we've had the from_backend() abstraction for ages now, which even has
an existing flag to indicate that the data is stderr rather than
stdout data; applications which set FLAG_STDERR are precisely those
that link against uxcons or wincons, so from_backend will do the
expected thing anyway with data sent to it with that flag set. So
there's no reason ssh.c can't just unconditionally pass everything
through that, and remove the special case.
FLAG_STDERR was also used by winproxy and uxproxy to decide whether to
capture standard error from a local proxy command, or whether to let
the proxy command send its diagnostics directly to the usual standard
error. On reflection, I think it's better to unconditionally capture
the proxy's stderr, for three reasons. Firstly, it means proxy
diagnostics are prefixed with 'proxy:' so that you can tell them apart
from any other stderr spew (which used to be particularly confusing if
both the main application and the proxy command were instances of
Plink); secondly, proxy diagnostics are now reliably copied to packet
log files along with all the other Event Log entries, even by
command-line tools; and thirdly, this means the option to suppress
proxy command diagnostics after the main session starts will actually
_work_ in the command-line tools, which it previously couldn't.
A more minor structure change is that copying of Event Log messages to
stderr in verbose mode is now done by wincons/uxcons, instead of
centrally in logging.c (since logging.c can now no longer check
FLAG_STDERR to decide whether to do it). The total amount of code to
do this is considerably smaller than the defensive-sounding comment in
logevent.c explaining why I did it the other way instead :-)
Now there's a centralised routine in misc.c to do the sanitisation,
which copies data on to an outgoing bufchain. This allows me to remove
from_backend_untrusted() completely from the frontend API, simplifying
code in several places.
Two use cases for untrusted-terminal-data sanitisation were in the
terminal.c prompts handler, and in the collection of SSH-2 userauth
banners. Both of those were writing output to a bufchain anyway, so
it was very convenient to just replace a bufchain_add with
sanitise_term_data and then not have to worry about it again.
There was also a simplistic sanitiser in uxcons.c, which I've now
replaced with a call to the good one - and in wincons.c there was a
FIXME saying I ought to get round to that, which now I have!
Clients outside ssh.c - all implementations of Channel - will now not
see the ssh_channel data type itself, but only a subobject of the
interface type SshChannel. All the sshfwd_* functions have become
methods in that interface type's vtable (though, wrapped in the usual
kind of macros, the call sites look identical).
This paves the way for me to split up the SSH-1 and SSH-2 connection
layers and have each one lay out its channel bookkeeping structure as
it sees fit; as long as they each provide an implementation of the
sshfwd_ method family, the types behind that need not look different.
A minor good effect of this is that the sshfwd_ methods are no longer
global symbols, so they don't have to be stubbed in Unix Pageant to
get it to compile.
Most of these were 'void *' because they weren't even reliably a
structure type underneath - the per-OS storage systems would directly
cast read/write/enum settings handles to and from random things like
FILE *, Unix DIR *, or Windows HKEY. So I've wrapped them in tiny
structs for the sake of having a sensible structure tag visible
elsewhere in the code.
'struct draw_ctx' has a structure tag inside gtkwin.c, so as per this
week's standard practice, let's expose the tag elsewhere so that
pointers declared that way can't be confused with anything else.
This was a particularly confusing piece of type-danger, because three
different types were passed outside sshshare.c as 'void *' and only
human vigilance prevented one coming back as the wrong one. Now they
all keep their opaque structure tags when they move through other
parts of the code.
There's now an interface called 'Channel', which handles the local
side of an SSH connection-layer channel, in terms of knowing where to
send incoming channel data to, whether to close the channel, etc.
Channel and the previous 'struct ssh_channel' mutually refer. The
latter contains all the SSH-specific parts, and as much of the common
logic as possible: in particular, Channel doesn't have to know
anything about SSH packet formats, or which SSH protocol version is in
use, or deal with all the fiddly stuff about window sizes - with the
exception that x11fwd.c's implementation of it does have to be able to
ask for a small fixed initial window size for the bodgy system that
distinguishes upstream from downstream X forwardings.
I've taken the opportunity to move the code implementing the detailed
behaviour of agent forwarding out of ssh.c, now that all of it is on
the far side of a uniform interface. (This also means that if I later
implement agent forwarding directly to a Unix socket as an
alternative, it'll be a matter of changing just the one call to
agentf_new() that makes the Channel to plug into a forwarding.)
This is another major source of unexplained 'void *' parameters
throughout the code.
In particular, the currently unused testback.c actually gave the wrong
pointer type to its internal store of the frontend handle - it cast
the input void * to a Terminal *, from which it got implicitly cast
back again when calling from_backend, and nobody noticed. Now it uses
the right type internally as well as externally.
Nearly every part of the code that ever handles a full backend
structure has historically done it using a pair of pointer variables,
one pointing at a constant struct full of function pointers, and the
other pointing to a 'void *' state object that's passed to each of
those.
While I'm modernising the rest of the code, this seems like a good
time to turn that into the same more or less type-safe and less
cumbersome system as I'm using for other parts of the code, such as
Socket, Plug, BinaryPacketProtocol and so forth: the Backend structure
contains a vtable pointer, and a system of macro wrappers handles
dispatching through that vtable.
Same principle again - the more of these structures have globally
visible tags (even if the structure contents are still opaque in most
places), the fewer of them I can mistake for each other.
That's one fewer anonymous 'void *' which might be accidentally
confused with some other pointer type if I misremember the order of
function arguments.
While I'm here, I've made its pointer-nature explicit - that is,
'Ldisc' is now a typedef for the structure type itself rather than a
pointer to it. A stylistic change only, but it feels more natural to
me these days for a thing you're going to eventually pass to a 'free'
function.
This commit adds the new ids and fingerprints in the keys appendix of
the manual, and moves the old ones down into the historic-keys
section. I've tweaked a few pieces of wording for ongoing use, so that
they don't imply a specific number of past key rollovers.
The -pgpfp option in all the tools now shows the new Master Key
fingerprint and the previous (2015) one. I've adjusted all the uses of
the #defines in putty.h so that future rollovers should only have to
modify the #defines themselves.
Most importantly, sign.sh bakes in the ids of the current release and
snapshot keys, so that snapshots will automatically be signed with the
new snapshot key and the -r option will invoke the new release key.
This formalises my occasional habit of using a single malloc to make a
block that contains a header structure and a data buffer that a field
of the structure will point to, allowing it to be freed in one go
later. Previously I had to do this by hand, losing the type-checking
advantages of snew; now I've written an snew-style macro to do the
job, plus an accessor macro to cleanly get the auxiliary buffer
pointer afterwards, and switched existing instances of the pattern
over to using that.
The general wisdom these days - in particular as given by the Linux
urandom(4) man page - seems to be that there's no need to use the
blocking /dev/random any more unless you're running at very early boot
time when the system random pool is at serious risk of not having any
entropy in it at all.
In case of non-Linux systems that don't think /dev/urandom is a
standard name, I fall back to /dev/random if /dev/urandom can't be
found.
This parameter returned a substring of the input, which was used for
two purposes. Firstly, it was used to hash the host and server keys
during the initial SSH-1 key setup phase; secondly, it was used to
check the keys in Pageant against the public key blob of a key
specified on the command line.
Unfortunately, those two purposes didn't agree! The first one needs
just the bare key modulus bytes (without even the SSH-1 mpint length
header); the second needs the entire key blob. So, actually, it seems
to have never worked in SSH-1 to say 'putty -i keyfile' and have PuTTY
find that key in Pageant and not have to ask for the passphrase to
decrypt the version on disk.
Fixed by removing that parameter completely, which simplifies all the
_other_ call sites, and replacing it by custom code in those two
places that each does the actually right thing.
There are several old functions that the previous commits have removed
all, or nearly all, of the references to. match_ssh_id is superseded
by ptrlen_eq_string; get_ssh_{string,uint32} is yet another replicated
set of decode functions (this time _partly_ centralised into misc.c);
the old APIs for the SSH-1 RSA decode functions are gone (together
with their last couple of holdout clients), as are
ssh{1,2}_{read,write}_bignum and ssh{1,2}_bignum_length.
Particularly odd was the use of ssh1_{read,write}_bignum in the SSH-2
Diffie-Hellman implementation. I'd completely forgotten I did that!
Now replaced with a raw bignum_from_bytes, which is simpler anyway.
Like the corresponding rewrite of conf serialisation, this affects not
just conf_deserialise itself but also the per-platform filename and
fontspec deserialisers.
Now I've got FROMFIELD, I can rework it so that structures providing
an implementation of the Socket or Plug trait no longer have to have
the vtable pointer as the very first thing in the structure. In
particular, this means that the ProxySocket structure can now directly
implement _both_ the Socket and Plug traits, which is always
_logically_ how it's worked, but previously it had to be implemented
via two separate structs linked to each other.
This is a cleanup I started to notice a need for during the BinarySink
work. It removes a lot of faffing about casting things to char * or
unsigned char * so that some API will accept them, even though lots of
such APIs really take a plain 'block of raw binary data' argument and
don't care what C thinks the signedness of that data might be - they
may well reinterpret it back and forth internally.
So I've tried to arrange for all the function call APIs that ought to
have a void * (or const void *) to have one, and those that need to do
pointer arithmetic on the parameter internally can cast it back at the
top of the function. That saves endless ad-hoc casts at the call
sites.
This removes a lot of pointless duplications of those constants.
Of course, _ideally_, I should upgrade to C99 bool throughout the code
base, replacing TRUE and FALSE with true and false and tagging
variables explicitly as bool when that's what they semantically are.
But that's a much bigger piece of work, and shouldn't block this
trivial cleanup!
This simplifies the client code both in ssh.c and in the client side
of Pageant.
I've cheated a tiny bit by preparing agent requests in a strbuf that
has space reserved at the front for the packet frame, which makes life
easier for the code that sends them off.
This affects all the functions that generate public and private key
and signature blobs of all kinds, plus ssh_ecdhkex_getpublic. Instead
of returning a bare block of memory and taking an extra 'int *length'
parameter, all these functions now write to a BinarySink, and it's the
caller's job to have prepared an appropriate one where they want the
output to go (usually a strbuf).
The main value of this change is that those blob-generation functions
were chock full of ad-hoc length-counting and data marshalling. You
have only to look at rsa2_{public,private}_blob, for example, to see
the kind of thing I was keen to get rid of!
Now instead of iterating through conf twice in separate functions,
once to count up the size of the serialised data and once to write it
out, I just go through once and dump it all in a strbuf.
(Of course, I could still do a two-pass count-then-allocate approach
easily enough in this system; nothing would stop me writing a
BinarySink implementation that didn't actually store any data and just
counted its size, and then I could choose at each call site whether I
preferred to do it that way.)
In fact, those functions don't even exist any more. The only way to
get data into a primitive hash state is via the new put_* system. Of
course, that means put_data() is a viable replacement for every
previous call to one of the per-hash update functions - but just
mechanically doing that would have missed the opportunity to simplify
a lot of the call sites.
This centralises a few things that multiple header files were
previously defining, and were protecting against each other's
redefinition with ifdefs - small things like structs and typedefs. Now
all those things are in a defs.h which is by definition safe to
include _first_ (out of all the codebase-local headers) and only need
to be defined once.
Lots of functions had really generic names (like 'makekey'), or names
that missed out an important concept (like 'rsakey_pubblob', which
loads a public blob from a _file_ and doesn't generate it from an
in-memory representation at all). Also, the opaque 'int order' that
distinguishes the two formats of public key blob is now a mnemonic
enumeration, and while I'm at it, rsa_ssh1_public_blob takes one of
those as an extra argument.
This seems to be a knock-on effect of my recent reworking of the SSH
code to be based around queues and callbacks. The loop iteration
function in uxsftp.c (ssh_sftp_do_select) would keep going round its
select loop until something had happened on one of its file
descriptors, and then return to the caller in the assumption that the
resulting data might have triggered whatever condition the caller was
waiting for - and if not, then the caller checks, finds nothing
interesting has happened, and resumes looping with no harm done.
But now, when something happens on an fd, it doesn't _synchronously_
trigger the follow-up condition PSFTP was waiting for (which, at
startup time, happens to be back->sendok() starting to return TRUE).
Instead, it schedules a callback, which will schedule a callback,
which ... ends up setting that flag. But by that time, the loop
function has already returned, the caller has found nothing
interesting and resumed looping, and _now_ the interesting thing
happens but it's too late because ssh_sftp_do_select will wait until
the next file descriptor activity before it next returns.
Solution: give run_toplevel_callbacks a return value which says
whether it's actually done something, and if so, return immediately in
case that was the droid the caller was looking for. As it were.
In commit 528513dde I absentmindedly replaced a write to the local
variable 'need_size' of drawing_area_setup with a write to
inst->drawing_area_setup_needed, imagining that they had the same
effect. But actually, need_size was doing two jobs and I only replaced
one of them: it was also the variable that indicated that the logical
terminal size had changed and so we had to call term_size() to make
the terminal.c data structures resize themselves appropriately. The
loss of that call also inhibited generation of SIGWINCH.
NFC for the moment, because the bufchain is always specially
constructed to hold exactly the same data that would have been passed
in to the function as a (pointer,length) pair. But this API change
allows get_userpass_input to express the idea that it consumed some
but not all of the data in the bufchain, which means that later on
I'll be able to point the same function at a longer-lived bufchain
containing the full stream of keyboard input and avoid dropping
keystrokes that arrive too quickly after the end of an interactive
password prompt.
NFC: this is a preliminary refactoring, intended to make my life
easier when I start changing around the APIs used to pass user
keyboard input around. The fewer functions even _have_ such an API,
the less I'll have to do at that point.
Changing the window's font size with Alt-< or Alt-> was not setting
any of the flags that make drawing_area_setup consider itself to have
been non-spuriously called, so the real window would enlarge without
the backing surface also doing so.
Since Pageant contains its own passphrase prompt system rather than
delegating it to another process, it's not trivial to use it in other
contexts. But having gone to the effort of coming up with my own
askpass system that (I think) does a better job at not revealing the
length of the password, I _want_ to use it in other contexts where a
GUI passphrase or password prompt is needed. Solution: an --askpass
option.
Mostly for debugging purposes, because I'm tired of having to use
'setsid' to force Pageant to select the GUI passphrase prompt when I'm
trying to fix bugs in gtkask.c. But I can also imagine situations in
which the ability to force a GUI prompt window might be useful to end
users, for example if the process does _technically_ have a
controlling terminal but it's not a user-visible one (say, in the back
end of some automation tool like expect(1)).
For symmetry, I also provide an option to force the tty prompt. That's
less obviously useful, because that's already the preferred prompt
type when both methods are available - so the only use for it would be
if you wanted to ensure that Pageant didn't _accidentally_ try to
launch a GUI prompt, and aborted with an error if it couldn't use a
tty prompt.
I've found Unix Pageant's GTK password prompt to be a bit flaky on
Ubuntu 18.04. Part of the reason for that seems to be (I _think_) that
GTK has changed its internal order of setting things up, so that you
can no longer call gtk_widget_show_now() and expect that when it
returns everything is ready to do a gdk_seat_grab. Another part is
that - completely mysteriously as far as I can see - a _failed_
gdk_seat_grab(GDK_SEAT_CAPABILITY_KEYBOARD) has the side effect of
calling gdk_window_hide on the window you gave it!
So I've done a considerable restructuring that means we no longer
attempt to do the keyboard grab synchronously in gtk_askpass_setup.
Instead, we make keyboard grab attempts during the run of gtk_main,
scheduling each one on a timer if the previous attempt fails.
This means I need a visual indication of 'not ready for you to type
anything yet', which I've arranged by filling in the three drawing
areas to mid-grey. At the point when the keyboard grab completes and
the window becomes receptive to input, they turn into the usual one
black and two white.
In GTK 3.10 and above, high-DPI support is arranged by each window
having a property called a 'scale factor', which translates logical
pixels as seen by most of the GTK API (widget and window sizes and
positions, coordinates in the "draw" event, etc) into the physical
pixels on the screen. This is handled more or less transparently,
except that one side effect is that your Cairo-based drawing code had
better be able to cope with that scaling without getting confused.
PuTTY's isn't, because we do all our serious drawing on a separate
Cairo surface we made ourselves, and then blit subrectangles of that
to the window during updates. This has two bad consequences. Firstly,
our surface has a size derived from what GTK told us the drawing area
size is, i.e. corresponding to GTK's _logical_ pixels, so when the
scale factor is (say) 2, our drawing takes place at half size and then
gets scaled up by the final blit in the draw event, making it look
blurry and unpleasant. Secondly, those final blits seem to end up
offset by half a pixel, so that a second blit over the same
subrectangle doesn't _quite_ completely wipe out the previously
blitted data - so there's a ghostly rectangle left behind everywhere
the cursor has been.
It's not that GTK doesn't _let_ you find out the scale factor; it's
just that it's in an out-of-the-way piece of API that you have to call
specially. So now we do: our backing surface is now created at a pixel
resolution matching the screen's real pixels, and we translate GTK's
scale factor into an ordinary cairo_scale() before we commence
drawing. So we still end up drawing the same text at the same size -
and this strategy also means that non-text elements like cursor
outlines and underlining will be scaled up with the screen DPI rather
than stubbornly staying one physical pixel thick - but now it's nice
and sharp at full screen resolution, and the subrectangle blits in the
draw event are back to affecting the exact set of pixels we expect
them to.
One silly consequence is that, immediately after removing the last
one, I've installed a handler for the GTK "configure-event" signal!
That's because the GTK 3 docs claim that that's how you get notified
that your scale factor has changed at run time (e.g. if you
reconfigure the scale factor of a whole monitor in the GNOME settings
dialog). Actually in practice I seem to find out via the "draw" event
before "configure" bothers to tell me, but now I've got a usefully
idempotent function for 'check whether the scale factor has changed
and sort it out if so', I don't see any harm in calling it from
anywhere it _might_ be useful.
I've been using that signal since the very first commit of this source
file, as a combined way to be notified when the size of the drawing
area changes (typically due to user window resizing actions) and also
when the drawing area is first created and available to be drawn on.
Unfortunately, testing on Ubuntu 18.04, I ran into an oddity, in which
the call to gtk_widget_show(inst->window) in new_session_window() has
the side effect of delivering a spurious configure_event on the
drawing area with size 1x46 pixels. This causes the terminal to resize
itself to 1 column wide, and the mistake isn't rectified until a
followup configure-event arrives after new_session_window returns to
the GTK main loop. But that means terminal output can occur between
those two configure events (the connection-sharing "Reusing a shared
connection to host.name" is a good example), and when it does, it gets
embarrassingly wrapped at one character per line down the left column.
I briefly tried to bodge around this by trying to heuristically guess
which configure events were real and which were spurious, but I have
no faith in that strategy continuing to work. I think a better
approach is to abandon configure-event completely, and move to a
system in which the two purposes I was using it for are handled by two
_different_ GTK signals, namely "size-allocate" (for knowing when we
get resized) and "realize" (for knowing when the drawing area
physically exists for us to start setting up Cairo or GDK machinery).
The result seems to have fixed the silly one-column wrapping bug, and
retained the ability to handle window resizes, on every GTK version I
have conveniently available to test on, including GTK 3 both before
and after these spurious configures started to happen.
GTK 3 PuTTY/pterm has always assumed that if it was compiled with
_support_ for talking to the raw X11 layer underneath GTK and GDK,
then it was entitled to expect that raw X11 layer to exist at all
times, i.e. that GDK_DISPLAY_XDISPLAY would return a meaningful X
display that it could do useful things with. So if you ran it over the
GDK Wayland backend, it would immediately segfault.
Modern GTK applications need to cope with multiple GDK backends at run
time. It's fine for GTK PuTTY to _contain_ the code to find and use
underlying X11 primitives like the display and the X window id, but it
should be prepared to find that it's running on Wayland (or something
else again!) so those functions don't return anything useful - in
which case it should degrade gracefully to the subset of functionality
that can be accessed through backend-independent GTK calls.
Accordingly, I've centralised the use of GDK_DISPLAY_XDISPLAY into a
support function get_x_display() in gtkmisc.c, which starts by
checking that there actually is one first. All previous direct uses of
GDK_*_XDISPLAY now go via that function, and check the result for NULL
afterwards. (To save faffing about calling that function too many
times, I'm also caching the display pointer in more places, and
passing it as an extra argument to various subfunctions, mostly in
gtkfont.c.)
Similarly, the get_windowid() function that retrieves the window id to
put in the environment of pterm's child process has to be prepared for
there not to be a window id.
This isn't a complete fix for all Wayland-related problems. The other
one I'm currently aware of is that the default font is "server:fixed",
which is a bad default now that it won't be available on all backends.
And I expect that further problems will show up with more testing. But
it's a start.
This is a heavily edited (by me) version of a patch originally due to
Nico Williams and Viktor Dukhovni. Their comments:
* Don't delegate credentials when rekeying unless there's a new TGT
or the old service ticket is nearly expired.
* Check for the above conditions more frequently (every two minutes
by default) and rekey when we would delegate credentials.
* Do not rekey with very short service ticket lifetimes; some GSSAPI
libraries may lose the race to use an almost expired ticket. Adjust
the timing of rekey checks to try to avoid this possibility.
My further comments:
The most interesting thing about this patch to me is that the use of
GSS key exchange causes a switch over to a completely different model
of what host keys are for. This comes from RFC 4462 section 2.1: the
basic idea is that when your session is mostly bidirectionally
authenticated by the GSSAPI exchanges happening in initial kex and
every rekey, host keys become more or less vestigial, and their
remaining purpose is to allow a rekey to happen if the requirements of
the SSH protocol demand it at an awkward moment when the GSS
credentials are not currently available (e.g. timed out and haven't
been renewed yet). As such, there's no need for host keys to be
_permanent_ or to be a reliable identifier of a particular host, and
RFC 4462 allows for the possibility that they might be purely
transient and only for this kind of emergency fallback purpose.
Therefore, once PuTTY has done a GSS key exchange, it disconnects
itself completely from the permanent host key cache functions in
storage.h, and instead switches to a _transient_ host key cache stored
in memory with the lifetime of just that SSH session. That cache is
populated with keys received from the server as a side effect of GSS
kex (via the optional SSH2_MSG_KEXGSS_HOSTKEY message), and used if
later in the session we have to fall back to a non-GSS key exchange.
However, in practice servers we've tested against do not send a host
key in that way, so we also have a fallback method of populating the
transient cache by triggering an immediate non-GSS rekey straight
after userauth (reusing the code path we also use to turn on OpenSSH
delayed encryption without the race condition).
Colin Watson reports that on pre-releases of Ubuntu 18.04, configure
events which don't actually involve a change of window size show up
annoyingly often. Our handling of configure events involves throwing
away the backing Cairo surface, making a fresh blank one, and
scheduling a top-level callback to get terminal.c to do a repaint and
populate the new surface; so a draw event before that callback occurs
causes the window contents to flicker off and on again, not to mention
wasting a lot of time.
The simplest solution is to spot spurious configures, and respond by
not throwing away the previous Cairo surface in the first place.
Except in GTK1 (which doesn't have the former), via a gtkcompat.h
workaround.
Up-to-date GTK3 has deprecated gdk_beep(), causing build failures due
to the default -Werror setting.
Looks as if I haven't retried the GTK1 build for a while, and recent
GTK frontend development has broken it. The selection revamp has
pointed out that GTK1 didn't have the accessor function
gtk_selection_data_get_selection(), the standard GdkAtom value
GDK_SELECTION_CLIPBOARD, or keysyms for alphabetic characters; and
also I had an initialisation of one of my own structure fields
(dp->selparams) accidentally not guarded by the same GTK-versioning
ifdef that controls whether or not it was defined.
Ahem. I _spotted_ this in code review, and forgot to make the change
before pushing!
Because it's legitimate for a C implementation to define 'NULL' so
that it expands to just 0, it follows that if you use NULL in a
variadic argument list where the callee will expect to extract a
pointer, you run the risk of putting an int-sized rather than
pointer-sized argument on the list and causing the consumer to get out
of sync. So you have to add an explicit cast.
The PuTTY GUIs (Unix and Windows) maintain an in-memory event log
for display to users as they request. This uses ints for tracking
eventlog size, which is subject to memory exhaustion and (given
enough heap space) overflow attacks by servers (via, e.g., constant
rekeying).
Also a bounded log is more user-friendly. It is rare to want more
than the initial logging and the logging from a few recent rekey
events.
The Windows fix has been tested using Dr. Memory as a valgrind
substitute. No errors corresponding to the affected code showed up.
The Dr. Memory results.txt was split into a file per-error and then
grep Error $(grep -l windlg *)|cut -d: -f3-|sort |uniq -c
was used to compare. Differences arose from different usage of the GUI,
but no error could be traced to the code modified in this commit.
The Unix fix has been tested using valgrind. We don't destroy the
eventlog_stuff eventlog arrays, so we can't be entirely sure that we
don't leak more than we did before, but from code inspection it looks
like we don't (and anyways, if we leaked as much as before, just without
the integer overflow, well, that's still an improvement).
Now we don't annoyingly print the 'askappend' prompt if you ask a
PuTTY tool to write its packet log to something that's not a regular
file, such as /dev/fd/1 or /dev/tty or a named pipe.
(In the case of a named pipe, another annoyance fixed by this change
is that we also don't open it for reading in the course of the
existence test.)
Apparently I haven't tried a GTK2 build since the most recent set of
GTK-related code reorganisation. Some functions that were ifdef'ed out
in GTK3 builds were now unused even in GTK2 builds (and, because they
were also declared static, caused a -Werror build failure); and the
pointless stub version of gtkapp.c was missing a stub version of a
recently added function referred to from another module.
gtk_application_set_accels_for_action() is new in Gtk 3.12, but (e.g.)
Ubuntu 14.04 LTS still ships with Gtk 3.10.
On the other hand, the function I've used instead,
gtk_application_add_accelerator(), is deprecated from Gtk 3.14 onwards,
indicating that it will disappear in some future version, so I've left
the newer code in against that day.
It actually doesn't seem to be necessary: running 'otool -L' on the
real binary in the application bundle (Pterm-bin or PuTTY-bin) lists a
lot of paths starting with "@executable_path/../Resources/", which I
take to mean that the application is already set up to automatically
load the GTK shared libraries out of its own bundle directory, without
me having to give it the extra hint of DYLD_LIBRARY_PATH.
Moreover, I just got round to upgrading my Mac to High Sierra, and now
the version of osxlaunch _with_ DYLD_LIBRARY_PATH is causing a crash
at program load time, when the libpng in the MacOS system library
directory tries to use the libz in the application bundle and finds
that it doesn't provide an entry point it was expecting
('inflateValidate'). I could try to fix that by updating the libz
version in my OS X PuTTY build environment, but that seems to me to
set a precedent of running to keep up with any further dependencies
the system libraries happen to acquire in later releases. Better to
reset DYLD_LIBRARY_PATH so that the system libpng will load the system
libz and not get confused in the first place.
I've been having intermittent segfaults in this launcher program, and
by means of the new TEST_COMPILE_ON_LINUX facility introduced by
commit eef8cac28, I ran it under valgrind which helpfully pointed out
several pointers between linked-list nodes which I'd been relying on
OS memory allocation to happen to have zeroed for me.
By default, the program still builds on Linux to a stub that just
prints 'nothing to see here'. But if you compile with
-DTEST_COMPILE_ON_LINUX, it compiles to a program that still doesn't
do anything _actually_ useful, but goes through all the same motions
that real osxlaunch would go through, until the final execv(2) fails
because of course it's not _really_ living in an application bundle
directory of the right shape.
That allows me to run all the setup code under the debugging tools I'm
most used to, in my preferred environment. (Same rationale as having
puttyapp / ptermapp build for Linux too.)
I've filled in the results of some not-entirely-conclusive
investigation into the trackpad scrolling issue, some thoughts on
resizing, and reordered the items into what currently seems the most
sensible order to me.
This still isn't complete: I also need to add the variable collections
of things like mid-session special commands and saved session names,
and also I need to try to grey out menu items when they're not
applicable. But it's a start.
Just to avoid an endless proliferation of functions too small to see,
I've arranged an enumeration of action ids and a single
app_menu_action function on the receiving end, and in gtkapp.c, a list
macro that means I at least don't have to define the tiny callback
functions and the GActionEntry records by hand and keep them in sync.
This fixes the problem I'd previously noticed, that if you don't
configure the "Command key acts as Meta" setting, then keystrokes like
Command-Q which _ought_ to function as accelerators for the
application menu bar don't.
Turns out that this was for the totally obvious reason: the keyboard
event was still being processed by gtkwin.c's key_event() and
translated via the GTK IM into ordinary keyboard input. If instead I
return FALSE from key_event on detecting that a key event has a
non-Meta-configured Command modifier, then it will go to the next-
level key-event handler inside GTK itself which implements the menu
accelerator behaviour. Another problem ticked off the OS X checklist.
That's what I get for not testing on all platforms before I push.
Forgot that, since OS X GTK mimics X11 GTK closely enough to still use
the name "CLIPBOARD" for the unique system clipboard, I had left this
code base's internal name for it as CLIP_CLIPBOARD and not the
CLIP_SYSTEM I used on Windows.
The gtkapp.c menu now has a Copy as well as Paste option; those menu
items, as well as the corresponding ones on the context menu and Copy
All, now address sets of clipboards parametrised between OS X and
ordinary GTK in unix.h. Also I've tweaked the wording of the
context-menu items to not use the X-specific terminology "CLIPBOARD"
on OS X.
I had a segfault on OS X today at Pterm.app shutdown. I wasn't able to
reproduce it in a debugger, but the cause seemed to be that
clipboard_clear called term_deselect (this was from before the patch
series that renamed that function) when inst->term was already NULL.
This must be because a clipboard_data_instance outlived its associated
inst->term, and quite likely its associated inst as well. But we can't
free those structures when a gui_data is freed, because GTK callbacks
will still depend on them; so instead we must have each gui_data keep
a list of active cdis pointing at it, and then at destruction time,
walk along the list nulling out each one's pointer to part of itself.
I've done the general clipboard revamp, and also, since I added
Ctrl-Shift-{C,V} as a new pair of UI actions for copy and paste, I've
also fulfilled the requirement that there should be some method of
non-menu-based pasting that doesn't depend on a middle mouse button or
an Ins key.
I think the list of OS X missing features is now down to details of
the OS X GTK port _itself_, as opposed to structural issues in the
general code base.
On all platforms, you can now configure which clipboard the mouse
pastes from, which clipboard Ctrl-Ins and Shift-Ins access, and which
Ctrl-Shift-C and Ctrl-Shift-V access. In each case, the options are:
- nothing at all
- a clipboard which is implicitly written by the act of mouse
selection (the PRIMARY selection on X, CLIP_LOCAL everywhere else)
- the standard clipboard written by explicit copy/paste UI actions
(CLIPBOARD on X, the unique system clipboard elsewhere).
Also, you can control whether selecting text with the mouse _also_
writes to the explicitly accessed clipboard.
The wording of the various messages changes between platforms, but the
basic UI shape is the same everywhere.
All the data fields referring to the selection in 'struct gui_data'
have been pulled out into a separate structure of which there are now
multiple instances, and I've plumbed through what should be the right
pointers and integer ids to everywhere they should go. So now the GTK
front end defines CLIP_PRIMARY and CLIP_CLIPBOARD in place of the
temporary cop-out CLIP_SYSTEM from the previous commit, and copying
and pasting can be done via either one.
The defaults should be the same as before, except that now the non-Mac
versions of the GtkApplication front ends will access CLIP_PRIMARY in
response to most actions but the 'Paste' menu item will paste from
CLIP_CLIPBOARD. (That's mostly just as a demonstration that accessing
multiple clipboards even works.)
This lays some groundwork for making PuTTY's cut and paste handling
more flexible in the area of which clipboard(s) it reads and writes,
if more than one is available on the system.
I've introduced a system of list macros which define an enumeration of
integer clipboard ids, some defined centrally in putty.h (at present
just a CLIP_NULL which never has any text in it, because that seems
like the sort of thing that will come in useful for configuring a
given copy or paste UI action to be ignored) and some defined per
platform. All the front end functions that copy and paste take a
clipboard id, and the Terminal structure is now configured at startup
to tell it which clipboard id it should paste from on a mouse click,
and which it should copy from on a selection.
However, I haven't actually added _real_ support for multiple X11
clipboards, in that the Unix front end supports a single CLIP_SYSTEM
regardless of whether it's in OS X or GTK mode. So this is currently a
NFC refactoring which does nothing but prepare the way for real
changes to come.
Previously, both the Unix and Windows front ends would respond to a
paste action by retrieving data from the system clipboard, converting
it appropriately, _storing_ it in a persistent dynamic data block
inside the front end, and then calling term_do_paste(term), which in
turn would call back to the front end via get_clip() to retrieve the
current contents of that stored data block.
But, as far as I can tell, this was a completely pointless mechanism,
because after a data block was written into this storage area, it
would be immediately used for exactly one paste, and then never
accessed again until the next paste action caused it to be freed and
replaced with a new chunk of pasted data.
So why on earth was it stored persistently at all, and why that
callback mechanism from frontend to terminal back to frontend to
retrieve it for the actual paste action? I have no idea. This change
removes the entire system and replaces it with the completely obvious
alternative: the character-set-converted version of paste data is
allocated in a _local_ variable in the frontend paste functions,
passed directly to term_do_paste which now takes (buffer,length)
parameters, and freed immediately afterwards. get_clip() is gone.
When testing the previous commit, I went to great lengths to check all
the tricky corner cases of the detailed command-line argument handling
in Plink and PuTTY, on Windows and Unix. And did I also double-check
that I had not completely broken the very simplest possible invocation
of pterm? I did not.
The call to cmdline_host_ok() in gtkmain.c was failing an assertion in
pterm, because that function only expects to have been called by a
program that has the TOOLTYPE_HOST_ARG flag set - if that flag isn't
set, the program is expected to come up with its own answer to the
question (because I wasn't sure what the right fallback answer would
be). And I forgot to conditionalise the call between PuTTY and pterm.
This is another piece of long-overdue refactoring similar to the
recent commit e3796cb77. But where that one dealt with normalisation
of stuff already stored _in_ a Conf by whatever means (including, in
particular, handling a user typing 'username@host.name' into the
Hostname box of the GUI session dialog box), this one deals with
handling argv entries and putting them into the Conf.
This isn't exactly a pure no-functional-change-at-all refactoring. On
the other hand, it isn't a full-on cleanup that completely
rationalises all the user-visible behaviour as well as the code
structure. It's somewhere in between: I've preserved all the behaviour
quirks that I could imagine a reason for having intended, but taken
the opportunity to _not_ faithfully replicate anything I thought was
clearly just a bug.
So, for example, the following inconsistency is carefully preserved:
the command 'plink -load session nextword' treats 'nextword' as a host
name if the loaded session hasn't provided a hostname already, and
otherwise treats 'nextword' as the remote command to execute on the
already-specified remote host, but the same combination of arguments
to GUI PuTTY will _always_ treat 'nextword' as a hostname, overriding
a hostname (if any) in the saved session. That makes some sense to me
because of the different shapes of the overall command lines.
On the other hand, there are two behaviour changes I know of as a
result of this commit: a third argument to GUI PuTTY (after a hostname
and port) now provokes an error message instead of being silently
ignored, and in Plink, if you combine a -P option (specifying a port
number) with the historical comma-separated protocol selection prefix
on the hostname argument (which I'd completely forgotten even existed
until this piece of work), then the -P will now override the selected
protocol's default port number, whereas previously the default port
would win. For example, 'plink -P 12345 telnet,hostname' will now
connect via Telnet to port 12345 instead of to port 23.
There may be scope for removing or rethinking some of the command-
line syntax quirks in the wake of this change. If we do decide to do
anything like that, then hopefully having it all in one place will
make it easier to remove or change things consistently across the
tools.
Now 'putty user@host' will do what you wanted on Unix the same way it
always has on Windows.
(Thanks to Geoff Winkless for pointing out this inconsistency. I've
redone his actual patch my way, but he should still be credited for
the inspiration!)
A more or less identical piece of code to sanitise the CONF_host
string prior to session launch existed in Windows PuTTY and both
Windows and Unix Plink. It's long past time it was centralised.
While I'm here, I've added a couple of extra comments in the
centralised version, including one that - unfortunately - tries _but
fails_ to explain why a string of the form "host.name:1234" doesn't
get the suffix moved into CONF_port the way "user@host" moves the
prefix into CONF_username. Commit c1c1bc471 is the one I'm referring
to in the comment, and unfortunately it has an unexplained one-liner
log message from before I got into the habit of being usefully
verbose.
Stopping dialog boxes from being modal is now done; post_main() is
defunct; nothing left in gtkwin.c does an inappropriate whole-process
termination in response to a window-level error or closure condition.
(There is still modalfatalbox(), but that's not an _inappropriate_
process termination.)
This one's in frontend_keypress(), which is supposed to close the
window on the first keypress after the session inside it terminates
(that is, if your close-on-exit settings haven't made it close already
at that point).
It looks to me as if that behaviour doesn't currently _work_, and
hasn't worked for quite a while (certainly it was broken as of 0.70,
well before I started on this weekend's refactoring), because when the
session terminates we delete inst->ldisc and that's what would
otherwise be calling frontend_keypress. I should probably decide what
to do about that at some point. But for the moment, I'm satisfied to
simply not break this functionality any worse by making it not a
process-global exit :-)
For gtkapp-based tools that will have to stop being a program-fatal
error, so I've turned it into a function called window_setup_error
(which I could in principle reuse for other problems in the long and
tortuous progress of new_session_window), and kept the original
handling in gtkmain.c's implementation of that function while gtkapp.c
does something more sensible with a message box.
Not all gtkwin-based tools use it. Only the ones with one session per
process, which parse a command line describing that session and might
reasonably want to report errors in that command line by writing to
standard error and exiting the program.
In other words, precisely the ones that link in gtkmain.c and not
gtkapp.c. So gtkmain.c is a more sensible place to put that
error-reporting function.
This was one of a handful of remaining places in gtkwin.c where exit()
is called incautiously. Of course, a failure to set up one SSH
connection should only be fatal to that connection, not the whole
process, so really we should be feeding into the connection_fatal
system.
This existed in order to avoid the various confusions that could
happen if a toplevel callback ran in the context of a subsidiary
instance of gtk_main(). Now there aren't any subsidiary gtk_main
instances any more, this mechanism is no longer needed, and I can
throw it out. It was horrible anyway.
I think these began to appear as a consequencce of replacing
fatalbox() calls with more sensible error reports: the more specific a
direction I send a report in, the greater the annoying possibility of
re-entrance when the resulting error handler starts closing stuff.
This change requires me to break up the general cleanups in
delete_inst() into two halves: one runs when the error message box is
created, and cleans up the network connection and all the stuff
associated with it, and the other runs when the error message is
dismissed and the window can actually close.
It's an incoherent concept! There should not be any such thing as an
error box that terminates the entire program but is not modal. If it's
bad enough to terminate the whole program, i.e. _all_ currently live
connections, then there's no point in permitting progress to continue
in windows other than the affected one, because all windows are
affected anyway.
So all previous uses of fatalbox() have become modalfatalbox(), except
those which looked to me as if they shouldn't have been fatal in the
first place, e.g. lingering pieces of error handling in winnet.c which
ought to have had the severity of 'give up on this particular Socket
and close it' rather than 'give up on the ENTIRE UNIVERSE'.
I've also moved it out into gtkwin.c, because it seemed easier to do
the 'find existing instance of this dialog and raise it' dance there
than to split it across source files pointlessly.
Apart from the specific benefit of non-modality, this also makes it a
lot simpler compared to the previous code! I'm not completely sure why
I wasn't using the standard gtkdlg.c message box system all along.
This fits into a new dialog-box slot (because it might have to come up
at the same time as a network prompt), and makes use of the existing
callback system in logging.c which buffers the logging data until the
user says what they want done with it.
Now it has several 'slots', each named for a particular class of
subsidiary dialog box that a session window can have at most one of,
and register_network_prompt_dialog has a more general name and takes
an enum-typed argument identifying a slot. This lets me avoid writing
a zillion annoyingly similar function pairs and corresponding snippets
of cleanup code in delete_inst.
If you close a session window with an associated SSH back end, the
back end may call back to notify_remote_exit() from ssh_free(), which
queues a new top-level callback citing the inst structure we were
about to delete.
We could fix this by introducing a special 'moribund' flag which
inhibits notify_remote_exit from queueing a callback, but far easier
is to move the delete_callbacks_for_context() call to _after_ all
subsidiary things have been cleaned up, so that any last-minute
callbacks they might schedule will be promptly unscheduled again
before they do any damage.
This follows exactly the same pattern as for verify_ssh_host_key, but
the results of the dialog box are simpler (a plain yes-no response),
so the two dialog types can share a callback.
I've switched it to using the new non-modal create_message_box, and
provided a callback function which handles the cleanup afterwards.
I had expected this to be a lot more work, because I'd imagined that
I'd have to contort the coroutines in ssh.c to give them the ability
to wait for an asynchronously delivered result from that user prompt.
But in fact that wasn't necessary, because just such a mechanism has
been sitting there unused since commit 8574822b9 in 2005, when I added
it as part of my _previous_ attempt to write an OS X front end! (The
abandoned one written in native ObjC + Cocoa.)
When I switch verify_ssh_host_key() and friends over to creating
non-modal message boxes and returning to the main loop, there will be
a risk that their parent window will need to close for some other
reason while the user hasn't answered the pending question yet. (E.g.
if the user presses the main session window's close button, which will
no longer be a prohibited UI action once the transient dialog is not
modal.)
At that point we need to get rid of the pending dialog box, both for
UI purposes (it would look silly and be confusing to leave it lying
around) and for memory management (if the user subsequently clicks OK
in such a dialog it would probably try to leave its result somewhere
stale).
So now there's a mechanism for gtkwin.c remembering what the current
'network prompt dialog' is, if any (in which category I intend to
include everything triggered from ssh.c's various reasons for asking
crypto-related questions), and cleaning it up when the struct gui_data
it belongs to goes away.
If a dialog box is destroyed by the program before the user has
pressed one of the result-delivering buttons - e.g. because the parent
window closes so the dialog is no longer relevant to anything anyway -
then dlgparam_destroy would never call the client code's provided
callback. That makes sense in terms of the callback wanting to _take
action_ based on the result of the dialog box, but it ignores the
possibility that the callback may simply need to free its own context
structure.
So now dlgparam_destroy always calls the client's callback, even if
the result it passes is negative (meaning 'the user never got round to
pressing any of the dialog-ending buttons'), and all the existing
client callbacks handle the negative-result case by doing nothing
except freeing any allocated memory they might have.
This does the bulk of the work previously done by message_box()
proper, but takes a pointer to a result-reporting callback function
identical to the one we pass to create_config_box().
The modal version of message_box() still exists and is a small wrapper
on this function, running its own subsidiary gtk_main() loop which the
result callback terminates. But now I can start switching over
individual uses of message_box() to the non-modal version, and when
that's done, remove the modal function completely.
Now, in place of a variadic argument list with four parameters per
button and a terminating NULL, it takes a pointer to a struct which in
turn contains an (array,length) pair of small per-button structures.
In the process I've renamed the function from messagebox() to
message_box(). Partly that was just because it gave me a convenient
way to search the source for calls I hadn't converted yet, but also
I've thought for a while that that missing underscore didn't really
match the rest of my naming.
NFCI. Partly this minor refactor has the virtue that we can reuse the
more common button layouts without having to type them in at multiple
places in the code (and, indeed, I've provided buttons_yn and
buttons_ok for easy reuse, and could easily provide other things like
yesnocancel any time I need them). But mostly it's because I'm about
to split up message_box into multiple functions, and this saves me the
hassle of deciding which ones to make variadic and which to pass an
actual va_list to - particularly since messagebox() used to go over
its variadic argument list twice, which always makes delegating it to
another function that much more annoying.
The last few changes between them have fixed the problem of windows
not closing properly when their sessions terminated. The problem was
really more than one problem - pterm session termination wasn't even
detected due to the missing SIGCHLD handler, window-closing wasn't
done explicitly due to exit_callback() just calling gtk_main_quit
instead of a proper gtk_widget_destroy(), and that in turn wouldn't do
quite the right thing without the g_application_{hold,release} system
which I added in gtkapp.c as part of the non-model config box rework.
Now that all of those are fixed, things seem to be working sensibly;
the OS X Pterm.app and PuTTY.app, and the ordinary X GTK ptermapp and
puttyapp too, now allow windows to be closed independently of each
other, close them automatically in the right way, and automatically
terminate the whole application when the last window is gone.
So I can clean up that TODO item, including its handwavy 'need to work
out some kind of mechanism'. Some kind of mechanism has now been
worked out, and given that there turned out to be a whole cluster of
interacting structural issues, no wonder I wasn't _quite_ sure what it
ought to be!
Now every call to do_config_box is replaced with a call to
create_config_box, which returns immediately having constructed the
new GTK window object, and is passed a callback function which it will
arrange to be called when the dialog terminates (whether by OK or by
Cancel). That callback is now what triggers the construction of a
session window after 'Open' is pressed in the initial config box, or
the actual mid-session reconfiguration action after 'Apply' is pressed
in a Change Settings box.
We were already prepared to ignore the re-selection of 'Change
Settings' from the context menu of a window that already had a Change
Settings box open (and not accidentally create a second config box for
the same window); but now we do slightly better, by finding the
existing config box and un-minimising and raising it, in case the user
had forgotten it was there.
That's a useful featurelet, but not the main purpose of this change.
The mani point, of course, is that now the multi-window GtkApplication
based front ends now don't do anything confusing to the nesting of
gtk_main() when config boxes are involved. Whether you're changing the
settings of one (or more than one) of your already-running sessions,
preparing to start up a new PuTTY connection, or both at once, we stay
in the same top-level instance of gtk_main() and all sessions' top-
level callbacks continue to run sensibly.
This has been logically necessary in principle for ages, but we got
away without it because we just exited the program. But in the multi-
window GtkApplication front ends, we can't get away with that for
ever; we need to be able to free _one_ of our 'struct gui_data'
instances and everything dangling off it (or, at least, everything
that GTK's reference counting system doesn't clean up for us), without
also doing anything global to the process in which that gui_data is
contained.
Apparently I copied that rather too literally from osxlaunch.c, where
the text about OS X and 'launcher' made more sense. The stub main in
gtkapp.c has nothing to do with launchers and OS X, so I've corrected
the wording to say that a completely different thing won't work in
completely different circumstances :-)
People who use a packaging system other than jhbuild still ought to be
able to run the OS X GTK3 build, so now the gtk-mac-bundler command
finds out the locations of things by a more portable method.
(I've had this change lurking around uncommitted in a working tree for
a while, and only just found it in the course of doing other OS X-
related work. Oops.)
Without this, the Conf objects in a session and its duplicate were
aliases of each other, which could lead to confusing semantic effects
if one of the sessions was reconfigured in mid-run, and worse still, a
crash if one session got cleaned up and called conf_free on a Conf
that the other was still using.
None of that was intentional; it was just a matter of forgetting to
clone the Conf for the duplicated session. Now we do.
Detecting that the child process in a pterm has terminated is
important for _any_ kind of pterm, so it's a mistake to put the signal
handler setup _solely_ inside the optional pty_pre_init function which
does the privileged setup and forks off a utmp watchdog process. Now
the signal handler is installed even in the GtkApplication-based
multi-window front end to pterm, meaning it will exist even on OS X.
ignore_sbar is a flag that we set while manually changing the
scrollbar settings, so that when those half-finished changes trigger
GTK event callbacks, we know to ignore them, and wait until we've
finished setting everything up before actually updating the window.
But somehow I had managed to leave the functions that actually _have
the effect_ (at least in GTK1) outside the pair of statements that set
and unset the ignore flag.
The effect was that compiling pterm for GTK1, starting it up, and
issuing a command like 'ls -l' that scrolls off the bottom of the
window would lead to the _top_ half of the ls output being visible,
and the scrollbar at the top of the scrollback rather than the bottom.
Apparently I haven't tested this compile mode in a while: I had a
couple of compile errors due to new code not properly #ifdeffed (the
true-colour mode has to be effectively disabled in the palette-based
GTK1 graphics model) and one for an unused static function
(get_monitor_geometry is only used in GTK2 and above, and with -Werror
that means I mustn't even _define_ it in GTK1).
With these changes, I still didn't get a clean compile unless I also
configured CFLAGS=-std=gnu89, due to the GTK1 headers having an
outdated set of ifdefs to figure out the compiler's semantics of
'inline'. (They seem to expect old-style gcc, which inconveniently
treats 'inline' and 'extern inline' more or less the opposite way
round from the version standardised by C99.)
My custom GTK layout class 'Columns' includes a linked list of
dynamically allocated data, and apparently I forgot to write a
destructor that frees it all when the class is deallocated, and have
never noticed until now.
While debugging some new code, I ran valgrind in leak-checking mode
and it pointed out a handful of existing memory leaks, which got in the
way of spotting any _new_ leaks I might be introducing :-)
This was one: in the case where an asynchronous agent query on Unix is
aborted, the dynamically allocated buffer holding the response was not
freed.
ATTR_REVERSE was being handled in the front ends, and was causing the
foreground and background colours to be switched. (I'm not completely
sure why I made that design decision; it might be purely historical,
but then again, it might also be because reverse video is one effect
on the fg and bg colours that must still be performed even in unusual
frontend-specific situations like display-driven monochrome mode.)
This affected both explicit reverse video enabled using SGR 7, and
also the transient reverse video arising from mouse selection. Thanks
to Markus Gans for reporting the bug in the latter, which when I
investigated it turned out to affect the former as well.
I've done this on a 'where possible' basis: in Windows paletted mode
(in case anyone is still using an old enough graphics card to need
that!) I simply haven't bothered, and will completely ignore the dim
flag.
Markus Gans points out that some applications which (not at all
unreasonably) don't trust $TERM to tell them the full capabilities of
their terminal will use the sequence "OSC 4 ; nn ; ? BEL" to ask for
the colour-palette value in position nn, and they may not particularly
care _what_ the results are but they will use them to decide whether
the right number of colour palette entries even exist.
Otherwise, moving the cursor (at least in active, filled-cell mode) on
to a true-coloured character cell causes it to vanish completely
because the cell's colours override the thing that differentiates the
cursor.
I'm not sure if any X11 monochrome visuals or Windows paletted display
modes are still around, but just in case they are, we shouldn't
attempt true colour on either kind of display.
This is a heavily rewritten version of a patch originally by Lorenz
Diener; it was tidied up somewhat by Christian Brabandt, and then
tidied up more by me. The basic idea is to add to the termchar
structure a pair of small structs encoding 24-bit RGB values, each
with a flag indicating whether it's turned on; if it is, it overrides
any other specification of fg or bg colour for that character cell.
I've added a test line to colours.txt containing a few example colours
from /usr/share/X11/rgb.txt. In fact it makes quite a good demo to run
the whole of rgb.txt through this treatment, with a command such as
perl -pe 's!^\s*(\d+)\s+(\d+)\s+(\d+).*$!\e[38;2;$1;$2;$3m$&\e[m!' rgb.txt
[unix/osxlaunch.c:133] -> [unix/osxlaunch.c:134]: (warning) Either the condition '!qhead' is redundant or there is possible null pointer dereference: qhead.
Alamy Liu points out that asking for CONF_host will display the wrong
part of the configuration in the case where serial port setup fails.
The Windows front end's analogous message already got this right, but
I must have forgotten to change this one too when I introduced
conf_dest.
This seems to work around a GTK 3.22 display bug that Colin Watson and
I have both observed on Ubuntu (though I found that proxying the X
server, e.g. by SSH X forwarding or xtruss, inhibited the bug). The
effect of the bug was that the terminal window would appear completely
black and nothing would ever be displayed in it, though the terminal
session was still actually running and keystrokes would be sent to it.
But changing the call to cairo_set_source_surface() to some other
cairo_set_source_foo caused successful drawing of whatever other
source I selected; the problem seemed specific to the image surface.
Also, when I popped up the Ctrl-right-click menu over the terminal
window, the menu didn't disappear when dismissed, i.e. the drawing
area's redraw operation was not drawing in black, but failing to draw
_anything_.
That led me to hypothesise that the draw event handler for the
terminal drawing area might somehow be accidentally inventing 0 rather
than 255 for the implicit alpha channel when using our RGB-type image
surface as a source; so I tried setting the surface type to one with
an explicit alpha channel in the hope that there would no longer be a
need to make up any alpha value at all. And indeed, that seems to
solve the problem for me, so I might as well commit it.
However, I don't know the full details of what the previous problem
was, so this is only an empirical workaround. If it turns out I was
making some other mistake without which a RGB source surface would
have worked for me, then I should probably revert this and do whatever
other fix turns out to be a better plan.
Calling gtk_widget_realize to enforce the existence of an underlying
GdkWindow, followed by gdk_window_ensure_native to enforce an
underlying X window in turn, allows me to get hold of an X window id
on which I can call the Xlib function for setting WM_CLASS, still
before the window is mapped.
With this change, plus Colin's preceding patches, the whole code base
_actually_ compiles and links against GTK 3.22 without any deprecation
warnings. (My claim in commit 8ce237234 that it previously did appears
to have been completely wrong - my guess is that I'd forgotten to
'make clean' before testing against 3.22 and so some source files had
already been compiled against earlier GTK headers.)
GTK+ 3.22 deprecates gdk_screen_{width,height} on the grounds that the
"screen" here actually refers to a virtual screen that may span multiple
monitors, and applications should generally be considering the width and
height of individual monitors. It's not entirely clear to me how this
fits with X geometry specifications, but I've gone with trying to get
hold of the geometry of the monitor that the window in question is on.
gdk_window_set_background was already deprecated, but with GTK+ 3.22
even gdk_window_set_background_rgba is deprecated, so we need a better
approach. The best seems to be to go with the flow and inject a custom
CSS style for the appropriate widgets.
GTK+ 3.22 deprecates gtk_menu_popup in favour of various
gtk_menu_popup_at_* functions. gtk_menu_popup_at_pointer seems most
appropriate, but that requires being able to pass it a GdkEvent rather
than just some elements of it. In order to achieve that, I've
rearranged the scroll_event shim to construct a real GdkEventButton and
pass that down to button_internal.
Minimal version of gtk+ 2.24 required to compile PuTTY
after GTK3 prep commits. Provide more compatibility macroses
to allow build against gtk+ 2.20.
Signed-off-by: Leonid Lisovskiy <lly.dev@gmail.com>
Unix PSCP, PSFTP, Plink and PuTTYgen now just report their build
platform as '64-bit Unix' or '32-bit Unix', without mentioning
irrelevant details of what flavour of GTK the other tools in the suite
might have been built against.
(In particular, they now won't imply anything outright untrue if there
was no GTK present at build time at all!)
Jacob pointed out the other day that the call to logevent with NULL
frontend handle can't possibly work, and the comment next to it saying
that it can is an outright lie (probably thoughtlessly copied from
some part of the Windows front end, where it actually would be true).
Furthermore, even if that logevent call didn't dereference NULL and
segfault, the followup call to fatalbox() would be inappropriate,
since proxied connections need not be the primary network connection
of the whole process.
Rewritten as a call to plug_closing, which is the proper channel
through which to report errors on an individual socket or equivalent.
When called with -V to ask for our version, return 0 rather than 1.
This is the usual behaviour observed by ssh(1) and other Unix commands.
Also use exit() rather than cleanup_exit() in pscp.c and psftp.c ; at
this point we have nothing to cleanup!
It's obvious to the trained eye whether GTK PuTTY was compiled against
GTK2 or GTK3, but the untrained eye would probably appreciate a little
help, and even the trained eye probably can't tell GTK 3.18 from 3.19
at a glance :-)
If we try to interpret a string argument as the name of a key file,
sometimes we it's in circumstances where we _know_ it's a key file, so
we must print an error message and return failure if the file can't be
loaded. Other times it's not, and we just fall back to interpreting
the argument in some other way (e.g. as a pattern match against the
comment or fingerprint of a key already in the agent).
My code dealing with failure returns from the public-key loading
functions were mishandling the latter case, if they identified a file
as existing and looking more or less like some kind of key file but
then it turned out to have a format error; they would try to copy and
return a public key that they didn't actually have. Even if
pageant_pubkey_copy avoided crashing as a result, this would still
inhibit the fallback to treating the input string as some other kind
of pattern match.
I think these were not strictly necessary, since passing a null
pointer to access(2) would have resulted in EINVAL rather than a
segfault. But it's clearer to put them in (and keeps static checkers a
bit happier).
Thanks, Coverity - I must have been lucky that Unix Pageant in client
mode hasn't so far happened to have this field come out non-NULL, or
else pageant_pubkey_copy would have tried to dupstr a garbage pointer.
Partly to reassure the user that they got what they asked for, and
partly so that's a clue for us in the logs when we get bug reports.
This involved repurposing platform_psftp_post_option_setup() (no longer
used since e22120fe) as platform_psftp_pre_conn_setup(), and moving it
to after logging is set up.
It's a function that exists on all platforms, not just on Unix - it's
used in ldisc.c - so it shouldn't have been declared only in unix.h.
Score another for clang's warnings.
Now, instead of returning a boolean indicating whether the query has
completed or is still pending, agent_query() returns NULL to indicate
that the query _has_ completed, and if it hasn't, it returns a pointer
to a context structure representing the pending query, so that the
latter can be used to cancel the query if (for example) you later
decide you need to free the thing its callback was using as a context.
This should fix a potential race-condition segfault if you overload an
agent forwarding channel and then close it abruptly. (Which nobody
will be doing for sensible purposes, of course! But I ran across this
while stress-testing other aspects of agent forwarding.)
backend_socket_log was generating the IP address in its error messages
by means of calling sk_getaddr(). But sk_getaddr only gets a SockAddr,
which may contain a whole list of candidate addresses; it doesn't also
get the information stored in the 'step' field of the Socket that was
actually trying to make the connection, which says _which_ of those
addresses we were in the middle of trying to connect to.
So now we construct a temporary SockAddr that points at the
appropriate one of the addresses, and use that for calls to plug_log
during connection setup.
If connect() returns EINPROGRESS, then previously we would detect a
successful connection by the socket becoming selectable for writing,
and spot an unsuccessful one by an error code being returned on the
first attempt to read from it.
This isn't the right way to do it: the right way is to respond to the
initial writability notification by calling getsockopt(SO_ERROR) to
retrieve the error code (if any) from the completed connection
attempt. Doing it the old way had the problem that when the socket
became writable, we could sometimes already have written some of our
outgoing data to it before finding out that the connect attempt failed
- which meant we'd discard that data from the bufchain, and no longer
have it to send through a later successful connection to a different
candidate address.
In case of connection errors before and during the handshake,
net_select_result is retrying with the next address of the server. It
however was immediately going to the last address as it was not
checking the return value of try_connect for all intermediate
addresses.
This shows the build platform (32- vs 64-bit in particular, and also
whether Unix GTK builds were compiled with or without the X11 pieces),
what compiler was used to build the binary, and any interesting build
options that might have been set on the make command line (especially,
but not limited to, the security-damaging ones like NO_SECURITY or
UNPROTECT). This will probably be useful all over the place, but in
particular it should allow the different Windows binaries to be told
apart!
Commits 21101c739 and 2eb952ca3 laid the groundwork for this, by
allowing the various About boxes to contain free text and also
ensuring they could be copied and pasted easily as part of a bug
report.
I had mistakenly pulled a 'char' value out of a string and passed it
to x11_font_has_glyph and x11_char_struct, each of which takes its two
index bytes as int-typed parameters. But if chars are signed, that
turns high-bit-set characters into out-of-range array indices. Oops.
The range checks in x11_char_struct prevented that from causing any
problem worse than refusal to display any affected glyph. Even so,
that's not particularly helpful. Fixed by changing the index byte
parameters to unsigned char type.
I noticed today that Unix Plink responds to SIGWINCH by accidentally
dying of EINTR having interrupted its main select loop, and when I
checked, there turn out to be a couple of other select loops with the
same bug.
The new font name configured by the keystrokes was missing its
"client:" or "server:" prefix, which could have led to the selection
of the wrong font in rare situations.
Each gtkfont back end now provides a routine that will return the name
of a similar font to the current one but one notch larger or smaller.
For Pango, this is just a matter of incrementing the font size field
in a standard way; for X11 server-side fonts, we have to go and do an
XListFonts query with a wildcard that requests fonts that vary only in
the size fields from the current one, and then iterate over the result
looking for the best one.
(I expect this will be more useful to Pango scalable-font users than
to X11 fonts, but it seemed a shame not to give the X11 side my best
shot while I was at it.)
Choice of hotkey: I know I'm being inconsistent with gnome-terminal's
use of Ctrl-plus and Ctrl-minus. I thought that was because I was
already using Ctrl-minus as a more convenient synonym for
Ctrl-underscore (which sends the actual control code 0x1F), but now I
actually try it, apparently I'm not. However, Ctrl-plus and Ctrl-minus
are quite horrible as a keystroke pair anyway (one has to be typed
with shift and one without!), and I feel as if the 'less' and
'greater' signs are more specific anyway, in that they specifically
indicate _size_ rather than just 'unspecified numerical value'.
There were already two places in the code (x11font_enum_fonts and
x11_guess_derived_font_name) where we retrieved an XLFD from the X
server, sawed it up ad-hoc into its '-'-separated parts and accessed
them by numeric index.
I'm about to add a third, so before I do, let's turn this into a
somewhat principled system where we get to do the decode/encode in
just one place and call all the individual fields by names that are
actually memorable.
No functional change intended by this commit.
The XDG configuration location ($XDG_CONFIG_HOME/putty, or
~/.config/putty) is now prefered over the old ~/.putty location, if the
XDG location already exists. If it doesn't exist, we try to use one of
the old locations ($HOME/.putty, [/etc/passwd home]/.putty, /.putty). If
none of the directories exist, we fall back to ~/.config/putty or
~/.putty, if the XDG_DEFAULT macro is defined or not, respectively. The
PUTTYDIR environment variable remains a definitive override of the
configuration location. This all ensures that the old location is still
used, unless the user explicitly requests otherwise.
The configuration directories are created using the make_dir_path()
function, to ensure that saving the configuration doesn't fail e.g.
because of a non-existent ~/.config directory.
Essentially 'mkdir -p' - we try to make each prefix of the pathname,
terminating on any error other than EEXIST. Semantics are similar to
make_dir_and_check_ours(): we return NULL on success or a dynamically
allocated error message string on failure.