For gtkapp-based tools that will have to stop being a program-fatal
error, so I've turned it into a function called window_setup_error
(which I could in principle reuse for other problems in the long and
tortuous progress of new_session_window), and kept the original
handling in gtkmain.c's implementation of that function while gtkapp.c
does something more sensible with a message box.
Not all gtkwin-based tools use it. Only the ones with one session per
process, which parse a command line describing that session and might
reasonably want to report errors in that command line by writing to
standard error and exiting the program.
In other words, precisely the ones that link in gtkmain.c and not
gtkapp.c. So gtkmain.c is a more sensible place to put that
error-reporting function.
They don't do normal command-line processing, so they don't need it. A
few stray references to machinery provided in there are now satisfied
instead by a new stub module nocmdline.c.
This was one of a handful of remaining places in gtkwin.c where exit()
is called incautiously. Of course, a failure to set up one SSH
connection should only be fatal to that connection, not the whole
process, so really we should be feeding into the connection_fatal
system.
If ssh_init encounters a synchronous error, it will call random_unref
before returning. But the Ssh structure it created will still exist,
and if the caller (sensibly) responds by freeing it, then that will
cause a second random_unref, leading to the RNG's refcount going below
zero and failing an assertion.
We never noticed this before because with only one PuTTY connection
per process it was easier to just exit(1) without bothering to clean
things up. Now, with all the multi-sessions-per-process fixes I'm
doing, this has shown up as a problem. But other front ends may
legitimately still just exit - I don't think I can sensibly enforce
_not_ doing so at this late stage - so I've had to arrange to set a
flag in the Ssh saying whether a random_unref is still pending or not.
This existed in order to avoid the various confusions that could
happen if a toplevel callback ran in the context of a subsidiary
instance of gtk_main(). Now there aren't any subsidiary gtk_main
instances any more, this mechanism is no longer needed, and I can
throw it out. It was horrible anyway.
I think these began to appear as a consequencce of replacing
fatalbox() calls with more sensible error reports: the more specific a
direction I send a report in, the greater the annoying possibility of
re-entrance when the resulting error handler starts closing stuff.
This change requires me to break up the general cleanups in
delete_inst() into two halves: one runs when the error message box is
created, and cleans up the network connection and all the stuff
associated with it, and the other runs when the error message is
dismissed and the window can actually close.
It's an incoherent concept! There should not be any such thing as an
error box that terminates the entire program but is not modal. If it's
bad enough to terminate the whole program, i.e. _all_ currently live
connections, then there's no point in permitting progress to continue
in windows other than the affected one, because all windows are
affected anyway.
So all previous uses of fatalbox() have become modalfatalbox(), except
those which looked to me as if they shouldn't have been fatal in the
first place, e.g. lingering pieces of error handling in winnet.c which
ought to have had the severity of 'give up on this particular Socket
and close it' rather than 'give up on the ENTIRE UNIVERSE'.
I've also moved it out into gtkwin.c, because it seemed easier to do
the 'find existing instance of this dialog and raise it' dance there
than to split it across source files pointlessly.
Apart from the specific benefit of non-modality, this also makes it a
lot simpler compared to the previous code! I'm not completely sure why
I wasn't using the standard gtkdlg.c message box system all along.
This fits into a new dialog-box slot (because it might have to come up
at the same time as a network prompt), and makes use of the existing
callback system in logging.c which buffers the logging data until the
user says what they want done with it.
Now it has several 'slots', each named for a particular class of
subsidiary dialog box that a session window can have at most one of,
and register_network_prompt_dialog has a more general name and takes
an enum-typed argument identifying a slot. This lets me avoid writing
a zillion annoyingly similar function pairs and corresponding snippets
of cleanup code in delete_inst.
If you close a session window with an associated SSH back end, the
back end may call back to notify_remote_exit() from ssh_free(), which
queues a new top-level callback citing the inst structure we were
about to delete.
We could fix this by introducing a special 'moribund' flag which
inhibits notify_remote_exit from queueing a callback, but far easier
is to move the delete_callbacks_for_context() call to _after_ all
subsidiary things have been cleaned up, so that any last-minute
callbacks they might schedule will be promptly unscheduled again
before they do any damage.
This follows exactly the same pattern as for verify_ssh_host_key, but
the results of the dialog box are simpler (a plain yes-no response),
so the two dialog types can share a callback.
I've switched it to using the new non-modal create_message_box, and
provided a callback function which handles the cleanup afterwards.
I had expected this to be a lot more work, because I'd imagined that
I'd have to contort the coroutines in ssh.c to give them the ability
to wait for an asynchronously delivered result from that user prompt.
But in fact that wasn't necessary, because just such a mechanism has
been sitting there unused since commit 8574822b9 in 2005, when I added
it as part of my _previous_ attempt to write an OS X front end! (The
abandoned one written in native ObjC + Cocoa.)
When I switch verify_ssh_host_key() and friends over to creating
non-modal message boxes and returning to the main loop, there will be
a risk that their parent window will need to close for some other
reason while the user hasn't answered the pending question yet. (E.g.
if the user presses the main session window's close button, which will
no longer be a prohibited UI action once the transient dialog is not
modal.)
At that point we need to get rid of the pending dialog box, both for
UI purposes (it would look silly and be confusing to leave it lying
around) and for memory management (if the user subsequently clicks OK
in such a dialog it would probably try to leave its result somewhere
stale).
So now there's a mechanism for gtkwin.c remembering what the current
'network prompt dialog' is, if any (in which category I intend to
include everything triggered from ssh.c's various reasons for asking
crypto-related questions), and cleaning it up when the struct gui_data
it belongs to goes away.
If a dialog box is destroyed by the program before the user has
pressed one of the result-delivering buttons - e.g. because the parent
window closes so the dialog is no longer relevant to anything anyway -
then dlgparam_destroy would never call the client code's provided
callback. That makes sense in terms of the callback wanting to _take
action_ based on the result of the dialog box, but it ignores the
possibility that the callback may simply need to free its own context
structure.
So now dlgparam_destroy always calls the client's callback, even if
the result it passes is negative (meaning 'the user never got round to
pressing any of the dialog-ending buttons'), and all the existing
client callbacks handle the negative-result case by doing nothing
except freeing any allocated memory they might have.
This does the bulk of the work previously done by message_box()
proper, but takes a pointer to a result-reporting callback function
identical to the one we pass to create_config_box().
The modal version of message_box() still exists and is a small wrapper
on this function, running its own subsidiary gtk_main() loop which the
result callback terminates. But now I can start switching over
individual uses of message_box() to the non-modal version, and when
that's done, remove the modal function completely.
Now, in place of a variadic argument list with four parameters per
button and a terminating NULL, it takes a pointer to a struct which in
turn contains an (array,length) pair of small per-button structures.
In the process I've renamed the function from messagebox() to
message_box(). Partly that was just because it gave me a convenient
way to search the source for calls I hadn't converted yet, but also
I've thought for a while that that missing underscore didn't really
match the rest of my naming.
NFCI. Partly this minor refactor has the virtue that we can reuse the
more common button layouts without having to type them in at multiple
places in the code (and, indeed, I've provided buttons_yn and
buttons_ok for easy reuse, and could easily provide other things like
yesnocancel any time I need them). But mostly it's because I'm about
to split up message_box into multiple functions, and this saves me the
hassle of deciding which ones to make variadic and which to pass an
actual va_list to - particularly since messagebox() used to go over
its variadic argument list twice, which always makes delegating it to
another function that much more annoying.
ssh1_rdpkt claimed to be handling SSH1_MSG_DEBUG and SSH1_MSG_IGNORE
packets, but in fact, the handling of those has long since been moved
into the dispatch table; those particular entries are set up in
ssh1_protocol_setup().
The last few changes between them have fixed the problem of windows
not closing properly when their sessions terminated. The problem was
really more than one problem - pterm session termination wasn't even
detected due to the missing SIGCHLD handler, window-closing wasn't
done explicitly due to exit_callback() just calling gtk_main_quit
instead of a proper gtk_widget_destroy(), and that in turn wouldn't do
quite the right thing without the g_application_{hold,release} system
which I added in gtkapp.c as part of the non-model config box rework.
Now that all of those are fixed, things seem to be working sensibly;
the OS X Pterm.app and PuTTY.app, and the ordinary X GTK ptermapp and
puttyapp too, now allow windows to be closed independently of each
other, close them automatically in the right way, and automatically
terminate the whole application when the last window is gone.
So I can clean up that TODO item, including its handwavy 'need to work
out some kind of mechanism'. Some kind of mechanism has now been
worked out, and given that there turned out to be a whole cluster of
interacting structural issues, no wonder I wasn't _quite_ sure what it
ought to be!
Now every call to do_config_box is replaced with a call to
create_config_box, which returns immediately having constructed the
new GTK window object, and is passed a callback function which it will
arrange to be called when the dialog terminates (whether by OK or by
Cancel). That callback is now what triggers the construction of a
session window after 'Open' is pressed in the initial config box, or
the actual mid-session reconfiguration action after 'Apply' is pressed
in a Change Settings box.
We were already prepared to ignore the re-selection of 'Change
Settings' from the context menu of a window that already had a Change
Settings box open (and not accidentally create a second config box for
the same window); but now we do slightly better, by finding the
existing config box and un-minimising and raising it, in case the user
had forgotten it was there.
That's a useful featurelet, but not the main purpose of this change.
The mani point, of course, is that now the multi-window GtkApplication
based front ends now don't do anything confusing to the nesting of
gtk_main() when config boxes are involved. Whether you're changing the
settings of one (or more than one) of your already-running sessions,
preparing to start up a new PuTTY connection, or both at once, we stay
in the same top-level instance of gtk_main() and all sessions' top-
level callbacks continue to run sensibly.
This has been logically necessary in principle for ages, but we got
away without it because we just exited the program. But in the multi-
window GtkApplication front ends, we can't get away with that for
ever; we need to be able to free _one_ of our 'struct gui_data'
instances and everything dangling off it (or, at least, everything
that GTK's reference counting system doesn't clean up for us), without
also doing anything global to the process in which that gui_data is
contained.
This is used when you're about to destroy an object that is
(potentially) the context parameter for some still-pending toplevel
callback. It causes callbacks.c to go through its pending list and
delete any callback records referring to that context parameter, so
that when you destroy the object those callbacks aren't still waiting
to cause stale-pointer dereferences.
Apparently I copied that rather too literally from osxlaunch.c, where
the text about OS X and 'launcher' made more sense. The stub main in
gtkapp.c has nothing to do with launchers and OS X, so I've corrected
the wording to say that a completely different thing won't work in
completely different circumstances :-)
People who use a packaging system other than jhbuild still ought to be
able to run the OS X GTK3 build, so now the gtk-mac-bundler command
finds out the locations of things by a more portable method.
(I've had this change lurking around uncommitted in a working tree for
a while, and only just found it in the course of doing other OS X-
related work. Oops.)
Without this, the Conf objects in a session and its duplicate were
aliases of each other, which could lead to confusing semantic effects
if one of the sessions was reconfigured in mid-run, and worse still, a
crash if one session got cleaned up and called conf_free on a Conf
that the other was still using.
None of that was intentional; it was just a matter of forgetting to
clone the Conf for the duplicated session. Now we do.
Detecting that the child process in a pterm has terminated is
important for _any_ kind of pterm, so it's a mistake to put the signal
handler setup _solely_ inside the optional pty_pre_init function which
does the privileged setup and forks off a utmp watchdog process. Now
the signal handler is installed even in the GtkApplication-based
multi-window front end to pterm, meaning it will exist even on OS X.
ignore_sbar is a flag that we set while manually changing the
scrollbar settings, so that when those half-finished changes trigger
GTK event callbacks, we know to ignore them, and wait until we've
finished setting everything up before actually updating the window.
But somehow I had managed to leave the functions that actually _have
the effect_ (at least in GTK1) outside the pair of statements that set
and unset the ignore flag.
The effect was that compiling pterm for GTK1, starting it up, and
issuing a command like 'ls -l' that scrolls off the bottom of the
window would lead to the _top_ half of the ls output being visible,
and the scrollbar at the top of the scrollback rather than the bottom.
Apparently I haven't tested this compile mode in a while: I had a
couple of compile errors due to new code not properly #ifdeffed (the
true-colour mode has to be effectively disabled in the palette-based
GTK1 graphics model) and one for an unused static function
(get_monitor_geometry is only used in GTK2 and above, and with -Werror
that means I mustn't even _define_ it in GTK1).
With these changes, I still didn't get a clean compile unless I also
configured CFLAGS=-std=gnu89, due to the GTK1 headers having an
outdated set of ifdefs to figure out the compiler's semantics of
'inline'. (They seem to expect old-style gcc, which inconveniently
treats 'inline' and 'extern inline' more or less the opposite way
round from the version standardised by C99.)
My custom GTK layout class 'Columns' includes a linked list of
dynamically allocated data, and apparently I forgot to write a
destructor that frees it all when the class is deallocated, and have
never noticed until now.
While debugging some new code, I ran valgrind in leak-checking mode
and it pointed out a handful of existing memory leaks, which got in the
way of spotting any _new_ leaks I might be introducing :-)
This was one: in the case where an asynchronous agent query on Unix is
aborted, the dynamically allocated buffer holding the response was not
freed.
The new AES routines are compiled into the code on any platform where
the compiler can be made to generate the necessary AES-NI and SSE
instructions. But not every CPU will support those instructions, so
the pure-software routines haven't gone away: both sets of functions
sit side by side in the code, and at key setup time we check the CPUID
bitmap to decide which set to select.
(This reintroduces function pointers into AESContext, replacing the
ones that we managed to remove a few commits ago.)
The outer routines are the ones which handle the CBC encrypt, CBC
decrypt and SDCTR cipher modes. Previously each of those had to be
able to dispatch to one of the per-block-size core routines, which
made it worth dividing the system up into two layers. But now there's
only one set of core routines, they may as well be inlined into the
outer ones.
Also as part of this commit, the nasty undef/redef of MAKEWORD and
LASTWORD have been removed, and the different macro definitions now
have different macro _names_, to make it clearer which one is used
where.
They're not really part of AES at all, in that they were part of the
Rijndael design but not part of the subset standardised by NIST. More
relevantly, they're not used by any SSH cipher definition, so they're
just adding complexity to the code which is about to get in the way of
refactoring it.
Removing them means there's only one pair of core encrypt/decrypt
functions, so the 'encrypt' and 'decrypt' function pointer fields can
be completely removed from AESContext.
Apparently I forgot to edit that when I originally imported this AES
implementation into PuTTY's SSH code from the more generically named
source file in which I'd originally developed it.
ATTR_REVERSE was being handled in the front ends, and was causing the
foreground and background colours to be switched. (I'm not completely
sure why I made that design decision; it might be purely historical,
but then again, it might also be because reverse video is one effect
on the fg and bg colours that must still be performed even in unusual
frontend-specific situations like display-driven monochrome mode.)
This affected both explicit reverse video enabled using SGR 7, and
also the transient reverse video arising from mouse selection. Thanks
to Markus Gans for reporting the bug in the latter, which when I
investigated it turned out to affect the former as well.