The upcoming PRNG revamp will want to tell noise sources apart, so
that it can treat them all fairly. So I've added an extra parameter to
noise_ultralight and random_add_noise, which takes values in an
enumeration covering all the vague classes of entropy source I'm
collecting. In this commit, though, it's simply ignored.
This is the commit that f3295e0fb _should_ have been. Yesterday I just
added some typedefs so that I didn't have to wear out my fingers
typing 'struct' in new code, but what I ought to have done is to move
all the typedefs into defs.h with the rest, and then go through
cleaning up the legacy 'struct's all through the existing code.
But I was mostly trying to concentrate on getting the test suite
finished, so I just did the minimum. Now it's time to come back and do
it better.
Taking a leaf out of the LLVM code base: this macro still includes an
assert(false) so that the message will show up in a typical build, but
it follows it up with a call to a function explicitly marked as no-
return.
So this ought to do a better job of convincing compilers that once a
code path hits this function it _really doesn't_ have to still faff
about with making up a bogus return value or filling in a variable
that 'might be used uninitialised' in the following code that won't be
reached anyway.
I've gone through the existing code looking for the assert(false) /
assert(0) idiom and replaced all the ones I found with the new macro,
which also meant I could remove a few pointless return statements and
variable initialisations that I'd already had to put in to placate
compiler front ends.
This fixes a batch of clang-analyzer warnings of the form 'you
declared / assigned this variable and then never use it'. It doesn't
fix _all_ of them - some are there so that when I add code in the
future _it_ can use the variable without me having to remember to
start setting it - but these are the ones I thought it would make the
code better instead of worse to fix.
This is another cleanup I felt a need for while I was doing
boolification. If you define a function or variable in one .c file and
declare it extern in another, then nothing will check you haven't got
the types of the two declarations mismatched - so when you're
_changing_ the type, it's a pain to make sure you've caught all the
copies of it.
It's better to put all those extern declarations in header files, so
that the declaration in the header is also in scope for the
definition. Then the compiler will complain if they don't match, which
is what I want.
My normal habit these days, in new code, is to treat int and bool as
_almost_ completely separate types. I'm still willing to use C's
implicit test for zero on an integer (e.g. 'if (!blob.len)' is fine,
no need to spell it out as blob.len != 0), but generally, if a
variable is going to be conceptually a boolean, I like to declare it
bool and assign to it using 'true' or 'false' rather than 0 or 1.
PuTTY is an exception, because it predates the C99 bool, and I've
stuck to its existing coding style even when adding new code to it.
But it's been annoying me more and more, so now that I've decided C99
bool is an acceptable thing to require from our toolchain in the first
place, here's a quite thorough trawl through the source doing
'boolification'. Many variables and function parameters are now typed
as bool rather than int; many assignments of 0 or 1 to those variables
are now spelled 'true' or 'false'.
I managed this thorough conversion with the help of a custom clang
plugin that I wrote to trawl the AST and apply heuristics to point out
where things might want changing. So I've even managed to do a decent
job on parts of the code I haven't looked at in years!
To make the plugin's work easier, I pushed platform front ends
generally in the direction of using standard 'bool' in preference to
platform-specific boolean types like Windows BOOL or GTK's gboolean;
I've left the platform booleans in places they _have_ to be for the
platform APIs to work right, but variables only used by my own code
have been converted wherever I found them.
In a few places there are int values that look very like booleans in
_most_ of the places they're used, but have a rarely-used third value,
or a distinction between different nonzero values that most users
don't care about. In these cases, I've _removed_ uses of 'true' and
'false' for the return values, to emphasise that there's something
more subtle going on than a simple boolean answer:
- the 'multisel' field in dialog.h's list box structure, for which
the GTK front end in particular recognises a difference between 1
and 2 but nearly everything else treats as boolean
- the 'urgent' parameter to plug_receive, where 1 vs 2 tells you
something about the specific location of the urgent pointer, but
most clients only care about 0 vs 'something nonzero'
- the return value of wc_match, where -1 indicates a syntax error in
the wildcard.
- the return values from SSH-1 RSA-key loading functions, which use
-1 for 'wrong passphrase' and 0 for all other failures (so any
caller which already knows it's not loading an _encrypted private_
key can treat them as boolean)
- term->esc_query, and the 'query' parameter in toggle_mode in
terminal.c, which _usually_ hold 0 for ESC[123h or 1 for ESC[?123h,
but can also hold -1 for some other intervening character that we
don't support.
In a few places there's an integer that I haven't turned into a bool
even though it really _can_ only take values 0 or 1 (and, as above,
tried to make the call sites consistent in not calling those values
true and false), on the grounds that I thought it would make it more
confusing to imply that the 0 value was in some sense 'negative' or
bad and the 1 positive or good:
- the return value of plug_accepting uses the POSIXish convention of
0=success and nonzero=error; I think if I made it bool then I'd
also want to reverse its sense, and that's a job for a separate
piece of work.
- the 'screen' parameter to lineptr() in terminal.c, where 0 and 1
represent the default and alternate screens. There's no obvious
reason why one of those should be considered 'true' or 'positive'
or 'success' - they're just indices - so I've left it as int.
ssh_scp_recv had particularly confusing semantics for its previous int
return value: its call sites used '<= 0' to check for error, but it
never actually returned a negative number, just 0 or 1. Now the
function and its call sites agree that it's a bool.
In a couple of places I've renamed variables called 'ret', because I
don't like that name any more - it's unclear whether it means the
return value (in preparation) for the _containing_ function or the
return value received from a subroutine call, and occasionally I've
accidentally used the same variable for both and introduced a bug. So
where one of those got in my way, I've renamed it to 'toret' or 'retd'
(the latter short for 'returned') in line with my usual modern
practice, but I haven't done a thorough job of finding all of them.
Finally, one amusing side effect of doing this is that I've had to
separate quite a few chained assignments. It used to be perfectly fine
to write 'a = b = c = TRUE' when a,b,c were int and TRUE was just a
the 'true' defined by stdbool.h, that idiom provokes a warning from
gcc: 'suggest parentheses around assignment used as truth value'!
I think this is the full set of things that ought logically to be
boolean.
One annoyance is that quite a few radio-button controls in config.c
address Conf fields that are now bool rather than int, which means
that the shared handler function can't just access them all with
conf_{get,set}_int. Rather than back out the rigorous separation of
int and bool in conf.c itself, I've just added a similar alternative
handler function for the bool-typed ones.
This commit includes <stdbool.h> from defs.h and deletes my
traditional definitions of TRUE and FALSE, but other than that, it's a
100% mechanical search-and-replace transforming all uses of TRUE and
FALSE into the C99-standardised lowercase spellings.
No actual types are changed in this commit; that will come next. This
is just getting the noise out of the way, so that subsequent commits
can have a higher proportion of signal.
ssh2connection.c now knows how to unmarshal the message formats for
all the channel requests we'll need to handle when we're the server
and a client sends them. Each one is translated into a call to a new
method in the Channel vtable, which is implemented by a trivial
'always fail' routine in every channel type we know about so far.
The code in Pageant that sets up the Unix socket and its containing
directory now lives in a separate file, uxagentsock.c, where it will
also be callable from the upcoming new SSH server when it wants to
create a similar socket for agent forwarding.
While I'm at it, I've also added a feature to create a watchdog
subprocess that will try to clean up the socket and directory once
Pageant itself terminates, in the hope of leaving less cruft lying
around /tmp.
Some kinds of channel, even after they've sent EOF in both directions,
still have something to do before they initiate the CLOSE mechanism
and wind up the channel completely. For example, a session channel
with a subprocess running inside it will want to be sure to send the
"exit-status" or "exit-signal" notification, even if that happens
after bidirectional EOF of the data channels.
Previously, the SSH-2 connection layer had the standard policy that
once EOF had been both sent and received, it would start the final
close procedure. There's a method chan_want_close() by which a Channel
could vary this policy in one direction, by indicating that it wanted
the close procedure to commence after EOF was sent in only one
direction. Its parameters are a pair of booleans saying whether EOF
has been sent, and whether it's been received.
Now chan_want_close can vary the policy in the other direction as
well: if it returns FALSE even when _both_ parameters are true, the
connection layer will honour that, and not send CHANNEL_CLOSE. If it
does that, the Channel is responsible for indicating when it _does_
want close later, by calling sshfwd_initiate_close.
Each of the new subroutines corresponds to one of the channel types
for which we know how to parse a CHANNEL_OPEN, and has a collection of
parameters corresponding to the fields of that message structure.
ssh2_connection_filter_queue now confines itself to parsing the
message, calling one of those functions, and constructing an
appropriate reply message if any.
Instead of the central code in ssh2_connection_filter_queue doing both
the job of parsing the channel request and deciding whether it's
acceptable, each Channel vtable now has a method for every channel
request type we recognise.
This is a new vtable-based abstraction which is passed to a backend in
place of Frontend, and it implements only the subset of the Frontend
functions needed by a backend. (Many other Frontend functions still
exist, notably the wide range of things called by terminal.c providing
platform-independent operations on the GUI terminal window.)
The purpose of making it a vtable is that this opens up the
possibility of creating a backend as an internal implementation detail
of some other activity, by providing just that one backend with a
custom Seat that implements the methods differently.
For example, this refactoring should make it feasible to directly
implement an SSH proxy type, aka the 'jump host' feature supported by
OpenSSH, aka 'open a secondary SSH session in MAINCHAN_DIRECT_TCP
mode, and then expose the main channel of that as the Socket for the
primary connection'. (Which of course you can already do by spawning
'plink -nc' as a separate proxy process, but this would permit it in
the _same_ process without anything getting confused.)
I've centralised a full set of stub methods in misc.c for the new
abstraction, which allows me to get rid of several annoying stubs in
the previous code. Also, while I'm here, I've moved a lot of
duplicated modalfatalbox() type functions from application main
program files into wincons.c / uxcons.c, which I think saves
duplication overall. (A minor visible effect is that the prefixes on
those console-based fatal error messages will now be more consistent
between applications.)
LogContext is now the owner of the logevent() function that back ends
and so forth are constantly calling. Previously, logevent was owned by
the Frontend, which would store the message into its list for the GUI
Event Log dialog (or print it to standard error, or whatever) and then
pass it _back_ to LogContext to write to the currently open log file.
Now it's the other way round: LogContext gets the message from the
back end first, writes it to its log file if it feels so inclined, and
communicates it back to the front end.
This means that lots of parts of the back end system no longer need to
have a pointer to a full-on Frontend; the only thing they needed it
for was logging, so now they just have a LogContext (which many of
them had to have anyway, e.g. for logging SSH packets or session
traffic).
LogContext itself also doesn't get a full Frontend pointer any more:
it now talks back to the front end via a little vtable of its own
called LogPolicy, which contains the method that passes Event Log
entries through, the old askappend() function that decides whether to
truncate a pre-existing log file, and an emergency function for
printing an especially prominent message if the log file can't be
created. One minor nice effect of this is that console and GUI apps
can implement that last function subtly differently, so that Unix
console apps can write it with a plain \n instead of the \r\n
(harmless but inelegant) that the old centralised implementation
generated.
One other consequence of this is that the LogContext has to be
provided to backend_init() so that it's available to backends from the
instant of creation, rather than being provided via a separate API
call a couple of function calls later, because backends have typically
started doing things that need logging (like making network
connections) before the call to backend_provide_logctx. Fortunately,
there's no case in the whole code base where we don't already have
logctx by the time we make a backend (so I don't actually remember why
I ever delayed providing one). So that shortens the backend API by one
function, which is always nice.
While I'm tidying up, I've also moved the printf-style logeventf() and
the handy logevent_and_free() into logging.c, instead of having copies
of them scattered around other places. This has also let me remove
some stub functions from a couple of outlying applications like
Pageant. Finally, I've removed the pointless "_tag" at the end of
LogContext's official struct name.
I think that means that _every_ one of my traitoids is now a struct
containing a vtable pointer as one of its fields (albeit sometimes the
only field), and never just a bare pointer.
All the main backend structures - Ssh, Telnet, Pty, Serial etc - now
describe structure types themselves rather than pointers to them. The
same goes for the codebase-wide trait types Socket and Plug, and the
supporting types SockAddr and Pinger.
All those things that were typedefed as pointers are older types; the
newer ones have the explicit * at the point of use, because that's
what I now seem to be preferring. But whichever one of those is
better, inconsistently using a mixture of the two styles is worse, so
let's make everything consistent.
A few types are still implicitly pointers, such as Bignum and some of
the GSSAPI types; generally this is either because they have to be
void *, or because they're typedefed differently on different
platforms and aren't always pointers at all. Can't be helped. But I've
got rid of the main ones, at least.
Clients outside ssh.c - all implementations of Channel - will now not
see the ssh_channel data type itself, but only a subobject of the
interface type SshChannel. All the sshfwd_* functions have become
methods in that interface type's vtable (though, wrapped in the usual
kind of macros, the call sites look identical).
This paves the way for me to split up the SSH-1 and SSH-2 connection
layers and have each one lay out its channel bookkeeping structure as
it sees fit; as long as they each provide an implementation of the
sshfwd_ method family, the types behind that need not look different.
A minor good effect of this is that the sshfwd_ methods are no longer
global symbols, so they don't have to be stubbed in Unix Pageant to
get it to compile.
This was a particularly confusing piece of type-danger, because three
different types were passed outside sshshare.c as 'void *' and only
human vigilance prevented one coming back as the wrong one. Now they
all keep their opaque structure tags when they move through other
parts of the code.
There's now an interface called 'Channel', which handles the local
side of an SSH connection-layer channel, in terms of knowing where to
send incoming channel data to, whether to close the channel, etc.
Channel and the previous 'struct ssh_channel' mutually refer. The
latter contains all the SSH-specific parts, and as much of the common
logic as possible: in particular, Channel doesn't have to know
anything about SSH packet formats, or which SSH protocol version is in
use, or deal with all the fiddly stuff about window sizes - with the
exception that x11fwd.c's implementation of it does have to be able to
ask for a small fixed initial window size for the bodgy system that
distinguishes upstream from downstream X forwardings.
I've taken the opportunity to move the code implementing the detailed
behaviour of agent forwarding out of ssh.c, now that all of it is on
the far side of a uniform interface. (This also means that if I later
implement agent forwarding directly to a Unix socket as an
alternative, it'll be a matter of changing just the one call to
agentf_new() that makes the Channel to plug into a forwarding.)
This is another major source of unexplained 'void *' parameters
throughout the code.
In particular, the currently unused testback.c actually gave the wrong
pointer type to its internal store of the frontend handle - it cast
the input void * to a Terminal *, from which it got implicitly cast
back again when calling from_backend, and nobody noticed. Now it uses
the right type internally as well as externally.
Same principle again - the more of these structures have globally
visible tags (even if the structure contents are still opaque in most
places), the fewer of them I can mistake for each other.
This parameter returned a substring of the input, which was used for
two purposes. Firstly, it was used to hash the host and server keys
during the initial SSH-1 key setup phase; secondly, it was used to
check the keys in Pageant against the public key blob of a key
specified on the command line.
Unfortunately, those two purposes didn't agree! The first one needs
just the bare key modulus bytes (without even the SSH-1 mpint length
header); the second needs the entire key blob. So, actually, it seems
to have never worked in SSH-1 to say 'putty -i keyfile' and have PuTTY
find that key in Pageant and not have to ask for the passphrase to
decrypt the version on disk.
Fixed by removing that parameter completely, which simplifies all the
_other_ call sites, and replacing it by custom code in those two
places that each does the actually right thing.
There are several old functions that the previous commits have removed
all, or nearly all, of the references to. match_ssh_id is superseded
by ptrlen_eq_string; get_ssh_{string,uint32} is yet another replicated
set of decode functions (this time _partly_ centralised into misc.c);
the old APIs for the SSH-1 RSA decode functions are gone (together
with their last couple of holdout clients), as are
ssh{1,2}_{read,write}_bignum and ssh{1,2}_bignum_length.
Particularly odd was the use of ssh1_{read,write}_bignum in the SSH-2
Diffie-Hellman implementation. I'd completely forgotten I did that!
Now replaced with a raw bignum_from_bytes, which is simpler anyway.
Now I've got FROMFIELD, I can rework it so that structures providing
an implementation of the Socket or Plug trait no longer have to have
the vtable pointer as the very first thing in the structure. In
particular, this means that the ProxySocket structure can now directly
implement _both_ the Socket and Plug traits, which is always
_logically_ how it's worked, but previously it had to be implemented
via two separate structs linked to each other.
This is a cleanup I started to notice a need for during the BinarySink
work. It removes a lot of faffing about casting things to char * or
unsigned char * so that some API will accept them, even though lots of
such APIs really take a plain 'block of raw binary data' argument and
don't care what C thinks the signedness of that data might be - they
may well reinterpret it back and forth internally.
So I've tried to arrange for all the function call APIs that ought to
have a void * (or const void *) to have one, and those that need to do
pointer arithmetic on the parameter internally can cast it back at the
top of the function. That saves endless ad-hoc casts at the call
sites.
This affects all the functions that generate public and private key
and signature blobs of all kinds, plus ssh_ecdhkex_getpublic. Instead
of returning a bare block of memory and taking an extra 'int *length'
parameter, all these functions now write to a BinarySink, and it's the
caller's job to have prepared an appropriate one where they want the
output to go (usually a strbuf).
The main value of this change is that those blob-generation functions
were chock full of ad-hoc length-counting and data marshalling. You
have only to look at rsa2_{public,private}_blob, for example, to see
the kind of thing I was keen to get rid of!
Lots of functions had really generic names (like 'makekey'), or names
that missed out an important concept (like 'rsakey_pubblob', which
loads a public blob from a _file_ and doesn't generate it from an
in-memory representation at all). Also, the opaque 'int order' that
distinguishes the two formats of public key blob is now a mnemonic
enumeration, and while I'm at it, rsa_ssh1_public_blob takes one of
those as an extra argument.
NFC: this is a preliminary refactoring, intended to make my life
easier when I start changing around the APIs used to pass user
keyboard input around. The fewer functions even _have_ such an API,
the less I'll have to do at that point.
Since Pageant contains its own passphrase prompt system rather than
delegating it to another process, it's not trivial to use it in other
contexts. But having gone to the effort of coming up with my own
askpass system that (I think) does a better job at not revealing the
length of the password, I _want_ to use it in other contexts where a
GUI passphrase or password prompt is needed. Solution: an --askpass
option.
Mostly for debugging purposes, because I'm tired of having to use
'setsid' to force Pageant to select the GUI passphrase prompt when I'm
trying to fix bugs in gtkask.c. But I can also imagine situations in
which the ability to force a GUI prompt window might be useful to end
users, for example if the process does _technically_ have a
controlling terminal but it's not a user-visible one (say, in the back
end of some automation tool like expect(1)).
For symmetry, I also provide an option to force the tty prompt. That's
less obviously useful, because that's already the preferred prompt
type when both methods are available - so the only use for it would be
if you wanted to ensure that Pageant didn't _accidentally_ try to
launch a GUI prompt, and aborted with an error if it couldn't use a
tty prompt.
It's an incoherent concept! There should not be any such thing as an
error box that terminates the entire program but is not modal. If it's
bad enough to terminate the whole program, i.e. _all_ currently live
connections, then there's no point in permitting progress to continue
in windows other than the affected one, because all windows are
affected anyway.
So all previous uses of fatalbox() have become modalfatalbox(), except
those which looked to me as if they shouldn't have been fatal in the
first place, e.g. lingering pieces of error handling in winnet.c which
ought to have had the severity of 'give up on this particular Socket
and close it' rather than 'give up on the ENTIRE UNIVERSE'.
When called with -V to ask for our version, return 0 rather than 1.
This is the usual behaviour observed by ssh(1) and other Unix commands.
Also use exit() rather than cleanup_exit() in pscp.c and psftp.c ; at
this point we have nothing to cleanup!
If we try to interpret a string argument as the name of a key file,
sometimes we it's in circumstances where we _know_ it's a key file, so
we must print an error message and return failure if the file can't be
loaded. Other times it's not, and we just fall back to interpreting
the argument in some other way (e.g. as a pattern match against the
comment or fingerprint of a key already in the agent).
My code dealing with failure returns from the public-key loading
functions were mishandling the latter case, if they identified a file
as existing and looking more or less like some kind of key file but
then it turned out to have a format error; they would try to copy and
return a public key that they didn't actually have. Even if
pageant_pubkey_copy avoided crashing as a result, this would still
inhibit the fallback to treating the input string as some other kind
of pattern match.
Thanks, Coverity - I must have been lucky that Unix Pageant in client
mode hasn't so far happened to have this field come out non-NULL, or
else pageant_pubkey_copy would have tried to dupstr a garbage pointer.
This shows the build platform (32- vs 64-bit in particular, and also
whether Unix GTK builds were compiled with or without the X11 pieces),
what compiler was used to build the binary, and any interesting build
options that might have been set on the make command line (especially,
but not limited to, the security-damaging ones like NO_SECURITY or
UNPROTECT). This will probably be useful all over the place, but in
particular it should allow the different Windows binaries to be told
apart!
Commits 21101c739 and 2eb952ca3 laid the groundwork for this, by
allowing the various About boxes to contain free text and also
ensuring they could be copied and pasted easily as part of a bug
report.
It has three settings: on, off, and 'only until session starts'. The
idea of the last one is that if you use something like 'ssh -v' as
your proxy command, you probably wanted to see the initial SSH
connection-setup messages while you were waiting to see if the
connection would be set up successfully at all, but probably _didn't_
want a slew of diagnostics from rekeys disrupting your terminal in
mid-emacs once the session had got properly under way.
Default is off, to avoid startling people used to the old behaviour. I
wonder if I should have set it more aggressively, though.
In case a front end needs to store more than an integer id to be
returned to uxsel_input_remove, we now return a pointer to a
frontend-defined structure.
Having found a lot of unfixed constness issues in recent development,
I thought perhaps it was time to get proactive, so I compiled the
whole codebase with -Wwrite-strings. That turned up a huge load of
const problems, which I've fixed in this commit: the Unix build now
goes cleanly through with -Wwrite-strings, and the Windows build is as
close as I could get it (there are some lingering issues due to
occasional Windows API functions like AcquireCredentialsHandle not
having the right constness).
Notable fallout beyond the purely mechanical changing of types:
- the stuff saved by cmdline_save_param() is now explicitly
dupstr()ed, and freed in cmdline_run_saved.
- I couldn't make both string arguments to cmdline_process_param()
const, because it intentionally writes to one of them in the case
where it's the argument to -pw (in the vain hope of being at least
slightly friendly to 'ps'), so elsewhere I had to temporarily
dupstr() something for the sake of passing it to that function
- I had to invent a silly parallel version of const_cmp() so I could
pass const string literals in to lookup functions.
- stripslashes() in pscp.c and psftp.c has the annoying strchr nature
I've written my own analogue of OpenSSH's ssh-askpass. At the moment,
it's contained inside Pageant proper, though it could easily be
compiled into a standalone binary as well or instead.
Unlike OpenSSH's version, I don't use a GTK edit box; instead I just
process key events myself and append them to a buffer. The big
advantage of doing this is that I can arrange for ^W and ^U to
function as they do in terminal line editing, i.e. delete a word or
delete the whole line.
^W in particular is really valuable when typing a multiple-word
passphrase unseen. If you feel yourself making the kind of typo in
which you're not sure if you pressed six keys or just five, you can
hit ^W and restart just that word, without either having to go right
back to the beginning or carry on and see if you feel lucky.
A delete-word function would of course be an information leak in even
an obscured edit box (displaying a blob per character), so instead I
give a visual acknowledgment of keypresses by a more ad-hoc means: I
display three lights in the box, and every meaningful keypress turns
off the currently active one and instead turns on a randomly selected
one of the others. (So the lit light doesn't even indicate _mod 3_ how
many keys have been pressed.)