I've found in the last day or two that the first thing I want to do
after any successful run of testbn is to check whether I was running
it with the right compile settings - so I should have made it easier
to find that out to begin with! Better late than never.
This makes it easier to compile in multiple debugging modes, or on
Windows, without having to constantly paste annoying test-compile
commands out of comments in sshbn.c.
The new binary is compiled into the build directory, but not shipped
by 'make install', just like fuzzterm. Unlike fuzzterm, though, testbn
is also compiled on Windows, for which we didn't already have a
mechanism for building unshipped binaries; I've done the very simplest
thing for the moment, of providing a target in Makefile.vc to delete
them.
In order to comply with the PuTTY makefile system's constraint of
never compiling the same object multiple times with different ifdefs,
I've also moved testbn's main() out into its own source file.
This commit fulfills the promise of the previous one: now one of the
branches of sshbn.h's big ifdef _doesn't_ define a BignumDblInt, and
instead provides implementations of the primitive arithmetic macros in
terms of Visual Studio's x86-64 compiler intrinsics. So now, when this
codebase is compiled with 64-bit VS, it can use a 64-bit BignumInt and
everything still seems to work.
As I mentioned in the previous commit, I'm going to want PuTTY to be
able to run sensibly when compiled with 64-bit Visual Studio,
including handling bignums in 64-bit chunks for speed. Unfortunately,
64-bit VS does not provide any type we can use as BignumDblInt in that
situation (unlike 64-bit gcc and clang, which give us __uint128_t).
The only facilities it provides are compiler intrinsics to access an
add-with-carry operation and a 64x64->128 multiplication (the latter
delivering its product in two separate 64-bit output chunks).
Hence, here's a substantial rework of the bignum code to make it
implement everything in terms of _those_ primitives, rather than
depending throughout on having BignumDblInt available to use ad-hoc.
BignumDblInt does still exist, for the moment, but now it's an
internal implementation detail of sshbn.h, only declared inside a new
set of macros implementing arithmetic primitives, and not accessible
to any code outside sshbn.h (which confirms that I really did catch
all uses of it and remove them).
The resulting code is surprisingly nice-looking, actually. You'd
expect more hassle and roundabout circumlocutions when you drop down
to using a more basic set of primitive operations, but actually, in
many cases it's turned out shorter to write things in terms of the new
BignumADC and BignumMUL macros - because almost all my uses of
BignumDblInt were implementing those operations anyway, taking several
lines at a time, and now they can do each thing in just one line.
The biggest headache was Poly1305: I wasn't able to find any sensible
way to adapt the existing Python script that generates the various
per-int-size implementations of arithmetic mod 2^130-5, and so I had
to rewrite it from scratch instead, with nothing in common with the
old version beyond a handful of comments. But even that seems to have
worked out nicely: the new version has much more legible descriptions
of the high-level algorithms, by virtue of having a 'Multiprecision'
type which wraps up the division into words, and yet Multiprecision's
range analysis allows it to automatically drop out special cases such
as multiplication by 5 being much easier than multiplication by
another multi-word integer.
DIVMOD_WORD is a portability hazard, because implementing it requires
either a way to get direct access to the x86 DIV instruction or
equivalent (be it inline assembler or a compiler intrinsic), or else
an integer type we can use as BignumDblInt. But I'm starting to think
about porting to 64-bit Visual Studio with a 64-bit BignumInt, and in
that situation neither of those options will be available.
I could write a piece of _out_-of-line x86-64 assembler in a separate
source file and put a function call in DIVMOD_WORD, but instead I've
decided to solve the problem in a more futureproof way: remove
DIVMOD_WORD totally and write a division function that doesn't need it
at all, solving not only today's porting headache but all future ones
in this area.
The new implementation works by precomputing (a good enough
approximation to) the leading word of the reciprocal of the modulus,
and then getting each word of quotient by multiplying by that
reciprocal, where we previously used DIVMOD_WORD to divide by the
leading word of the actual modulus. The reciprocal itself is computed
outside internal_mod() and passed in as a parameter, allowing me to
save time by only computing it once when I'm about to do a modpow.
To some extent this complicates the implementation: the advantage of
DIVMOD_WORD was that it yielded a full word q of quotient every time
it was used, so the subtraction of q*m from the input could be done in
a nicely word-aligned way. But the reciprocal multiply approach yields
_almost_ a full word of quotient, because you have to make the
reciprocal a bit short to avoid overflow at multiplication time. For a
start, this means we have to do fractionally more iterations of the
main loop; but more painfully, we can no longer depend on the
subtraction of q*m at every step being word-aligned, and instead we
have to be prepared to do it at any bit shift.
But the flip side is that once we've implemented that, the rest of the
algorithm becomes a lot less full of horrible special cases: in
particular, we can now completely throw away the horribleness at all
the call sites where we shift the modulus up by a fractional word to
set its top bit, and then have to do a little dance to get the last
few bits of quotient involving a second call to internal_mod.
So there are points both for and against the new implementation in
simplicity terms; but I think on balance it's more comprehensible than
the old one, and a quick timing test suggests it also ends up a touch
faster overall - the new testbn gets through the output of
testdata/bignum.py in 4.034s where the old one took 4.392s.
I'm about to rewrite the division code, so it'll be useful to have a
way to test it directly, particularly one which exercises difficult
cases such as extreme values of the leading word and remainders just
above and below zero.
Or, at least, potentially do so. The build script now has a slot into
which code-signing can be dropped by setting a variable in the bob
configuration to specify an appropriate command line.
The variable will typically need to point at a script wrapping the
actual signing tool, since there are lots of fiddly details
(timestamping countersignature, certificate, private key, etc) not
given on the command lines in this build script, on the basis that
they're local configuration questions for whoever is _running_ this
build script.
When we provide an editable text box with a drop-down list of useful
preset values, such as the one full of character sets in the
Translation panel, we implement it on GTK 2.4+ as a GtkComboBox
pointing at a two-column GtkListStore, in which the second column is
the actual text (the first being a numeric id). Therefore, we need to
set the "entry-text-column" property to tell GtkComboBox which of
those columns to look in for the value corresponding to the edit-box
text.
Thanks to Robert de Bath for spotting the problem and tracing it as
far as commit 5fa22495c. That commit replaced a widget construction
call via gtk_combo_box_entry_new_with_model() with one using the newer
gtk_combo_box_new_with_model_and_entry(), overlooking the fact that
the former provided the text column number as a parameter, and the
latter didn't.
logevent() doesn't do printf-style formatting (though the logeventf
wrapper in ssh.c does), so if you need to format a message, it has to
be done separately with dupprintf.
I'd missed out an if statement in the Unix proxy stderr code
introduced by commit 297efff30, causing ret->cmd_err to be passed to
uxsel_set even when it was -1 (which happened in the non-GUI tools).
Unfortunately, putting a negative fd into the uxsel tree has really
bad effects, because the first_fd / next_fd interface returns a
negative number to signal end-of-list - and since the uxsel tree is
sorted by fd, that happens _immediately_.
Added the missing if statement, and also an assertion to make sure we
never pass -1 to uxsel_set by mistake again!
By default Windows processes have wide open ACLs which allow interference
by other processes running as the same user. Adjust our ACL to make this
a bit harder.
Because it's useful to protect PuTTYtel as well, carve winsecur.c into
advapi functions and wincapi.c for crypt32 functions.
gtk_misc_set_alignment was deprecated in GTK 3.14. But my replacement
code using gtk_label_set_xalign doesn't work there, because that
function wasn't introduced until GTK 3.16, so there are two minor
versions in the middle where a third strategy is needed.
Thanks to Colin Harrison for spotting it very quickly. No thanks to
Visual Studio for only giving me a _warning_ when I prototyped a
function with four parameters and called it with five!
It has three settings: on, off, and 'only until session starts'. The
idea of the last one is that if you use something like 'ssh -v' as
your proxy command, you probably wanted to see the initial SSH
connection-setup messages while you were waiting to see if the
connection would be set up successfully at all, but probably _didn't_
want a slew of diagnostics from rekeys disrupting your terminal in
mid-emacs once the session had got properly under way.
Default is off, to avoid startling people used to the old behaviour. I
wonder if I should have set it more aggressively, though.
On both Unix and Windows, we now redirect the local proxy command's
standard error into a third pipe; data received from that pipe is
broken up at newlines and logged in the Event Log. So if the proxy
command emits any error messages in the course of failing to connect
to something, you now have a fighting chance of finding out what went
wrong.
This feature is disabled in command-line tools like PSFTP and Plink,
on the basis that in that situation it seems more likely that the user
would expect standard-error output to go to the ordinary standard
error in the ordinary way. Only GUI PuTTY catches it and logs it like
this, because it either doesn't have a standard error at all (on
Windows) or is likely to be pointing it at some completely unhelpful
session log file (under X).
I've defined a new value for the 'int type' parameter passed to
plug_log(), which proxy sockets will use to pass their backend
information on how the setup of their proxied connections are going.
I've implemented support for the new type code in all _nontrivial_
plug log functions (which, conveniently, are precisely the ones I just
refactored into backend_socket_log); the ones which just throw all
their log data away anyway will do that to the new code as well.
We use the new type code to log the DNS lookup and connection setup
for connecting to a networked proxy, and also to log the exact command
string sent down Telnet proxy connections (so the user can easily
debug mistakes in the configured format string) and the exact command
executed when spawning a local proxy process. (The latter was already
supported on Windows by a bodgy logging call taking advantage of
Windows in particular having no front end pointer; I've converted that
into a sensible use of the new plug_log facility, and done the same
thing on Unix.)
I'm about to want to make a change to all those functions at once, and
since they're almost identical, it seemed easiest to pull them out
into a common helper. The new source file be_misc.c is intended to
contain helper code common to all network back ends (crypto and
non-crypto, in particular), and initially it contains a
backend_socket_log() function which is the common part of ssh_log(),
telnet_log(), rlogin_log() etc.
We set up a pair of bufchains for the standard input and output
exchanged with the proxy process, but forgot to clear them when the
Local_Proxy_Socket is cleaned up.
We've always had the back-end code unconditionally print 'Looking up
host' before calling name_lookup. But name_lookup doesn't always do an
actual lookup - in cases where the connection will be proxied and
we're configured to let the proxy do the DNS for us, it just calls
sk_nonamelookup to return a dummy SockAddr with the unresolved name
still in it. It's better to print a message that varies depending on
whether we're _really_ doing DNS or not, e.g. so that people can tell
the difference between DNS failure and proxy misconfiguration.
Hence, those log messages are now generated inside name_lookup(),
which takes a couple of extra parameters for the purpose - a frontend
pointer to pass to logevent(), and a reason string so that it can say
what the hostname it's (optionally) looking up is going to be used
for. (The latter is intended for possible use in logging subsidiary
lookups for port forwarding, though the moment I haven't changed
the current setup where those connection setups aren't logged in
detail - we just pass NULL in that situation.)
There was a very old plan to flesh this out into an implementation of
SSLified Telnet, back when it looked as if that might be the winning
option for encrypted remote login. But SSH won, so that random junk in
network.h has been sitting around for decades doing nothing useful.
make_private_security_descriptor and a new function protectprocess().
protectprocess() opens the running PuTTY process and adjusts the
Everyone and user access control entries in its ACL to deny a
selection of permissions which malicious processes running as the same
user could use to hijack PuTTY.
Half the release checklist has changed recently, what with me
completely reworking the website and also writing all this release
automation. I think these are all the checklist changes needed now the
dust has settled, though of course when I do the next actual release I
expect there'll turn out to be something I missed...
I've added extra modes to release.pl which should automate the more
tedious parts of the deployment phase: uploading the release build to
all the places it needs to go, checking its integrity once it gets
there, verifying that everything can be downloaded again usefully,
checking content-types etc.
The new version should check more thoroughly (it checks the whole FTP
and HTTP download directories, so it will spot errors like failing to
update the FTP 'latest' symlink), and take fewer commands to run.
The length coming back from ber_read_id_len might have overflowed, so
treat it as potentially negative. Also, while I'm here, accumulate it
inside ber_read_id_len as an unsigned, so as to avoid undefined
behaviour on integer overflow, and toint() it before return.
Thanks to Hanno Böck for spotting this, with the aid of AFL.
The initial test for a line ending with "PRIVATE KEY-----" failed to
take into account the possibility that the line might be shorter than
that. Fixed by introducing a new library function strendswith(), and
strstartswith() for good measure, and using that.
Thanks to Hanno Böck for spotting this, with the aid of AFL.
TOOLTYPE_NONNETWORK (i.e. pterm) already has "-log" (as does Unix
PuTTY), so there's no sense suppressing the synonym "-sessionlog".
Undocumented lacunae that remain:
plink accepts -sessionlog, but does nothing with it. Arguably it should.
puttytel accepts -sshlog/-sshrawlog (and happily logs e.g. Telnet
negotiation, as does PuTTY proper).
When we set ssh->sc{cipher,mac} to s->sc{cipher,mac}_tobe
conditionally, we should be conditionalising on the values we're
_reading_, not the ones we're about to overwrite.
Thanks to Colin Harrison for this patch.
I've added a few sample shell commands in the upload procedure (mostly
so that I don't have to faff about remembering how rsync trailing
slashes work every time), and also written a script called
'release.pl', which automates the updating of the version number in
all the various places it needs to be done and also ensures the PSCP
and Plink transcripts in the docs will match the release itself.
I spotted that I've been checking that old-style Windows Help files
were delivered with content-type "application/octet-stream", but not
also checking the same thing about the marginally newer .CHM ones. (Or
at least not writing it down in the wishlist; I think I did actually
check on at least one occasion.)
It's for regression testing and fuzzing, so there's no use for it if
you're not a developer working on the source.
Leaving it out of the 'make install' target in Makefile.gtk is no
trouble because that's already handled manually in Recipe by inserting
a giant hairy Makefile fragment to do the installation. But
Makefile.am was just setting bin_PROGRAMS to the full set of binaries
built, so for that one, I had to invent a new Recipe program category
[UT] which moves a particular binary into noinst_PROGRAMS.
While I was at it, I've retired the [M] program category, which has
been lying around unused since Ben's old Mac OS pre-X port.
Apparently if you maintain a branch for a long time where you only
compile with a non-default ifdef enabled, it becomes possible to not
notice a typo you left in the default branch :-)
This brings in the rest of the 0.66 branch, including some changes new
on master.
Conflicts:
doc/plink.but
sshrsa.c
(The conflicts were both trivial: in one, the addition of an extra
parameter to rsa2_newkey on master happened on the line next to 0.66's
addition of a check for NULL return value, and in the other, I'd got
the version number in the plink -h transcript messed up on master.)
Everything up to here on the release branch is cherry-picks from
master anyway, and some of their cherry-picked forms conflict with the
current state of master due to further work, so here I'm just
recording an ancestry relation to indicate that there's nothing up to
here on 0.66 that master hasn't got.
Handles managed by winhandl.c have a 'busy' flag, which is used to
mean two things: (a) is a subthread currently blocked on this handle
so various operations in the main thread have to be deferred until it
finishes? And (b) is this handle currently one that should be returned
to the main loop to be waited for?
For HT_INPUT and HT_OUTPUT, those things are either both true or both
false, so a single flag covering both of them is fine. But HT_FOREIGN
handles have the property that they should always be waited for in the
main loop, but no subthread is blocked on them. The latter means that
operations done on them in the main thread should not be deferred; the
only such operation is cleaning them up in handle_free().
handle_free() was failing to spot this, and was deferring freeing
HT_FOREIGN handles until their subthread terminated - which of course
never happened. As a result, when a named pipe server was closed, its
actual Windows event object got destroyed, but winhandl.c still kept
passing it back to the main thread, leading to a tight loop because
MsgWaitForMultipleObjects would return ERROR_INVALID_HANDLE and never
block.
(cherry picked from commit 431f8db862)