A known_hosts line can have multiple comma-separated hostnames on it,
or more usually a hostname and an IP address.
In the RSA and DSA key handlers, I was making a list of the integer
parameters of the public key by using the 'map' function, and then
iterating over it once per hostname on the line. But in Python 3, the
'map' function returns an iterator, not a list, so after you've
iterated to its end once, it's empty, and iterating over it a second
time stops immediately. As a result, the registry line for the second
hostname was coming out empty.
(cherry picked from commit 143f8a2d10)
The comparison functions between an mp_int and an integer worked by
walking along the mp_int, comparing each of its words to the
corresponding word of the integer. When they ran out of mp_int, they'd
stop.
But this overlooks the possibility that they might not have run out of
_integer_ yet! If BIGNUM_INT_BITS is defined to be less than the size
of a uintmax_t, then comparing (say) the uintmax_t 0x8000000000000001
against a one-word mp_int containing 0x0001 would return equality,
because it would never get as far as spotting the high bit of the
integer.
Fixed by iterating up to the max of the number of BignumInts in the
mp_int and the number that cover a uintmax_t. That means we have to
use mp_word() instead of a direct array lookup to get the mp_int words
to compare against, since now the word indices might be out of range.
(cherry picked from commit 289d8873ec)
Functions like mp_copy_integer_into, mp_add_integer_into and
mp_hs_integer all take an ordinary C integer in the form of a
uintmax_t, and perform an operation between that and an mp_int. In
order to do that, they have to break it up into some number of
BignumInt, via bit shifts.
But in C, shifting by an amount equal to or greater than the width of
the type is undefined behaviour, and you risk the compiler generating
nonsense or complaining at compile time. I did various dodges in those
functions to try to avoid that, but didn't manage to use the same
idiom everywhere. Sometimes I'd leave the integer in its original form
and shift it right by increasing multiples of BIGNUM_INT_BITS;
sometimes I'd shift it down in place every time. And mostly I'd do the
conditional shift by checking against sizeof(n), but once I did it by
shifting by half the word and then the other half.
Now refactored so that there's a pair of functions to shift a
uintmax_t left or right by BIGNUM_INT_BITS in what I hope is a UB-safe
manner, and changed all the code I could find to use them.
(cherry picked from commit 3ea69c290e)
While looking over the code for other reasons, I happened to notice
that the internal function mp_add_masked_integer_into was using a
totally wrong condition to check whether it was about to do an
out-of-range right shift: it was comparing a shift count measured in
bits against BIGNUM_INT_BYTES.
The resulting bug hasn't shown up in the code so far, which I assume
is just because no caller is passing any RHS to mp_add_integer_into
bigger than about 1 or 2. And it doesn't show up in the test suite
because I hadn't tested those functions. Now I am testing them, and
the newly added test fails when built for 16-bit BignumInt if you back
out the actual fix in this commit.
(cherry picked from commit 921118dbea)
This file exports several functions defined in sshserver.h, and the
declarations weren't being type-checked against the definitions.
(cherry picked from commit 37d91aabff)
I'm not really sure why that's necessary: by my understanding of the C
standard, it shouldn't be. But my observation is that when compiling
with {Address,Leak} Sanitiser enabled, pageant --askpass can somehow
manage to exit without having actually written the passphrase to its
standard output.
(cherry picked from commit c618d6baac)
On Windows, due to a copy-paste goof, the message that should have
read "Configuring n stop bits" instead ended with "data bits".
While I'm here, I've arranged that the "1 stop bit" case of that
message is in the singular. And then I've done the same thing again on
Unix, because I noticed that message was unconditionally plural too.
(cherry picked from commit bdb7b47a5e)
The sets of poll(2) events that we check in order to return SELECT_R
and SELECT_W overlap: to be precise, they have POLLERR in common. So
if an fd signals POLLERR, then pollwrap_get_fd_rwx will respond by
saying that it has both SELECT_R and SELECT_W available on it - even
if the caller had only asked for one of those.
In other words, you can get a spurious SELECT_W notification on an fd
that you never asked for SELECT_W on in the first place. This
definitely isn't what I'd meant that API to do.
In particular, if a socket in the middle of an asynchronous connect()
signals POLLERR, then Unix Plink will call select_result for it with
SELECT_R and then SELECT_W respectively. The former will notice that
it's got an error condition and call plug_closing - and _then_ the
latter will decide that it's writable and set s->connected! The plan
was to only select it for write until it was connected, but this bug
in pollwrap was defeating that plan.
Now pollwrap_get_fd_rwx should only ever return a set of rwx flags
that's a subset of the one that the client asked for via
pollwrap_add_fd_rwx.
(cherry picked from commit 78974fce89)
Spotted by Leak Sanitiser, while I was investigating the PSFTP /
proftpd issue mentioned in the previous commit (with ASan on as
usual).
The two very similar loops that read PSFTP commands from the
interactive prompt and a batch file differed in one respect: only one
of them remembered to free the command afterwards. Now I've moved the
freeing code out into a subroutine that both loops can use.
(cherry picked from commit bf0f323fb4)
I tried to do an SFTP upload through connection sharing the other day
and found that pscp sent some data and then hung. Now I debug it, what
seems to have happened was that we were looping in sftp_recv() waiting
for an SFTP packet from the remote, but we didn't have any outstanding
SFTP requests that the remote was going to reply to. Checking further,
xfer_upload_ready() reported true, so we _could_ have sent something -
but the logic in the upload loop had a hole through which we managed
to get into 'waiting for a packet' state.
I think what must have happened is that xfer_upload_ready() reported
false so that we entered sftp_recv(), but then the event loop inside
sftp_recv() ran a toplevel callback that made xfer_upload_ready()
return true. So, the fix: sftp_recv() is our last-ditch fallback, and
we always try emptying our callback queue and rechecking upload_ready
before we resort to waiting for a remote packet.
This not only fixes the hang I observed: it also hugely improves the
upload speed. My guess is that the bug must have been preventing us
from filling our outgoing request pipeline a _lot_ - but I didn't
notice it until the one time the queue accidentally ended up empty,
rather than just sparse enough to make transfers slow.
Annoyingly, I actually considered this fix back when I was trying to
fix the proftpd issue mentioned in commit cd97b7e7e. I decided fixing
ssh_sendbuffer() was a better idea. In fact it would have been an even
better idea to do both! Oh well, better late than never.
(cherry picked from commit 3a633bed35)
This reverts commit 4634cd47f7 and
commit 43a63019f5, both of which
introduced checks at ldisc_send call sites to avoid triggering the
assertion that len != 0 inside ldisc_send. Now that assertion is gone,
it's OK to call ldisc_send without checking the buffer size.
A user reported another situation in which that assertion can fail: if
you paste text into the terminal that consists 100% of characters not
available in the CONF_line_codepage character set, then the
translation step generates the empty string as output, and that gets
passed to ldisc_send by term_paste without checking.
Previous bugs of this kind (see commits 4634cd47f7 and 43a63019f5)
were fixed by adding a check before calling ldisc_send. But in commit
4634cd47f7 I said that probably at some point the right fix would be
to remove the assertion in ldisc_send itself, so that passing len==0
becomes legal. (The assertion was there in the first place to catch
cases where len==0 was used with its obsolete special meaning of
signalling 'please update your status'.)
Well, I think it's finally time. The assertion is removed: it's now
legal again to call ldisc_send with an empty buffer, and its meaning
is no longer the archaic special thing, but the trivial one of sending
zero characters through the line discipline.
wcrtomb returns a size_t, so it's silly to immediately assign it into
an int variable. Apparently running gcc with LTO enabled points this
out as an error.
This was benign as far as I can see: the obvious risk of integer
overflow could only happen if the OS wanted to convert a single wide
character into more than 2^31 bytes, and the test of the return value
against (size_t)-1 for an error check seems to work anyway in
practice, although I suspect that's only because of implementation-
defined behaviour in gcc at the point where the size_t is narrowed to
a signed int.
Again, there was a missing #include in that file which meant that the
definition of the function was never being checked against the
declaration visible to other source files.
I'd forgotten to #include "dialog.h" in that file, which meant nothing
was checking the prototypes of the stub implementations of the dlg_*
function family against the real versions. They almost all needed a
'void *dlg' parameter updating to 'dlgparam *dp', which is a change
dating from commit 3aae1f9d76 nearly two years ago. And a handful of
them also still had 'int' that should be now have become 'bool'.
The class for general rth-root finding started off as a cube-root
finder before I generalised it, and in one part of the top-level
explanatory comment, I still referred to a subgroup having index 3
rather than index r.
Also, in a later paragraph, I seem to have said 'index' several times
where I meant the concept of 'rank' I defined in the previous
paragraph.
Mark Wooding points out that when running with the +ut flag, we close
pty_utmp_helper_pipe during pty backend setup, which causes the
previously forked helper process to terminate. If that termination
happens quickly enough, then the code later in pty_backend_create
won't have set up the SIGCHLD handler and its pipe yet, so when we get
to the main event loop, we'll fail to notice that subprocess waiting
to be reaped, and leave it lying around as a zombie.
An easy fix is to move the handler and pipe setup to before the code
that potentially closes pty_utmp_helper_pipe, so that there isn't a
race condition any more.
A user points out that in the current state of PSCP, if you have some
protocol other than SSH configured in Default Settings, then
specifying a non-saved-session hostname on the PSCP command line will
cause it to try to connect with protocol SSH but the port number from
Default Settings.
A better approach is the one used in PSFTP: we use the port number
from the saved session _if_ the protocol is also one that's known to
PSCP (i.e. SSH or bare ssh-connection), and otherwise, we reset both
to sensible values.
Now you can see exactly what pathname the backend tried to open for
the serial port, and what error code it got back from the OS when it
tried. That should help users distinguish between (for example) a
permissions problem and a typo in the filename.
Now, instead of a 'const char *' in the static data segment, error
messages returned from backend setup are dynamically allocated and
freed by the caller.
This will allow me to make the messages much more specific (including
errno values and the like). However, this commit is pure refactoring:
I've _just_ changed the allocation policy, and left all the messages
alone.
If a terminal window closed with a popup (due to a network error,
for instance) while the mouse pointer was hidden by 'Hide mouse
pointer when typing in window', the mouse pointer could remain hidden
while over the terminal window, making it hard to navigate to the
popup.
Colin reports that on betas of Ubuntu 20.04, Pango has switched to
getting its font metrics from HarfBuzz, and a side effect is
apparently that they're being returned in the full precision of
PANGO_SCALE fixed point.
Previously, Pango appears to have been returning values that were
always a whole number of pixels scaled by PANGO_SCALE. Moreover, it
looks as if it was rounding the font ascent and descent _up_ to a
whole number of pixels, rather than rounding to nearest. But our code
rounds to nearest, which means that now the same font gets allocated
fewer vertical pixels, which can be enough to cut off some ascenders
or descenders.
Pango already provides the macro PANGO_PIXELS_CEIL, so it's easy to
switch over to using it. This should arrange that any text that fits
within the font's stated ascent/descent measurements will also fit in
the character cell.
If gdk_event_get_scroll_deltas() return failure for a given
GdkEventScroll, it doesn't follow that that event has no usable
scrolling action in it at all. The fallback is to call
gdk_event_get_scroll_direction() instead, which is less precise but
still gives _something_ you can use. So in that situation, instead of
just returning false, we can fall through to the handling we use for
pre-GTK3 scroll events (which are always imprecise).
In particular, I've noticed recently that if you run GTK 3 PuTTY in
the virtual X display created by vnc4server, and connect to it using
xtightvncviewer, then scroll-wheel actions passed through from the VNC
client will cause scroll_event() to receive low-res GdkEventScroll
structures of exactly this kind. So scroll-wheel activity on the
terminal window wasn't causing a scroll in that environment, and with
this patch, it does.
In the SSH-2 connection layer, an outstanding_channel_request
structure comes with a handler to be called back with the reply
packet, when the other end sends one. But sometimes it doesn't - if
the channel begins to close before the request has been replied to -
in which case the handler function is called with a NULL packet
pointer.
The common ssh2_channel_response function that handles most of the
client-side channel requests was not prepared to cope with that
pointer being null. Fixed by making it handle a null return the same
as CHANNEL_FAILURE.
This fills in the missing piece of Windows Pageant's story on deferred
decryption: we now actually know how to put up a dialog box asking for
the passphrase, when a not-yet-decrypted key is used.
This is quite a rough implementation so far, but it's a start. Known
issues:
- these new non-modal dialog boxes are serialised with respect to
each other by the Pageant core, but they can run in parallel with a
passphrase prompt popping up from the ordinary GUI 'Add Key'
operation. That may be too confusing; perhaps I should fix it.
- I'm not confident that the passphrase dialog box gets the keyboard
focus in all situations where I'd like it to (or what I can do
about it if not).
- the text in the non-modal box has two copies of the instruction
'enter passphrase for key'.
This is unlikely in most situations, but 'psusan' in particular is
intended to be run in a lot of weird environments where things aren't
properly set up yet. I just found out that if you use a Cygwin-built
psusan as the proxy process for Windows PuTTY (to get a local Cygwin
xterm) then it starts up with SHELL unset, and uxpty's forked
subprocess segfaults when it tries to exec a null pointer.
That causes the config dialog to terminate with result -1, which
wasn't handled at all by the result-receiving code. So GTK PuTTY would
continue running its main loop even though it had no windows open and
wasn't ever planning to do anything.
I just happened to notice in a re-read of the code that we were
computing b^2-4a and feeding it to mp_sqrt to check if it was a
perfect square, without having first checked that the subtraction
didn't overflow and deliver some arbitrary large positive number when
the true mathematical value was negative.
Fortunately, if this came up at all, it would have been as a false
_negative_ in Pockle's primality verification: it might have managed
to reject a genuine prime with a valid certificate on rare occasions.
So that's not too serious. But even so, now I've spotted it, fix it.
In commit 1f399bec58 I had the idea of generating the protocol radio
buttons in the GUI configurer by looping over the backends[] array,
which gets the reliably correct list of available backends for a given
binary rather than having to second-guess. That's given me an idea: we
can do the same for the per-backend config panels too.
Now the GUI config panel for every backend is guarded by a check of
backend_vt_from_proto, and we won't display the config for that
backend unless it's present.
In particular, this allows me to move the serial-port configuration
back into config.c from the separate file sercfg.c: we find out
whether to apply it by querying backend_vt_from_proto(PROT_SERIAL),
the same as any other backend.
In _particular_ particular, that also makes it much easier for me to
move the serial config up the pecking order, so that it's now second
only to SSH in the list of per-protocol config panes, which I think is
now where it deserves to be.
(A side effect of that is that I now have to come up with a different
method of having each serial backend specify the subset of parity and
flow control schemes it supports. I've done it by adding an extra pair
of serial-port specific bitmask fields to BackendVtable, taking
advantage of the new vtable definition idiom to avoid having to
boringly declare them as zero in all the other backends.)
This is a sweeping change applied across the whole code base by a spot
of Emacs Lisp. Now, everywhere I declare a vtable filled with function
pointers (and the occasional const data member), all the members of
the vtable structure are initialised by name using the '.fieldname =
value' syntax introduced in C99.
We were already using this syntax for a handful of things in the new
key-generation progress report system, so it's not new to the code
base as a whole.
The advantage is that now, when a vtable only declares a subset of the
available fields, I can initialise the rest to NULL or zero just by
leaving them out. This is most dramatic in a couple of the outlying
vtables in things like psocks (which has a ConnectionLayerVtable
containing only one non-NULL method), but less dramatically, it means
that the new 'flags' field in BackendVtable can be completely left out
of every backend definition except for the SUPDUP one which defines it
to a nonzero value. Similarly, the test_for_upstream method only used
by SSH doesn't have to be mentioned in the rest of the backends;
network Plugs for listening sockets don't have to explicitly null out
'receive' and 'sent', and vice versa for 'accepting', and so on.
While I'm at it, I've normalised the declarations so they don't use
the unnecessarily verbose 'struct' keyword. Also a handful of them
weren't const; now they are.
The contribution of SUPDUP has pushed SSH off the first page in my
default GTK font settings. It probably won't do that the same way for
everyone, but it did make me realise that the [Telnet, Rlogin, SSH]
order was a relic of 1997 when the SSH backend was new and added at
the bottom of the list. These days, SSH ought to be at the top for
easy access!
(And Serial ought to be next, but since it's added in a separate file,
that involves a bit more faffing about which I haven't done yet.)
The motivation is for the SUPDUP protocol. The server may send a
signal for the terminal to reset any input buffers. After this, the
server will not know the state of the terminal, so it is required to
send its cursor position back.
A known_hosts line can have multiple comma-separated hostnames on it,
or more usually a hostname and an IP address.
In the RSA and DSA key handlers, I was making a list of the integer
parameters of the public key by using the 'map' function, and then
iterating over it once per hostname on the line. But in Python 3, the
'map' function returns an iterator, not a list, so after you've
iterated to its end once, it's empty, and iterating over it a second
time stops immediately. As a result, the registry line for the second
hostname was coming out empty.
The testcrypt protocol expects a string literal to be a concatenation
of literal bytes other than '%' and '\n', and %-escaped hex digit
pairs. But testcrypt.py was only ever using the latter format, so even
a legible ASCII string like "123" was being sent to testcrypt as the
unreadable and needlessly long "%31%32%33".
When debugging, I often arrange to save the testcrypt input stream to
a file, and sometimes I use that file as the starting point for
editing. So it is actually useful to have the protocol exchange be
legible to humans. Hence, here's a change to testcrypt.py which makes
it only use the %-escape encoding for byte values that aren't
printable ASCII.
A 'strong' prime, as defined by the Handbook of Applied Cryptography,
is a prime p such that each of p-1 and p+1 has a large prime factor,
and that the large factor q of p-1 is such that q-1 in turn _also_ has
a large prime factor.
HoAC says that making your RSA key using primes of this form defeats
some factoring algorithms - but there are other faster algorithms to
which it makes no difference. So this is probably not a useful
precaution in practice. However, it has been recommended in the past
by some official standards, and it's easy to implement given the new
general facility in PrimeCandidateSource that lets you ask for your
prime to satisfy an arbitrary modular congruence. (And HoAC also says
there's no particular reason _not_ to use strong primes.) So I provide
it as an option, just in case anyone wants to select it.
The change to the key generation algorithm is entirely in sshrsag.c,
and is neatly independent of the prime-generation system in use. If
you're using Maurer provable prime generation, then the known factor q
of p-1 can be used to help certify p, and the one for q-1 to help with
q in turn; if you switch to probabilistic prime generation then you
still get an RSA key with the right structure, except that every time
the definition says 'prime factor' you just append '(probably)'.
(The probabilistic version of this procedure is described as 'Gordon's
algorithm' in HoAC section 4.4.2.)
Since our prime-generation code contains facilities not used by the
main key generators - Sophie Germain primes, user-specified modular
congruences, and MPU certificate output - it's probably going to be
useful sooner or later to have a command-line tool to access those
facilities. So here's a simple script that glues a Python argparse
interface on to the front of it all.
It would be nice to put this in 'contrib' rather than 'test', on the
grounds that it's at least potentially useful for purposes other than
testing PuTTY during development. But it's a client of the testcrypt
system, so it can't live anywhere other than the same directory as
testcrypt.py without me first having to do a lot of faffing about with
Python module organisation. So it can live here for the moment.
If you want to generate a Sophie Germain / safe prime pair with this
code, then after you've made p, you need to test the primality of
2p+1.
The easiest way to do that is to make a PrimeCandidateSource that is
so constrained as to only be able to deliver 2p+1 as a candidate, and
then run the ordinary prime generation system. The problem is that the
prime generation loops forever, so if 2p+1 _isn't_ prime, it will just
keep testing the same number over and over again and failing the test.
To solve this, I add a 'one-shot mode' to the PrimeCandidateSource
itself, which will cause it to return NULL if asked to generate a
second candidate. Then the prime-generation loops all detect that and
return NULL in turn. However, for clients that _don't_ set a pcs into
one-shot mode, the API remains unchanged: pcs_generate will not return
until it's found a prime (by its own criteria).
This feels like a bit of a bodge, API-wise. But the other two obvious
approaches turn out more awkward.
One option is to extract the Pockle from the PrimeGenerationContext
and use that to directly test primality of 2p+1 based on p - but that
way you don't get to _probabilistically_ generate safe primes, because
that kind of PGC has no Pockle in the first place. (And you can't make
one separately, because you can't convince it that an only
probabilistically generated p is prime!)
Another option is to add a test() method to PrimeGenerationPolicy,
that sits alongside generate(). Then, having generated p, you just
_test_ 2p+1. But then in the provable case you have to explain to it
that it should use p as part of the proof, so that API would get
awkward in its own way.
So this is actually the least disruptive way to do it!