I just ran into a bug in which the testcrypt child process was cleanly
terminated, but at least one Python object was left lying around
containing the identifier of a testcrypt object that had never been
freed. On program exit, the Python reference count on that object went
to zero, the __del__ method was invoked, and childprocess.funcall
started a _new_ instance of testcrypt just so it could tell it to free
the object identifier - which, of course, the new testcrypt had never
heard of!
We can already tell the difference between a ChildProcess object which
has no subprocess because it hasn't yet been started, and one which
has no subprocess because it's terminated: the latter has exitstatus
set to something other than None. So now we enforce by assertion that
we don't ever restart the child process, and the __del__ method avoids
doing anything if the child has already finished.
When I'm writing Python using the testcrypt API, I keep finding that I
instinctively try to call vtable methods as if they were actual
methods of the object. For example, calling key.sign(msg, 0) instead
of ssh_key_sign(key, msg, 0).
So this change to the Python side of the testcrypt mechanism panders
to my inappropriate finger-macros by making them work! The idea is
that I define a set of pairs (type, prefix), such that any function
whose name begins with the prefix and whose first argument is of that
type will be automatically translated into a method on the Python
object wrapping a testcrypt value of that type. For example, any
function of the form ssh_key_foo(val_ssh_key, other args) will
automatically be exposed as a method key.foo(other args), simply
because (val_ssh_key, "ssh_key_") appears in the translation table.
This is particularly nice for the Python 3 REPL, which will let me
tab-complete the right set of method names by knowing the type I'm
trying to invoke one on. I haven't decided yet whether I want to
switch to using it throughout cryptsuite.py.
For namespace-cleanness, I've also renamed all the existing attributes
of the Python Value class wrapper so that they start with '_', to
leave the space of sensible names clear for the new OOish methods.
Also spelled '-O text', this takes a public or private key as input,
and produces on standard output a dump of all the actual numbers
involved in the key: the exponent and modulus for RSA, the p,q,g,y
parameters for DSA, the affine x and y coordinates of the public
elliptic curve point for ECC keys, and all the extra bits and pieces
in the private keys too.
Partly I expect this to be useful to me for debugging: I've had to
paste key files a few too many times through base64 decoders and hex
dump tools, then manually decode SSH marshalling and paste the result
into the Python REPL to get an integer object. Now I should be able to
get _straight_ to text I can paste into Python.
But also, it's a way that other applications can use the key
generator: if you need to generate, say, an RSA key in some format I
don't support (I've recently heard of an XML-based one, for example),
then you can run 'puttygen -t rsa --dump' and have it print the
elements of a freshly generated keypair on standard output, and then
all you have to do is understand the output format.
I accepted both 'int' and 'uint' as function argument types, but
hadn't previously noticed that only 'uint' is handled properly as a
return type. Now both are.
Previously, if the testcrypt subprocess suffered any kind of crash or
assertion failure during a run of the Python-based test system, the
effect would be that ChildProcess.read_line() would get EOF, ignore
it, and silently return the empty string. Then it would carry on doing
that for the rest of the program, leading to a long string of error
reports in tests that were nowhere near the code that actually caused
the crash.
Now ChildProcess.read_line() detects EOF and raises an exception, so
that the test suite won't heedlessly carry on trying to do things once
it's noticed that its subprocess has gone away.
This is more fiddly than it sounds, however, because of the wrinkle
that sometimes that function can be called while a Python __del__
method is asking testcrypt to free something. If that happens, the
exception can't be propagated out of the __del__ (analogously to the
rule that it's a really terrible idea for C++ destructors to throw).
So you get an annoying warning message on standard error, and then the
next command sent to testcrypt will be back in the same position.
Worse still, this can also happen if testcrypt has _already_ crashed,
because the __del__ methods will still run.
To protect against _that_, ChildProcess caches the exception after
throwing it, and then each subsequent write_line() will rethrow it.
And __del__ catches and explicitly ignores the exception (to avoid the
annoying warning if Python has to do the same).
The combined result should be that if testcrypt crashes in normal
(non-__del__) context, we should get a single exception that
terminates the run cleanly without cascade failures, and whose
backtrace localises the problem to the actual operation that caused
the crash. If testcrypt crashes in __del__, we can't quite do that
well, but we can still terminate with an exception at the next
opportunity, avoiding multiple cascade failures.
Also in this commit, I've got rid of the try-finally in
cryptsuite.py's (trivial) main program.
Now we only run the final memory-leak check if we didn't already have
some other error to report, or some other exception that terminated
the process.
Also, we wait for the subprocess to terminate before returning control
to the shell, so that any last-minute complaints from Leak Sanitiser
appear before rather than after the shell prompt comes back.
While I'm here, I've also made check_return_status tolerate the case
in which the child process never got started at all. That way, if a
failure manages to occur before even getting _that_ far, there won't
be a cascade failure from check_return_status getting confused
afterwards.
The aim of this reorganisation is to make it easier to test all the
ciphers in PuTTY in a uniform way. It was inconvenient that there were
two separate vtable systems for the ciphers used in SSH-1 and SSH-2
with different functionality.
Now there's only one type, called ssh_cipher. But really it's the old
ssh2_cipher, just renamed: I haven't made any changes to the API on
the SSH-2 side. Instead, I've removed ssh1_cipher completely, and
adapted the SSH-1 BPP to use the SSH-2 style API.
(The relevant differences are that ssh1_cipher encapsulated both the
sending and receiving directions in one object - so now ssh1bpp has to
make a separate cipher instance per direction - and that ssh1_cipher
automatically initialised the IV to all zeroes, which ssh1bpp now has
to do by hand.)
The previous ssh1_cipher vtable for single-DES has been removed
completely, because when converted into the new API it became
identical to the SSH-2 single-DES vtable; so now there's just one
vtable for DES-CBC which works in both protocols. The other two SSH-1
ciphers each had to stay separate, because 3DES is completely
different between SSH-1 and SSH-2 (three layers of CBC structure
versus one), and Blowfish varies in endianness and key length between
the two.
(Actually, while I'm here, I've only just noticed that the SSH-1
Blowfish cipher mis-describes itself in log messages as Blowfish-128.
In fact it passes the whole of the input key buffer, which has length
SSH1_SESSION_KEY_LENGTH == 32 bytes == 256 bits. So it's actually
Blowfish-256, and has been all along!)
No cipher construction function _currently_ returns NULL, but one's
about to start, so the testcrypt system will have to be able to cope.
This is the first time a function in the testcrypt API has had an
'opt' type as its return value rather than an argument. But it works
just the same in reverse: the wire protocol emits the special
identifer "NULL" when the optional return value is absent, and the
Python module catches that and rewrites it as Python 'None'.
The bulk of this commit is the changes necessary to make testcrypt
compile under Visual Studio. Unfortunately, I've had to remove my
fiddly clever uses of C99 variadic macros, because Visual Studio does
something unexpected when a variadic macro's expansion puts
__VA_ARGS__ in the argument list of a further macro invocation: the
commas don't separate further arguments. In other words, if you write
#define INNER(x,y,z) some expansion involving x, y and z
#define OUTER(...) INNER(__VA_ARGS__)
OUTER(1,2,3)
then gcc and clang will translate OUTER(1,2,3) into INNER(1,2,3) in
the obvious way, and the inner macro will be expanded with x=1, y=2
and z=3. But try this in Visual Studio, and you'll get the macro
parameter x expanding to the entire string 1,2,3 and the other two
empty (with warnings complaining that INNER didn't get the number of
arguments it expected).
It's hard to cite chapter and verse of the standard to say which of
those is _definitely_ right, though my reading leans towards the
gcc/clang behaviour. But I do know I can't depend on it in code that
has to compile under both!
So I've removed the system that allowed me to declare everything in
testcrypt.h as FUNC(ret,fn,arg,arg,arg), and now I have to use a
different macro for each arity (FUNC0, FUNC1, FUNC2 etc). Also, the
WRAPPED_NAME system is gone (because that too depended on the use of a
comma to shift macro arguments along by one), and now I put a custom C
wrapper around a function by simply re-#defining that function's own
name (and therefore the subsequent code has to be a little more
careful to _not_ pass functions' names between several macros before
stringifying them).
That's all a bit tedious, and commits me to a small amount of ongoing
annoyance because now I'll have to add an explicit argument count
every time I add something to testcrypt.h. But then again, perhaps it
will make the code less incomprehensible to someone trying to
understand it!
When testcrypt.h lists a function argument as 'opt_val_foo', it means
that the argument is optional in the sense that the C function can
take a null pointer in place of a valid foo, and so the Python wrapper
module should accept None in the corresponding argument slot from the
client code and translate it into the special string "NULL" in the
wire protocol.
This works fine at argument translation time, but the code that reads
testcrypt.h wasn't looking at it, so if you said 'opt_val_foo_suffix'
in place of 'opt_val_foo' (indicating that that argument is optional
_and_ the C function expects it in a translated form), then the
initial pass over testcrypt.h wouldn't strip the _suffix, and would
set up data structures with mismatched type names.
I've written a new standalone test program which incorporates all of
PuTTY's crypto code, including the mp_int and low-level elliptic curve
layers but also going all the way up to the implementations of the
MAC, hash, cipher, public key and kex abstractions.
The test program itself, 'testcrypt', speaks a simple line-oriented
protocol on standard I/O in which you write the name of a function
call followed by some inputs, and it gives you back a list of outputs
preceded by a line telling you how many there are. Dynamically
allocated objects are assigned string ids in the protocol, and there's
a 'free' function that tells testcrypt when it can dispose of one.
It's possible to speak that protocol by hand, but cumbersome. I've
also provided a Python module that wraps it, by running testcrypt as a
persistent subprocess and gatewaying all the function calls into
things that look reasonably natural to call from Python. The Python
module and testcrypt.c both read a carefully formatted header file
testcrypt.h which contains the name and signature of every exported
function, so it costs minimal effort to expose a given function
through this test API. In a few cases it's necessary to write a
wrapper in testcrypt.c that makes the function look more friendly, but
mostly you don't even need that. (Though that is one of the
motivations between a lot of API cleanups I've done recently!)
I considered doing Python integration in the more obvious way, by
linking parts of the PuTTY code directly into a native-code .so Python
module. I decided against it because this way is more flexible: I can
run the testcrypt program on its own, or compile it in a way that
Python wouldn't play nicely with (I bet compiling just that .so with
Leak Sanitiser wouldn't do what you wanted when Python loaded it!), or
attach a debugger to it. I can even recompile testcrypt for a
different CPU architecture (32- vs 64-bit, or even running it on a
different machine over ssh or under emulation) and still layer the
nice API on top of that via the local Python interpreter. All I need
is a bidirectional data channel.