2000-10-18 15:00:36 +00:00
|
|
|
/*
|
|
|
|
* RSA key generation.
|
|
|
|
*/
|
|
|
|
|
2013-08-04 19:34:07 +00:00
|
|
|
#include <assert.h>
|
|
|
|
|
2000-10-18 15:00:36 +00:00
|
|
|
#include "ssh.h"
|
2020-02-24 19:09:08 +00:00
|
|
|
#include "sshkeygen.h"
|
Complete rewrite of PuTTY's bignum library.
The old 'Bignum' data type is gone completely, and so is sshbn.c. In
its place is a new thing called 'mp_int', handled by an entirely new
library module mpint.c, with API differences both large and small.
The main aim of this change is that the new library should be free of
timing- and cache-related side channels. I've written the code so that
it _should_ - assuming I haven't made any mistakes - do all of its
work without either control flow or memory addressing depending on the
data words of the input numbers. (Though, being an _arbitrary_
precision library, it does have to at least depend on the sizes of the
numbers - but there's a 'formal' size that can vary separately from
the actual magnitude of the represented integer, so if you want to
keep it secret that your number is actually small, it should work fine
to have a very long mp_int and just happen to store 23 in it.) So I've
done all my conditionalisation by means of computing both answers and
doing bit-masking to swap the right one into place, and all loops over
the words of an mp_int go up to the formal size rather than the actual
size.
I haven't actually tested the constant-time property in any rigorous
way yet (I'm still considering the best way to do it). But this code
is surely at the very least a big improvement on the old version, even
if I later find a few more things to fix.
I've also completely rewritten the low-level elliptic curve arithmetic
from sshecc.c; the new ecc.c is closer to being an adjunct of mpint.c
than it is to the SSH end of the code. The new elliptic curve code
keeps all coordinates in Montgomery-multiplication transformed form to
speed up all the multiplications mod the same prime, and only converts
them back when you ask for the affine coordinates. Also, I adopted
extended coordinates for the Edwards curve implementation.
sshecc.c has also had a near-total rewrite in the course of switching
it over to the new system. While I was there, I've separated ECDSA and
EdDSA more completely - they now have separate vtables, instead of a
single vtable in which nearly every function had a big if statement in
it - and also made the externally exposed types for an ECDSA key and
an ECDH context different.
A minor new feature: since the new arithmetic code includes a modular
square root function, we can now support the compressed point
representation for the NIST curves. We seem to have been getting along
fine without that so far, but it seemed a shame not to put it in,
since it was suddenly easy.
In sshrsa.c, one major change is that I've removed the RSA blinding
step in rsa_privkey_op, in which we randomise the ciphertext before
doing the decryption. The purpose of that was to avoid timing leaks
giving away the plaintext - but the new arithmetic code should take
that in its stride in the course of also being careful enough to avoid
leaking the _private key_, which RSA blinding had no way to do
anything about in any case.
Apart from those specific points, most of the rest of the changes are
more or less mechanical, just changing type names and translating code
into the new API.
2018-12-31 13:53:41 +00:00
|
|
|
#include "mpint.h"
|
2000-10-18 15:00:36 +00:00
|
|
|
|
2020-02-17 19:53:19 +00:00
|
|
|
#define RSA_EXPONENT 65537
|
2000-10-18 15:00:36 +00:00
|
|
|
|
2020-02-24 19:09:08 +00:00
|
|
|
#define NFIRSTBITS 13
|
2020-02-23 15:31:05 +00:00
|
|
|
static void invent_firstbits(unsigned *one, unsigned *two,
|
|
|
|
unsigned min_separation);
|
|
|
|
|
RSA generation: option to generate strong primes.
A 'strong' prime, as defined by the Handbook of Applied Cryptography,
is a prime p such that each of p-1 and p+1 has a large prime factor,
and that the large factor q of p-1 is such that q-1 in turn _also_ has
a large prime factor.
HoAC says that making your RSA key using primes of this form defeats
some factoring algorithms - but there are other faster algorithms to
which it makes no difference. So this is probably not a useful
precaution in practice. However, it has been recommended in the past
by some official standards, and it's easy to implement given the new
general facility in PrimeCandidateSource that lets you ask for your
prime to satisfy an arbitrary modular congruence. (And HoAC also says
there's no particular reason _not_ to use strong primes.) So I provide
it as an option, just in case anyone wants to select it.
The change to the key generation algorithm is entirely in sshrsag.c,
and is neatly independent of the prime-generation system in use. If
you're using Maurer provable prime generation, then the known factor q
of p-1 can be used to help certify p, and the one for q-1 to help with
q in turn; if you switch to probabilistic prime generation then you
still get an RSA key with the right structure, except that every time
the definition says 'prime factor' you just append '(probably)'.
(The probabilistic version of this procedure is described as 'Gordon's
algorithm' in HoAC section 4.4.2.)
2020-03-02 06:52:09 +00:00
|
|
|
typedef struct RSAPrimeDetails RSAPrimeDetails;
|
|
|
|
struct RSAPrimeDetails {
|
|
|
|
bool strong;
|
|
|
|
int bits, bitsm1m1, bitsm1, bitsp1;
|
|
|
|
unsigned firstbits;
|
|
|
|
ProgressPhase phase_main, phase_m1m1, phase_m1, phase_p1;
|
|
|
|
};
|
|
|
|
|
|
|
|
#define STRONG_MARGIN (20 + NFIRSTBITS)
|
|
|
|
|
|
|
|
static RSAPrimeDetails setup_rsa_prime(
|
|
|
|
int bits, bool strong, PrimeGenerationContext *pgc, ProgressReceiver *prog)
|
|
|
|
{
|
|
|
|
RSAPrimeDetails pd;
|
|
|
|
pd.bits = bits;
|
|
|
|
if (strong) {
|
|
|
|
pd.bitsm1 = (bits - STRONG_MARGIN) / 2;
|
|
|
|
pd.bitsp1 = (bits - STRONG_MARGIN) - pd.bitsm1;
|
|
|
|
pd.bitsm1m1 = (pd.bitsm1 - STRONG_MARGIN) / 2;
|
|
|
|
if (pd.bitsm1m1 < STRONG_MARGIN) {
|
|
|
|
/* Absurdly small prime, but we should at least not crash. */
|
|
|
|
strong = false;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
pd.strong = strong;
|
|
|
|
|
|
|
|
if (pd.strong) {
|
|
|
|
pd.phase_m1m1 = primegen_add_progress_phase(pgc, prog, pd.bitsm1m1);
|
|
|
|
pd.phase_m1 = primegen_add_progress_phase(pgc, prog, pd.bitsm1);
|
|
|
|
pd.phase_p1 = primegen_add_progress_phase(pgc, prog, pd.bitsp1);
|
|
|
|
}
|
|
|
|
pd.phase_main = primegen_add_progress_phase(pgc, prog, pd.bits);
|
|
|
|
|
|
|
|
return pd;
|
|
|
|
}
|
|
|
|
|
|
|
|
static mp_int *generate_rsa_prime(
|
|
|
|
RSAPrimeDetails pd, PrimeGenerationContext *pgc, ProgressReceiver *prog)
|
|
|
|
{
|
|
|
|
mp_int *m1m1 = NULL, *m1 = NULL, *p1 = NULL, *p = NULL;
|
|
|
|
PrimeCandidateSource *pcs;
|
|
|
|
|
|
|
|
if (pd.strong) {
|
|
|
|
progress_start_phase(prog, pd.phase_m1m1);
|
|
|
|
pcs = pcs_new_with_firstbits(pd.bitsm1m1, pd.firstbits, NFIRSTBITS);
|
|
|
|
m1m1 = primegen_generate(pgc, pcs, prog);
|
|
|
|
progress_report_phase_complete(prog);
|
|
|
|
|
|
|
|
progress_start_phase(prog, pd.phase_m1);
|
|
|
|
pcs = pcs_new_with_firstbits(pd.bitsm1, pd.firstbits, NFIRSTBITS);
|
|
|
|
pcs_require_residue_1_mod_prime(pcs, m1m1);
|
|
|
|
m1 = primegen_generate(pgc, pcs, prog);
|
|
|
|
progress_report_phase_complete(prog);
|
|
|
|
|
|
|
|
progress_start_phase(prog, pd.phase_p1);
|
|
|
|
pcs = pcs_new_with_firstbits(pd.bitsp1, pd.firstbits, NFIRSTBITS);
|
|
|
|
p1 = primegen_generate(pgc, pcs, prog);
|
|
|
|
progress_report_phase_complete(prog);
|
|
|
|
}
|
|
|
|
|
|
|
|
progress_start_phase(prog, pd.phase_main);
|
|
|
|
pcs = pcs_new_with_firstbits(pd.bits, pd.firstbits, NFIRSTBITS);
|
|
|
|
pcs_avoid_residue_small(pcs, RSA_EXPONENT, 1);
|
|
|
|
if (pd.strong) {
|
|
|
|
pcs_require_residue_1_mod_prime(pcs, m1);
|
|
|
|
mp_int *p1_minus_1 = mp_copy(p1);
|
|
|
|
mp_sub_integer_into(p1_minus_1, p1, 1);
|
|
|
|
pcs_require_residue(pcs, p1, p1_minus_1);
|
|
|
|
mp_free(p1_minus_1);
|
|
|
|
}
|
|
|
|
p = primegen_generate(pgc, pcs, prog);
|
|
|
|
progress_report_phase_complete(prog);
|
|
|
|
|
|
|
|
if (m1m1)
|
|
|
|
mp_free(m1m1);
|
|
|
|
if (m1)
|
|
|
|
mp_free(m1);
|
|
|
|
if (p1)
|
|
|
|
mp_free(p1);
|
|
|
|
|
|
|
|
return p;
|
|
|
|
}
|
|
|
|
|
|
|
|
int rsa_generate(RSAKey *key, int bits, bool strong,
|
|
|
|
PrimeGenerationContext *pgc, ProgressReceiver *prog)
|
2001-05-06 14:35:20 +00:00
|
|
|
{
|
2018-11-26 21:02:28 +00:00
|
|
|
key->sshk.vt = &ssh_rsa;
|
2018-06-03 11:58:05 +00:00
|
|
|
|
2000-10-18 15:00:36 +00:00
|
|
|
/*
|
|
|
|
* We don't generate e; we just use a standard one always.
|
|
|
|
*/
|
Complete rewrite of PuTTY's bignum library.
The old 'Bignum' data type is gone completely, and so is sshbn.c. In
its place is a new thing called 'mp_int', handled by an entirely new
library module mpint.c, with API differences both large and small.
The main aim of this change is that the new library should be free of
timing- and cache-related side channels. I've written the code so that
it _should_ - assuming I haven't made any mistakes - do all of its
work without either control flow or memory addressing depending on the
data words of the input numbers. (Though, being an _arbitrary_
precision library, it does have to at least depend on the sizes of the
numbers - but there's a 'formal' size that can vary separately from
the actual magnitude of the represented integer, so if you want to
keep it secret that your number is actually small, it should work fine
to have a very long mp_int and just happen to store 23 in it.) So I've
done all my conditionalisation by means of computing both answers and
doing bit-masking to swap the right one into place, and all loops over
the words of an mp_int go up to the formal size rather than the actual
size.
I haven't actually tested the constant-time property in any rigorous
way yet (I'm still considering the best way to do it). But this code
is surely at the very least a big improvement on the old version, even
if I later find a few more things to fix.
I've also completely rewritten the low-level elliptic curve arithmetic
from sshecc.c; the new ecc.c is closer to being an adjunct of mpint.c
than it is to the SSH end of the code. The new elliptic curve code
keeps all coordinates in Montgomery-multiplication transformed form to
speed up all the multiplications mod the same prime, and only converts
them back when you ask for the affine coordinates. Also, I adopted
extended coordinates for the Edwards curve implementation.
sshecc.c has also had a near-total rewrite in the course of switching
it over to the new system. While I was there, I've separated ECDSA and
EdDSA more completely - they now have separate vtables, instead of a
single vtable in which nearly every function had a big if statement in
it - and also made the externally exposed types for an ECDSA key and
an ECDH context different.
A minor new feature: since the new arithmetic code includes a modular
square root function, we can now support the compressed point
representation for the NIST curves. We seem to have been getting along
fine without that so far, but it seemed a shame not to put it in,
since it was suddenly easy.
In sshrsa.c, one major change is that I've removed the RSA blinding
step in rsa_privkey_op, in which we randomise the ciphertext before
doing the decryption. The purpose of that was to avoid timing leaks
giving away the plaintext - but the new arithmetic code should take
that in its stride in the course of also being careful enough to avoid
leaking the _private key_, which RSA blinding had no way to do
anything about in any case.
Apart from those specific points, most of the rest of the changes are
more or less mechanical, just changing type names and translating code
into the new API.
2018-12-31 13:53:41 +00:00
|
|
|
mp_int *exponent = mp_from_integer(RSA_EXPONENT);
|
2000-10-18 15:00:36 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Generate p and q: primes with combined length `bits', not
|
|
|
|
* congruent to 1 modulo e. (Strictly speaking, we wanted (p-1)
|
|
|
|
* and e to be coprime, and (q-1) and e to be coprime, but in
|
|
|
|
* general that's slightly more fiddly to arrange. By choosing
|
|
|
|
* a prime e, we can simplify the criterion.)
|
Rewrite invent_firstbits().
Instead of repeatedly looping on the random number generator until it
comes up with two values that have a large enough product, the new
version guarantees only one use of random numbers, by first counting
up all the possible pairs of values that would work, and then
inventing a single random number that's used as an index into that
list.
I've done the selection from the list using constant-time techniques,
not particularly because I think key generation can be made CT in
general, but out of sheer habit after the last few months, and who
knows, it _might_ be useful.
While I'm at it, I've also added an option to make sure the two
firstbits values differ by at least a given value. For RSA, I set that
value to 2, guaranteeing that even if the smaller prime has a very
long string of 1 bits after the firstbits value and the larger has a
long string of 0, they'll still have a relative difference of at least
2^{-12}. Not that there was any serious chance of the primes having
randomly ended up so close together as to make the key in danger of
factoring, but it seems like a silly thing to leave out if I'm
rewriting the function anyway.
2019-02-26 07:06:57 +00:00
|
|
|
*
|
|
|
|
* We give a min_separation of 2 to invent_firstbits(), ensuring
|
|
|
|
* that the two primes won't be very close to each other. (The
|
|
|
|
* chance of them being _dangerously_ close is negligible - even
|
|
|
|
* more so than an attacker guessing a whole 256-bit session key -
|
|
|
|
* but it doesn't cost much to make sure.)
|
2000-10-18 15:00:36 +00:00
|
|
|
*/
|
Fix RSA key gen at awkward sizes mod BIGNUM_INT_BITS.
If you try to generate (say) a 2049-bit RSA key, then primegen will
try to generate a 1025-bit prime. It will do it by making a random
1024-bit mp_int (that is, one strictly _less_ than 2^1024), and then
trying to set bit 1024. But that will fail an assertion in mp_set_bit,
because the number of random bits is a multiple of BIGNUM_INT_BITS, so
an mp_int of the minimum size that can hold the random bits is not
quite big enough to hold the extra bit at the top.
Fix: change the strategy in primegen so that we allocate the mp_int
large enough to hold even the top bit, and copy in the random numbers
via mp_or_into.
There's a second bug hiding behind that one. If the key has odd size,
then the two primes are generated with different bit lengths. If the
overall key size is congruent to 1 mod (2*BIGNUM_INT_BITS), then the
two primes will be allocated as mp_ints with different numbers of
words, leading to another assertion failure in the mp_cond_swap that
sorts the primes into a consistent order.
Fix for that one: if the primes are being generated different bit
lengths, then we arrange those lengths to be already in the right
order, and replace the mp_cond_swap with an assert() that checks the
ordering is already correct.
Combined effect: now you should be able to successfully generate a
2049-bit key without assertion failures.
2019-04-17 17:15:23 +00:00
|
|
|
int qbits = bits / 2;
|
|
|
|
int pbits = bits - qbits;
|
|
|
|
assert(pbits >= qbits);
|
2020-02-24 19:09:08 +00:00
|
|
|
|
RSA generation: option to generate strong primes.
A 'strong' prime, as defined by the Handbook of Applied Cryptography,
is a prime p such that each of p-1 and p+1 has a large prime factor,
and that the large factor q of p-1 is such that q-1 in turn _also_ has
a large prime factor.
HoAC says that making your RSA key using primes of this form defeats
some factoring algorithms - but there are other faster algorithms to
which it makes no difference. So this is probably not a useful
precaution in practice. However, it has been recommended in the past
by some official standards, and it's easy to implement given the new
general facility in PrimeCandidateSource that lets you ask for your
prime to satisfy an arbitrary modular congruence. (And HoAC also says
there's no particular reason _not_ to use strong primes.) So I provide
it as an option, just in case anyone wants to select it.
The change to the key generation algorithm is entirely in sshrsag.c,
and is neatly independent of the prime-generation system in use. If
you're using Maurer provable prime generation, then the known factor q
of p-1 can be used to help certify p, and the one for q-1 to help with
q in turn; if you switch to probabilistic prime generation then you
still get an RSA key with the right structure, except that every time
the definition says 'prime factor' you just append '(probably)'.
(The probabilistic version of this procedure is described as 'Gordon's
algorithm' in HoAC section 4.4.2.)
2020-03-02 06:52:09 +00:00
|
|
|
RSAPrimeDetails pd = setup_rsa_prime(pbits, strong, pgc, prog);
|
|
|
|
RSAPrimeDetails qd = setup_rsa_prime(qbits, strong, pgc, prog);
|
2020-02-23 15:29:40 +00:00
|
|
|
progress_ready(prog);
|
|
|
|
|
RSA generation: option to generate strong primes.
A 'strong' prime, as defined by the Handbook of Applied Cryptography,
is a prime p such that each of p-1 and p+1 has a large prime factor,
and that the large factor q of p-1 is such that q-1 in turn _also_ has
a large prime factor.
HoAC says that making your RSA key using primes of this form defeats
some factoring algorithms - but there are other faster algorithms to
which it makes no difference. So this is probably not a useful
precaution in practice. However, it has been recommended in the past
by some official standards, and it's easy to implement given the new
general facility in PrimeCandidateSource that lets you ask for your
prime to satisfy an arbitrary modular congruence. (And HoAC also says
there's no particular reason _not_ to use strong primes.) So I provide
it as an option, just in case anyone wants to select it.
The change to the key generation algorithm is entirely in sshrsag.c,
and is neatly independent of the prime-generation system in use. If
you're using Maurer provable prime generation, then the known factor q
of p-1 can be used to help certify p, and the one for q-1 to help with
q in turn; if you switch to probabilistic prime generation then you
still get an RSA key with the right structure, except that every time
the definition says 'prime factor' you just append '(probably)'.
(The probabilistic version of this procedure is described as 'Gordon's
algorithm' in HoAC section 4.4.2.)
2020-03-02 06:52:09 +00:00
|
|
|
invent_firstbits(&pd.firstbits, &qd.firstbits, 2);
|
2020-02-29 09:10:47 +00:00
|
|
|
|
RSA generation: option to generate strong primes.
A 'strong' prime, as defined by the Handbook of Applied Cryptography,
is a prime p such that each of p-1 and p+1 has a large prime factor,
and that the large factor q of p-1 is such that q-1 in turn _also_ has
a large prime factor.
HoAC says that making your RSA key using primes of this form defeats
some factoring algorithms - but there are other faster algorithms to
which it makes no difference. So this is probably not a useful
precaution in practice. However, it has been recommended in the past
by some official standards, and it's easy to implement given the new
general facility in PrimeCandidateSource that lets you ask for your
prime to satisfy an arbitrary modular congruence. (And HoAC also says
there's no particular reason _not_ to use strong primes.) So I provide
it as an option, just in case anyone wants to select it.
The change to the key generation algorithm is entirely in sshrsag.c,
and is neatly independent of the prime-generation system in use. If
you're using Maurer provable prime generation, then the known factor q
of p-1 can be used to help certify p, and the one for q-1 to help with
q in turn; if you switch to probabilistic prime generation then you
still get an RSA key with the right structure, except that every time
the definition says 'prime factor' you just append '(probably)'.
(The probabilistic version of this procedure is described as 'Gordon's
algorithm' in HoAC section 4.4.2.)
2020-03-02 06:52:09 +00:00
|
|
|
mp_int *p = generate_rsa_prime(pd, pgc, prog);
|
|
|
|
mp_int *q = generate_rsa_prime(qd, pgc, prog);
|
2000-10-18 15:00:36 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Ensure p > q, by swapping them if not.
|
Fix RSA key gen at awkward sizes mod BIGNUM_INT_BITS.
If you try to generate (say) a 2049-bit RSA key, then primegen will
try to generate a 1025-bit prime. It will do it by making a random
1024-bit mp_int (that is, one strictly _less_ than 2^1024), and then
trying to set bit 1024. But that will fail an assertion in mp_set_bit,
because the number of random bits is a multiple of BIGNUM_INT_BITS, so
an mp_int of the minimum size that can hold the random bits is not
quite big enough to hold the extra bit at the top.
Fix: change the strategy in primegen so that we allocate the mp_int
large enough to hold even the top bit, and copy in the random numbers
via mp_or_into.
There's a second bug hiding behind that one. If the key has odd size,
then the two primes are generated with different bit lengths. If the
overall key size is congruent to 1 mod (2*BIGNUM_INT_BITS), then the
two primes will be allocated as mp_ints with different numbers of
words, leading to another assertion failure in the mp_cond_swap that
sorts the primes into a consistent order.
Fix for that one: if the primes are being generated different bit
lengths, then we arrange those lengths to be already in the right
order, and replace the mp_cond_swap with an assert() that checks the
ordering is already correct.
Combined effect: now you should be able to successfully generate a
2049-bit key without assertion failures.
2019-04-17 17:15:23 +00:00
|
|
|
*
|
|
|
|
* We only need to do this if the two primes were generated with
|
|
|
|
* the same number of bits (i.e. if the requested key size is
|
|
|
|
* even) - otherwise it's already guaranteed!
|
2000-10-18 15:00:36 +00:00
|
|
|
*/
|
Fix RSA key gen at awkward sizes mod BIGNUM_INT_BITS.
If you try to generate (say) a 2049-bit RSA key, then primegen will
try to generate a 1025-bit prime. It will do it by making a random
1024-bit mp_int (that is, one strictly _less_ than 2^1024), and then
trying to set bit 1024. But that will fail an assertion in mp_set_bit,
because the number of random bits is a multiple of BIGNUM_INT_BITS, so
an mp_int of the minimum size that can hold the random bits is not
quite big enough to hold the extra bit at the top.
Fix: change the strategy in primegen so that we allocate the mp_int
large enough to hold even the top bit, and copy in the random numbers
via mp_or_into.
There's a second bug hiding behind that one. If the key has odd size,
then the two primes are generated with different bit lengths. If the
overall key size is congruent to 1 mod (2*BIGNUM_INT_BITS), then the
two primes will be allocated as mp_ints with different numbers of
words, leading to another assertion failure in the mp_cond_swap that
sorts the primes into a consistent order.
Fix for that one: if the primes are being generated different bit
lengths, then we arrange those lengths to be already in the right
order, and replace the mp_cond_swap with an assert() that checks the
ordering is already correct.
Combined effect: now you should be able to successfully generate a
2049-bit key without assertion failures.
2019-04-17 17:15:23 +00:00
|
|
|
if (pbits == qbits) {
|
|
|
|
mp_cond_swap(p, q, mp_cmp_hs(q, p));
|
|
|
|
} else {
|
|
|
|
assert(mp_cmp_hs(p, q));
|
|
|
|
}
|
2000-10-18 15:00:36 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Now we have p, q and e. All we need to do now is work out
|
|
|
|
* the other helpful quantities: n=pq, d=e^-1 mod (p-1)(q-1),
|
|
|
|
* and (q^-1 mod p).
|
|
|
|
*/
|
Complete rewrite of PuTTY's bignum library.
The old 'Bignum' data type is gone completely, and so is sshbn.c. In
its place is a new thing called 'mp_int', handled by an entirely new
library module mpint.c, with API differences both large and small.
The main aim of this change is that the new library should be free of
timing- and cache-related side channels. I've written the code so that
it _should_ - assuming I haven't made any mistakes - do all of its
work without either control flow or memory addressing depending on the
data words of the input numbers. (Though, being an _arbitrary_
precision library, it does have to at least depend on the sizes of the
numbers - but there's a 'formal' size that can vary separately from
the actual magnitude of the represented integer, so if you want to
keep it secret that your number is actually small, it should work fine
to have a very long mp_int and just happen to store 23 in it.) So I've
done all my conditionalisation by means of computing both answers and
doing bit-masking to swap the right one into place, and all loops over
the words of an mp_int go up to the formal size rather than the actual
size.
I haven't actually tested the constant-time property in any rigorous
way yet (I'm still considering the best way to do it). But this code
is surely at the very least a big improvement on the old version, even
if I later find a few more things to fix.
I've also completely rewritten the low-level elliptic curve arithmetic
from sshecc.c; the new ecc.c is closer to being an adjunct of mpint.c
than it is to the SSH end of the code. The new elliptic curve code
keeps all coordinates in Montgomery-multiplication transformed form to
speed up all the multiplications mod the same prime, and only converts
them back when you ask for the affine coordinates. Also, I adopted
extended coordinates for the Edwards curve implementation.
sshecc.c has also had a near-total rewrite in the course of switching
it over to the new system. While I was there, I've separated ECDSA and
EdDSA more completely - they now have separate vtables, instead of a
single vtable in which nearly every function had a big if statement in
it - and also made the externally exposed types for an ECDSA key and
an ECDH context different.
A minor new feature: since the new arithmetic code includes a modular
square root function, we can now support the compressed point
representation for the NIST curves. We seem to have been getting along
fine without that so far, but it seemed a shame not to put it in,
since it was suddenly easy.
In sshrsa.c, one major change is that I've removed the RSA blinding
step in rsa_privkey_op, in which we randomise the ciphertext before
doing the decryption. The purpose of that was to avoid timing leaks
giving away the plaintext - but the new arithmetic code should take
that in its stride in the course of also being careful enough to avoid
leaking the _private key_, which RSA blinding had no way to do
anything about in any case.
Apart from those specific points, most of the rest of the changes are
more or less mechanical, just changing type names and translating code
into the new API.
2018-12-31 13:53:41 +00:00
|
|
|
mp_int *modulus = mp_mul(p, q);
|
|
|
|
mp_int *pm1 = mp_copy(p);
|
|
|
|
mp_sub_integer_into(pm1, pm1, 1);
|
|
|
|
mp_int *qm1 = mp_copy(q);
|
|
|
|
mp_sub_integer_into(qm1, qm1, 1);
|
|
|
|
mp_int *phi_n = mp_mul(pm1, qm1);
|
|
|
|
mp_free(pm1);
|
|
|
|
mp_free(qm1);
|
|
|
|
mp_int *private_exponent = mp_invert(exponent, phi_n);
|
|
|
|
mp_free(phi_n);
|
|
|
|
mp_int *iqmp = mp_invert(q, p);
|
2000-10-18 15:00:36 +00:00
|
|
|
|
|
|
|
/*
|
Complete rewrite of PuTTY's bignum library.
The old 'Bignum' data type is gone completely, and so is sshbn.c. In
its place is a new thing called 'mp_int', handled by an entirely new
library module mpint.c, with API differences both large and small.
The main aim of this change is that the new library should be free of
timing- and cache-related side channels. I've written the code so that
it _should_ - assuming I haven't made any mistakes - do all of its
work without either control flow or memory addressing depending on the
data words of the input numbers. (Though, being an _arbitrary_
precision library, it does have to at least depend on the sizes of the
numbers - but there's a 'formal' size that can vary separately from
the actual magnitude of the represented integer, so if you want to
keep it secret that your number is actually small, it should work fine
to have a very long mp_int and just happen to store 23 in it.) So I've
done all my conditionalisation by means of computing both answers and
doing bit-masking to swap the right one into place, and all loops over
the words of an mp_int go up to the formal size rather than the actual
size.
I haven't actually tested the constant-time property in any rigorous
way yet (I'm still considering the best way to do it). But this code
is surely at the very least a big improvement on the old version, even
if I later find a few more things to fix.
I've also completely rewritten the low-level elliptic curve arithmetic
from sshecc.c; the new ecc.c is closer to being an adjunct of mpint.c
than it is to the SSH end of the code. The new elliptic curve code
keeps all coordinates in Montgomery-multiplication transformed form to
speed up all the multiplications mod the same prime, and only converts
them back when you ask for the affine coordinates. Also, I adopted
extended coordinates for the Edwards curve implementation.
sshecc.c has also had a near-total rewrite in the course of switching
it over to the new system. While I was there, I've separated ECDSA and
EdDSA more completely - they now have separate vtables, instead of a
single vtable in which nearly every function had a big if statement in
it - and also made the externally exposed types for an ECDSA key and
an ECDH context different.
A minor new feature: since the new arithmetic code includes a modular
square root function, we can now support the compressed point
representation for the NIST curves. We seem to have been getting along
fine without that so far, but it seemed a shame not to put it in,
since it was suddenly easy.
In sshrsa.c, one major change is that I've removed the RSA blinding
step in rsa_privkey_op, in which we randomise the ciphertext before
doing the decryption. The purpose of that was to avoid timing leaks
giving away the plaintext - but the new arithmetic code should take
that in its stride in the course of also being careful enough to avoid
leaking the _private key_, which RSA blinding had no way to do
anything about in any case.
Apart from those specific points, most of the rest of the changes are
more or less mechanical, just changing type names and translating code
into the new API.
2018-12-31 13:53:41 +00:00
|
|
|
* Populate the returned structure.
|
2000-10-18 15:00:36 +00:00
|
|
|
*/
|
Complete rewrite of PuTTY's bignum library.
The old 'Bignum' data type is gone completely, and so is sshbn.c. In
its place is a new thing called 'mp_int', handled by an entirely new
library module mpint.c, with API differences both large and small.
The main aim of this change is that the new library should be free of
timing- and cache-related side channels. I've written the code so that
it _should_ - assuming I haven't made any mistakes - do all of its
work without either control flow or memory addressing depending on the
data words of the input numbers. (Though, being an _arbitrary_
precision library, it does have to at least depend on the sizes of the
numbers - but there's a 'formal' size that can vary separately from
the actual magnitude of the represented integer, so if you want to
keep it secret that your number is actually small, it should work fine
to have a very long mp_int and just happen to store 23 in it.) So I've
done all my conditionalisation by means of computing both answers and
doing bit-masking to swap the right one into place, and all loops over
the words of an mp_int go up to the formal size rather than the actual
size.
I haven't actually tested the constant-time property in any rigorous
way yet (I'm still considering the best way to do it). But this code
is surely at the very least a big improvement on the old version, even
if I later find a few more things to fix.
I've also completely rewritten the low-level elliptic curve arithmetic
from sshecc.c; the new ecc.c is closer to being an adjunct of mpint.c
than it is to the SSH end of the code. The new elliptic curve code
keeps all coordinates in Montgomery-multiplication transformed form to
speed up all the multiplications mod the same prime, and only converts
them back when you ask for the affine coordinates. Also, I adopted
extended coordinates for the Edwards curve implementation.
sshecc.c has also had a near-total rewrite in the course of switching
it over to the new system. While I was there, I've separated ECDSA and
EdDSA more completely - they now have separate vtables, instead of a
single vtable in which nearly every function had a big if statement in
it - and also made the externally exposed types for an ECDSA key and
an ECDH context different.
A minor new feature: since the new arithmetic code includes a modular
square root function, we can now support the compressed point
representation for the NIST curves. We seem to have been getting along
fine without that so far, but it seemed a shame not to put it in,
since it was suddenly easy.
In sshrsa.c, one major change is that I've removed the RSA blinding
step in rsa_privkey_op, in which we randomise the ciphertext before
doing the decryption. The purpose of that was to avoid timing leaks
giving away the plaintext - but the new arithmetic code should take
that in its stride in the course of also being careful enough to avoid
leaking the _private key_, which RSA blinding had no way to do
anything about in any case.
Apart from those specific points, most of the rest of the changes are
more or less mechanical, just changing type names and translating code
into the new API.
2018-12-31 13:53:41 +00:00
|
|
|
key->modulus = modulus;
|
|
|
|
key->exponent = exponent;
|
|
|
|
key->private_exponent = private_exponent;
|
|
|
|
key->p = p;
|
|
|
|
key->q = q;
|
|
|
|
key->iqmp = iqmp;
|
2000-10-18 15:00:36 +00:00
|
|
|
|
2020-01-09 07:21:30 +00:00
|
|
|
key->bits = mp_get_nbits(modulus);
|
|
|
|
key->bytes = (key->bits + 7) / 8;
|
|
|
|
|
2000-10-18 15:00:36 +00:00
|
|
|
return 1;
|
|
|
|
}
|
2020-02-23 15:31:05 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Invent a pair of values suitable for use as the 'firstbits' values
|
|
|
|
* for the two RSA primes, such that their product is at least 2, and
|
|
|
|
* such that their difference is also at least min_separation.
|
|
|
|
*
|
|
|
|
* This is used for generating RSA keys which have exactly the
|
|
|
|
* specified number of bits rather than one fewer - if you generate an
|
|
|
|
* a-bit and a b-bit number completely at random and multiply them
|
|
|
|
* together, you could end up with either an (ab-1)-bit number or an
|
|
|
|
* (ab)-bit number. The former happens log(2)*2-1 of the time (about
|
|
|
|
* 39%) and, though actually harmless, every time it occurs it has a
|
|
|
|
* non-zero probability of sparking a user email along the lines of
|
|
|
|
* 'Hey, I asked PuTTYgen for a 2048-bit key and I only got 2047 bits!
|
|
|
|
* Bug!'
|
|
|
|
*/
|
|
|
|
static inline unsigned firstbits_b_min(
|
|
|
|
unsigned a, unsigned lo, unsigned hi, unsigned min_separation)
|
|
|
|
{
|
|
|
|
/* To get a large enough product, b must be at least this much */
|
|
|
|
unsigned b_min = (2*lo*lo + a - 1) / a;
|
|
|
|
/* Now enforce a<b, optionally with minimum separation */
|
|
|
|
if (b_min < a + min_separation)
|
|
|
|
b_min = a + min_separation;
|
|
|
|
/* And cap at the upper limit */
|
|
|
|
if (b_min > hi)
|
|
|
|
b_min = hi;
|
|
|
|
return b_min;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void invent_firstbits(unsigned *one, unsigned *two,
|
|
|
|
unsigned min_separation)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* We'll pick 12 initial bits (number selected at random) for each
|
|
|
|
* prime, not counting the leading 1. So we want to return two
|
|
|
|
* values in the range [2^12,2^13) whose product is at least 2^25.
|
|
|
|
*
|
|
|
|
* Strategy: count up all the viable pairs, then select a random
|
|
|
|
* number in that range and use it to pick a pair.
|
|
|
|
*
|
|
|
|
* To keep things simple, we'll ensure a < b, and randomly swap
|
|
|
|
* them at the end.
|
|
|
|
*/
|
|
|
|
const unsigned lo = 1<<12, hi = 1<<13, minproduct = 2*lo*lo;
|
|
|
|
unsigned a, b;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Count up the number of prefixes of b that would be valid for
|
|
|
|
* each prefix of a.
|
|
|
|
*/
|
|
|
|
mp_int *total = mp_new(32);
|
|
|
|
for (a = lo; a < hi; a++) {
|
|
|
|
unsigned b_min = firstbits_b_min(a, lo, hi, min_separation);
|
|
|
|
mp_add_integer_into(total, total, hi - b_min);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Make up a random number in the range [0,2*total).
|
|
|
|
*/
|
|
|
|
mp_int *mlo = mp_from_integer(0), *mhi = mp_new(32);
|
|
|
|
mp_lshift_fixed_into(mhi, total, 1);
|
|
|
|
mp_int *randval = mp_random_in_range(mlo, mhi);
|
|
|
|
mp_free(mlo);
|
|
|
|
mp_free(mhi);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Use the low bit of randval as our swap indicator, leaving the
|
|
|
|
* rest of it in the range [0,total).
|
|
|
|
*/
|
|
|
|
unsigned swap = mp_get_bit(randval, 0);
|
|
|
|
mp_rshift_fixed_into(randval, randval, 1);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Now do the same counting loop again to make the actual choice.
|
|
|
|
*/
|
|
|
|
a = b = 0;
|
|
|
|
for (unsigned a_candidate = lo; a_candidate < hi; a_candidate++) {
|
|
|
|
unsigned b_min = firstbits_b_min(a_candidate, lo, hi, min_separation);
|
|
|
|
unsigned limit = hi - b_min;
|
|
|
|
|
|
|
|
unsigned b_candidate = b_min + mp_get_integer(randval);
|
|
|
|
unsigned use_it = 1 ^ mp_hs_integer(randval, limit);
|
|
|
|
a ^= (a ^ a_candidate) & -use_it;
|
|
|
|
b ^= (b ^ b_candidate) & -use_it;
|
|
|
|
|
|
|
|
mp_sub_integer_into(randval, randval, limit);
|
|
|
|
}
|
|
|
|
|
|
|
|
mp_free(randval);
|
|
|
|
mp_free(total);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Check everything came out right.
|
|
|
|
*/
|
|
|
|
assert(lo <= a);
|
|
|
|
assert(a < hi);
|
|
|
|
assert(lo <= b);
|
|
|
|
assert(b < hi);
|
|
|
|
assert(a * b >= minproduct);
|
|
|
|
assert(b >= a + min_separation);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Last-minute optional swap of a and b.
|
|
|
|
*/
|
|
|
|
unsigned diff = (a ^ b) & (-swap);
|
|
|
|
a ^= diff;
|
|
|
|
b ^= diff;
|
|
|
|
|
|
|
|
*one = a;
|
|
|
|
*two = b;
|
|
|
|
}
|