A Sophie Germain prime is a prime p such that 2p+1 is also prime. The
larger prime of the pair 2p+1 is also known as a 'safe prime', and is
the preferred kind of modulus for conventional Diffie-Hellman.
Generating these is harder work than normal prime generation. There's
not really much of a technique except to just keep generating
candidate primes p and then testing 2p+1. But what you _can_ do to
speed things up is to get the prime-candidate generator to help a bit:
it's already enforcing that no small prime divides p, and it's easy to
get it to also enforce that no small prime divides 2p+1. That check
can filter out a lot of bad candidates early, before you waste time on
the more expensive checks, so you have a better chance of success with
each number that gets as far as Miller-Rabin.
Here I add an extra setup function for PrimeCandidateSource which
enables those extra checks. After you call pcs_try_sophie_germain(),
the PCS will only deliver you numbers for which both p and 2p+1 are
free of small factors.
pcs_get_upper_bound lets the holder of a PrimeCandidateSource ask what
is the largest value it might ever generate. pcs_get_bits_remaining
lets it ask how much extra entropy it's going to generate on top of
the requirements that have already been input into it.
Both of these will be needed by the upcoming provable-prime work to
decide what sizes of subsidiary prime to generate.
We already had a function pcs_require_residue_1() which lets you ask
PrimeCandidateSource to ensure it only returns numbers congruent to 1
mod a given value. pcs_require_residue_1_mod_prime() is the same, but
it also records the number in a list of prime factors of n-1, which
can be queried later.
The idea is that if you're generating a DSA key, in which the small
prime q must divide p-1, the upcoming provable generation algorithm
will be able to recover q from the PrimeCandidateSource and use it as
part of the primality certificate, which reduces the number of bits of
extra prime factors it also has to make up.
Now we don't even bother with picking an mp_int base value and a small
adjustment; we just generate a random mp_int, and if it's congruent to
anything we want to avoid, throw it away and try again.
This should cause us to select completely uniformly from the candidate
values in the available range. Previously, the delta system was
introducing small skews at the start and end of the range (values very
near there were less likely to turn up because they fell within the
delta radius of a smaller set of base values).
I was worried about doing this because I thought it would be slower,
because of having to do a big pile of 'reduce mp_int mod small thing'
every time round the loop: the virtue of the delta system is that you
can set up the residues of your base value once and then try several
deltas using only normal-sized integer operations. But now I look more
closely, we were computing _all_ the residues of the base point every
time round the loop (several thousand of them), whereas now we're very
likely to be able to throw a candidate away after only two or three if
it's divisible by one of the smallest primes, which are also the ones
most likely to get in the way. So probably it's actually _faster_ than
the old system (although, since uniformity was my main aim, I haven't
timed it, only noticed that it seems to be fast _enough_).
The more features and options I add to PrimeCandidateSource, the more
cumbersome it will be to replicate each one in a command-line option
to the ultimate primegen() function. So I'm moving to an API in which
the client of primegen() constructs a PrimeCandidateSource themself,
and passes it in to primegen().
Also, changed the API for pcs_new() so that you don't have to pass
'firstbits' unless you really want to. The net effect is that even
though we've added flexibility, we've also simplified the call sites
of primegen() in the simple case: if you want a 1234-bit prime, you
just need to pass pcs_new(1234) as the argument to primegen, and
you're done.
The new declaration of primegen() lives in ssh_keygen.h, along with
all the types it depends on. So I've had to #include that header in a
few new files.
I've replaced the random number generation and small delta-finding
loop in primegen() with a much more elaborate system in its own source
file, with unit tests and everything.
Immediate benefits:
- fixes a theoretical possibility of overflowing the target number of
bits, if the random number was so close to the top of the range
that the addition of delta * factor pushed it over. However, this
only happened with negligible probability.
- fixes a directional bias in delta-finding. The previous code
incremented the number repeatedly until it found a value coprime to
all the right things, which meant that a prime preceded by a
particularly long sequence of numbers with tiny factors was more
likely to be chosen. Now we select candidate delta values at
random, that bias should be eliminated.
- changes the semantics of the outermost primegen() function to make
them easier to use, because now the caller specifies the 'bits' and
'firstbits' values for the actual returned prime, rather than
having to account for the factor you're multiplying it by in DSA.
DSA client code is correspondingly adjusted.
Future benefits:
- having the candidate generation in a separate function makes it
easy to reuse in alternative prime generation strategies
- the available constraints support applications such as Maurer's
algorithm for generating provable primes, or strong primes for RSA
in which both p-1 and p+1 have a large factor. So those become
things we could experiment with in future.