11/03 15-859(D) Randomized Algorithms
Randomized algorithms for number-theoretic problems.
Today: Randomized algorithms for testing primality.
==============================================================================
The basic idea of how we will use randomness is that we'll have some
"bad" subset of Z_n^*, and we want to pick a number outside it. Even
if we don't understand how to do that algorithmically, if we can show
that this "bad" subset is actually a SUBGROUP of Z_n^*, and
furthermore if we can show there EXISTS even just one element outside
of it, then we know at least HALF of Z_n^* is outside the set (since
the size of a subgroup divides the size of the group). So, we can get
elements outside of the set by picking randomly.
==============================================================================
First, in all the algorithms we will assume that n is odd, not a prime
power, and not a perfect square. (These are fine to assume since we
can always begin by checking if the square-root, cube-root, etc. of n
is an integer, and if so saying "composite".)
Here's a BPP algorithm for primality testing. This one is probably the
easiest to analyze, and I think is due to Lehmer.
Repeat k times:
* Pick a in {1,...,n-1} at random.
* If gcd(a,n) != 1, then output COMPOSITE.
[this is actually unnecessary but conceptually helps]
* If a^{(n-1)/2} is not congruent to +1 or -1 (mod n),
then output COMPOSITE.
Now, if we ever got a "-1" above output "PROBABLY PRIME" else output
"PROBABLY COMPOSITE".
Theorem: this procedure has error probability at most 1/2^k.
Proof: if n is really prime, then in each iteration we get
a^{(n-1)/2} = -1 (mod n) with probability 1/2, so we get a -1 at least
once with probability 1 - 1/2^k. If n is composite, the proof of
success will follow from the lemma below with t = (n-1)/2.
Key Lemma: Let n be an odd composite, not a prime power, and let t <
n. If there exists a in Z_n^* such that a^t = -1 (mod n), then at
most half of the x's in Z_n^* have x^t = {-1,+1} (mod n). In fact, we
can weaken the condition to simply that there exists a in Z_n^* such
that a^t is not congruent to +1 (mod n).
Proof: Let S = {x in Z_n^* : x^t = +1 or -1 (mod n)}. S is a subgroup
of Z_n^* since it's closed under multiplication (xy)^t = (x^t)(y^t)
and inverse (x^{-1})^t = (x^t)^{-1}. So we just need to find some b
in Z_n^* not in S. (So, the "weakened condition" is equivalent to the
stronger condition because if a^t is not congruent to -1 (mod n), then
we're done.) Let n = q * r, where q and r are relatively prime
and greater than 1. Let b = (a,1), where we're using CRT notation:
b = a (mod q) and b = 1 (mod r). Note that b^t = (a^t, 1^t) = (-1,1).
Therefore, b is not in S since 1 = (1,1) and -1 = (-1, -1). QED.
========================================================================
An aside: it's clear that primality is in co-NP: if a number is
composite, there is a short witness: namely, a factor. Is primality
in NP? Yes. Here's an idea due to Pratt:
If n is prime, then there is a generator g such that g^{n-1} = 1 (mod n)
and g^{(n-1)/k} is NOT congruent to 1 (mod n) for any prime divisor k
of n-1. If n is not prime, then no such g exists, in particular
because for any g, order(g) divides phi(n) which is less than n-1.
So, the certificate that n is prime is a generator g, and a prime
factorization of n-1. And, (recursively) a proof that each of those
prime factors really is prime.
We can see that this certificate is not too large since if you draw
out the recursive "tree", the width is at most log(n) (since the
product of all primes on any given level is at most n) and the depth
is at most log(n) since the primes are dropping by at least a factor
or 2.
===========================================================================
We now turn to an RP algorithm for compositeness, the Solovay-Strassen
algorithm. First, we need to define a generalization of the Legendre
symbol called the Jacobi symbol:
If n = p1 * p2 * ... * pt, where the pi are primes not nec. distinct,
[a/n] = [a/p1]*[a/p2]*...*[a/pt].
Note: the Jacobi symbol is NOT necessarily equivalent to "being a
quadratic residue mod n".
It turns out that the Jacobi symbol can be calculated efficiently
(essentially using a Euclidean GCD-like algorithm) via a number of
identities that are not trivial to prove. A couple easy-to-prove
identities are: [ab/n] = [a/n]*[b/n], and if a=b (mod n) then [a/n]=[b/n].
A harder one to prove is: (for a and n relatively prime)
[a/n] = [n/a] (-1)^{(a-1)(n-1)/4}.
Here is the Solovay-Strassen algorithm:
* Pick random a in {1,...,n-1}
* if gcd(a,n) != 1, then COMPOSITE.
* if [a/n] != a^{(n-1)/2} then COMPOSITE.
* else say POSSIBLY PRIME.
Thm: if n is composite, then this algorithm says "composite" with
probability at least 1/2.
Proof: the proof is much like that of our "key lemma".
Let J = {a in Z_n^* : [a/n] = a^{(n-1)/2} (mod n)}. This is the "bad"
set.
J is a group (e.g., use fact that [ab/n] = [a/n]*[b/n]) so it suffices
to show there exists b not in J. By the assumption that n is not a
prime power and is not a perfect square, we can write n = p1^e1 * r,
where p1 and r are relatively prime, and e1 is odd. Let g be an
arbitrary non-residue mod p1, and let b = (g,1), in CRT notation. We
can see that b^{(n-1)/2} = (g^{(n-1)/2}, 1) is *not* congruent to -1
(mod n), since -1 = (-1, -1). So, it suffices to show that [b/n] = -1.
This in turn follows from the basic Jacobi symbol identities:
[b/n] = ([b/p1])^e1 * [b/r] = ([g/p1])^e1 * [1/r] = (-1)^e1 * 1 = -1.
QED
========================================================================
We now describe one final RP alg for compositeness. This is along the
lines of using a^{n-1} =? 1 (mod n) as a test.
Miller-Rabin Algorithm:
* Let n-1 = 2^r * B, where B is odd.
* pick a in {1,...,n-1} at random.
* Compute a^B, a^{2B}, ..., a^{n-1} (mod n).
* If a^{n-1} != 1 (mod n), then output COMPOSITE
* If we found a non {-1,+1} root of 1 in the above list, then
output COMPOSITE.
* else output POSSIBLY PRIME.
Note that we only have to worry about Carmichael numbers (composite n
such that all a in Z_n^* satisfy a^{n-1} = 1 (mod n)). For all other
composite numbers, at least half of the a's don't have this property
(since {a : a^{n-1} = 1 (mod n)} is a group).
Here's how we can argue for Carmichael numbers:
Suppose there exists a in Z_n^* satisfying a^{(n-1)/2} != 1 (mod n).
Then our "Key Lemma" implies that with probability at least 1/2, we
will find a non {-1,+1} root of 1 in computing a^{(n-1)/2} and output
COMPOSITE. If no such a exists, but there exists b in Z_n^* such that
b^{(n-1)/4} != 1 (mod n) then similarly we're OK (assuming n-1 is
divisible by 4). Going down the line from right-to-left, we can now
assume w.l.o.g. that all a in z_n^* satisfy a^B = 1 (mod n).
We now show a contradiction: let n=p1^e1 * r, where e1 is odd and p1
and r are relatively prime. Let g be a generator mod p1^e1 (we didn't
prove it, but the prime power groups are cyclic too; actually,
Carmichael numbers must be products of distinct primes, but we didn't
prove that either). Let a = (g,1) using CRT notation. If a^B = 1 (mod n)
then g^B = 1 (mod p1^e1), which means that B must be a multiple of
phi(p1^e1) since g is a generator. But B is odd and phi(p1^e1) is
even, a contradiction. QED