15-451 Algorithms 11/21/00 * Number-theoretic algs, contd - RSA - Chinese Remainder Theorem - Primality testing ============================================================================== Continue with discussion of number-theoretic algorithms. Motivate with RSA public-key cryptosystem: RSA is an answer to the question of "how can two people communicate securely if they have never met to exchange secret keys before?". Answer is to somehow separate encryption and decryption. Each person has a public encryption key that's published in the phone book, and then their own private decryption key that's kept secret. To send a msg to A, you look them up in the phone book and use their public key. The point about this public-key cryptography is that just because you can encrypt a message to A doesn't mean you can decrypt anyone else's msgs sent to A. RSA: Person A (Alice) subscribes by (1) picking two large primes p and q (say 100 decimal digits long) and computing n = p*q. (2) Picking a small odd number e relatively prime to (p-1)*(q-1). E.g., e=3. (3) Computing d = e^{-1} mod phi(n), where phi(n) = (p-1)*(q-1). (4) Publishing the pair (e,n) as her PUBLIC KEY in a global phone book, and keeping the pair (d,n) secret as her SECRET KEY. The primes p and q can now be thrown away. If person B (Bob) wants to send a message M to Alice, what he does is compute x = M^e mod n, and sends x along. (5) To decrypt, Alice computes x^d mod n, which is M. [Various issues I'm ignoring. E.g., might want to prepend some garbage onto M so that evesdropper can't guess-and-check] Let's now look at details of how/why all this works. Work backwards: STEP (5): Why do we get back M? Answer is that this comes from Euler's theorem. EULER'S THEOREM: for any a in Z_n^*, a^{phi(n)} = 1 (mod n). So, x^d = M^{de} mod n. By definition, de = 1 + k*phi(n). So, M^{1 + k*phi(n)} = M*M^{phi(n)^k} = M*1^k = M (mod n). How about the question of computing x^d mod n quickly. x,d,n are all 200-digit numbers. Answer: use fast exponentiation. Step(3): just requires computing inverse which we know how to do. Take extended gcd(e, phi(n)) and get numbers d,k such that d*e + k*phi(n) = 1. So, d is the inverse of e mod phi(n). Step (1): pick random large number and then walk up, testing for primality. So,... how can we tell if a number is prime? Isn't this just as hard as factoring? Answer is no: we have fast ways of proving that a number is composite without actually having any idea how to factor it. Idea for why primality testing might be easy is Fermat's little theorem. Say you are given some number n and you pick a random a < n and compute a^{n-1} mod n and you *dont* get 1. They you *know* n is not prime. Can do for a bunch of random a's. Could it be that n is composite and yet all or most of a's have a^{n-1} = 1 mod n. It turns out unfortunately the answer is YES, but the good news is that those special numbers (called Carmichael numbers) are actually easy to factor. Before doing this, some issues to address first: ====================================================================== On the security of RSA: Suppose that given n we could figure out phi(n). Then, could use to factor. This is true for general n, but especially easy for case that n=p*q. Basically, we get two equations in two unknowns: (1) n = pq (2) phi(n) = (p-1)(q-1) (2) gives us: phi(n) = n - p - q + 1, so q = n - p - phi(n) + 1. Then plug into (1) and solve quadratic for p. Now, claim is that given our public key (e,n) figuring out our decryption key d is as hard as factoring n. I'll just prove for case that e=3. Reason is that we know e*d = 1 mod phi(n). So, e*d - 1 is a multiple of phi(n). So, not quite as good as having phi(n) but close. In particular, if e=3, then the multiple is either 1, 2, or 3, so we can just try e*d, e*d/2 and e*d/3. So, that's one reason RSA is believed to be secure. Now, it is possible someone might be able to decrypt or get info about M without actually figuring out the decryption key, so it is conceivable one could break RSA without actually having to factor. ======================================================================== Chinese Remainder Theorem ------------------------- Story: want to figure out how many troops. We know less than 1000, but how many? ask to get into groups of 9, how many left over? 10, how many left over? 13, how many left over? Claim is: this uniquely determines the answer mod 9*10*13. Specifically, say n1, n2, ..., nk are (pairwise) relatively prime. n=n1*n2*...*nk. e.g., n1 = 9, n2 = 10, n=90 Consider the mapping from Z_n to Z_n1 x Z_n2 x ... x Z_nk r -> (r mod n1, r mod n2, ..., r mod nk). e.g., 27 -> (0, 7) 28 -> (1, 8) 30 -> (3, 0) 35 -> (8, 5) Chinese remainder Theorem: this map is 1-1, and furthermore there's an efficient inverse. Won't give full proof but idea of why it's 1-1 is that every time we add 1 to r, we're adding 1 to each of tuple modulo its modulus, and we don't get back to (0,0) until we've gone a number of steps equal to the least common multiple, which is n since these are all relatively prime. How to go backwards: if we know that (1,0) goes back to x, and (0,1) goes back to y, then (8,5) goes back to 8x + 5y. E.g., in this case above, (1,0) comes from 10 and (0,1) comes from 81, so (8,5) goes back to 80 + 5(81) = 80 +5(-9) = 35. Interesting fact: if n = p*q, then the number 1 has 4 square roots mod n. They are: (1,1) = 1 (-1,-1) = -1 (1, -1) = x (-1,1) = -x If we could find x, then we could factor, since x-1 = (0,-2) is a multiple of p but is not 0 mod n, so GCD(x-1,n) = p. In fact, every square has 4 square roots. This means that if you had a black box that could find square roots mod n, then could factor. Idea: take random y and square it. Put that into black box. It might return y or -y, but just as likely to return x or -x where x^2 = y^2 mod n. So, (x-y)*(x+y) = 0 mod n, but x-y and x+y are not 0 mod n. So, take GCD(x-y,n) to get a factor. ====================================================================== Primality Testing ----------------- How can we tell if n is prime or composite without actually factoring it? As mentioned above, idea is: pick some a and take a^{n-1} mod n. If you don't get 1, then you know that n is not prime. This is a nice easy certificate of compositeness. Of course if a^{n-1} = 1 mod n, that doesn't guarantee that n is prime. But, here's an interesting thing. The set S = {a in Z_n^*: a^{n-1} = 1} is a group. Closed under multiplication: If a in S, b in S, ab in S because (ab)^{n-1} = a^{n-1}*b^{n-1} = 1 and closed under inverses: If a in S, ab = 1 mod n, then (ab)^{n-1} = 1, so b^{n-1} = 1. This means that if there is even a single number a in z_n^* such that a^{n-1} != 1 mod n, then AT LEAST HALF of the a's have this property since the size of a subgroup divides the size of the group -- so if it's not the whole thing, it can be at most half. So, are there composites where a^{n-1} = 1 mod n for *all* a? Unfortunately, yes, and they're called Carmichael numbers. Smallest is 561. But, if somehow we know that n is not a Carmichael number (and there are very few of them) here is a randomized alg for testing primality: Repeat k times: Pick a in {1,...,n-1} at random if GCD(a,n) != 1, return COMPOSITE if a^{n-1} != 1 mod n return COMPOSITE Return "PRIME (I THINK)" If n is composite and not Carmichael, what's the chance we don't catch it? Ans: 1/2^k. If n is prime, what's the chance we output the wrong thing? Ans: 0. Problem: what to do about those Carmichael numbers? Luckily, it turns out that they are easy to factor. Probably don't have time for this, but here's how you do it. If you then combine this with the algorithm above, you get the Miller-Rabin primality-testing algorithm. How to factor Carmichael numbers: Pull out multiples of 2 from n-1. Let n-1 = 2^r * B. Repeat k times: Pick a in {1,...,n-1} at random. Assume GCD(a,n)!= 1. [else we're done] Compute a^B, a^2B, a^4B, ..., a^{n-1} mod n. If any t in {B,2B,...} have a^t != {1,-1} mod n, but a^{2t} = 1 mod n, then GCD(a^t+1,n) is a factor. Else, continue; The proof that this works uses the following key lemma: KEY LEMMA: Suppose n is odd, not a prime power or perfect square, and composite. Let t < n. If there exists a in Z_n^* such that a^t != 1 (mod n), then at least half of x in Z_n^* have x^t != {-1,+1} mod n. Furthermore, if t is ODD, then such an a exists. Idea of proof of key lemma: show that set of x such that x^t = 1 or -1 mod n is a group. So, if a^t != -1 then we're done. But even if a^t = -1, can use Chinese Remainder theorem to get what we want: write n = q*r, where these are rel prime. Then, to get first part, look at number that is a mod q and 1 mod r (a,1). This thing to the power t is (-1,1), which isn't 1 or -1. Last part is a little trickier but same rough idea. Proof of factoring Carmichael given our key lemma: If we look at the different exponents t used in the algorithm (B, 2B, ...) the lemma says that we can put them into two categories: (1) all a have a^t = 1 mod n (2) at least half of a have a^t != 1 or -1 (mod n) Also the lemma tells us that t=B is in category (2), so the world looks something like this: t = B 2B 4B 8B ... n-1 --------------------------------------------------- category: 2 2 1 1 ... 1 ^ call this point t_critical Since n is charmichael, n-1 is in category 1. Now, define t_critical as largest exponent in category (2). This means that each iteration, there is at least 1/2 chance that a^{t_critical} != +1, -1.