15-451 Algorithms 11/12/09 * Number-theoretic algorithms - fast modular exponentiation - GCD - a^{-1} mod N ============================================================================== Before going into today's topic, just a quick "retrospective" on NP-completeness. One interesting thing is that often many similar-looking problems can be quite different in terms of hardness. E.g., 3-coloring is NP-complete but 2-coloring is easy. 3-SAT is NP-Complete, but 2-SAT is easy (can anyone see how to do it?). Hamiltonian cycle is hard (a tour that visits each vertex exactly once) but Euler tour is easy (a tour that visits each edge exactly once). To show a problem is NP-hard, show how you could use it to solve 3-SAT. Another pair of problems that seem similar but have very different hardness is (a) factoring vs (b) primality testing. We have efficient algs for b but nothing for a. In fact, many crypto systems are based on the assumption that factoring is hard (but it is not believed to be NP-hard). NUMBER-THEORETIC ALGORITHMS =========================== In the next few classes we are going to talk about algorithms for number problems. Assume inputs given in binary, so if our input is some number N, then its "size" is log(N). Some basic things we can do in polynomial time: add two numbers, multiply two numbers, take A mod N (by this I mean the remainder when A is divided by N: the smallest non-negative integer of the form A - kN). Some basic facts: if we want to compute, say, X*Y*Z mod N, we can mod out by N as we go, since (X*Y - kN)*Z - k'N = X*Y*Z - k''N for some integer k''. Let's look at our first nontrivial problem: modular exponentiation. MODULAR EXPONENTIATION ====================== Given A,B,N, all n-bit numbers. Goal: compute A^B mod N. If B was small, like 3, we could do this easily by just multiplying A by itself B times. But what if B is a large n-bit number? E.g., you need to do this when decrypting under RSA. Note: our goal is at least plausible since the output is at most n bits long (if you didn't have the "mod N" then we couldn't even write down the output in polynomial time...). So, what's a faster way than multiplying A by itself B times? Ans: Can use repeated squaring. Let X=1. Walk left-to-right down the bits of B. Each time we see a 1, do X = (X^2)*A mod N. Each time you see a 0, just do X = X^2 mod N. Notice that at each step we have A^B' mod N where B' is number corresponding to the part of B we've read so far. GREATEST-COMMON-DIVISOR ======================= GCD(A,B): GCD(A,B) is the largest integer d such that A and B are both multiples of d. gcd(A,0)=A. Can we compute GCD(A,B) quickly? Notice that the number of bits in the input is log(A)+log(B) so we want to have a good dependence on that. Classic algorithm over 2000 years old called Euclid's alg. Based on observation that GCD(A,B) = GCD(B, A mod B). [Proof: if A and B are multiples of d, so A = A'*d and B = B'*d, then A-kB = A'd - kB'd is a multiple of d too. Similarly, if B is a multiple of d and "A mod B" is a multiple of d then A has to be a multiple of d.] So, this is the algorithm: GCD(A,B) // assume A >= B (will be true after 1st iteration anyway) if (B==0) return A else return GCD (B, A mod B) E.g., GCD(51, 9) = GCD(9,6) = GCD(6,3) = GCD(3,0) = 3. Can anyone see quick argument that the number of iterations is linear in the number of bits in the input? One way to see this is that "A mod B" is guaranteed to have at least one fewer bit than A. In particular, if B has fewer bits than A then this is easy, and if A and B have the same number of bits, then doing A-B gets rid of the leading 1, so again it is true. So, each iteration reduces the total number of bits in the inputs by at least 1. - EXTENDED GCD: also compute integers x and y such that d = Ax + By. For example, A=7, B=9. d=1. 1=4*7-3*9, so x=4, y=-3. How to do it: can compute with same algorithm. Just when we compute A mod B = A-kb, remember the k we used. Now, recursively running on B and A-kB, say we output x', y',d such that d = Bx' + (A-kB)y'. This means that d = Ay' + B(x'-ky'). So we can output x = y', y = x'-ky' (and of course, d=d). This seems like a curiosity but it turns out to be really useful. More on Modular Arithmetic ========================== Z_N = {0,1,2,...,N-1} Z_N^* = {A in Z_N : gcd(A,N) = 1}. If N is prime, then Z_N^* = {1,2,...,N-1} Z_N^* is a group under multiplication mod N: if you multiply two numbers relatively prime to N, you get another number relatively prime to N. (If N doesn't share any factors with either one, then it doesn't share any factors with their product). Z_N is a group under addition mod N. [At this point we will use "(mod N)" to mean we are doing everything modulo N]. A^{-1} (mod N): the inverse of A modulo N is defined to be an integer B in Z_N^* such that AB = 1 (mod N). Each A in Z_N^* has an inverse modulo N. What is 5^{-1} mod 17? How about 2^{-1} mod 17? 16^{-1} mod 17? Question: why do inverses exist, and how can we compute them quickly? Here's how: compute extended GCD of N and A. Get x,y such that Nx+Ay=1. So, Ay = 1 mod N: y is the inverse of A. E.g., EGCD(17,5) calls EGCD(5,2) where 2 = 17-3*5. This returns x'= 1, y' = -2. So, our answer is x = -2, y = 1 - 3*(-2) = 7. So, 5^{-1} = 7 mod 17. Euler's Theorem, Fermat's little theorem ======================================== Fermat's little theorem: if N is prime and A is in Z_N^*, then A^{N-1} = 1 mod N. So, if we pick some A (like A=2) and compute A^{N-1} mod N (which we now know how to do efficiently), and find the result is not equal to 1, this is a PROOF that N is not prime, even though it gives us no information about how to factor N. Of course it could be that composites also have the same property, but it turns out that composites actually come in two types. A rare type, called Carmichael numbers have the same property as above, but they turn out to be easy to factor and in any case they are very sparse. The rest of the composites N have the property that at least half of the A in Z_N^* satisfy A^{N-1} != 1 mod N. So if your number is not Carmichael, and you pick 100 random A's and they all give you 1, you can be pretty confident that the number is prime. This gives a a fast randomized primality test. Recently, a deterministic no-error polynomial-time primality test was devloped too (but it is slower). Let's end by proving Fermat's little theorem, and a generalization called Euler's theorem. In the next class we'll extend this further to get a randomized polynomial-time algorithm for testing if a number is prime or composite. DEFN: Euler phi function: phi(N) = |Z_N^*|. E.g., if N is prime, then phi(N) = N-1. DEFN: for A in Z_N^*, order(A) = smallest t>0 such that A^t = 1 (mod N). THEOREM: for all A in Z_N^*, order(A) divides phi(N). COROLLARY 1: Euler's Theorem: for any A in Z_N^*, A^{phi(N)} = 1 (mod N). Proof: if t is the order of A, then phi(N) = k*t for some k, and A^{phi(N)} = (A^t)^k = 1 (mod N). COROLLARY 2: Fermat's little theorem: If N is prime, then for any A in Z_N^*, A^{N-1} = 1 mod N. Proof of THEOREM: Part 1: Note that {1, A, A^2, ..., A^{t-1}} is a subgroup of Z_N^*: it's closed under multiplication mod N and inverses. Part 2: Now we can use (and prove) a basic fact from group theory that THE SIZE OF ANY SUBGROUP DIVIDES THE SIZE OF THE GROUP. Proof: Say H is a subgroup of G and y is not in H. Then the coset yH = {yh : h in H} is a set of size |H| (if y*h1 = y*h2 then h1 = h2) and is disjoint from H (if y*h1 = h2 then y = h2*h1^{-1}, which is in H by H's group closure properties). Furthermore, all cosets are disjoint (if z*h1 = y*h2 then z = y*h3 for some h3 in H). So, that's it. One last thing: the basic group theory fact we proved above is very powerful for randomized algorithms. In particular, we'll show that if N is composite, the set of A's that are *bad* (have A^{N-1}=1 mod N) form a subgroup. So, if we have at least one *good* A, this means at least half of the A's are good. This doesn't tell us *where* they are, but it means if we randomly pick them and try, pretty soon we'll get one where A^{N-1} != 1 mod N.