15-451 Algorithms 11/21/06 * Number-theoretic algorithms - fast modular exponentiation - GCD - a^{-1} mod N ============================================================================== In the next few classes we are going to talk about algorithms for number problems. Assume inputs given in binary, so if our input is some number N, then its "size" n is log(N). Some basic things we can do in polynomial time: add two numbers, multiply two numbers, take A mod N (by this I mean the remainder when A is divided by N: the smallest non-negative integer of the form A - kN). Some basic facts: if we want to compute, say, X*Y*Z mod N, we can mod out by N as we go, since (X*Y - kN)*Z = X*Y*Z - k'N for some k'. So, can keep the numbers at size O(n) = O(log N). Let's look at our first nontrivial problem: modular exponentiation. MODULAR EXPONENTIATION ====================== Given A,B,N, all n-bit numbers. Goal: compute A^B mod N. If B was small, like 3, we could do this easily by just multiplying A by itself B times. But what if B is a large n-bit number? E.g., you need to do this when decrypting under RSA. Note: our goal is at least plausible since the output is at most n bits long (if you didn't have the "mod N" then we couldn't even write down the output in polynomial time...). So, what's a faster way than multiplying A by itself B times? Ans: Can use repeated squaring. Let X=1. Walk left-to-right down the bits of B. Each time we see a 1, do X = (X^2)*A mod N. Each time you see a 0, just do X = X^2 mod N. Notice that at each step we have A^B' mod N where B' is number corresponding to the part of B we've read so far. GREATEST-COMMON-DIVISOR ======================= GCD(A,B): GCD(A,B) is the largest integer d such that A and B are both multiples of d. gcd(A,0)=A. Can we compute GCD(A,B) quickly? Notice that the number of bits in the input is log(A)+log(B) so we want to have a good dependence on that. Classic algorithm over 2000 years old called Euclid's alg. Based on observation that GCD(A,B) = GCD(B, A mod B). [Proof: if A and B are multiples of d, so A = A'*d and B = B'*d, then A-kB = A'd - kB'd is a multiple of d too. Similarly, if B is a multiple of d and "A mod B" is a multiple of d then A has to be a multiple of d.] So, this is the algorithm: GCD(A,B) // assume A >= B (will be true after 1st iteration anyway) if (B==0) return A else return GCD (B, A mod B) E.g., GCD(51, 9) = GCD(9,6) = GCD(6,3) = GCD(3,0) = 3. Can anyone see quick argument that the number of iterations is linear in the number of bits in the input? One way to see this is that "A mod B" is guaranteed to have at least one fewer bit than A. In particular, if B has fewer bits than A then this is easy, and if A and B have the same number of bits, then doing A-B gets rid of the leading 1, so again it is true. So, each iteration reduces the total number of bits in the inputs by at least 1. - EXTENDED GCD: also compute integers x and y such that d = Ax + By. For example, A=7, B=9. d=1. 1=4*7-3*9, so x=4, y=-3. How to do it: can compute with same algorithm. Recursively, running on B and A-kB, we compute x', y',d such that d = Bx' + (A-kB)y'. This means that d = Ay' + B(x'-ky'). This seems like a curiosity but it turns out to be really useful. More on Modular Arithmetic ========================== Z_N = {0,1,2,...,N-1} Z_N^* = {A in Z_N : gcd(A,N) = 1}. If N is prime, then Z_N^* = {1,2,...,N-1} Z_N^* is a group under multiplication mod N: if you multiply two numbers relatively prime to N, you get another number relatively prime to N. (If N doesn't share any factors with either one, then it doesn't share any factors with their product). Z_N is a group under addition mod N. [At this point we will use "(mod N)" to mean we are doing everything modulo N]. A^{-1} (mod N): the inverse of A modulo N is defined to be an integer B in Z_N^* such that AB = 1 (mod N). Each A in Z_N^* has an inverse modulo N. Question: why do inverses exist, and how can we compute them quickly? E.g., what is 5^{-1} mod 17? Here's how: compute extended GCD of N and A. Get x,y such that Nx+Ay=1. So, Ay = 1 mod N: y is the inverse of A. E.g., EGCD(17,5) calls EGCD(5,2) where 2 = 17-3*5. This returns x'= 1, y' = -2. So, our answer is x = -2, y = 1 - 3*(-2) = 7. So, 5^{-1} = 7 mod 17. Euler's Theorem, Fermat's little theorem ======================================== Fermat's little theorem: if N is prime and A is in Z_N^*, then A^{N-1} = 1 mod N. So, if we pick some A (like A=2) and compute A^{N-1} mod N (which we now know how to do efficiently), and find the result is not equal to 1, this is a PROOF that N is not prime, even though it gives us no information about how to factor N. Of course it could be that composites also have the same property, but it turns out that composites actually come in two types. A rare type, called Carmichael numbers have the same property as above, but they turn out to be easy to factor and in any case they are very sparse. The rest of the composites N have the property that at least half of the A in Z_N^* satisfy A^{N-1} != 1 mod N. So if your number is not Carmichael, and you pick 100 random A's and they all give you 1, you can be pretty confident that the number is prime. This gives a a fast randomized primality test. Recently, a deterministic no-error polynomial-time primality test was devloped too (but it is slower). In the next class we will look at all this in more detail.