15-451 Algorithms       11/21/06
* Number-theoretic algorithms
  - fast modular exponentiation
  - GCD
  - a^{-1} mod N
==============================================================================
In the next few classes we are going to talk about algorithms for
number problems.  Assume inputs given in binary, so if our input is
some number N, then its "size" n is log(N).

Some basic things we can do in polynomial time: add two numbers,
multiply two numbers, take A mod N (by this I mean the remainder when
A is divided by N: the smallest non-negative integer of the form A - kN).

Some basic facts: if we want to compute, say, X*Y*Z mod N, we can mod
out by N as we go, since (X*Y - kN)*Z = X*Y*Z - k'N for some k'.
So, can keep the numbers at size O(n) = O(log N).

Let's look at our first nontrivial problem: modular exponentiation.

MODULAR EXPONENTIATION
======================
Given A,B,N, all n-bit numbers.  Goal: compute A^B mod N.  

If B was small, like 3, we could do this easily by just multiplying A
by itself B times.  But what if B is a large n-bit number?  E.g., you
need to do this when decrypting under RSA.  Note: our goal is at least
plausible since the output is at most n bits long (if you didn't have
the "mod N" then we couldn't even write down the output in polynomial
time...).  So, what's a faster way than multiplying A by itself B times?

Ans: Can use repeated squaring.  Let X=1.  Walk left-to-right down the
bits of B.  Each time we see a 1, do X = (X^2)*A mod N.  Each time you
see a 0, just do X = X^2 mod N.  Notice that at each step we have A^B'
mod N where B' is number corresponding to the part of B we've read so far.

GREATEST-COMMON-DIVISOR
=======================
GCD(A,B):  GCD(A,B) is the largest integer d such that A and B are both
multiples of d.  gcd(A,0)=A.    

Can we compute GCD(A,B) quickly?  Notice that the number of bits in
the input is log(A)+log(B) so we want to have a good dependence on
that.  Classic algorithm over 2000 years old called Euclid's alg.
Based on observation that GCD(A,B) = GCD(B, A mod B).  [Proof: if A
and B are multiples of d, so A = A'*d and B = B'*d, then A-kB = A'd -
kB'd is a multiple of d too.  Similarly, if B is a multiple of d and
"A mod B" is a multiple of d then A has to be a multiple of d.]

So, this is the algorithm:
      GCD(A,B)  // assume A >= B (will be true after 1st iteration anyway)
	if (B==0) return A
	else return GCD (B, A mod B)

  E.g., GCD(51, 9) = GCD(9,6) = GCD(6,3) = GCD(3,0) = 3.

Can anyone see quick argument that the number of iterations is linear
in the number of bits in the input?  One way to see this is that "A
mod B" is guaranteed to have at least one fewer bit than A.  In
particular, if B has fewer bits than A then this is easy, and if A and
B have the same number of bits, then doing A-B gets rid of the leading
1, so again it is true.  So, each iteration reduces the total number
of bits in the inputs by at least 1.

- EXTENDED GCD: also compute integers x and y such that d = Ax + By.
	        For example, A=7, B=9.  d=1.  1=4*7-3*9, so x=4, y=-3.

How to do it: can compute with same algorithm.  Recursively, running
on B and A-kB, we compute x', y',d such that d = Bx' + (A-kB)y'.  This
means that d = Ay' + B(x'-ky').  This seems like a curiosity but it
turns out to be really useful.

More on Modular Arithmetic
==========================
Z_N = {0,1,2,...,N-1}
Z_N^* = {A in Z_N : gcd(A,N) = 1}.  If N is prime, then Z_N^* = {1,2,...,N-1}

Z_N^* is a group under multiplication mod N: if you multiply two numbers
relatively prime to N, you get another number relatively prime to N.
(If N doesn't share any factors with either one, then it doesn't share
any factors with their product).  Z_N is a group under addition mod N.

[At this point we will use "(mod N)" to mean we are doing everything modulo N].

A^{-1} (mod N): the inverse of A modulo N is defined to be an integer B
in Z_N^* such that AB = 1 (mod N).  Each A in Z_N^* has an inverse modulo N.

Question: why do inverses exist, and how can we compute them quickly?
E.g., what is 5^{-1} mod 17?

Here's how: compute extended GCD of N and A.  Get x,y such that Nx+Ay=1.
So, Ay = 1 mod N: y is the inverse of A.

E.g., EGCD(17,5) calls EGCD(5,2) where 2 = 17-3*5.  This returns x'= 1, y'
= -2. So, our answer is x = -2, y = 1 - 3*(-2) = 7.  So, 5^{-1} = 7 mod 17. 

Euler's Theorem, Fermat's little theorem
========================================
Fermat's little theorem: if N is prime and A is in Z_N^*, then
A^{N-1} = 1 mod N.

So, if we pick some A (like A=2) and compute A^{N-1} mod N (which we
now know how to do efficiently), and find the result is not equal to
1, this is a PROOF that N is not prime, even though it gives us no
information about how to factor N. 

Of course it could be that composites also have the same property, but
it turns out that composites actually come in two types.  A rare type, called
Carmichael numbers have the same property as above, but they turn out
to be easy to factor and in any case they are very sparse.  The rest
of the composites N have the property that at least half of the A in
Z_N^* satisfy A^{N-1} != 1 mod N.  So if your number is not
Carmichael, and you pick 100 random A's and
they all give you 1, you can be pretty confident that the number is
prime.   This gives a a fast randomized primality test.  Recently, a
deterministic no-error polynomial-time primality test was devloped too
(but it is slower).  

In the next class we will look at all this in more detail.