Notes for 1/13/97 15-852 Randomized Algorithms * Intro/Admin * Why randomized algorithms? * Plan for the course * Two interesting algorithms ============================================================================= INTRO/ADMIN ----------- Grades based on homeworks + class participation + final. Each student will do one class presentation. Also, help with hwk grading. Web page: /afs/cs/usr/avrim/www/Randalgs97/home.html (http://www.cs.cmu.edu/~avrim/Randalgs97/home.html) Page will have rough notes for each class, plus handouts. Book is Motwani-Raghavan. Should be in bookstore by now. WHY RANDOMIZED ALGORITHMS? -------------------------- Randomness is a useful resource to take advantage of in designing algorithms. Ways randomness helps: 1. Use simple randomized algorithm to achieve same worst-case performance as complicated deterministic algorithm. Or, sometimes have simple deterministic heuristic with good performance "most of the time" that can transform into an algorithm with good worst-case performance by adding randomness. E.g., median-finding, many standard algorithmic problems. 2. Sometimes can do things we don't know how to do deterministically. E.g., primality testing (deterministic alg requires unproven assumptions), linear-time min-spanning tree. 3. Sometimes can do things provably impossible deterministically (protocol or oracle problems) 4. Sometimes can do things that just seem impossible... This course will talk about: * Techniques for using this resource in algorithms. * Techniques for analyzing randomized algorithms. * Some neat randomized algorithms. Book is structured in this way. First half discusses important techniques and 2nd half is a sampling of different areas where randomized algorithms are useful. What I envision for class presentations is that in 2nd half, there are a lot more subjects than we can really go through. So, I'll pick some that I like, and you will each have a chance to learn one in depth and talk about it. Today: Some neat randomized algs. Categories 1 and (3,4). SPLITTING A 3-COLORABLE GRAPH INTO TWO PIECES SO NEITHER HAS A TRIANGLE ----------------------------------------------------------------------- Definition of problem: You have a 3-colorable graph. Want to split into 3 independent sets. NP-hard. Easier problem: split into two pieces so that neither has a triangle. (If could do this for diamonds or pentagons, then this would improve current approx ratio) A solution: split arbitrarily. Find some bad triangle. Then pick one of its 3 nodes AT RANDOM, and move it to the other side. Continue until there are no more bad triangles. * How to see as random walk. In this walk, the expected time to hit boundary 0 or n is O(n^2) [*]. This means that expect to end in at most that much time. Proof of [*]: First, let's consider 50/50 random walk --- it's easy to see proving for this case is enough. Why? Now, in a random walk with 2m steps, prob that make A steps to right is 2^{-2m} * (2m choose A). What is the maximum value of this? It's maximized at A=m. Using Stirling's formula: m! is approximately (m/e)^m * sqrt{2 pi m}, so for A=m we have about 1/sqrt{pi m}. So, prob that after 2m steps you are between 0 and n is at most n/sqrt{pi m}, which is < 1/2 for m = 2n^2. So, with probability at least 1/2, we end after 4n^2 steps. With prob > 1 - 1/2^k, we end after 4kn^2 steps. So, expectation is O(n^2). More exact calculation: Let E_x be the expected time to reach 0 or n given that you start at x. So, E_0 = E_n = 0. What is E_x in terms of its neighbors? E_x = 1 + 1/3(E_x + E_{x+1} + E_{x-1}). Rewrite as: E_{x+1} - E_x = E_x - E_{x-1} - 3. So ``second derivative'' of E w.r.t. x is -3. Suggests to solve E_x = - 3/2 x^2 + cx + c' at boundary conditions. Get: 3/2 x(n-x). Note: this is using linearity of expectation, though of indep events. FINDING THE AVERAGE GRADE, WITHOUT GIVING AWAY YOUR OWN SCORE ------------------------------------------------------------- * The problem: A group of N friends wants to determine their average score on a test, but nobody wants to reveal their own score. * Even if two people collaborate, they shouldn't be able to find your score. * More generally, if K of the N "friends" are dishonest, then they should still learn no more than the sum of the remaining N-K honest ones. * Possible protocols? Well see one that works next time... [Notes: Coloring algorithm is from: Colin McDiarmid, ``A random recouloring method for graphs and hypergraphs.'' Combinatorics, Probability and Computing, 2:363-365, 1993.]