15-451 Algorithms 11/13/13 recitation notes * Hand back quiz * A few problems from old hwks/minis/tests related to approx algs and online algs ================================================================= - Go over quiz. - Here are some problems from prior year's minis and hwks related to approximation algorithms and online algorithms. Problem 1. Strange SAT We know the problem "Given a CNF formula, does it have a satisfying assignment?" is NP-complete. In fact, the problem "Given a CNF formula, does there exist an assignment that satisfies exactly one literal per clause?" is also NP-complete. However, the problem "Given a CNF formula, does there exist an assignment that satisfies an odd number of literals in each clause" *can* be solved in polynomial time. Why/how? Answer: convert each clause into a linear equation mod 2. Then solve them using Gaussian elimination. Notice that Gaussian elimination is fine mod 2 (unlike linear programming). Here is an example if you want to get a feel for it: x + y = 1 (mod 2) y + z = 0 (mod 2) y + w = 1 (mod 2) x + z + w = 0 (mod 2) Comment: Notice that you've shown the following strange fact. Say A is the set of CNF formulas that have a satisfying assignment, B is the set of CNF formulas that have an assignment satisfying an odd number of literals per clause, and C is the set of CNF formulas that have an assignment satisfying exactly one literal per clause. Notice that C is a subset of B, and B is a subset of A. This implies that even though it is NP-complete to determine whether a given formula belongs to A, and NP-complete to determine whether a given formula belongs to C, membership in B can be decided in polynomial time. This is interesting because given some formula F, if your algorithm says NO, then we know that it is not in C, and if your algorithm says YES, then we know that it *is* in A. Problem 2: The List-Update Problem. Suppose we have n data items x_1, x_2, ..., x_n that we wish to store in a linked list in some order. Let's say the cost for performing a lookup(x) operation is $1 if x is in the head of the list, $2 if x is the second element in the list, and so on. For instance, say there are 4 items and it turns out that we end up accessing x_1 3 times, x_2 5 times, x_3 once, and x_4 twice. In this case, in hindsight, the best ordering for a linked list would have been (x_2, x_1, x_4, x_3) with a total cost of $21. The {\em Move-to-Front} (MTF) strategy is the following algorithm for organizing the list if we don't know in advance how many times we will access each element. We begin with the elements in their initial order (x_1, x_2, \ldots, x_n). Then, whenever we perform a lookup(x) operation, we move the item accessed to the front of the list. Let us say that performing the movement is free. For instance, if the first operation was lookup(x_3), then we pay $3, and afterwards the list will look like (x_3, x_1, x_2, x_4...) Your job is to prove that the total cost of the MTF algorithm on a sequence of m operations (think of m as much larger than n) is at most 2C_{static} + n^2 where C_{static} is the cost of the best static list in hindsight for those m operations (like in our first example). We will prove this in two steps. a. First prove the somewhat easier statement that the cost of Move-to-Front is at most 2C_{initial} where C_{initial} is the cost of the original ordering (x_1, x_2, \ldots, x_n). Answer: If ii, then there must be at least (d-i) elements cutting in line in front of it. Notice also that by moving x_i to the front, these are no longer cutting in front of x_i, and x_i now is cutting in line in front of at most i-1 new elements (because by definition, there are at most i-1 elements that x_i can possibly cut in line in front of). Finally, no other pairs (x_j, x_k) have changed orderings by our movement of x_i. So the total cost + change in potential is at most d - max(0,d-i) + (i-1), which is at most 2i-1. b. Now prove the 2C_{static} + n^2 bound. Answer: just do exactly as part A except define "cutting in line" as being wrt the optimal static list. Of course the algorithm doesn't know what that optimal list is, but that's fine since this is just for analysis. Note also we need to "seed" our piggy banks with some initial cash, but that will cost us at most n^2. Note: one nice use of this is for data compression. You store each ascii character in a list in this way, and then when reading a string of text, for each character you output its index i in the list before moving the character to the front (this requires only O(log i) bits, which will be small if the item was close to the front of the list).