RECITATION NOTES 15-451 Algorithms 10/11/06 * Bellman-Ford vs Dijkstra * Floyd-Warshall * Knapsack problem [We'll do more on Kruskal and Prim on Thursday] ========================================================================= Shortest paths: Bellman-Ford versus Dijkstra -------------------------------------------- We've seen two algorithms for finding shortest-path trees. Bellman-Ford worked by finding the shortest path to the goal that uses at most 1 edge, then the shortest that uses at most 2 edges, etc. Key idea: shortest path from v to the goal that uses i or fewer edges will first go to some neighbor x, and then take the shortest path from x to the goal that uses i-1 or fewer edges, which we've already solved for. So just take the min over all neighbors x. Running time O(mn). Dijkstra works a bit differently. Here you build out a tree from the goal, and with appropriate data structures you get time O(m log n). (Note: in a directed graph you can either run it with the starting node as a source or with the starting node as a goal.) Why would you ever want to use Bellman-Ford? One answer is that it can also solve more general problems. One such case is when you have directed edges of negative length (we'll see places where this comes up later) so long as there are no negative-cost cycles. E.g., -10 a<-----b | | 12| |5 | | goal-----c 6 Here Dijkstra will first put on c with 6, then b with 11, then a with 12. Think about: where did the proof of correctness fail??? ================================================================== ALL-PAIRS SHORTEST PATHS ------------------------ Say we want to compute length of shortest path between *every* pair of vertices in a graph. Here is a really cute algorithm called Floyd-Warshall that can do this in time O(n^3). The idea is that instead of increasing the number of edges in the path like Bellman-Ford, we'll increase the set of vertices we allow as intermediate nodes in the path. In other words, starting from the same base case (the shortest path that uses no intermediate nodes), we'll then go on to the shortest path that's allowed to use node 1 as an intermediate node. Then the shortest path that's allowed to use {1,2} as intermediate nodes, and so on. Let A be the adjacency matrix for the graph: A[i][j] = length of edge (i,j) if the edge exists, infinity if it doesn't exist, and 0 on the diagonal. /* after each iteration of the outside loop, A[i][j] = length of shortest i->j path that's allowed to use intermediate vertices in the set 1..k */ for k = 1 to n do: for each i,j do: A[i][j] = min( A[i][j], (A[i][k] + A[k][j]); i.e., you either go through k or you don't. Total time is O(n^3). What's amazing here is how compact and simple the code is! =================================================================== KNAPSACK PROBLEM ---------------- Imagine you have a homework assignment with different parts A,B,C,D,E,F,G. Each part has a "value" (in points) and a "size" (time in hours) to do. EG: A B C D E F G value: 7 9 5 12 14 6 12 time: 3 4 2 6 7 3 5 You have: 15 hours. Which parts should you do? What's the best total value possible? (Assume no partial credit on parts) (A: 34 -- ABFG) This is called the "knapsack problem": Given n items. Each has size s_i and value v_i. Knapsack of size S. Goal: find subset of items of max possible total value such that sum of sizes is <= S. Can solve in time O(2^n) by trying all possible subsets. Want something better. We'll use DP to solve in time O(n*S). Let's do this top down by starting with a simple recursive solution and trying to memoize. Start by just computing the best possible value - then we can see how to actually extract the items needed. recursive algorithm: either we use the last element or we don't. Value(n,S) // S = space left, n = # items still to choose from { if (S <= 0 || n = 0) return 0; if (s_n > S) result = Value(n-1,S); // can't use nth item else result = max{v_n + Value(n-1, S-s_n), Value(n-1, S)}; return result; } Right now, time is O(2^n). But, how can we speed up? Use a 2-d array to store results so we don't end up computing the same thing over and over again. Initialize arr[i][j] to "unknown" Value(n,S) { if (S <= 0 || n = 0) return 0; if (arr[n][S] != unknown) return arr[n][S]; // <- added this if (s_n > S) result = Value(n-1,S); else result = max{v_n + Value(n-1, S-s_n), Value(n-1, S)}; arr[n][S] = result; // <- and this return result; } Running time: same analysis as for LCS: O(nS). How to get items? Work backwards: if arr[n][S] = arr[n-1][S] then we *didn't* use the nth item so recursively work backwards from arr[n-1][S]. Otherwise, we *did*, so output nth item and recursively work backwards from arr[n-1][S-s_n]. [if you want to do the other algorithm for Knapsack, which is time O(nV), look at the solutions to hwk3 for Fall05]