15-451 Algorithms 10/12/04 * Topological sorting * Shortest-paths * Maximum bottleneck path Note: Midterm in 1 week. * Bellman-Ford 1 sheet of notes no book ========================================================================= Topological sorting and DFS, cont. ================================= B----->A--->E |\ ^ | \ | | \ | | | / V V/ C<--D---->F Recall the Topological sorting problem: Given a DAG, we want an ordering of the vertices such that all edges go forward in the ordering. Say there are n vertices and m edges. Claim is can use Depth-first search in a tricky way to solve this in O(n+m) time. (linear time in the size of the problem). Graph stored as adjacency list. DFS(v): mark v as ``in progress''. for each unmarked out-neighbor w of v: do DFS(w). mark v as done. DFS_main(G): start with all nodes unmarked. For v=1 to n: if v is not yet marked, do DFS(v). Total time is O(n+m) because each directed edge is examined just once, plus constant time per node for marking. How to solve Topological Sorting: 1. Do depth-first search of G, and output the nodes as you *exit* them (when they are marked as done). 2. reverse the order. E.g., here, we get: B D F C A E Claim: If there is an edge from u to v, then v is exited first. (This implies that the edge will be pointing forward when we reverse the order.) Proof of claim: Easy to see if our DFS started at u before starting at v (e.g., u=B,v=D). What if we started at v first (like u=D,v=A)? In this case, we would finish v before even starting u since there can't be a path from v to u because that would create a cycle. ===================================================================== Shortest paths ============== Suppose we are given a graph G and a start node S, and we want to find the shortest path from S to everywhere in G. (called the "shortest path tree" from S). In 15-211, you saw Dijkstra's algorithm for this problem. Idea is to build up shortest-path tree in greedy way. Today: review Dijkstra, see how modifications can be used to solve some related problems (maximum bottleneck path), and then look at algorithms for some generalizations. E.g., 5 8 A-----B-----C | | | 1| 1| |3 start = A. | 2 | 4 | D-----E-----F DIJKSTRA's ALG -------------- initialize: - tree = {start}. No edges. label start as having distance 0. invariant: nodes in tree have correct distance to start. repeat: - for each nbr x of tree compute an (over)-estimate of distance to start: distance(x) = min [ distance(v) + length(e) ] edges e=(v,x) v in tree In other words, by our invariant, this is the length of the shortest path to x whose only edge not in the tree is the very last edge. - put x of minimum distance into tree, connecting it via the argmin (the edge e used to get distance(x)) [do above example] Claim: even if some of distances are wrong, the *minimum* one is correct. Proof: say x is neighbor of tree of smallest distance(x). What does the *true* shortest path to x look like? The key point is that the last edge this path takes must come directly from the tree, since otherwise the path would have length at least distance(y) from some neighbor of tree y!=x, and by definition, distance(x) is smaller. Running time: if you use a heap and code this up right, you can get time of O(m log n). Start with all nodes on heap, everyone having distance of infinity except start having distance of 0. Repeatedly pull off the minimum, then update its neighbors, tentatively assigning parents any time the distance of some node is lowered. Takes linear time to initialize, but then we do m updates, at O(log n) cost each. If you use something called a "fibonacci heap" (that we're not going to talk about) can get time down to O(m + n log n) Maximum-bottleneck path ======================= Here is another problem you can solve with this type of algorithm, called the "maximum bottleneck path" problem. Imagine the edge weights represent capacities of the edges ("widths" rather than "lengths") and you want the path between two nodes whose minimum width is largest. How could you modify Dijkstra to solve this?? Now, we'd update: bottleneck(x) = max [min(bottleneck(v),width(e))] edges e=(v,x) v in tree - put x of maximum bottleneck into tree, connecting it via the argmax We'll actually use this later in the course... =========================================================================== Bellman-Ford alg for shortest paths. (Note: Bellman is the person credited for inventing Dynamic Programming). What if we want to compute shortest paths, but some of the edge costs are negative? (For this to make sense, we need to assume the graph is *directed*, and there are no negative-cost cycles. Otherwise you can have a distance of negative-infinity by going around such a cycle infinitely-many times...) Will Dijkstra's algorithm still work? NO. Can anyone give an example?? It turns out these kinds of problems do come up --- in fact, we will need to solve for shortest paths when some edge costs are negative as a subroutine in some later algorithms in this course. So, here is a different algorithm that works in that case. The BELLMAN-FORD Algorithm ========================== 30 15 Say we have a graph 5-------------------3-----------------4 like this: | ^ | 10 | -40| 40 | | | | | | | 2-------------------1-----------------0-- 30 60 start Will use DP to solve the shortest-path problem. Alg idea: find shortest path that uses 1 or fewer edges (if it exists) use to find shortest path that uses 2 or fewer edges (if it exists). use to find shortest path that uses 3 or fewer edges (if it exists). ... [how far do we need to go?] use to find shortest path that uses n-1 or fewer edges. BELLMAN-FORD pseudocode: as usual for DP, let's just compute *distances* first, and then can reconstruct the paths. d[v][i] will mean the length of the shortest path to v that uses i or fewer edges (if it exists). initialize d[v][0] = infinity for v != start, 0 for start. for i=1 to n-1: for each v not equal to start: d[v][i] = MIN(d[x][i-1] + length(xv edge) )) x->v (try it on the above graph!) Inner MIN operation takes time proportional to in-degree of v. So, inner for-loop takes time O(sum of in-degrees of all nodes) = O(m). --> total time is O(mn). How to recosntruct the paths? Work backwards: if you're at vertex v at distance d[v], move to node x such that d[v] = d[x] + len(xv). =========================================================================== [Do this in rec?] ALL-PAIRS SHORTEST PATHS via MATRIX PRODUCTS Say we want to compute length of shortest path between *every* pair of vertices. If all edges are non-negative, we can run Dijkstra'a algorithm n times, one for each start (remember, it finds the shortest path tree from the start), for a total time of O(nm log n). If we have negative-cost edges, we can use Bellman-Ford, for a total time of O(mn^2). Here is a nice algorithm that uses the matrix representation of a graph, and runs in time O(n^3 log n). Given graph G, define matrix A(G) as follows: - A[i,i] = 0 for all i. - if there is an edge from i to j, then A[i,j] = length of that edge. - otherwise, A[i,j] = infinity. I.e., A[i,j] = length of shortest path from i to j using 1 or fewer edges. Now, following the basic Dynamic Programming idea, can we use this to produce a new matrix B where B[i,j] = length of the shortest path from i to j using 2 or fewer edges? Answer: yes. B[i,j] = min_k (A[i,k] + A[k,j]) [think about why this is true] I.e., what we want to do is compute a matrix product B = A*A except we change "*" to "+" and we change "+" to "min". In other words, instead of computing the sum of products, we compute the min of sums. What if we now want to get shortest paths using 4 or fewer edges? Just need to compute C = B*B. I.e., to get from i to j using 4 or fewer edges, we need to go from i to some k using 2 or fewer edges, and then from k to j using 2 or fewer edges. Just need to keep squaring O(log n) times. Running time: O(n^3 log(n)). Need to do log(n) matrix multiplies to get all paths of length n or less and standard matrix multiply takes n^3 operations. ============================================================================== All-pairs shortest paths via Floyd-Warshall. Here is an algorithm that shaves off the O(log n) and runs in time O(n^3). The idea is that instead of increasing the number of edges in the path, we'll increase the set of vertices we allow as intermediate nodes in the path. In other words, starting from the same base case (the shortest path that uses no intermediate nodes), we'll then go on to the shortest path that's allowed to use node 1 as an intermediate node. Then the shortest path that's allowed to use {1,2} as intermediate nodes, and so on. /* after each iteration of the outside loop, A[i][j] = len of i->j path that's allowed to use vertices in the set 1..k */ for k = 1 to n do: for each i,j do: A[i][j] = min( A[i][j], (A[i][k] + A[k][j]); i.e., you either go through k or you don't. Total time is O(n^3). ===================================================================