15-451 Algorithms: Lecture 2/24/00 Announcements: - Homework #2 scores being sent by email - Homework #3 due Tues - New office hours (see web page / Bboard) - Part of Tues class: Hints for designing algorithms - Midterm next Thurs 3/2 - Material cumulative through today's class - Review session 8-10 pm Wed 3/1 Room TBD ********** Topics: I. Graphs II. DFS, Topological sort III. Single-source shortest paths - BFS - Bellman-Ford ********** I. Graphs - A graph is a set of NODES, or VERTICES, with edges between some of the nodes. E.g., 1 4 |\ | | \| 2--3 V = set of vertices, E = set of edges. If there's an edge between two vertices, we call them ADJACENT or NEIGHBORS. DEGREE of a vertex = number of neighbors. standard notation: n = |V|, m = |E|. Unless otherwise specified: - edges are undirected - there is at most one edge between any two vertices - no edges from a vertex to itself (i.e., no self-loops) In an n-node graph, max degree is n-1. What is max number of total edges??? [{n \choose 2}] DIRECTED graph: each edge has a direction. For each node, can talk about out-edges (out-neighbors, out-degree) and in-edges (in-neighbors, in-degree). Unless otherwise specified: - for any two vertices u and v, there is at most one edge directed from u to v, and at most one from v to u. - no edges from a vertex to itself (i.e., no self-loops) In an n-node graph, max degree is ??? [n-1] max number of total edges??? [n(n-1), twice as many as in undirected case] Directed acyclic graph (DAG): no directed cycles ********** Graphs used to model many problems - E.g., V = airports, E = flights - E.g., V = web pages, E = hyperlinks - E.g., V = tasks, E = timing constraints - E.g., V = basic blocks in program, E = branches Getting dressed (a DAG): undershorts socks watch | \ | v v v pants --> shoes | v belt<--shirt \ | \ v \ tie \ | v v jacket ********** II. DFS, Topological sort Start with undirected graphs A------D----E | \ / | \ / C---B----F DFS(v): mark v as visited. for each neighbor w of v: if w is unvisited, then DFS(w) At end, if some nodes not yet visited then pick one and do DFS from it. Resulting tree is called DFS tree (DFS forest if the original graph is not connected). - Not unique. - If draw tree, all non-tree edges go between ancestors and descendants. Why??? Called "back" edges. ********** Directed graphs: - These are more interesting since possible for w to be reachable from v but not vice versa. E.g., jacket reachable from pants but not vice-versa DFS algorithm is the same, except consider only out-neighbors. 1 undershorts socks 6 watch 7 |* \ | v v v 2 pants --> shoes 5 |* * Numbered in order visited v i.e., depth-first numbering (DFN) 3 belt<--shirt 8 \ |* * = edge in DF forest \ v *\ tie 9 \ | v v 4 jacket Edges not in DF forest can be - "back" edges: decendant to ancestor wrt DF forest - "forward" edges: ancestor to descendant - "cross" edges: the rest - What are examples of cross edges??? [socks->shoes, shirt->belt, tie->jacket] - forward edges??? [undershorts->shoes] - back edges??? [None since DAG] - DAG iff DFS has no back edges. why??? [back edge => cycle (easy). cycle => back edge (not much harder)] ********** Topological sorting: Given a DAG, find an ordering of the vertices such that for all edges (u,v), u appears before v in the ordering. Useful for producing a schedule of the tasks that preserves the edge constraints How to solve it: - When doing DFS, when exit node, prepend to output list 1,X5 undershorts socks 6,X6 watch 7,X7 |* \ | v v v 2,X4 pants --> shoes 5,X3 |* * Numbered in order visited v i.e., depth-first numbering (DFN) 3,X2 belt<--shirt 8,X9 \ |* * = edge in DF forest \ v *\ tie 9,X8 \ | v v 4,X1 jacket E.g., shirt, tie, watch, socks, undershorts, pants, shoes, belt, jacket Claim: Output is a topological sorting Why? If there is an edge from u to v, then claim we exit v first. When (u,v) is considered by the DFS, either (1) add (u,v) as a tree edge, and eventually exit v by backtracking to u, or (2) v has already been visited, and (u,v) is a cross or forward edge (can't be back edge since acyclic), and already exited v by backtracking via another parent Note: topological sorting in linear time: O(m) (unlike regular sorting) ********** III. Single-source shortest paths Breadth-first search: - sweeping across vertices. - Do by storing current wavefront (fringe) in a FIFO queue. - when we extract from the queue that corresponds to advancing the wave front. - BFS is nice for finding shortest paths when graph is unweighted - O(m) time ********** What about finding shortest paths in *weighted* graphs? "Single source shortest paths" problem: find shortest path from some start vertex to all other vertices in the graph. This will look like a tree -- called the shortest-path tree. 40 10 f<------->d<-------->e A A A 10 | -20 | | | | 15 | V | | c<--------b<---------a 5 20 Start First: An O(n*m) time algorithm based on Dynamic Programming. Called BELLMAN-FORD alg. Works even if have negative-weight edges. Can extend to report if graph has a cycle of net negative weight. - Why are these problematic??? Then will discuss faster alg, but requires nonnegative costs. Alg idea: find shortest path that uses 1 or fewer edges (if it exists) use to find shortest path that uses 2 or fewer edges (if it exists). use to find shortest path that uses 3 or fewer edges (if it exists). ... [how far do we need to go?] use to find shortest path that uses n-1 or fewer edges. General idea: start is always 0. For others, to get to v using i or fewer edges, need to get to an in-neighbor of v using i-1 or fewer edges. So, min cost way to get to v using i edges or less: cost(v,i) = MIN ( cost(x,i-1) + weight(x->v edge) ) edges x->v inedge(v,i) = edge x->v that minimizes above BELLMAN-FORD pseudocode: initialize c[v][0] = infinity for v != start, 0 for start. for i=1 to n-1: for each v not equal to start: c[v,i] = MIN(c[x,i-1] + w(x,v)), inedge[v] = x x->v ********** E.g., on the above graph node= a b c d e f i=0 0 inf inf inf inf inf 1 0 20(a) inf inf 15(a) inf 2 0 20(a) 25(b) 0(b) 15(a) inf 3 0 20(a) 25(b) 0(b) 10(d) 35(c) 4 0 20(a) 25(b) 0(b) 10(d) 35(c) 5 0 20(a) 25(b) 0(b) 10(d) 35(c) Shortest path tree: 10 f d--------->e A A 10 | | | -20 | | | c<--------b<---------a 5 20 Start ********** Inner MIN operation takes time proportional to in-degree of v. So, inner for-loop takes time O(sum of in-degrees of all nodes) = O(m). --> total time is O(m*n). Why do the set of shortest paths form a tree??? [optimal substructure property: if the shortest path from s to v goes through node x, then it uses the shortest path from s to x to get there.]