15-451 Algorithms: Lecture 3/14/00 Topics: I. Max Flow (cont) II. Bipartite Matching III. Min-Cost Flow ********** I. Max Flow (cont) Flow network: Directed graph with distinguished source s and sink t. Every node is reachable from s, and t is reachable from every node. Each edge has a nonnegative capacity. Max Flow problem: Send as much flow as possible from s to t subject to the following flow constraints: - Capacity constraint: on any edge, f(u,v) <= c(u,v) - Skew-symmetry: on any edge, f(u,v) = - f(v,u) - Flow conservation: for all nodes u except s and t, sum_v f(u,v) = 0 Residual network: after push a flow f along an s-t path. - has same nodes, but the edge capacities are adjusted: c_f(u,v) = c(u,v) - f(u,v) Augmenting path: s-t path of all positive residual capacity edges. Ford-Fulkerson method: 1. Initialize flow f to 0 2. While there exists an augmenting path p: a. push maximum possible flow along p b. compute residual capacities ********** Claimed last time that running time of Ford-Fulkerson method depends on which augmenting path we choose. E.g., 1000000 1000000 s-------->a-------->t | | A | | 1 | | V | +-------->b---------+ 1000000 1000000 If select s->a->b->t for augmenting path, residual network is: 1 +------+ | \ V 999999 \ 1000000 s-------->a-------->t | A A\ | | 1 | | | | | | +-------->b---------+ | 1 1000000 A 999999 | | | +-----------+ If now select s->b->a->t for augmenting path, residual network is: 1 1 +------+ +------+ | \ | \ V 999999 \V 999999 \ s-------->a-------->t A\ | A\ | \ | 1 | | | \ V | | 1 | +---->b---------+ | 1 | 999999 /A 999999 | | / | | +------+ +-----------+ Can repeated alternate s->a->b->t and s->b->a->t 1000000 times each. Running time is Theta(|E| * |flow|). On the other hand, if select first s->a->t then s->b->t, then two iterations suffice. ********** Edmonds-Karp Algorithm Idea: Use an augmenting path with the fewest number of edges. (Would solve example network above.) How can we find an augmenting path with the fewest number of edges??? [Use breadth-first search] Theorem: At most O(|V| * |E|) iterations, for a total of O(|V| * |E|^2) worst case time. Note: Not obvious that this is the right approach: E.g., s---------->a-->c-->d-->t | | A All other edges | | 1 | have capacity 1000000 | V | +-->e-->f-->b-----------+ Proof (highlight): - d(v) = fewest number of edges in an s->v path in the current residual network. - Claim 1: d(v) never decreases. (Assume true. Proof in the book.) - Each iteration we saturate at least one edge on the augmenting path. Thus >= 1 edge that dropped out of the residual network this iteration. - It's possible that later this edge will come back (i.e., if we later push flow in the other direction on it) - But we'll show that any given edge can be removed at most |V|/2 times. - This gives us the desired bound. Why??? [There are |E| edges in the original graph, and each iteration removes or re-adds at least one.] To show: each edge u->v can be removed at most |V|/2 times. - We'll use fact that it's a shortest (i.e., fewest edges) path. - Suppose we remove u->v. That means our path went on that edge, so d_old(v) = 1 + d_old(u) since it was a shortest path. - If we later put the edge back, we must have gone v->u, so in the new residual network d_new(u) = 1 + d_new(v). - But, Claim 1 tells us that d_new(v) >= d_old(v) - So d_new(u) >= 2 + d_old(u). - Claim: d(u) < |V| or u has become unreachable from s. Why??? [Path with fewest edges can visit each node at most once.] - If becomes unreachable, then remains unreachable. Why??? [If u is unreachable, then any node that can reach u is also unreachable. Only way to make u reachable again from s is to push flow from u or some node that can reach u back to some node reachable from s. But we can't reach any of these nodes, so can't push flow to any of them.] - Thus can remove and add at most |V|/2 times. QED ********** Another natural heuristic: - Choose the path that you can push the most flow on. (The maximum bottleneck capacity path). - Takes at most O(|E| * lg(|flow|)) iterations. Other faster algorithms too. Book describes a couple of them. ********** II. Bipartite matching Say we are in charge of assigning dorm rooms to incoming freshmen. Let's say freshmen are allowed to scout out rooms and list as acceptable or unacceptable. E.g., small school with 4 freshmen and 4 available rooms: student room ------- ---- Bob A Dan B Sue C Pat D Edges: (Bob A), (Bob B), (Dan B), (Dan C), (Dan D), (Sue A), (Sue B), (Pat B), (Pat D) Edge if acceptable. - BIPARTITE GRAPH: a graph with two sides L and R, and all edges go between L and R. - A MATCHING is a set of edges with no endpoints in common. - What we want here is a PERFECT MATCHING - a matching that connects every point on the left hand side with some point on the right hand side. - What's a perfect matching in our example??? [There are exactly two: (Bob A), (Dan C) (Sue B) (Pat D) and (Bob B), (Dan C) (Sue A) (Pat D)] - Maximum Bipartite Matching: a matching with maximum possible number of edges. ********** Algorithm for finding a maximum bipartite matching: - Can anyone see how to convert to a flow problem??? [1. Orient original edges from L to R, with capacity 1. (Anything >=1 suffices) 2. Add a dummy source s with capacity 1 edges to all in L. Add a dummy sink t with capacity 1 edges from all in R. 3. Find a max s-t flow. 4. Output the matching corresponding to the flow.] E.g., On above graph, can select augmenting paths: s -> Bob -> A -> t s -> Dan -> B -> t s -> Sue -> B -> Dan -> C -> t s -> Pat -> D -> t each augmenting path increases the size of the matching by 1 (even if it ends up "undoing" some of our previous choices). Claim 1: integer flow gives a matching of same value. Why??? [Edges out of s and into t all have capacity 1.] Claim 2: If there *exists* a matching of size k then there exists a flow of k units too. Why??? [Just hook up edges in matching.] So, we get the maximum matching. We want to ensure integer flow: - If capacities are all integers, then the Ford-Fulkerson method produces an integer flow. (not hard to show; exercise in the book) What about the number of iterations of path-finding? - at most number of edges in the matching since each new augmenting path gives us 1 new edge. - O(|V| * |E|) worst case time. ********** Matchings come up as subroutines in many problems. E.g., cell-phone service: - Callers on LHS, cells X slots on RHS (each RHS node is a pair: (cell i, slot j)) where # slots is the number of connections a cell can maintain. - Edge if person is within the cell. - Person can be in several cells at the same time (which is what makes it not just trivial). ********** III. Min-Cost Max Flows - In above example, each student had a list of acceptable versus unacceptable rooms. - A natural generalization is to consider what about more general preferences. E.g., Bob prefers room B so it costs $0 to match him there, his second choice is room A so it costs us $1 to match him there, and he hates rooms C and D so it costs $infinity to match him there. And, so on with Dan, Sue and Pat. - Then we could ask for the MINIMUM-COST PERFECT MATCHING. This is a perfect matching that out of all perfect matchings has the least total cost. ********** Generalization to flows is called the MIN-COST MAX FLOW problem: - Each edge has a *cost* as well as a capacity. - Cost of a flow is the sum over all edges of the flow on that edge times the cost of the edge. - Can have negative costs (or benefits) on edges too. - These are more general than plain max flow so can model more things. - Turns out can solve by similar method as we were using for max flow, but where each time you find the least-cost path to the sink: the shortest path where view costs as edge lengths. (Assuming there are no negative-cost cycles.) - Tricky thing is that even if originally all costs were positive, in the residual network will get negative costs - if it costs $1 to push a unit of flow on edge e, then unpushing it gives you your $1 back - So, need to use the Bellman-Ford algorithm for finding shortest paths, since Dijkstra's is not guaranteed to work with negative cost edges.