15-750 Graduate Algorithms 3/6/01
- recap
- applications
- Edmonds-Karp 2
- speedups Reading: Chapter 17 & 18
===========================================================================
Last time: described max flow problem. Ford-Fulkerson algorithm, and
proof of maxflow-mincut theorem, also integral flow theorem. Ended
with Edmonds-Karp version 1 (always augment along path of maximum
capacity). # iterations at most m*log(F), where F is the max flow
value (assuming all capacities are integers). Each iteration involves
running a Prim-like algorithm.
Recap of Edmonds-Karp argument:
First, we can *find* the desired path by building a tree. Starting
from U = {s}, always put in the widest (highest capacity) edge leaving
the set U. Produces a tree. The path from s to t is the one we want.
Claim is: this guarantees to find a path of capacity at least a 1/m
fraction of the max flow in the current graph. Proof: if we ever get
to a point where all edges leaving U have capacity < f/m, then that
means the (U,V-U) cut has capacity < (f/m)*m = f. This contradicts
flow <= cut. So, each path reduces remaining flow by (1-1/m).
AN INTERESTING APPLICATION:
Application: 3-D image processing. Given a pixel image, where each
pixel is 0 or 1 (indicating foreground or background), want to clean
it up using the rule that in general neighboring pixels are usually
supposed to be the same value. [In actual application, usually have
many depth levels, but we will just consider 2 levels]. Say that
given image I, the "cost" or "energy" of some modification I' is
c*(#locations in which I' differs from I) + (number of neighboring
pixels in I' that have different values). Think of as: costs $c for
flipping a bit, and costs $1 for each pair of neighbors that are
different. Want the image of lowest energy (cost). Usually solved by
simulated annealing but this is *slow*. Faster way is to turn into a
min cut problem. Here's how we do it: set up 0-source and 1-sink.
Connect source to all 0s, and sink to all 1's by edges of capacity c.
Connect neighboring pixels by edge of capacity 1. Claim: any solution
corresponds to a cut, with energy = value of cut. So, we just want
the min S-T cut. The problem with multple depths corresponds to the
"multiway cut" problem: Given a collection of 'terminal' nodes S1, S2,
..., Sk, find the partition into regions U1, U2, ..., Uk with Si in
Ui, of least total cost. Unfortunately, that problem is NP hard.
But, it turns out that even for this problem, you can use the min cut
problem to give you a heuristic that produces good solutions in
practice.
EDMONDS-KARP #2
---------------
Edmonds-Karp #2: always pick the *shortest* path.
[Do example on standard cut-diamond graph, but where add extra nodes
to bottom-right and top-left edges to lengthen them.]
Claim: this makes at most O(mn) iterations. So, run time is O(nm^2)
since we use BFS in each iteration.
Will give two proofs (they both have the same 1st half).
Proof #1: Let d(v) be the shortest path distance from s to v in the
current residual graph. Also, for visualization, let's lay out the
graph in levels according to a BFS from s: so we have levels for
d=0, d=1, d=2, etc. Now, each time we perform an iteration, we may
add new edges into the residual graph, but the d(v)'s NEVER GO DOWN.
Why? [Because we only add edges that go backwards in the BFS levels].
Each time we find an augmenting path, we saturate at least
one edge on it. Let's say we saturate edge u->v, where u is at level i
and v is at level i+1. Claim: if we ever re-add edge u->v, then u must
be at level i+2 or more. Why? [Because we can only add this edge if
our shortest path used the edge v->u. Since the d()'s never go down,
v is at level at least i+1, so u is at level at least i+2.]
So, any given edge can be removed at most n/2 times. Since each
iteration removes or re-adds at least one edge from the original
graph, this means there can be at most mn iterations. QED
Proof #2: Another way to argue the second half is to just look at
distance from s to t. Start with BFS levels for original graph, and
don't recompute until d(t) changes. Notice that so long as d(t)
doesn't change, all paths will use forward edges. Each iteration
removes at least one forward edge, so we can do this at most m times
before we need to recompute. d(t) can increase at most n times, so
again we get at most mn iterations.
DINIC & MPM
===========
Can we do this a little faster? [This is chapter 18 in the book]
Previous alg: O(mn) iterations at O(m) each for O(m^2 n) total. Let's
try to reduce to O(n^3).
Idea: Let d = d(t). We will try to find lots of paths of length d all
at once. To do this, let's look at "level graph" which is what you
get by computing BFS levels and then deleting all edges that go
backward or have both endpoints in the same level (since we know we
won't need to use them in paths of length d). We now will try to push as
much flow as possible in this graph. At that point, we then recompute
residual and the new leveled graph and repeat.
Terminology: a flow that saturates this level graph is called a
"blocking flow".
So, our goal is to find a blocking flow in time O(n^2).
Define c(v) = capacity of vertex v to be: min[c_in(v), c_out(v)], where
c_in(v) is sum of capacities of in-edges and c_out is sum of
capacities of out-edges.
The Algorithm:
- Find vertex v of minimum capacity c. If it's zero, that's great -- we
can remove it and incident edges, updating capacities of neighboring
nodes.
- if c is not zero, then we greedily pull c units of flow from s to v,
and then push it along to t. This seems like just another version of
the original problem. But, there are two points here: (1) because v
had *minimum* capacity, we can do the pulling and pushing in any
greedy way and we won't get stuck. This observation means we can
saturate node v in O(m) time [we can do BFS, pushing the flow
level-by-level, so we never examine any edge twice], which gives us an
O(mn) algorithm for saturating the level graph (O(m) per node, n
nodes).
To get the time bound we want, we need one thing: the way we will do
the pulling and pushing is we examine the edges one at a time, fully
saturating the edge before going on to the next one. This means we
can allocate our time into two parts: (a) time spent pushing through
edges that get saturated, and (b) time spent on edges that we didn't
quite saturate [at most one of these per node]. We only take time
O(n) on type-b operations. Type-a operations result in deleting the
edge, so over *all* vertices, the *total* time of these is just O(m).
So, our running time is dominated by the type (b) operations, which is
O(n) per vertex saturated, or O(n^2) total. Once we've saturated the
level graph, we recompute it (in O(m) time, but this can be put into
the O(n^2)) and re-solve. Total time is O(n^3).
?? Book talks about storing vertices in a fib-heap so we can do the
remove-min and decrease-key operations quickly. Do we really need
this? Why not just an array, since it's OK to spend O(n) time to do
remove-min, so long as decrease-key is O(1)?