15-451 Algorithms 10/10/00
* Graphs Algorithms I Midterm next week.
- coloring&BFS (1 sheet of notes)
- topological sorting & DFS
- Bellman-Ford for shortest paths
- Dijkstra for shortest paths
=======================================================================
A lot of algorithmic problems can be modeled as problems on graphs.
Today will talk about a few basic ones. Continue talking about graph
algorithms for good portion of rest of course.
Basic terminology: A graph is a set of NODES, or VERTICES, with edges
between some of the nodes. E.g.,
1 4
|\ |
| \ |
2----3
V = set of vertices, E = set of edges.
If there's an edge between two vertices, we call them NEIGHBORS.
Degree of a vertex is the number of neighbors it has.
standard notation: n = |V|, m = |E|.
If we don't allow self-loops, what is max number of total edges? {n \choose 2}
In a directed graph, each edge has a direction. Can have up to n(n-1)
edges now. For each node, can talk about out-neighbors (and
out-degree) and in-neighbors (and in-degree).
PROBLEM 1: GRAPH COLORING.
Given a graph, want to give each node a color such that no
two neighbors have the same color. Goal is to do this with the fewest
total number of different colors (easy to do with n colors).
E.g., pentagon.
Think of problem of scheduling exams. Each node is a class, and there
is an edge between two if lots of students are taking both classes.
Goal is to use fewest number of different time slots such that no two
adjacent nodes are given the same time slot. In compilers, graph
coloring comes up when assigning variables to registers.
Unfortunately, graph-coloring problem is NP-hard. In fact, even
telling if possible to color with <= 3 colors is NP-hard in general.
But, 2-coloring problem (color with 2 colors if possible else say
"impossible") is easy. How to do it?
2-coloring: given graph G, find a 2-coloring if one exists, else
output "not 2-colorable".
One Answer: use breadth-first search. Pick some start node and color
it red. Color all neighbors blue. Color their neighbors red, etc.
(If didn't reach some nodes then they're in separate component so can
do them separately).
Claim: if this coloring isn't legal, then the graph wasn't
2-colorable. Why?
-> If coloring is illegal
==> edge between two nodes at same distance
==> have odd-length cycle in the graph
==> graph isn't 2-colorable.
Total time = time for BFS = O(m + n)
(We could have used depth-first-search too)
Here's a neat problem if you like to doodle. Draw a box (or circle,
it doesn't matter). Then slice it with straight lines. Now, color
the regions so no two adjacent (sharing an edge) have the same color.
How many colors do you need? Answer: 2. Can anyone see a proof? (So,
good for doodling in pencil).
========================================================================
PROBLEM 2: Topological sorting.
A "DAG" or directed-acyclic-graph is a directed graph without any
cycles. E.g.,
A----->D--->E
|\ ^
| \ |
| \ |
| | /
V V/
C<--B<----F
Given a DAG, the "topological sorting" problem is: find an ordering of
the vertices such that all edges go from low numbers to high numbers.
Typically comes up when given a set of tasks to do with precedence
constraints (need to do A and F before you can do B), and want to find
a legal ordering of performing the jobs.
How to solve it #1:
1. find a node v with no in-edges.
2. remove v and its outedges
3. goto 1.
This takes time O(n^2). How about a linear-time way? Here's a neat
algorithm based on depth-first-search.
DFS-main:
For v=1 to n: if v is not yet visited, do DFS(v).
DFS(v):
mark v as visited.
for each neighbor w of v:
if w is unvisited, then DFS(w)
How to solve it #2:
1. Use depth-first search, and output the nodes as they are exited.
[output v after for-loop]
2. reverse the order.
Claim: This is a topological sorting.
Why? If there is an edge from u to v, then v has to finish
first. Easy to see if our DFS started at u before starting at v. What
if we started at v first (like u=D,v=B above)? In this case, we would
finish v before even starting u since there can't be a path from v to
u (else wouldn't be acyclic).
Note: topological sorting in linear time (unlike regular sorting)
Note: can use same alg to tell if graph has any cycles.
------------
PROBLEM #3: SHORTEST PATHS IN WEIGHTED GRAPHS
Given a (possibly directed) graph with costs/lengths on edges, and one
node specified as a "goal" node, find the shortest path from every
node in the graph to the goal. Will give two algs: one that handles
edges of negative cost but is slower and uses Dynamic Programming
(Bellman-Ford), and one that is faster but requires all edges to have
nonnegative weight (Dijkstra's alg).
30 30
-- picture -- 5-------------------3-----------------4
| ^ |
10 | -20| 15 |
| ^ |
| | |
2-------------------1-----------------0--
30 50 Goal
BELLMAN-FORD algorithm. Bellman is guy who invented DP. Works even
if have negative-cost edges (but these should be directed -- otherwise
problem doesn't make sense since could go back and forth racking up
bonus points forever. Similarly, assume no negative-cost cycles).
Alg idea: find shortest path that uses 1 or fewer edges (if it exists)
use to find shortest path that uses 2 or fewer edges (if it exists).
use to find shortest path that uses 3 or fewer edges (if it exists).
... [how far do we need to go?]
use to find shortest path that uses V-1 or fewer edges.
BELLMAN-FORD pseudocode: as usual for DP, let's just compute
*distances* first, and then can reconstruct the paths.
initialize d[v][0] = infinity for v != goal, 0 for goal.
for i=1 to |V|-1:
for each v not equal to goal:
d[v][i] = MIN(d[x][i-1] + len(vx) ))
v->x
Inner MIN operation takes time proportional to out-degree of v. So,
inner for-loop takes time O(sum of out-degrees of all nodes) = O(E).
--> total time is O(|E|*|V|).
How to recosntruct the paths? If you're at vertex v at distance d[v],
move to node x such that d[v] = d[x] + len(vx).
===============================================================================
Faster alg: Dijkstra's algorithm. Idea is to build up shortest-path
tree in greedy way. Only guaranteed to work if no negative-weight edges.
E.g.,
5 10
A-----B-----C
| | |
1| 1| |5 goal = A.
| 2 | 4 |
D-----E-----F
DIJKSTRA's ALG
--------------
initialize:
- tree = {goal}. No edges. goal.dist = 0.
invariant: nodes in tree have correct distance to goal.
repeat:
- for each node x that's a neighbor of tree (x is a FRINGE node) compute:
x.dist = min [ v.dist + length(e) ]
edges e=(x,v)
v in tree
- put x of minimum x.dist into tree, connecting it via the argmin
(the edge that caused x.dist to be what it is)
[do above example]
Claim: even if some of dists are wrong, the *minimum* one
is correct.
Proof: say x is the fringe node of smallest x.dist. What does the
*true* sortest path to x look like? The point is that the first node
this path hits once it exits the tree has to be x, since if it was
some other node y, then its length would have to be at *least* y.dist,
which is greater than x.dist.
Notice that this argument would break down if there was a
negative-cost way of getting from y to x. That's why Dijkstra's
algorithm doesn't allow negative-cost edges.
------
To implement this efficiently, want to store fringe in some kind of
data structure. Need to be able to insert pairs (node, dist) and
need to be able to modify the dist field (actually, only decrease it, which
simplifies things). Need to be able to remove the minimum. We can
use a heap to do this in which case each operation takes worst-case O(log V)
time. There are O(E) operations (each node x on the fringe gets
updated at most degree(x) times), so total is O(E log V).
Actually, there's a more complicated data structure called a Fibonacci
heap where insert and remove min are still O(log V), but
decrease-value takes O(1) *amortized* time. This is nice since only
O(V) inserts and deletes, but potentially O(E) decrease-values. So,
total time is now O(E + V log V). This is better if lots of edges.