15-451 Algorithms		10/12/04
* Topological sorting
* Shortest-paths
* Maximum bottleneck path			Note: Midterm in 1 week.
* Bellman-Ford          			      1 sheet of notes
                                                      no book
=========================================================================

Topological sorting and DFS, cont.
=================================

      B----->A--->E
      |\     ^
      | \    |
      |  \   |
      |   | /  
      V   V/         
      C<--D---->F

Recall the Topological sorting problem: Given a DAG, we want an
ordering of the vertices such that all edges go forward in the ordering.

Say there are n vertices and m edges.  Claim is can use Depth-first
search in a tricky way to solve this in O(n+m) time.  (linear time in
the size of the problem).  Graph stored as adjacency list.

DFS(v):
  mark v as ``in progress''.
  for each unmarked out-neighbor w of v: do DFS(w).
  mark v as done. 

DFS_main(G):
 start with all nodes unmarked.
 For v=1 to n: if v is not yet marked, do DFS(v).

Total time is O(n+m) because each directed edge is examined just once,
plus constant time per node for marking. 


How to solve Topological Sorting:
1. Do depth-first search of G, and output the nodes as you *exit* them
   (when they are marked as done).
2. reverse the order.

E.g., here, we get: B D F C A E

Claim: If there is an edge from u to v, then v is exited first.  (This
implies that the edge will be pointing forward when we reverse the order.)

Proof of claim: Easy to see if our DFS started at u before starting at
v (e.g., u=B,v=D).  What if we started at v first (like u=D,v=A)?  In
this case, we would finish v before even starting u since there can't
be a path from v to u because that would create a cycle.
=====================================================================

Shortest paths
==============
Suppose we are given a graph G and a start node S, and we want to find
the shortest path from S to everywhere in G. (called the "shortest
path tree" from S).

In 15-211, you saw Dijkstra's algorithm for this problem.  Idea is to
build up shortest-path tree in greedy way.   Today: review Dijkstra,
see how modifications can be used to solve some related problems
(maximum bottleneck path), and then look at algorithms for some
generalizations.  

E.g.,
                 5     8
              A-----B-----C
              |     |     |
             1|    1|     |3     start = A.
              |  2  |  4  |
              D-----E-----F

DIJKSTRA's ALG
--------------
initialize:
 - tree = {start}.  No edges.  label start as having distance 0.

invariant: nodes in tree have correct distance to start.

repeat:
 - for each nbr x of tree compute an (over)-estimate of distance to start:
           distance(x) =    min   [  distance(v) + length(e)  ]
                       edges e=(v,x)
                         v in tree

   In other words, by our invariant, this is the length of the
   shortest path to x whose only edge not in the tree is the very last
   edge.

 - put x of minimum distance into tree, connecting it via the argmin
        (the edge e used to get distance(x))

[do above example]

Claim: even if some of distances are wrong, the *minimum* one is correct.
          
Proof: say x is neighbor of tree of smallest distance(x).  What does
the *true* shortest path to x look like?  The key point is that the
last edge this path takes must come directly from the tree, since
otherwise the path would have length at least distance(y) from some
neighbor of tree y!=x, and by definition, distance(x) is smaller.

Running time: if you use a heap and code this up right, you can get
time of O(m log n).  Start with all nodes on heap, everyone having
distance of infinity except start having distance of 0.  Repeatedly pull
off the minimum, then update its neighbors, tentatively assigning
parents any time the distance of some node is lowered.  Takes linear
time to initialize, but then we do m updates, at O(log n) cost each. 
If you use something called a "fibonacci heap" (that we're not going
to talk about) can get time down to O(m + n log n)

Maximum-bottleneck path
=======================
Here is another problem you can solve with this type of algorithm, called
the "maximum bottleneck path" problem.  Imagine the edge weights
represent capacities of the edges ("widths" rather than "lengths") and
you want the path between two nodes whose minimum width is largest.  How
could you modify Dijkstra to solve this??

Now, we'd update:
           bottleneck(x) =    max   [min(bottleneck(v),width(e))]
                         edges e=(v,x)
                           v in tree

 - put x of maximum bottleneck into tree, connecting it via the argmax

We'll actually use this later in the course...

===========================================================================

Bellman-Ford alg for shortest paths.

(Note: Bellman is the person credited for inventing Dynamic Programming).

What if we want to compute shortest paths, but some of the edge costs
are negative?  (For this to make sense, we need to assume the graph is
*directed*, and there are no negative-cost cycles.  Otherwise you can
have a distance of negative-infinity by going around such a cycle
infinitely-many times...)

Will Dijkstra's algorithm still work?  NO.  Can anyone give an example??

It turns out these kinds of problems do come up --- in fact, we will
need to solve for shortest paths when some edge costs are negative as
a subroutine in some later algorithms in this course.  So, here is a
different algorithm that works in that case.

The BELLMAN-FORD Algorithm
==========================
                               30                    15
Say we have a graph     5-------------------3-----------------4
like this:              |                   ^                 |
                     10 |                -40|              40 |
                        |                   |                 |
                        |                   |                 |
                        2-------------------1-----------------0-- 
                               30                    60      start

Will use DP to solve the shortest-path problem.

Alg idea: find shortest path that uses 1 or fewer edges (if it exists)
   use to find shortest path that uses 2 or fewer edges (if it exists).
   use to find shortest path that uses 3 or fewer edges (if it exists).
        ... [how far do we need to go?]
   use to find shortest path that uses n-1 or fewer edges.

BELLMAN-FORD pseudocode:  as usual for DP, let's just compute
*distances* first, and then can reconstruct the paths.

d[v][i] will mean the length of the shortest path to v that uses i or
fewer edges (if it exists).

initialize d[v][0] = infinity for v != start, 0 for start.
for i=1 to n-1:
  for each v not equal to start:
    d[v][i] = MIN(d[x][i-1] + length(xv edge) ))
              x->v

(try it on the above graph!)

Inner MIN operation takes time proportional to in-degree of v.  So,
inner for-loop takes time O(sum of in-degrees of all nodes) = O(m).
--> total time is O(mn).

How to recosntruct the paths?  Work backwards: if you're at vertex v
at distance d[v], move to node x such that d[v] = d[x] + len(xv).


===========================================================================


[Do this in rec?]

ALL-PAIRS SHORTEST PATHS via MATRIX PRODUCTS

Say we want to compute length of shortest path between *every* pair of
vertices.  If all edges are non-negative, we can run Dijkstra'a
algorithm n times, one for each start (remember, it finds the shortest
path tree from the start), for a total time of O(nm log n).  If we
have negative-cost edges, we can use Bellman-Ford, for a total time of
O(mn^2).  Here is a nice algorithm that uses the matrix representation
of a graph, and runs in time O(n^3 log n).

Given graph G, define matrix A(G) as follows:
	- A[i,i] = 0 for all i.
        - if there is an edge from i to j, then A[i,j] = length of that edge.
	- otherwise, A[i,j] = infinity.

I.e., A[i,j] = length of shortest path from i to j using 1 or fewer edges.

Now, following the basic Dynamic Programming idea, can we use this to
produce a new matrix B where B[i,j] = length of the shortest path from
i to j using 2 or fewer edges? 

Answer: yes.  B[i,j] = min_k (A[i,k] + A[k,j])
[think about why this is true]

I.e., what we want to do is compute a matrix product B = A*A except we
change "*" to "+" and we change "+" to "min". 

In other words, instead of computing the sum of products, we compute
the min of sums.  What if we now want to get shortest paths using 4 or
fewer edges?  Just need to compute C = B*B.  I.e., to get from i to j
using 4 or fewer edges, we need to go from i to some k using 2 or
fewer edges, and then from k to j using 2 or fewer edges.

Just need to keep squaring O(log n) times.

Running time: O(n^3 log(n)).  Need to do log(n) matrix multiplies to
get all paths of length n or less and standard matrix multiply takes n^3
operations.

==============================================================================
All-pairs shortest paths via Floyd-Warshall.

Here is an algorithm that shaves off the O(log n) and runs in time
O(n^3).  The idea is that instead of increasing the number of edges in
the path, we'll increase the set of vertices we allow as intermediate
nodes in the path.  In other words, starting from the same base case
(the shortest path that uses no intermediate nodes), we'll then go on
to the shortest path that's allowed to use node 1 as an intermediate
node.  Then the shortest path that's allowed to use {1,2} as
intermediate nodes, and so on.

/*  after each iteration of the outside loop, A[i][j] = len of i->j 
    path that's allowed to use vertices in the set 1..k  */

   for k = 1 to n do:
     for each i,j do:
       A[i][j] = min( A[i][j], (A[i][k] + A[k][j]);

i.e., you either go through k or you don't.  Total time is O(n^3).
===================================================================