recitation 11/17 * priority-first search for shortest paths ------------------------------------------------------------------------- Go over an example of priority-first search, analysis of why it works, and why the running time is what it is. Also, perhaps some variations that may be easier to think about. Recall the defn: shortest path tree from START 1. spanning tree (has all vertices reachable from START). 2. for any vertex V, shortest path from START to V is path in tree. Here is a good example to do (starting at A) 5 4 3 A------D----E---H 7| \ 2/ | | 4\ / |1 C---B---G---F 2 4 3 Let's use the "army of ants" scenario (described in lecture) and make a formal connection between that and our pseudocode. The scenario: have an army of ants (or whatever) all starting at the START vertex (say, A), moving at unit speed away from it in all directions. Whenever a group of ants hits a vertex, they split apart going in all directions. Clearly, the first ant to hit a node has gone along the shortest path. Some things to notice about these ants: 1. If an ant gets to a vertex that's already been visited, it can quit since it can't possibly be the first to anywhere. (this will map to the "if V is visited..." line below) 2. In fact, there's no reason to even start down a path to a vertex if you know it's been visited before. (this will map to the "if X is visited..." line below) Say we wanted to implement this idea in a computer program. What would we need to keep track of? In terms of the vertices, we could keep track of: visited[X] = 1 if X has already been visited by the ants dist[X] = the time the ants first got there (the length of the shortest path to X) parent[X] = where those ants came from. Then we also need to keep track of the ants: for each group of ants currently active we want to know where they are (what edge are they on) and when are they going to get to the endpoint they're heading towards. Then, in each step of the algorithm, we just need to find out what the next event is ("event" == ants hitting a vertex) and update our data structures. E.g., at the start, we have the following (edge, time-to-reach-endpoint) pairs (A->C, 7) (A->B, 4) (A->D, 5) Then (pulling off the minimum) we see that at time 4 the ants reach node B. (So, visited[B] = 1, parent[B] = A, dist[B] = 4). These ants head towards C, D, and G (but not A by rule #2 above) and we can figure out their time-of-arrival by adding the edge lengths to the current time. I.e., our active set now looks like (A->C, 7) (A->D, 5) (B->C, 6) (B->D, 6) (B->G, 8) Then (pulling off the minimum we se that at time 5 the ants reach node D. (So, visited[D] = 1, parent[D] = A, dist[D] = 5). These ants head towards E (but not B and A by rule #2 above). Then, at time 6, the ants from B reach node C (So, visited[C] = 1, parent[C] = B, dist[C] = 6). Also, at time 6, the ants from B reach D. But, since D is already visited, we don't need to do anything (by rule #1 above). So, our active set now looks like: (D->E, 9) (A->C, 7) (B->G, 8) and so on. This algorithm is slightly different from the PFS algorithm given in lecture in that: (A) we're storing (edge, time) pairs instead of (vertex, time) pairs (B) we set parent[X] and dist[X] only when X enters "visited" as opposed to setting them when X enters "fringe" and then possibly having to update them when a better route comes in. I.e., the code from lecture is more like saying: if the ants at vertex V are considering heading towards vertex X, they first see if there are any other ants heading towards X, and if so, they only go ahead if they can see that they're going to get there first. pseudocode: init: insert (start, start, 0) into Pqueue While queue not empty: (U, V, priority) = pqueue.removeMin(); if V already visited, continue; // corresponds to (1) above mark V as visited. set parent[V] = U. set dist[V] = priority for each unvisted neighbor X of V: insert (V, X, dist[V] + length(VX edge)) into pqueue. In the code from lecture, we would have a simpler pqueue (each entry just stores one vertex and its priority) and we would set dist[X] and parent[X] when X enters the pqueue (and later update these if we find a better path). How long does the above algorithm take? Answer: there are at most |E| "inserts" (because once you send ants down an edge, you will never send them down that edge again) and so also at most |E| removeMins. So, the time is O(|E|*(time for insert) + |E|*(time for removeMin)) Max number of items in pqueue at any point in time is |E|. So, if we can get the inserts and removeMins to O(log n) time we will have time = O(|E| * log(|E|)) We can do this by using a min-heap. To removemin we take the root off (that's the minimum) and then re-make the heap by replacing it with the rightmost leaf and doing a "downheap"/"siftUp" procedure. To do an insert, we put the new value as the rightmost leaf and do swaps up the tree. Let's do an example. In the above algorithm on the graph above, we do the following sequence of operations: operations: heap (as array): insert(A->C, 7) 7 insert(A->B, 4) 7 4 --> 4 7 insert(A->D, 5) 4 7 5 removeMin() 5 7 insert(B->C, 6) 5 7 6 insert(B->G, 8) 5 7 6 8 insert(B->D, 6) 5 7 6 8 6 --> 5 6 6 8 7 (we wouldn't have done the above line using the method from lecture) removeMin() 7 6 6 8 --> 6 7 6 8 insert(D->E, 9) 6 7 6 8 9 removeMin() 9 7 6 8 --> 6 7 9 8 removeMin() 8 7 9 --> 7 8 9 removeMin() 9 8 --> 8 9 etc.