Recitations 9/29/97 Today: o a dynamic programming example o a memoizing example Dynamic Programming: Shortest Paths ------------------------------------ Finding shortest paths in Pittsburgh. Say we wanted to do what Avrim talked about in first class: find shortest path from Skibo (imagine at intersection 0) to every other intersection in Pittsburgh. Paths from intersection to intersection like maze in Hwk 2, except different streets have different lengths. Graph/maze, but DISTANCES on edges. Also, might have 1-way streets (1-way edges). Here, there's a DP approach that works pretty well. Later talk about faster way using some more sophisticated data structures. 30 50 -- picture -- 5-------------------3-----------------4- Fifth ^ | | 15 | S. Dithridge 20 | S. Craig 15 | Moorewood ^ (one way) | | | | | 6-------------2-------------------1-----------------0-- Forbes 30 30 45 Skibo Say N nodes, E edges. DP approach: need to organize problem into sequence of subproblems of larger and larger sizes so can solve next larger based on previous solutions. For this problem, break it up in this way. Ask "what's the shortest path to point X that uses at most i edges?" (If none, write "infinity"). table: node #: 0 1 2 3 4 5 6 ------------------------------------- i = 0 | i = 1 | i = 2 | Idea: shortest to X that uses at most i edges is smallest of: (A) shortest to X that uses at most (i-1) edges, or (B) shortest to a neighbor V of X that uses at most (i-1) edges followed by the edge V -> X. Finally, shortest uses at most N-1 edges. WHY?? Alg: First, initialize all distances to 0 or infinity. Say have distance array, where distance[X][i] is shortest to X using i or fewer edges. Start: initialize distance[X][0] for each X. In C, represent infinity by some number larger than all distances. Then, pseudo-C: for(i=0; i < n; ++i) { /* number of edges */ for each node X { for each edge e from some V to X, distance[X][i] = min( distance[X][i-1], distance[V][i-1] + length[e]) } } Running time: Big-Oh? N*N*E, ok. -- if had to look at every edge here. But, turns out actually takes less time: If you have adgacency list, only need (#edges in to X) edges here, not total number of edges. So, if you look at these two loops together, total time is SUM_X (# edges in to X). Counting each edge twice: V -- X. Once at V, once at X. --> inner two loops only O(E). --> total time O(N*E). Memoizing and recursion ------------------------ Remember knapsack from class: have a bunch of tasks, each has a time to complete, and a value. You have some fixed amount of time to spend and you want to know the maximum value you can get. Idea of solution: say you have time t, and tasks 1,...,i. Then the best solution either DOESN'T use item i, in which case your value is the best you can do with time t and items 1,...,i-1, or else it DOES use item i, in which case your value is value[i] + (best you can do with time t - time[i] and items 1,...,i-1). Consider this Recursive version: say given time "t", items 1,...,n -> return max value achievable. int knapsack(int t, int n) { if (t <= 0 || n == 0) return 0; if (time[n] > t) return knapsack(t, n-1); // not enough time to do item n. return max(knapsack(t, n-1), knapsack(t - time[n], n-1) + value[n]); } (say you've defined a function (NOT A MACRO) "max") Why does this return the correct value? How long does this take? Lets consider case where time[i] is small and n is big. -> to solve on n, call twice on n-1. -> exponential in n. Now, lets use memoizing -->start with array varr[][] initialized to UNKNOWN = -1. int knapsack2(int t, int n) { if (t < 0) return 0; if (varr[t][n] != UNKNOWN) return varr[t][n]; if (t == 0 || n == 0) varr[t][n] = 0; else if (time[n] > t) varr[t][n] = knapsack2(t, n-1); else varr[t][n] = max(knapsack2(t, n-1), knapsack2(t - time[n], n-1) + value[n]); return varr[t][n]; } How much time does this take? A: O(n*t) Why: this is just like the dynamic programming solution, but top-down. (Maybe do an example. like the one from lecture.) Or, look at it this way: - only go past the (... != UNKNOWN) line once per pair . (t>0 and n>0) --> we only execute the "else" lines at most n*t times. --> at most 2*n*t + 1 calls to knapsack2, since except for the very first call (the "+ 1"), every call results from executing an "else" line, and executing one of those lines causes <= 2 calls to be made.