15-451 Algorithms 10/05/00
* Dynamic programming
- LCS [similar examples in Section 6.8 and 6.11]
- Matrix product parens
- knapsack [discussed in Section 5.10]
(will probably just get to 2 of these 3)
======================================================================
Dynamic Programming: technique used a lot in AI and other areas.
Often allows you to solve a problem in time O(n^2) or O(n^3) where a
naive approach would take exponential time. To get speed down below
that (if possible), probably need other ideas. Today: 2 or 3 examples.
Basic Idea: Break down problem into subproblems. Solve the smallest
subproblems first and then use those to solve the next-smallest, and
so on, until finally you get a solution to the whole thing. As long
as there are not too many subproblems, the running time won't be too bad.
Example #1: Longest Common Subsequence
2 strings: S of length n, T of length m. E.g.,
S = BACBAD
T = ABAZDC
LCS is longest sequence of characters that appear left-to-right (but
not necessarily in a contiguous block) in both strings.
E.g., here, LCS has length 4: ABAD.
For instance, use in genetics: given two DNA fragments, the LCS gives
info about what they have in common and the best way to line them up.
Often want to do with k fragments at a time.
Let's solve using DP. Subproblems: look at LCS of prefix of S and
prefix of T. For simplicity, we'll worry first about finding *length* of
longest and then modify to produce the sequence.
Algorithm is sort of like the mine-sweeper game.... We build a nxm
matrix with S across the top and T down the side, and fill in the
entries for "how long is the longest subsequence using only the first i
characters of S and only the first j characters of T.
* | B A C B A D
--+------------
A | 0 1 1 1 1 1
B | 1 1 1 2 2 2
A | 1 2 2 2 3 3
Z | 1 2 2 2 3 3
D | 1 2 2 2 3 4
C | 1 2 3 3 3 4
Rule: if (S[i]==T[j]) then M[i][j] = 1 + M[i-1][j-1]
else M[i][j] = max{ M[i-1][j], M[i][j-1] }
To find sequence: walk backwards through matrix to largest of left or
up. If both are < current, then go diagonally and output character.
(Or, could modify alg to fill matrix with sequences instead of numbers)
Total time: O(mn)
We've been looking at "bottom-up Dynamic Programming". Can also view
in other direction: take exponential-time recursive solution, and then
make poly-time by storing (memoizing) partial results in a table.
Sometimes this is called "top-down Dynamic Programming".
E.g., here is an expl-time recursive program for LCS (arrays start at 1):
LCS(S,n,T,m)
{
if (n==0 || m==0) return 0;
if (S[n] == T[m]) return 1 + LCS(S,n-1,T,m-1); // no harm in matching up
else return max( LCS(S,n-1,T,m), LCS(S,n,T,m-1) );
}
Memoized version: start by initializing arr[i][j]=unknown for all i,j.
LCS(S,n,T,m)
{
if (n==0 || m==0) return 0;
if (arr[n][m] == unknown) {
if (S[n] == T[m]) arr[n][m] = 1 + LCS(S,n-1,T,m-1);
else arr[n][m] = max( LCS(S,n-1,T,m), LCS(S,n,T,m-1) );
}
return arr[n][m];
}
Running time is O(nm). Why?
-> {...} region is executed at most n*m times
=> at most 2nm recursive calls.
=> O(nm) running time.
Comparing bottom-up and top-down: top-down (memoized) pays a penalty
in recursion overhead, but can be faster for problems where a
reasonable fraction of the subproblems never get examined at all, so
we can avoid computing them.
=============================================================================
Example #2: Matrix product parenthesization
Say we want to multiply 3 matrices X, Y, and Z. We could do it like
this: (XY)Z or like this X(YZ). Which way doesn't affect final
outcome but *does* affect running time to compute it.
E.g., say X is 100x20, Y is 20x100, and Z is 100x20. So, end result
will be a 100x20 matrix. If we multiply using usual algorithm, then
to multiply lxm matrix by an mxn matrix takes time O(lmn). So in this
case, which is better, doing (XY)Z or X(YZ)?
Answer: X(YZ) is better because computing (YZ) takes 20x100x20 steps,
and produces a 20x20 matrix, then multiplying this by X takes another
20x100x20 steps, so total is just 2x20x100x20. But, doing the other
way, (XY) takes 100x20x100 steps, so already right there you've spent
more, and then multplying this with Z takes another 100x20x100 steps.
Question: suppose we need to multiply a series of matrices:
A_1 x A_2 x A_3 x ... x A_n.
What is the best way to parenthesize them?
There are an exponential number of different possible
parenthesizations: {2(n-1) choose n-1}/n. So don't want to search
through all of them. DP gives us a better way.
Define:
d_ij = dimensions of A_i x ... A_j. d_ii = dimensions of A_i.
time(d,d') = time to multiply matrix of dimensions d with matrix of
dimensions d'.
cost[i,j] = time to multiply A_i x ... x A_j using best possible
parenthesization (we're trying to figure out cost[1,n]).
(easy case: cost[i,i+1] = time(d_ii, d_{i+1,i+1}))
DP idea: Use cost[i,i+1] for all i to figure out cost[i,i+2] for all
i, then use this to figure out cost[i,i+3] for all i, and so on.
Initialization: for i=1 to n-1: cost[i,i+1] = time(A_i,A_{i+1})
for diff=2 to n-1:
for i=1 to n-diff:
j=i+diff
cost[i,j] = min ( cost[i,k] + cost[k+1,j] + time(d_{i,k},d_{k+1,j}))
i<=k S) return Value(n-1,S); // can't use nth item
return max{v_n + Value(n-1, S-s_n), Value(n-1, S)};
}
Right now, time is O(2^n). But, how can we speed up?
Use a 2-d array to store results so we don't end up computing the same
thing over and over again. Initialize arr[i][j] to "unknown"
Value(n,S)
{
if (S <= 0 || n = 0) return 0;
if (arr[n][S] == unknown) {
if (s_n > S) arr[n][S] = Value(n-1,S);
else arr[n][S] = max{v_n + Value(n-1, S-s_n), Value(n-1, S)};
}
return arr[n][S];
}
Running time:
-> we reach {...} at most n*S times.
=> at most 2*n*S recursive calls.
=> total time is O(n*S)
How to get items? Work backwards: if arr[n][S] = arr[n-1][S] then we
*didn't* use the nth item so recursively work backwards from
arr[n-1][S]. Otherwise, we *did*, so output nth item and recursively
work backwards from arr[n-1][S-s_n].
Can also do bottom up.