03-511/711, 15-495/856 Course Notes - Sept 2, 2010
Pairwise alignment continued
Alignment algorithms
The dynamic programs for sequence alignment compute a matrix a[i,j],
which gives the scores of the optimal alignments of all prefixes. These algorithms have four components:
- Initialization of the first row and column of a[i,j].
- A recurrence relation for a[i,j], i,j > 1.
- Determination of the score of the optimal alignment
from the matrix a[i,j] in o(m-n) time.
- Trace back through the alignment matrix to obtain the
optimal alignment in o(m+n) time.
The details of each of these steps are what differentiate global,
semi-global and local alignment.
Local Alignment
- Initialize the first row and column to zero: a[i,0] = 0 and a[0,j] = 0 for all i and j
- Recurrence
a[i,j]= max { |
a[i-1,j] + g |
a[i-1,j-1] + p(s[i], t[j]) |
a[i,j-1] + g |
0 |
- The score of the optimal alignment is max{ a[i,j]}, where the
maximum is taken over all i and all j.
- Trace back starting at a[i^{*},j^{*}], the cell corresponding
to the maximum score. End the trace back at the first cell with value zero.
Note that :
- There can be more than one optimal alignment
- Suboptimal alignments may be of interest
- A scoring function for local pairwise alignment must satisfy the following requirements:
- M > m > 2g
- The scoring function must be a similarity function.
- The similarity matrix, p[i,j], must contain at least one positive
value.
- The expected random alignment score must be
negative.
Last modified: September 2, 2010.
Maintained by Dannie Durand (durand@cs.cmu.edu) and Annette McLeod.