Pairwise alignment continued

Alignment algorithms

The dynamic programs for sequence alignment compute a matrix a[i,j], which gives the scores of the optimal alignments of all prefixes. These algorithms have four components:

• Initialization of the first row and column of a[i,j].
• A recurrence relation for a[i,j], i,j > 1.
• Determination of the score of the optimal alignment from the matrix a[i,j] in o(m-n) time.
• Trace back through the alignment matrix to obtain the optimal alignment in o(m+n) time.

The details of each of these steps are what differentiate global, semi-global and local alignment.

Local Alignment

• Initialize the first row and column to zero: a[i,0] = 0 and a[0,j] = 0 for all i and j
• Recurrence

 a[i,j]= max  { a[i-1,j] + g a[i-1,j-1] + p(s[i], t[j]) a[i,j-1] + g 0
• The score of the optimal alignment is max{ a[i,j]}, where the maximum is taken over all i and all j.
• Trace back starting at a[i*,j*], the cell corresponding to the maximum score. End the trace back at the first cell with value zero.

Note that :

• There can be more than one optimal alignment
• Suboptimal alignments may be of interest
• A scoring function for local pairwise alignment must satisfy the following requirements:
• M > m > 2g
• The scoring function must be a similarity function.
• The similarity matrix, p[i,j], must contain at least one positive value.
• The expected random alignment score must be negative.