Visualization and Exploration of Large Multiple sequence Alignments
We wish to set protein sequence alignments free from the usual grid of letters visualization. As the number of sequences in an alignment grows, that representation becomes less useful for capturing all of the information in the data. Information is also lost when the amino acids are grouped based on one property or a particular combination of properties to create a single color scheme for the letter grid. We prefer to consider individual properties separately and simultaneously.
Consider this display of an alignment. Each column in the
alignment is represented as a vertical histogram of amino acid
property values, in this case hydrophobicity. The height of
each bar represents the proportion of sequences with a given
value. The color is scaled with the property value: red for
hydrophobic, blue for hydrophilic.
We also promised multiple simultaneous views. The screen
below demonstrates this with a possible sample session. There
are two separate property distribution displays, a display of
the 3D structure of the protein, a scattergram plotting two
features of each position and the grid of letters view for
traditionalists. All of these views are linked as well -
selecting a range of positions in one display will update the
selection in all of the other displays as well.
Have we piqued your interest? Click here to find out how to get the software up and running on your machine.
VELMA makes use of functions provided by the following open source libraries (no need to download; just giving credit where credit is due):