03-511/711 Reading Materials- Fall 2002


Since there is no single text book that covers all the material in this course and is appropriate for students from all backgrounds, there is no required text for the course. Course notes on sequence alignment and phylogeny reconstruction will be posted on this page. In addition, there are two recommended texts: These books will be on reserve in both the Mellon and Engineering libraries. Students may buy one or both books or use the copies on reserve. Note that the lack of a specific required text does not relieve you of the responsibility of studying the material. For further reading, additional texts include:




   
Suggested reading
Additional topics - optional
1.   Aug. 27    
2.   Aug. 29    
3.  Sept. 3 Pairwise sequence comparison
  • Global sequence alignment notes,   courtesy Dr. M. Singh, Princeton University
  • Setubal and Meidanis, 3.1, 3.2.1, 3.6.1, 3.6.2
  • Durbin, pp. 17 - 22
  • Mount, pp 64-76, 92 - 95
  • Repeated Matches, Durbin, pp. 24 - 28;
  • Biological context, Mount, Chapter 3
  • 4.   Sept. 5    
    5.   Sept. 10 Pairwise sequence comparison
  • Local sequence alignment notes,   courtesy Dr. M. Singh, Princeton University
  • Setubal and Meidanis, 3.2.2, 3.2.3, 3.3.3
  • Durbin, 22 - 24, 29 - 30
  • Mount, pp 64-76, 92 - 95
  • Saving space, Setubal and Meidanis, 3.3.1;
  • General gap penalty functions, Setubal and Meidanis, 3.3.2;
  • Biological context, Mount, Chapter 3
  • 6.   Sept. 12 Global multiple sequence alignment
  • Multiple sequence alignment notes, I,   courtesy Dr. M. Singh, Princeton University
  • Setubal and Meidanis, 3.4
  • Mount, pp 145-156
  • On the Design of Optimization Criteria for MSA, Durand and Farach-Colton, In Biological Evolution and Statistical Physics, M. Laessig and A. Valleriani, Eds, Springer Verlag, 2002
  • 7.   Sept. 17 Global multiple sequence alignment
  • Multiple sequence alignment notes, II,   courtesy Dr. M. Singh, Princeton University
  • Setubal and Meidanis, 3.4
  • Mount, pp 145-156
  • Strategies for multiple sequence alignment, Nicholas HB Jr, Ropelewski AJ, Deerfield DW 2nd, Biotechniques 2002 Mar;32(3):572-4 - handed out in class.
  • 8.   Sept. 19    
    9.   Sept. 24   Local multiple sequence alignment
  • Motifs and Profile Analysis,   courtesy Dr. M. Singh, Princeton University
  • Durbin et al, 3.1 - 3.4
  • Mount, pp 161-198
  • Databases of patterns in protein families, Mount, pp 430
  • 10. Sept. 26 Hidden Markov Models
  • Durbin et al, 3.1 - 3.4
  • An Introduction to Hidden Markov Models, Rabiner and Juang, IEEE ASSP Magazine, 3(1):4-16, Jan, 1986 - handed out in class.
  •  
    11. Oct. 1    
    12. Oct. 3 Applications of HMMs to molecular biology
  • Profile Hidden Markov Models,   courtesy Dr. M. Singh, Princeton University
  • Durbin et al, 5.1 - 5.4 and pp 149-154, 158.
  • Hidden Markov Models in Computational Biology, Krogh et al., JMB, 235, 1501--1531, 1994.   (You will need to be in the cmu.edu domain to access CMU's online subscription.)
  • Estimation of probabilities from counts: Durbin et al, 11.5
  • Expectation maximization : Durbin et al, 11.6.
  • 13. Oct. 8 Substitution matrices
  • Setubal and Meidanis, 3.5.1.
  • Mount, pp 76-89.
  • Durbin et al, pp 14-16.
  •  
    14. Oct. 10 Substitution matrices
  • Amino acid substitution matrices from protein blocks., Henikoff S, Henikoff JG., PNAS 89(22):10915-9, 1992.
  •  
    15. Oct. 15    
    16. Oct. 17    
    17. Oct. 22 Database Searching
    General principles:
  • Mount, pp. 282-291
    BLAST:
  • Setubal and Meidanis, 3.5.2.
  • Mount, pp. 300-307
  • Basic local alignment search tool, Altschul et al. , J. Mol. Bio., 1990
  • Gapped BLAST and PSI-BLAST: a new generation of protein database search programs, Altschul et al. , Nucleic Acids Research, 1997, pp. 3389 - 3394
  •  
    FASTA:
  • Setubal and Meidanis, 3.5.3.
  • Mount, pp. 291-299
    BLAST extensions:
  • Mount, pp. 308-314
  • Gapped BLAST and PSI-BLAST: a new generation of protein database search programs, Altschul et al. , Nucleic Acids Research, 1997, pp. 3394 - 3402
  • 18. Oct. 24 Database Searching
    BLAST statistics:
  • The statistics of sequence similarity scores S. F. Altschul

  •  
    BLAST statistics:
  • Amino acid substitution matrices from an information theoretic perspective, S. F. Altschul, J. Mol. Bio., 219:555-565, 1991
  • A protein alignment scoring system sensitive at all evolutionary distances. , S. F. Altschul, J. Mol. Evol., 36:290-300 , 1993
  • Other BLAST references
  • W. Ewens and G. Grant, Statistical Methods in Bioinformatics Springer-Verlag NY
  • 19. Oct. 29 Using BLAST in practise:
  • Blast tutorial
  • Strategies for searching sequence databases, Nicholas HB Jr, Ropelewski AJ, Deerfield DW 2nd, Biotechniques 2002 Jun;28(6):1174-8 - handed out in class.
  • 20. Oct. 31 Phylogeny reconstruction
      Background on trees
  • Phylogeny,   courtesy Dr. M. Singh, Princeton University, pp. 1 - 3, 7.
  • Durbin et al, pp 160-164.
  • Mount, pp 238-248.  
  •  
    21. Nov. 5 Phylogeny reconstruction
      Parsimony
  • Phylogeny notes, pp. 17 - 20.
  • Durbin et al, 7.4
  • Mount, pp 248-254.
  •  
    22. Nov. 7 Phylogeny reconstruction
      Distance
  • Phylogeny notes, pp. 4-13.
  • Durbin et al, 7.3
  •  
    Computing rate corrected distances:
    Jukes Cantor and Kimura 2 parameter models
  • Phylogeny notes, pp. 13-17.
  • Durbin et al, 8.2

    Complexity results:
  • On the Approximability of Numerical Taxonomy: (Fitting Distances by Tree Metrics) , Agarwala et al. , (SODA '96)
  • 23. Nov. 12 Phylogeny reconstruction
     UPGMA:
  • Phylogeny notes, pp. 8 - 10.
  • Durbin, pp 166 - 169. Neighbor Joining:
  • Durbin, pp 169 - 173.
  •  
    24. Nov. 14 No class  
    25. Nov. 19   Phylogeny reconstruction: Maximum Likelihood
  • Durbin, pp 197 - 207, 224 - 231
  •  
  • Durbin, chapter 8.
  • Efficient Algorithms for Inverting Evolution, Farach and Kannan, (STOC '96)
  • 26. Nov. 21 Gene Finding:
  • Mount, pp 338 - 351
  • Gene Discovery in DNA Sequences
    S. Salzberg, IEEE 1999
  • A hidden Markov model that finds genes in E. coli DNA.
    A. Krogh et al. , NAR 1994
  •  
  • Assessment of protein coding measures
    J.W. Fickett and C.S. Tung, NAR 1992
  • Distinctive sequence features in protein coding genic non-coding, and intergenic human DNA.
    R. Guigo and J.W. Fickett, JMB 1995
  • 27. Nov. 26 Eukaryotic Gene Finding
  • Prediction of Complete Gene Structures in Human Genomic DNA
    C. Burge and S. Karlin, JMB 1997
  •  
  • Evaluation of Gene Structure Prediction Programs
    M. Burset and R. Guigo, Genomics 1996
  •   Nov. 28 Thanksgiving-no class    
    28. Dec. 3    
    29. Dec. 5    


    Return to course homepage
    Last modified: November 21st, 2002.
    Maintained by Dannie Durand and Annette Welsch. (durand@cs.cmu.edu).