Computational Molecular Biology and Genomics Syllabus and Reading Assignments - Fall 2003


CLASS
DATE
TOPICS
ASSIGNED READING
ADDITIONAL TOPICS
1.  Aug. 26 Course overview
Introduction to computational biology and genomics I
Review biology and algorithms background  
2.  Aug. 28 Introduction to computational biology and genomics II
  PS0 handed out. Due 9/4.
   
3.  Sept. 2 Global pairwise sequence alignment
Lecture outline
Alignment examples
  • Global sequence alignment notes,
      courtesy Dr. M. Singh, Princeton University
  • Setubal and Meidanis, 47-55, 89-92, 96-98; (electronic reserve)
  • Durbin, pp. 17-22 (course text)
  • Saving space: Setubal and Meidanis, 58-60; (physical reserve)
  • General gap penalty functions: Setubal and Meidanis, 60-64 (physical reserve)
  • 4.  Sept. 4 Online lectures in Quicktime format:
      Introduction to sequencing,
      D. Durand
      Genome Assemblies and Interval Graphs
      M. Farach-Colton, Rutgers Univ.

      PS0 due in class.
      PS1 handed out. Due 9/4.
       
    5.   Sept. 9 Local pairwise sequence alignment.
    Semiglobal alignment.
    Affine gap penalties.
    Lecture outline

      PS1 due in class.
  • Local sequence alignment notes,
      courtesy Dr. M. Singh, Princeton University
  • Setubal and Meidanis, 55-57, 64-66; (electronic reserve)
  • Durbin, pp. 23-24, 29-30 (course text)
  •  
    6.   Sept. 11 Global Multiple Sequence Alignment
    Lecture outline

  • Setubal and Meidanis, 69-72 (electronic reserve)
  • Multiple sequence alignment notes, I,
  • Multiple sequence alignment notes, II,
      courtesy Dr. M. Singh, Princeton University
  • Durbin, 6.1 -- 6.4(course text)
  • On the Design of Optimization Criteria for MSA, Durand and Farach-Colton, In Biological Evolution and Statistical Physics, M. Laessig and A. Valleriani, Eds,Springer Verlag, 2002
  • Strategies for multiple sequence alignment, Nicholas HB Jr, Ropelewski AJ, Deerfield DW 2nd, Biotechniques 2002 Mar;32(3):572-4 (electronic reserve)
  • 7.  Sept. 16 Global MSA summary, Introduction to phylogeny reconstruction.

      PS2 handed out. Due 9/23.
      Alignment template
       
    8.   Sept. 18 Phylogeny Reconstruction
    Lecture outline
    Newick tree format
    Durbin, et al: (course text)
    7.1, 7.2:  Background on trees
    7.4:  Parsimony
    Parsimony, nice examples
  • Mount, pp 248-254(physical reserve)

  • 9.   Sept. 23 Phylogeny Reconstruction
     Distance-based methods.
    Introduction to class projects
      PS2 due in class.
    Distance-based methods
  • Durbin, et al: 7.3(course text)
  • Phylogeny notes,
      courtesy Dr. M. Singh, Princeton University
  •  
    10. Sept. 25 Phylogeny Reconstruction
     Distance-based methods.
    Lecture outline
     UPGMA algorithm
     NJ algorithm

      PS3 handed out. Due 10/2
       
    11. Sept. 30 Phylogeny Reconstruction
     Minimum Evolution
    Lecture outline

      Project preference email due.
      PS2 solutions
  • Phylogeny notes,
      courtesy Dr. M. Singh, Princeton University
  • Theoretical foundation of the minimum-evolution method of phylogenetic inference, Rzhetsky and Nei, MBE 1993;10(5)
  • On the optimization principle in phylogenetic analysis and the minimum-evolution criterion. , Gascuel, MBE 2000;17(3)
  • 12. Oct. 2 Phylogeny Reconstruction
     Probabilistic models of evolution (Jukes-Cantor)
    Lecture outline

      PS3 due in class.
    Markov Chain background
    Ewens and Grant, 4.4-4.8
    Durbin et al., 3.1
    Probabilistic models of evolution
    Durbin, et al: 8.1, 8.2 (course text)
    Phylogeny notes,  courtesy Dr. M. Singh, Princeton University
    13. Oct. 7 Phylogeny Reconstruction
     Maximum Likelihood
     Comparison of methods, Evaluation of results

    Lecture outline

      PS3 solutions
    Durbin, et al: (course text)
    8.3, 8.4:  Maximum Likelihood
    Complexity results:
  • On the Approximability of Numerical Taxonomy: (Fitting Distances by Tree Metrics), Agarwala et al. , (SODA '96) (electronic reserve)
  • Efficient Algorithms for Inverting Evolution, Farach and Kannan, (STOC '96)
  • 14. Oct. 9 Midterm Exam
    This exam is closed book. You may bring two pages (or one page, front and back) of your own notes. The midterm covers all material up to October 7th.
       
    15. Oct. 14 Hypothesis testing lecture notes
    Local multiple alignment
     PSSM example

    Online protein domain databases:
      CDD: Conserved Domain Database
       CDART: Conserved Domain Architecture Retrieval Tool,

  • Motifs and Profile Analysis,
      courtesy Dr. M. Singh, Princeton University
  • Durbin, et al: p. 102 (course text)
  • Pseudocounts:
  • Durbin, et al: 5.6(course text)
  • 16. Oct. 16 Local multiple alignment
     Gibb's Sampler lecture notes

       Project proposal/outline due.  You may hand it in in class or email me your proposal in ascii, pdf or postscript format. Do not email me a Word document.
    Gibbs sampler
    Ewens and Grant, pp. 211-215.   Handed out in class. Also on physical reserve.
    Theoretical framework, convergence proofs
    Ewens and Grant, 10.5.2, Physical reserves.
    Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment, Lawrence et al., Science. 1993 262(5131):208-14.

    Other motif discovery methods
    17. Oct. 21  Hidden Markov Models I
    Guest lecturer: Rose Hoberman.
    Lecture notes
    Introduction to Markov models
    Durbin, pp 46-55.
    Ewens and Grant, pp. 327-329 Electronic reserves.
    Viterbi, Forward, Backward algorithms
    Durbin, pp 55 - 61.
    Ewens and Grant, pp. 329-332 Electronic reserves.
    Hidden Markov Models in Computational Biology: Applications to Protein Modeling,
    Krogh et al., JMB 235, pp 1501--1531,(1994).
    Available through electronic reserves.
    18. Oct. 23  Hidden Markov Models II
    Guest lecturer: Rose Hoberman.
    Lecture notes
    Profile HMMs
    Durbin, pp 100 - 113.
    Ewens andGrant, pp. 335-337 Electronic reserves.
     
    19. Oct. 28  Hidden Markov Models III
     Lecture notes

    Discussion topic: Genome Sequences from the Sea
    HMM topology, parameter estimation, Baum-Welsch algorithm
    Durbin, pp 61-71
    Ewens and Grant, pp. 329-332 Electronic reserves.
    Multiple alignment using HMMs
    Ewens and Grant, pp. 337 - 339 Electronic reserves.
     
    20. Oct. 30 Substitution Matrices
      PAM matrices
     Lecture notes
  • Setubal and Meidanis, 80-84; (electronic reserve)
  • Mount, pp 76-89; (electronic reserve)
  • Durbin et al, pp 14-16 (course text)
  •  
    21. Nov. 4 Substitution Matrices
      PAM matrices, BLOSUM matrices
     Lecture notes

      Revised proposal/outline due.
      PS4 handed out. Due 11/13
    BLOSUM Matrices:
    Ewens and Grant, 6.5.2.

    Amino acid substitution matrices from protein blocks, Henikoff S, Henikoff JG., PNAS 89(22):10915-9, 1992 (electronic reserve)

    Scoring systems
     
    22. Nov. 6 Database searching; BLAST
     Lecture notes

      BLAST home page

      BLAST Tutorial page  Recommended for students unfamiliar with BLAST
    Data Base Searching
    Mount, pp. 282-291 (electronic reserve)

    BLAST
    Setubal and Meidanis, 84-87 (electronic reserve)
    Basic local alignment search tool, Altschul et al. , J. Mol. Bio., 1990 (electronic reserve)
     
    23. Nov. 11 BLAST; statistics of local, ungapped alignments.
     Lecture notes

     
    The statistics of sequence similarity scores S. F. Altschul  
    Blast statistics:
    Amino acid substitution matrices from an information theoretic perspective, S. F. Altschul, J. Mol. Bio., 219:555-565, 1991
    A protein alignment scoring system sensitive at all evolutionary distances, S. F. Altschul, J. Mol. Evol., 36:290-300 , 1993
    24. Nov. 13 BLAST; statistics of local, ungapped alignments.
     Lecture notes



       PS4 due in class.   PS4 solutions
    Strategies for searching sequence databases, Nicholas HB Jr, Ropelewski AJ, Deerfield DW 2nd, Biotechniques 2002 Jun;28(6):1174-8 (electronic reserve)
    Blast statistics:
    Statistical Methods in Bioinformatics, W. Ewens and G. Grant (Physical reserves)

    Other BLAST references:
    Other BLAST references
    25. Nov. 18 Gapped BLAST
     Lecture notes

    Gapped BLAST and PSI-BLAST: a new generation of protein database search programs, Altschul et al., Nucleic Acids Research, 1997, pp. 3389 - 3394 (electronic reserve)

     
    26. Nov. 20 Prokaryotic Gene Finding
     Lecture notes

     Discussion topic: Defining genes in the genomics era.
    Snyder and Gerstein, Science (2003) 300(5617):258-60.

      PS5 handed out. Due 12/3
  • Gene Discovery in DNA Sequences
    S. Salzberg, IEEE 1999 (electronic reserve)
  • A hidden Markov model that finds genes in E. coli DNA A. Krogh et al., NAR 1994 (electronic reserve)
  • Assessment of protein coding measures
    J.W. Fickett and C.S. Tung, NAR 1992 (electronic reserve)
  • Distinctive sequence features in protein coding genic non-coding, and intergenic human DNA R. Guigo and J.W. Fickett, JMB 1995 (electronic reserve)
  • 27. Nov. 25 Eukaryotic Gene Finding
     Lecture notes

     Discussion topic: Yeast rises again.
    S. Salzberg, Nature ( 2003) 423, 233-234
  • Prediction of Complete Gene Structures in Human Genomic DNA C. Burge and S. Karlin, JMB 1997 (electronic reserve)

  • Ewens and Grant, pp. 340-346.

  • Evaluation of Gene Structure Prediction Programs M. Burset and R. Guigo, Genomics 1996 (electronic reserve)
  •   Nov. 27 No class (Thanksgiving Holiday)    
    28. Dec. 2 Project presentations
       
    30. Wednesday
    Dec. 3rd
       PS5 due at noon in Rose's office WH5119.
       
    29. Dec. 4 Project presentations
      10:30 - 10:40 Course evaluation
      Project final papers due.
       
    30. Monday
    Dec. 8th
    Final Exam:
      17:30 - 20:30 Porter Hall A18C
       
    To view online lectures in Quicktime format, you will need to have within your browser the QuickTime plug-in, and select it as the player for all media files. You can download the QuickTime Movie player for a PC or Mac free of charge at: http://www.apple.com/quicktime/download/index.html.



    Return to course homepage
    Last modified: December 6, 2003.
    Maintained by Dannie Durand (durand@cs.cmu.edu).