Computational Molecular Biology and Genomics Syllabus - Fall 2010

The materials in the "Assigned Reading" column are directly related to the topics covered in class. Readings under "Additional Topics" are strictly optional and will not be covered on the exams.
      In some cases, the same material is covered in more than one textbook. You have the choice of selecting the text that presents a treatment of the material most to your liking. It is your responsibility to make sure that you understand the material covered in class and you may read as many or as few of these texts needed to achieve that goal.

CLASS
DATE
TOPICS
ASSIGNED READING
ADDITIONAL TOPICS
1.  Aug. 24   Introduction to computational biology and genomics Introduction to computational biology and genomics
PS0 (due Aug. 31).
  Review biology and algorithms background  
2.  Aug. 26   Global pairwise sequence alignment
  • Lecture outline
  • Notation for alignment algorithms
  • Alignment examples
  • Global sequence alignment notes,
      courtesy Dr. M. Singh, Princeton University
  • Setubal and Meidanis, 47-55, 89-92, 96-98; (electronic reserve)
  • Durbin, pp. 17-22 (electronic reserves)
  •  
  • Saving space: Setubal and Meidanis, 58-60; (physical reserve)
  • General gap penalty functions: Setubal and Meidanis, 60-64 (physical reserve)
  • 3.  Aug. 31   Global and semi-global pairwise sequence alignment
    Lecture outline
    Alignment examples
    PS0 due.
     
  • Setubal and Meidanis, 56-57; (electronic reserve)
  •  
  • Saving space: Setubal and Meidanis, 58-60; (physical reserve)
  • General gap penalty functions: Setubal and Meidanis, 60-64 (physical reserve)
  • 4.  Sept. 2   Local alignment.

    Lecture outline
    Alignment examples
  • Local sequence alignment notes,
      courtesy Dr. M. Singh, Princeton University
  • Setubal and Meidanis, 55; (electronic reserve)
  • Durbin, pp. 23-24 (electronic reserves)
  •  
    5.   Sept. 7 Affine gap functions
    Lecture outline

    PS1, (due Sept. 16).
    Alignment template
    Affine gap functions:
  • Setubal and Meidanis, 64-66; (electronic reserve)
  •  
    6.   Sept. 9 Global Multiple Sequence Alignment Lecture notes
  • Setubal and Meidanis, 69-72 (electronic reserve)
  • Multiple sequence alignment Notes: I and II,  courtesy Dr. M. Singh, Princeton University
  • Durbin, 6.1 -- 6.4(electronic reserves)
  •  
    7.  Sept. 14 Global MSA continued Lecture notes
  • Protein multiple sequence alignment , Do and Katoh, 2008.
  • On the Design of Optimization Criteria for MSA, Durand & Farach-Colton, In Biological Evolution and Statistical Physics, Laessig & Valleriani
  • Strategies for multiple sequence alignment, Nicholas HB Jr, Ropelewski AJ, Deerfield DW 2nd, Biotechniques 2002 Mar;32(3):572-4 (electronic reserve)
  • 8.   Sept. 16 Evolutionary trees Lecture notes
    PS1 due.
    Durbin, et al: (electronic reserves)
    7.1, 7.2:  Background on trees
    7.4:  Parsimony
    Phylogeny notes,pp 1-3 & 17-20,
      Dr. M. Singh, Princeton University
    Sankoff's algorithm for inferring ancestral sequences; in Inferring Phylogenies, J. Felsenstein, Sinauer, pp 13-16.
    9.   Sept. 21 Evolutionary trees - parsimony methods Lecture notes
    711/856 only:
    Literature assignment 1
  • Pop & Salzberg, 07
  • Flicek & Birney, 09
  •    
    10. Sept. 23 Evolutionary trees - distance methods Lecture notes
    Distance-based methods
  • Durbin, et al: 7.3(electronic reserves)
  • Phylogeny notes, pp.4-13,
      courtesy Dr. M. Singh, Princeton University
  •  
    11. Sept. 28 Distance-based phylogeny reconstruction.
    UPGMA
    Lecture notes

    Literature assignment 1 due.
    PS2 (due Oct. 5th).
       
    12. Sept. 30 Distance-based phylogeny reconstruction.
    Neighbor Joining
    Lecture notes

    Felsenstein*,   Ch. 11 UPGMA and NJ, pp. 161-169
  • Durbin, et al: 7.8, NJ proof. (physical reserves)
  • 13. Oct. 5 Distance-based phylogeny reconstruction: optimization criteria,
    Lecture notes

    Intro to Markov chains, notation

    Distance matrix methods: Felsenstein*,  
    Optimization criteria pp. 147-150.
    Navigating tree space, pp. 37-44

    Markov Chain background
    Ewens and Grant, 4.4-4.8
    Durbin et al., 3.1 (electronic reserves)
    Felsenstein*,   Ch. 11 Statistical interpretations of distance matrix methods, pp. 151-155

    Bryant & Waddell Rapid Evaluation of Least-Squares ... Phylogenetic Trees, MBE, 98.
    14. Oct. 7 Models of sequence substitution, Jukes Cantor model.

    Lecture outline
    Jukes Cantor R program
    Distance-based methods
  • Durbin, et al: 8.1-8.2(electronic reserves)
  • Phylogeny notes, pp.14-17,
      courtesy Dr. M. Singh, Princeton University
  •  
    15. Oct. 12 Maximum likelihood estimation;     Phylogeny Estimation and Hypothesis Testing using Maximum Likelihood., pp. 1-8
    J. P. Huelsenbeck and K. A. Crandall, Ann. Rev. Ecol. Syst. 1997, 28:437-66
    Complexity results:
  • On the Approximability of Numerical Taxonomy: (Fitting Distances by Tree Metrics), Agarwala et al. , (SODA '96) (electronic reserve)
  • Efficient Algorithms for Inverting Evolution, Farach and Kannan, (STOC '96)
  • 16. Oct. 14   Midterm Exam
    This exam is closed book. You may bring two pages (or one page, front and back) of your own notes.
       
    17. Oct. 19 Class is cancelled.    
    18. Oct. 21 Phylogeny reconstruction summary Lecture notes    
    19. Oct. 26 Introduction to local MSA Position Specific Scoring Matrices
    A PSSM for the WEIRD motif
    A PSSM with pseudocounts

    711/856 only:
    Literature assignment 2
  • Delsuc, et al 2005
  •    
    20. Oct. 28 Local MSA, discovery
    Lecture notes
  • The Gibbs Sampler
  • Gibbs sampler
    Ewens and Grant, pp. 211-215.    Electronic reserve.
    Theoretical framework, convergence proofs
    Ewens and Grant, 10.5.2, Physical reserves.
    Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment, Lawrence et al., Science. 1993 262(5131):208-14.
    Explaining the Gibbs sampler, G. Casella & E. I. George, The American Statistician, 46:167-174, 1992

    Other motif discovery methods
    21. Nov. 2   Introduction to Hidden Markov Models Lecture notes    
    22. Nov. 4   Hidden Markov Models II
       The Viterbi algorithm Lecture notes
    Viterbi algorithm example
    Forward algorithm example

    Literature assignment 2 due.
    Viterbi, Forward, Backward algorithms
    Durbin, pp 55 - 61.
    Ewens and Grant, pp. 329-332 Electronic reserves.
    Hidden Markov Models in Computational Biology: Applications to Protein Modeling,
    Krogh et al., JMB 235, pp 1501--1531,(1994).
    Available through electronic reserves.
    23. Nov. 9 Hidden Markov Models II
       Forward & Backward algorithms, posterior decoding.
    Lecture notes

    PS3 (due Nov. 18th).
       
    24. Nov. 11 Hidden Markov Models III
       Discovery:
    Parameter estimation,
    topology.

    Parameter estimation, Baum-Welsh algorithm
    Durbin, pp 61-71
    Ewens and Grant, pp. 329-332 Electronic reserves.
    HMM topology:
    Durbin, pp 61-71 Electronic reserves.
     
    25. Nov. 16 Hidden Markov Models IV
       Profile HMMs
    Lecture notes
    Profile HMMs
    Durbin, pp 100 - 113.
    Ewens andGrant, pp. 335-337 Electronic reserves.
    Multiple alignment using HMMs
    Ewens and Grant, pp. 337 - 339 Electronic reserves.
     
    26. Nov. 18 Substitution Matrices
      PAM matrices
     Lecture notes

      PAM250,   PAM30

    PS3 due.
    Substitution matrices:
    Setubal and Meidanis, 80-84; (electronic reserve)
    Mount, pp 76-89; (electronic reserve)
    Durbin et al, pp 14-16 (electronic reserves)
     
    27. Nov. 23 Substitution Matrices
      BLOSUM matrices
     Lecture notes

    PS4 (due Dec. 2).
      BLOSUM45,   BLOSUM62,   BLOSUM80,

      BLAST home page
      BLAST Tutorial page  Recommended for students unfamiliar with BLAST
    BLOSUM Matrices:
    Ewens and Grant, 6.5.2.
    Amino acid substitution matrices from protein blocks, Henikoff S, Henikoff JG., PNAS 89(22):10915-9, 1992 (electronic reserve)
     
      Nov. 25 No class (Thanksgiving Holiday)

       
    28. Nov. 30 BLAST, the heuristic.
      Statistics of local, ungapped alignments.   Lecture notes
    BLAST
    Setubal and Meidanis, 84-87 (electronic reserve)
    Basic local alignment search tool, Altschul et al. , J. Mol. Bio., 1990 (electronic reserve)
    Blast statistics:
    The statistics of sequence similarity scores S. F. Altschul  
    Data base searching:
    Strategies for searching sequence databases, Nicholas HB Jr, Ropelewski AJ, Deerfield DW 2nd, Biotechniques 2002 Jun;28(6):1174-8 (electronic reserve)
    Blast statistics:
    Amino acid substitution matrices from an information theoretic perspective S. F. Altschul, J. Mol. Bio., 219:555-565, 1991 (electronic reserve)
    A protein alignment scoring system sensitive at all evolutionary distances, S. F. Altschul, J. Mol. Evol., 36:290-300 , 1993 (electronic reserve)
    Statistical Methods in Bioinformatics, W. Ewens and G. Grant (Physical reserves)

    Other BLAST references
    29. Dec. 2 Gapped BLAST
    Lecture notes

    PS4 due in class.  
       
      Dec. 7   Final Exam:   1pm - 4pm,   Location DH1112
    This exam is closed book. You may bring two pages (or one page, front and back) of your own notes. Study questions  
    To view online lectures in Quicktime format, you will need to have within your browser the QuickTime plug-in, and select it as the player for all media files. You can download the QuickTime Movie player for a PC or Mac free of charge at: http://www.apple.com/quicktime/download/index.html.



    Return to course homepage
    Last modified: August 22, 2010.
    Maintained by Dannie Durand (durand@cs.cmu.edu).