Advanced Algorithms and Models for Computational Biology

10-810, Spring 2006

School of Computer Science, Carnegie-Mellon University






Class Assistant:

Course Description


Dramatic advances in experimental technology and computational analysis are fundamentally transforming the basic nature and goal of biological research. The emergence of new frontiers in biology, such as systems biology and evolutionary genomics, is demanding new methodologies that can confront quantitative issues of substantial computational and mathematical sophistication. Machine learning and probabilistic modeling represent the methods of choice for designing systems that can integrate, comprehend, query vast body of heterogeneous biological data based on well-founded statistical principles. They provide a systematic computational framework for large-scale statistical analysis of dynamic, noisy and dependent experimental data, a convenient vehicle to adopt the Bayesian philosophy whereby one can formally incorporate biological prior knowledge to the models, and a firm foundation on which to design composite predictive and simulation models for biological data from heterogeneous sources.

This course focuses on modern machine learning methodologies for computational problems in molecular biology and genetics, including probabilistic modeling, inference and learning algorithms, pattern recognition, data integration, time series analysis, active learning, etc. We will discuss classical approaches and latest methodological advances in the context of the following biological problems: 1) Analysis of high throughput biological data, such as gene expression data, focusing on issues ranging from data acquisition to pattern recognition and classification. 2) Computational genomics, focusing on gene finding, motifs detection and sequence evolution. 3) Medical and populational genetics, focusing on polymorphism analysis, linkage analysis, pedigree and genetic demography, 4) Molecular and regulatory evolution, focusing on phylogenetic inference and regulatory network evolution, and 5) Systems biology, concerning how to combine sequence, expression and other biological data sources to infer the structure and function of different systems in the cell.


Students are expected to have successfully completed 10701 (Machine Learning), or an equivalent class.


Page links