Algorithms and Models for
Deonier, Tavare and Waterman, Computational Genome Analysis.
, Wean Hall 4616, x8-5527
Dramatic advances in experimental
technology and computational analysis are fundamentally transforming
the basic nature and goal of biological research. The emergence of new
frontiers in biology, such as systems biology and evolutionary
genomics, is demanding new methodologies that can confront quantitative
issues of substantial computational and mathematical sophistication.
Machine learning and probabilistic modeling represent the methods of
choice for designing systems that can integrate, comprehend, query vast
body of heterogeneous biological data based on well-founded statistical
provide a systematic computational framework for large-scale
analysis of dynamic, noisy and dependent experimental data, a
convenient vehicle to adopt
the Bayesian philosophy whereby one can formally incorporate biological
knowledge to the models, and a firm foundation on which to design
and simulation models for biological data from heterogeneous sources.
This course focuses on modern machine learning methodologies for computational problems in molecular biology and genetics, including probabilistic modeling, inference and learning algorithms, pattern recognition, data integration, time series analysis, active learning, etc. We will discuss classical approaches and latest methodological advances in the context of the following biological problems: 1) Analysis of high throughput biological data, such as gene expression data, focusing on issues ranging from data acquisition to pattern recognition and classification. 2) Computational genomics, focusing on gene finding, motifs detection and sequence evolution. 3) Medical and populational genetics, focusing on polymorphism analysis, linkage analysis, pedigree and genetic demography, 4) Molecular and regulatory evolution, focusing on phylogenetic inference and regulatory network evolution, and 5) Systems biology, concerning how to combine sequence, expression and other biological data sources to infer the structure and function of different systems in the cell.
Students are expected to have successfully
10701 (Machine Learning), or an equivalent class.