Computational Molecular Biology in the School of Computer Science

Computational molecular biology is an active area of research at Carnegie Mellon, carried out by faculty members who make this their primary research area as well through collaborations between computational and biological scientists. Areas of emphasis include the application of machine learning and data mining techniques to large biological knowledge bases, biological modeling, computational aspects of high-throughput laboratory methodologies for large-scale, systematic studies of protein structure and function, analysis of biological image data and computational genomics. Carnegie Mellon recently won a $3.2M multi-investigator grant for research on bioinformatics and cancer.

Carnegie Mellon University offers educational programs in computational biology through the undergraduate degree in Biological Sciences with a Computational Option or the B.S. Degree in Biological Sciences/Computational Biology, the Professional Masters Program in Computational Biology, and the Merck Graduate Program in Computational Biology and Chemistry . Graduate student applicants wishing to participate in the Merck Computational Biology and Chemistry graduate program at Carnegie Mellon must apply to and be accepted into one of the graduate programs of the participating departments: Biological Sciences, Chemistry and Computer Science .

Pittsburgh is a fertile environment for research combining biology with other disciplines, including activities at centers such as the Center for ALgorithm ADaptation Dissemination and INtegration (ALADDIN) , the Center for Biological Language Modeling, the Center for Light Microscope Imaging and Biotechnology , the Center for the Neural Basis of Cognition , Faculty of Biomedical Engineering , The Pittsburgh NMR Center for Biomedical Research , and the Pittsburgh Supercomputing Center .


Ziv Bar-Joseph analyzes high throughput biological datasets (especially gene expression and time series gene expression data, and protein-DNA binding data). His work addresses issues ranging from the experimental design level to the systems biology level.
Guy Blelloch develops scalable, parallel high-accuracy algorithms for simulating the flow of blood in the Sangria Project.
Jaime Carbonell is interested in the application of artificial intelligence to problems in computational molecular biology.
Dannie Durand studies the evolution of vertebrate genome organization and functional diversity through gene duplication, using combinatorial methods to analyze data obtained from diverse biological data sets via web-based information retrieval techniques.
Mike Erdmann studies problems in protein structure comparison in collaboration with Gordon Rule (Biological Sciences).
Chris Langmead develops algorithms for high-throughput structural biology, accelerating critical steps in determining and modeling protein structures and their dynamics, thereby facilitating faster and better techniques for drug design. Langmead also develops tools for the analysis of large-scale protein and gene expression data. This work will lead to early and better disease diagnosis, as well as the design of targeted therapies against illnesses like cancer.
Tom Mitchell applies machine learning and information retrieval techniques to the automatic construction of protein knowledge bases.
Gary Miller participates in the Sangria Project, a research project to design and apply advanced parallel geometric and numerical algorithms and software for simulating complex flows with dynamic interfaces, such as blood flow.
Andrew W. Moore applies clustering and data mining techniques to the analysis of multidimensional spaces of complex, biological entities.
R. Ravi collaborates with Jon Minden (Biological Sciences). and Alan Frieze (Mathematical Sciences) on combinatorial and statistical problems that arise in developing proteomics assays.
Roni Rosenfeld collaborates with Judith Klein-Seetharaman on the application of techniques from natural language modeling to biological sequence, using information theory, statistics and AI. These techniques include N-gram analysis, prediction in sparse domains, and automatic classification.
Russell Schwartz develops mathematical models and algorithms for studying genome variations between members of a species. He also develops methods for modeling self-assembling biological systems.
Raul Valdes-Perez is using recently invented data mining methods to pose new questions about genome-wide datasets on gene expression, function, location, and other properties. The new question is: how is a specific, given gene interestingly unique, or at least highly distinctive, compared to all other genes in the same genome?
Eric Xing develops probabilistic inference and learning algorithms for computational biology and statistical genetics, and for generic intelligent systems of a wide range of applications. His works address problems ranging from network modeling in systems biology to genetic polymorphisms associated with diseases.

Faculty in allied departments at Carnegie Mellon who are using computational and mathematical approaches to study related biological problems include:
Bill Eddy (Statistics)
Alan Frieze (Math)
Chris Genovese (Statistics)
Rob Kass (Statistics)
Bob Murphy (Biological Sciences)
John Nagle (Physics)
Kathryn Roeder (Statistics)
Larry Wasserman (Statistics)

Top of page
Back to CSD Home Page
Back to SCS Home Page
Last modified: Nov 4 2004. Maintained by Dannie Durand (