Advanced Algorithms and Models for Computational Biology
Spring 2007

School of Computer Science, Carnegie-Mellon University


Class participation and reading

For the first part of the course we will be using: Biological Sequence Analysis by Durbin, Eddy, Krogh and Mitchison (Cambridge press). However, since computational genomics is a rapidly evolving field, there is currently no text book that covers all the material in this course. Most lectures will have assigned reading, and it is expected (and required) that students read the assigned papers before class. Most reading assignments will be recent papers in scientific journals or conferences. Initially, you may find some of these papers hard to read (either because of the biological terms or because of the computational methods, depending on your background). However, as the course progresses our hope is that these reading assignments will become easier, so that by the end of the term you be able to read (and understand) papers from Science, Nature and PNAS (or RECOMB, depending on your background).

Homework resources and collaboration policy

Homeworks and the exam may contain material that has been covered by papers and webpages. Since this is a graduate class, we expect students to want to learn and not google for answers.

Homeworks will be done individually: each student must hand in their own answers. It is acceptable, however, for students to collaborate in figuring out answers and helping each other solve the problems. We will be assuming that, as participants in a graduate course, you will be taking the responsibility to make sure you personally understand the solution to any work arising from such collaboration. You also must indicate on each homework with whom you collaborated.

The final project may be completed by small teams.

Late homework policy

Homework regrades policy

If you feel that we have made an error in grading your homework, please turn in your homework with a written explanation to Monica Hopes, and we will consider your request. Please note that regrading of a homework may cause your grade to go up or down.

Homework assignments

We will have four problem sets. Problem sets will consist of both, theoretical and programming problems. This is not a computer systems class, and so the programming load will be small. Still, we think that it is essential to work with real data since computational biology is an applied field. We will use matlab for the programming part, and we will have an ‘intro to matlab’ class for those who did not work with matlab in the past.

Final project

Computational biology faces many challenges, and it is possible to arrive at very interesting results in a (relatively) short time. Projects will either try to extend one the methods discussed in class, apply a method to a new dataset or apply a new / revised algorithm to one of the problems discussed in class (for example, a new clustering or classification algorithm). If you are working on a new machine learning / graph theoretical or biological problem, and you think it can be applied to one of the problems we discussed, you can base your project on this problem. Projects will be done in groups of up to three people. I am hoping to have a good mix of students, so that heterogeneous teams of students from different departments can form. The projects will include a writeup (of 6-8 pages) and a class presentation. We will hold preliminary class presentations towards the last third of the class. I encourage you to form teams as early as possible, and start working on a problem as soon as you can. Once you have formed a team, you are welcomed to schedule a meeting with me to discuss possible projects.

For project milestone, roughly half of the project work should be completed. A short, graded write-up will be required, and we will provide feedback.


Note to people outside CMU

Feel free to use the slides and materials available online here. Please email the instructors with any corrections or improvements.