Statistical Software Debugging

Abstract

Traditional software debugging is an arduous task that requires time, effort, and a good understanding of the source code. Given the scale and complexity of the task, the development of methods for automatically debugging software seems both essential and very difficult. However, several trends make such an endeavor increasingly realistic: (1) the wide-scale deployment of software, (2) the establishment of distributed crash report feedback systems, and (3) the development of statistical machine learning algorithms that can take advantage of aggregate data over multiple users.
In this talk, I present a statistical software debugging framework that applies machine learning techniques to run-time reports of instrumented programs. The problem has a relatively simple solution under the single-bug assumption. However, in the more realistic case of multiple bugs, the problem can no longer be dealt with using simple feature selection and classification techniques. I describe the chanllenges and present a solution inspired by bi-clustering algorithms.
This is joint work with Ben Liblit (U. Wisconsin, Madison), Michael Jordan (U.C. Berkeley), Alex Aiken and Mayur Naik (Stanford).

Back to the Main Page

Pradeep Ravikumar

Last modified: Fri Oct 13 11:22:28 EDT 2006