Carlos Guestrin's Research

This page mentions a few recent projects in our group. More details can be found in the Select Lab website.

GraphLab: Abstraction and System for Parallel Machine Learning

Many efficient machine learning algorithms cannot be naturally decomposed as computations over independent subproblem, and are thus inefficient when implemented under a Map-Reduce abstraction. GraphLab addresses this challenge by providing a natural abstraction for asynchronous, iterative, parallel algorithms over sparse (graph-represented) data. The approach has been applied to a wide range of machine learning tasks, including inference and learning in graphical models, collaborative filtering and matrix factorization, and clustering.

The first GraphLab paper was presented at UAI 2010: GraphLab: A New Framework for Parallel Machine Learning.
Code for GraphLab is available at www.graphlab.org.

Taming Information Overload

We are overwhelmed with the amounts of information available on the web, with the volumes of opinions expressed in our political process, and with the numbers of scientific papers published every year. Our goal is to discover new ways to connect seemly dispare pieces of information, and to quickly learn a user's preference, thus providing personalized recommendations. Recent results include:

Parallel Machine Learning Algorithms

As our computer architectures have transitioned to multicore and the cloud, we must develop parallel machine learning algorithms that can exploit such hardware. Our recent results include:

Submodular Function Optimization, Optimizing Information Gathering, Sensor Placement, and Experiment Design

We have showed that applications in Civil Engineering, environmental monitoring, and information gathering on the web require us to find the most relevant information at the minimal cost. Although these problems are often solved by heuristics without theoretical guarantees, we demonstrated that they can often be formulated as submodular function optimization problems that are amenable to efficient algorithms with theoretical guarantees. Some of our results in this are include:

Representing Probability Distributions over Rankings and Permutations

Permutation problems are ubiquitous is ranking, voting, and data association. However, representing probability distributions over such problems is very challenging, because there are n! possibilities, and standard techniques, such as graphical models, are inadequate to represent structure in these domains. We have developed the notion of riffle independence in rankings, and algorithms using Fourier representations over permutations to tackle these challenging tasks. Our results include:

Probabilistic Graphical Models

We have also developed algorithms for a number of core machine learning problems, especially in the are of graphical models. A core theme in this work is the development of methods with theoretical guarantees that address significant, real-world challenges. Some results include:

Distributed Algorithms for Inference in Sensor Networks

Data in a sensor network is highly distributed throughout the environment. We thus need to develop techniques that can fuse the data and make predictions in a distributed fashion, while being robust to communication and node failures. Our results here include:

Multiagent Reinforcement Learning and Factored MDPs

My thesis work demonstrated how to exploit problem-specific structure to scale up MDPs and enable the efficient coordination of multiple agents. These results include: