Geoffrey J. Gordon

[mug shot] I'm a professor in the Machine Learning Department at Carnegie Mellon.  I am also affiliated with the Robotics Institute.  I'm interested in multi-agent planning, reinforcement learning, decision-theoretic planning, statistical models of difficult data (e.g. maps, video, text), computational learning theory, and game theory.  My group is called the SELECT lab (for SEnse, LEarn, and aCT).  Here is its mailing list.

I spent AY 2003-4 as a visiting professor at the Stanford Robotics Lab.  Before joining CMU I used to work for Burning Glass Technologies, a company that provided intelligent searching and matching software for resumes and job postings.  The company was headquartered in San Diego, but I worked at their Pittsburgh office.

Before that, I was a postdoctoral researcher at the AUTON lab in the Robotics Institute.  Before that, I was a Computer Science PhD student, with advisor Tom Mitchell.  For my RI page, click here.  Here are the CMU machine learning lunch page, CS local page, facilities help page, budgeting system, and facilities costing system.  Here is CMU's finger gateway.


Office: GHC 8213, x8-7399
(To reach these numbers from outside CMU, first dial 1-412-26.)

Shipping address:
6105 Gates Hillman Complex
Machine Learning Department
Carnegie Mellon University
5000 Forbes Avenue
Pittsburgh, PA 15213

You can email me with, where user_ID is part of the URL for this page.


Spring 2023 I am teaching 10-405 and 10-605, Machine Learning with Large Datasets, with Barnabas Poczos.

Spring 2022 I taught 10-606 and 10-607, Mathematical Background for ML and Computational Background for ML. I taught these courses before in Fall 2017. These are the same course as 10-600 from Fall 2016, but renumbered since CMU's registration system prefers different numbers for the two minis.

Spring 2021 I taught 10-701, Intro to ML, with Aarti Singh. I also taught this course in Fall 2014 with Aarti Singh.  I also taught this course in Fall 2013 with Alex Smola.

Fall 2015 I taught 10-601, the masters-level machine learning course, with Aarti Singh.  I also taught this course in Fall 2009 with Miro Dudík.

Spring 2013 I taught the MLD Journal Club.  I also taught this course in Spring 2012, in Fall 2011, with Tom Mitchell in Fall 2010, with Aarti Singh in Spring 2010, with Ann Lee several times before that, and with Steve Fienberg several times before that.

Fall 2012 I taught 10-725, Optimization, with Ryan Tibshirani.  I also taught this course in Spring 2010 and Spring 2008 with Carlos Guestrin.

Spring 2011 I taught 15-780, the graduate AI Star course, with Tuomas Sandholm.  I also taught this course in spring 2009 with Tuomas, and in fall 2007 and Fall 2006 with Ziv Bar-Joseph.

Spring 2004 I taught CS23N, Robotics and Machine Learning, with Andrew Ng at Stanford.

Summer 2003 I organized the CALD summer school with Tom Mitchell.

Fall 2002 I taught 16-899C, Statistical Techniques in Robotics, with Sebastian Thrun.


Here is a current list of the students I am supervising.

Notes, examples, and tutorials

These are informal notes rather than polished presentations, so let me know if you find any errors.


  • The New Artist: art created by robots, for robots.  (A collaboration led by Axel Straschnoy, of which I am a small part.)

Playing games

  • Play some one-card poker.
  • Compute some correlated equilibria.
  • Some slides on what it means to be a reasonable learning algorithm in a repeated game.  I presented these as an invited talk at the AAAI workshop on multiagent learning in 2005.

Algorithms for statistical inference

Linear programming and optimization

  • A very simple implementation of an infeasible interior-point method for linear and convex quadratic programs, as a Matlab M-file, and an example of its use.  I also have a slightly more sophisticated implementation (also in Matlab).  If you have access to Matlab's quadprog, I'd recommend using that instead; when I wrote this, I did not have access to quadprog.
  • A tutorial on some geometry behind linear programming, in PostScript (780k, 30 slides) (or try PDF).
  • For comparison, here's another short interior point linear programming solver.  This one is due to Yin Zhang and was presented at SIAM 2000; I have basically only reformatted the code so that it's slightly easier to use and read.
  • Support vector machines are an interesting use of optimization, and there is some interior point code for learning SVMs on my SVM page.  This is not really a very good way to optimize SVMs, and perhaps not the best interior-point implementation, but it may be an interesting example.

Reinforcement learning

  • A (very partial) annotated bibliography on robot learning via MDPs and related methods.  I made this as an initial cut at readings our multirobot planning group might want to go over.
  • Notes on conditioning (the dogs and bells kind), in PostScript (50k, 20 slides) or PDF (80k).
  • Lecture notes for an intro to reinforcement learning, in PostScript or PDF (215k, 43 slides).


  • A tutorial on machine learning for educational data that Emma Brunskill and I gave at NIPS 2012 (or, direct link to the video).
  • Advice for technical speaking, written for our Journal Club course at CMU.
  • Code for path planning via Dijkstra's algorithm and A* search, in Java with a Matlab interface.
  • A tutorial on synthetic division and partial fraction expansion, which are useful in working with the rational functions which arise when analyzing a linear, time-invariant system of differential equations.
  • Software for tracking dots in images.  This is a useful primitive for some types of computational biology experiments: fluorescently tag something, take pictures of it, and track how it moves.  This software isn't very polished, but we couldn't find anything out there for the purpose; so we wrote this, and some friends of mine used it to help with the data for one of the papers below.
  • Notes on edge and corner detectors.
  • The iterated prisoner's dilemma (text, 10k).
  • Slides on rank-based nonparametric statistical tests, as PostScript (115k) or PDF (253k).
  • Matlab code for computing log(exp(a)+exp(b)).
  • A simple tutorial on the Common LISP language, written as class material for the AI core course at CMU.

Some publications

This list is approximately in reverse chronological order.  Some of my publications are also available from the CMU SCS tech reports archive or from arXiv.


  • Ahmed Hefny, Carlton Downey, Geoffrey Gordon. An Efficient, Expressive and Local Minima-free Method for Learning Controlled Dynamical Systems. In AAAI-18. (sorry, no link yet) (an earlier version of this work was presented at CoRL 2017)
  • Siddarth Srinivasan, Geoff Gordon, and Byron Boots. Learning Hidden Quantum Markov Models. In AISTATS 2018. (sorry, no link yet)
















2002 and earlier

  • My NIPS-02 paper, Generalized2 Linear2 Models.  It combines principal components analysis, independent components analysis, and generalized linear models.  The result is a nonlinear component analysis model which can be optimized quickly and which can express a variety of useful relationships between hidden and visible variables.  (222k gzipped postscript, 8 pages) (or try 144k PDF)
  • My NIPS-02 paper with Nick Roy, Exponential Family PCA for Belief Compression in POMDPs. It describes a way to find structure in robot belief states and take advantage of that structure for planning.  Belief states are probability distributions over physical states, and therefore high-dimensional.  In order to plan, we must reduce the high-dimensional representation to a lower-dimensional one; so, we applied a nonlinear component analysis algorithm to find the low-dimensional features which allow us to reconstruct our belief most accurately in KL-divergence.  (66k gzipped postscript, 8 pages; or try PDF)
  • My UAI-02 paper, Distributed planning in hierarchical factored MDPs (with Carlos Guestrin).  It describes a way to decompose a large MDP with factored dynamics into several smaller MDPs that run in parallel and are coupled by constraints, and provides a principled distributed planning algorithm based on this intuition.  (384k gzipped postscript, 10 pages) (or try 458k PDF)
  • My NIPS-00 paper, Reinforcement Learning with Function Approximation Converges to a Region.  It proves that two related algorithms, SARSA(0) and V(0), cannot diverge.  (The latter algorithm was used in the TD-Gammon program, albeit with a nonlinear function approximator that doesn't fit my assumptions.)  (55k gzipped postscript, 7 pages.)  You can also dowload some slides (246k gzipped postscript, 18 pages).
  • My ML-00 paper, Learning Filaments (with Andrew Moore).  It describes a modification to k-means that allows cluster centers to be shaped like line segments instead of points.  (500k gzipped postscript, 8 pages)
  • Hierarchical Linear Models and Cell Data, a Robotics Institute tech report (82k gzipped postscript, 14 pages). You can also download it from the RI TR archive as gzipped postscript or PDF.
  • My thesis, Approximate Solutions to Markov Decision Processes.  Contains results about fitted value iteration, worst case learning, and the relationship between MDPs and convex programming (680k gzipped postscript, 150 pages).  Also available from the CMU CS tech reports archive as postscript (3208k) or PDF (1286k).  Here is the abstract.
  • My COLT-99 paper, Regret bounds for prediction problems, which proves worst-case performance bounds for some widely-used learning algorithms (379k, 12 pages).  You can also download some slides (493k, 37 pages).  Chapter 3 of my thesis is a slightly longer and more recent presentation of the material in this paper.
  • The online proceedings of the workshop on modelling in reinforcement learning, held at ICML-97, co-organized with Chris Atkeson.
  • Stable Function Approximation in Dynamic Programming from ML-95: convergence guarantees for offline Markov decision problem solvers based on function approximators like k-nearest-neighbor.  (84k, 8 pages) (or try PDF)
  • Stable Function Approximation in Dynamic Programming, tech report CMU-CS-95-103: an earlier, longer version of the above paper.  Contains proofs and discussion which were left out of the ML-95 paper due to limited space.  Also available from the tech reports archive.  (188k, 23 pages) (or try PDF)
  • Online Fitted Reinforcement Learning from the Value Function Approximation workshop at ML-95: an addendum to the above two papers which extends some of their techniques to online Markov decision problems.  (81k, 3 pages) (or try PDF)
  • An example of SARSA failing to converge.  (50k, 6 pages)
  • My NIPS-95 paper.  The previous two papers (the one from the ML-95 VFA workshop and the SARSA example) are more recent and cover the same topics.  So, this paper is mostly obsolete.  If you want it anyway, you can click here.  (56k, 7 pages)