Geoffrey J. Gordon

[mug shot] I'm an associate research professor in the Machine Learning Department (which used to be the Center for Automated Learning and Discovery) at Carnegie Mellon.  I am also affiliated with the Robotics Institute.  I'm interested in multi-agent planning, reinforcement learning, decision-theoretic planning, statistical models of difficult data (e.g. maps, video, text), computational learning theory, and game theory.  Here is the page for the SELECT Lab, which Carlos Guestrin and I run together (as well as its mailing list).

I spent AY 2003-4 as a visiting professor at the Stanford Robotics Lab.  Before joining CMU I used to work for Burning Glass Technologies, a company that provided intelligent searching and matching software for resumes and job postings.  The company was headquartered in San Diego, but I worked at their Pittsburgh office.

Before that, I was a postdoctoral researcher at the AUTON lab in the Robotics Institute.  Before that, I was a Computer Science PhD student, with advisor Tom Mitchell.  For my RI page, click here.  Here are the CMU machine learning lunch page, CS local page, facilities help page, budgeting system, and facilities costing system.  Here is CMU's finger gateway.

Have a look at some family pictures!


Fall 2013 I am teaching 10-701, Machine Learning, with Alex Smola.

Spring 2013 I taught the MLD Journal Club.  I also taught this course in Spring 2012, in Fall 2011, with Tom Mitchell in Fall 2010, with Aarti Singh in Spring 2010, with Ann Lee several times before that, and with Steve Fienberg several times before that.

Fall 2012 I taught 10-725, Optimization, with Ryan Tibshirani.  I also taught this course in Spring 2010 and Spring 2008 with Carlos Guestrin.

Spring 2011 I taught 15-780, the graduate AI Star course, with Tuomas Sandholm.  I also taught this course in spring 2009 with Tuomas, and in fall 2007 and Fall 2006 with Ziv Bar-Joseph.

Fall 2009 I taught 10-601, the masters-level machine learning course, with Miro Dudík.

Spring 2004 I taught CS23N, Robotics and Machine Learning, with Andrew Ng at Stanford.

Summer 2003 I organized the CALD summer school with Tom Mitchell.

Fall 2002 I taught 16-899C, Statistical Techniques in Robotics, with Sebastian Thrun.


Here is a current list of the students I am supervising.

Notes, examples, and tutorials

These are informal notes rather than polished presentations, so let me know if you find any errors.


  • The New Artist: art created by robots, for robots.  (A collaboration led by Axel Straschnoy, of which I am a small part.)

Playing games

  • Play some one-card poker.
  • Compute some correlated equilibria.
  • Some slides on what it means to be a reasonable learning algorithm in a repeated game.  I presented these as an invited talk at the AAAI workshop on multiagent learning in 2005.

Algorithms for statistical inference

Linear programming and optimization

  • A very simple implementation of an infeasible interior-point method for linear and convex quadratic programs, as a Matlab M-file, and an example of its use.  I also have a slightly more sophisticated implementation (also in Matlab).  If you have access to Matlab's quadprog, I'd recommend using that instead; when I wrote this, I did not have access to quadprog.
  • A tutorial on some geometry behind linear programming, in PostScript (780k, 30 slides) (or try PDF).
  • For comparison, here's another short interior point linear programming solver.  This one is due to Yin Zhang and was presented at SIAM 2000; I have basically only reformatted the code so that it's slightly easier to use and read.
  • Support vector machines are an interesting use of optimization, and there is some interior point code for learning SVMs on my SVM page.  This is not really a very good way to optimize SVMs, and perhaps not the best interior-point implementation, but it may be an interesting example.

Reinforcement learning

  • A (very partial) annotated bibliography on robot learning via MDPs and related methods.  I made this as an initial cut at readings our multirobot planning group might want to go over.
  • Notes on conditioning (the dogs and bells kind), in PostScript (50k, 20 slides) or PDF (80k).
  • Lecture notes for an intro to reinforcement learning, in PostScript or PDF (215k, 43 slides).


  • A tutorial on machine learning for educational data that Emma Brunskill and I gave at NIPS 2012 (or, direct link to the video).
  • Advice for technical speaking, written for our Journal Club course at CMU.
  • Code for path planning via Dijkstra's algorithm and A* search, in Java with a Matlab interface.
  • A tutorial on synthetic division and partial fraction expansion, which are useful in working with the rational functions which arise when analyzing a linear, time-invariant system of differential equations.
  • Software for tracking dots in images.  This is a useful primitive for some types of computational biology experiments: fluorescently tag something, take pictures of it, and track how it moves.  This software isn't very polished, but we couldn't find anything out there for the purpose; so we wrote this, and some friends of mine used it to help with the data for one of the papers below.
  • Notes on edge and corner detectors.
  • The iterated prisoner's dilemma (text, 10k).
  • Slides on rank-based nonparametric statistical tests, as PostScript (115k) or PDF (253k).
  • Matlab code for computing log(exp(a)+exp(b)).
  • A simple tutorial on the Common LISP language, written as class material for the AI core course at CMU.

Some publications

This list is approximately in reverse chronological order.  Some of my publications are also available from the CMU SCS tech reports archive or from arXiv.


  • A. Hefny, G. Gordon, and K. Sycara. Random walk features for network-aware topic models. In NIPS Workshop on Frontiers of Network Analysis: Methods, Models, and Applications, 2013. (sorry, no link yet)
  • E. Zawadzki, G. J. Gordon, and A. Platzer. A projection algorithm for strictly monotone linear complementarity problems. In Proc. NIPS OPT Workshop, 2013.
  • Geoffrey J. Gordon.  Galerkin Methods for Complementarity Problems and Variational Inequalities.  Technical report arXiv:1306.4753, 2013.
  • Byron Boots, Arthur Gretton, Geoffrey Gordon.  Hilbert Space Embeddings of Predictive State Representations.  ICML workshop on Machine Learning and System Identification (MLSYSID), 2013.  (sorry, no link yet)
  • B. Boots, A. Gretton, and G. J. Gordon.  Hilbert space embeddings of predictive state representations.  In 29th Intl. Conf. on Uncertainty in Artificial Intelligence (UAI), 2013.  (sorry, no link yet)
  • M. H. Falakmasir, Z. A. Pardos, G. J. Gordon, and P. Brusilovsky.  A spectral learning approach to knowledge tracing.  In 6th Intl. Conf. on Educational Data Mining (EDM), 2013.
  • M. V. Yudelson, K. R. Koedinger, and G. J. Gordon.  Individualized Bayesian knowledge tracing models.  In Proc. 16th Intl. Conf. on Artificial Intelligence in Education (AIED), 2013.
  • A. Gupta, K. Sycara, G. Gordon, and A. Hefny.  Differences in social influence across cultures in Twitter.  In Proc. IEEE/ACM Intl. Conf. on Advances in Social Networks Analysis and Mining (ASONAM), 2013.  (sorry, no link yet)
  • E. Zawadzki, A. Platzer, and G. J. Gordon.  A generalization of SAT and #SAT for policy evaluation.  In Proc. Intl. Joint Conf. on Artificial Intelligence (IJCAI), 2013.
  • B. Boots and G. J. Gordon. A spectral learning approach to range-only SLAM.  In 30th Intl. Conf. on Machine Learning (ICML), 2013.  (sorry, no link yet; but see also arXiv version below)











2002 and earlier

  • My NIPS-02 paper, Generalized2 Linear2 Models.  It combines principal components analysis, independent components analysis, and generalized linear models.  The result is a nonlinear component analysis model which can be optimized quickly and which can express a variety of useful relationships between hidden and visible variables.  (222k gzipped postscript, 8 pages) (or try 144k PDF)
  • My NIPS-02 paper with Nick Roy, Exponential Family PCA for Belief Compression in POMDPs. It describes a way to find structure in robot belief states and take advantage of that structure for planning.  Belief states are probability distributions over physical states, and therefore high-dimensional.  In order to plan, we must reduce the high-dimensional representation to a lower-dimensional one; so, we applied a nonlinear component analysis algorithm to find the low-dimensional features which allow us to reconstruct our belief most accurately in KL-divergence.  (66k gzipped postscript, 8 pages; or try PDF)
  • My UAI-02 paper, Distributed planning in hierarchical factored MDPs (with Carlos Guestrin).  It describes a way to decompose a large MDP with factored dynamics into several smaller MDPs that run in parallel and are coupled by constraints, and provides a principled distributed planning algorithm based on this intuition.  (384k gzipped postscript, 10 pages) (or try 458k PDF)
  • My NIPS-00 paper, Reinforcement Learning with Function Approximation Converges to a Region.  It proves that two related algorithms, SARSA(0) and V(0), cannot diverge.  (The latter algorithm was used in the TD-Gammon program, albeit with a nonlinear function approximator that doesn't fit my assumptions.)  (55k gzipped postscript, 7 pages.)  You can also dowload some slides (246k gzipped postscript, 18 pages).
  • My ML-00 paper, Learning Filaments (with Andrew Moore).  It describes a modification to k-means that allows cluster centers to be shaped like line segments instead of points.  (500k gzipped postscript, 8 pages)
  • Hierarchical Linear Models and Cell Data, a Robotics Institute tech report (82k gzipped postscript, 14 pages). You can also download it from the RI TR archive as gzipped postscript or PDF.
  • My thesis, Approximate Solutions to Markov Decision Processes.  Contains results about fitted value iteration, worst case learning, and the relationship between MDPs and convex programming (680k gzipped postscript, 150 pages).  Also available from the CMU CS tech reports archive as postscript (3208k) or PDF (1286k).  Here is the abstract.
  • My COLT-99 paper, Regret bounds for prediction problems, which proves worst-case performance bounds for some widely-used learning algorithms (379k, 12 pages).  You can also download some slides (493k, 37 pages).  Chapter 3 of my thesis is a slightly longer and more recent presentation of the material in this paper.
  • The online proceedings of the workshop on modelling in reinforcement learning, held at ICML-97, co-organized with Chris Atkeson.
  • Stable Function Approximation in Dynamic Programming from ML-95: convergence guarantees for offline Markov decision problem solvers based on function approximators like k-nearest-neighbor.  (84k, 8 pages) (or try PDF)
  • Stable Function Approximation in Dynamic Programming, tech report CMU-CS-95-103: an earlier, longer version of the above paper.  Contains proofs and discussion which were left out of the ML-95 paper due to limited space.  Also available from the tech reports archive.  (188k, 23 pages) (or try PDF)
  • Online Fitted Reinforcement Learning from the Value Function Approximation workshop at ML-95: an addendum to the above two papers which extends some of their techniques to online Markov decision problems.  (81k, 3 pages) (or try PDF)
  • An example of SARSA failing to converge.  (50k, 6 pages)
  • My NIPS-95 paper.  The previous two papers (the one from the ML-95 VFA workshop and the SARSA example) are more recent and cover the same topics.  So, this paper is mostly obsolete.  If you want it anyway, you can click here.  (56k, 7 pages)

Office: GHC 8105, x8-7399
Lab: GHC 8208, x8-3837
Assistant: Michelle Martin, GHC 8001, x8-5527, userid michelle324
Fax: x8-3431

Shipping address:
6105 Gates Hillman Complex
Machine Learning Department
Carnegie Mellon University
5000 Forbes Avenue
Pittsburgh, PA 15213

To reach the above numbers from outside CMU, first dial 1-412-26.  You can email any CMU SCS member with or  (My userid is part of the URL for this page.)