Teaching
Fall 2015 I
taught 10601,
the masterslevel machine learning course,
with Aarti Singh. I also
taught this course in
Fall 2009
with Miro Dudík.
Fall 2014 I taught 10701, Introduction to Machine Learning,
with Aarti Singh. I
also taught this course in
Fall 2013
with
Alex Smola.
Spring 2013 I taught
the MLD Journal
Club. I also taught this course in Spring 2012, in Fall 2011,
with Tom Mitchell in Fall
2010, with Aarti Singh in
Spring 2010, with Ann
Lee several times before that, and
with Steve
Fienberg several times before that.
Fall 2012 I taught 10725, Optimization,
with Ryan
Tibshirani. I also taught this course in
Spring 2010
and Spring 2008
with Carlos
Guestrin.
Spring 2011 I taught 15780, the graduate AI
Star course, with Tuomas
Sandholm. I also taught this course in
spring 2009 with Tuomas, and in
fall 2007
and Fall 2006
with Ziv BarJoseph.
Spring 2004 I taught CS23N, Robotics and
Machine Learning, with Andrew
Ng at Stanford.
Summer 2003 I organized the CALD summer
school with Tom
Mitchell.
Fall 2002 I taught 16899C, Statistical
Techniques in Robotics, with Sebastian Thrun.
Students
Here is a current list of the students I
am supervising.
Notes, examples, and tutorials
These are informal notes rather than polished presentations, so let me
know if you find any errors.
Art
 The New Artist: art
created by robots, for robots. (A collaboration led by Axel
Straschnoy, of which I am a small part.)
Playing games
 Play some onecard poker.
 Compute some correlated equilibria.
 Some slides on what
it means to be a reasonable learning algorithm in a repeated
game. I presented these as an invited talk at the AAAI workshop
on multiagent learning in 2005.
Algorithms for statistical inference
 A tutorial on spectral learning that we gave at ICML 2012.
 Some code for spectral learning of dynamical
systems.
 Some code for generalized linear PCA using a
Poisson error model (and its matching exponential link function)
 Some lecture notes on Monte Carlo algorithms, including Matlab demos.
 Some lecture notes on support vector machines, including a simple Java applet.
 Some lecture notes on variational algorithms, including kmeans clustering and meanfield image segmentation.
 Notes on Gaussian distributions as they are used in the Kalman filter.
 An example of how to fit a logistic
model using iterativelyreweighted least squares.

An example of using gradient descent to fit a
discrete exponential family. Matlab code, 2k.

Notes on the concaveconvex procedure (CCCP) and its relationship to
variational bounding algorithms, in PostScript (44k, 20 slides).

Notes on Fisher scoring, in PostScript (42k, 8 slides).

Notes on boosting, in PostScript (90k,
20 slides).

A very simple implementation of an infeasible interiorpoint method
for linear and convex quadratic programs, as a Matlab
Mfile, and an example of its use.
I also have a slightly more sophisticated
implementation (also in Matlab). If you have access to Matlab's
quadprog, I'd recommend using that instead; when I wrote this, I did
not have access to quadprog.

A tutorial on some geometry behind linear programming,
in PostScript (780k, 30 slides) (or
try PDF).

For comparison, here's another short interior
point linear programming solver. This one is due to Yin
Zhang and was presented at SIAM 2000;
I have basically only reformatted the code so that it's slightly
easier to use and read.

Support vector machines are an interesting use of optimization, and
there is some interior point code for learning SVMs on
my SVM page. This is not really a very good
way to optimize SVMs, and perhaps not the best interiorpoint
implementation, but it may be an interesting example.
Reinforcement learning

A (very partial) annotated bibliography on
robot learning via MDPs and related methods. I made this as an
initial cut at readings our multirobot planning group might want to go
over.

Notes on conditioning (the dogs and bells kind), in PostScript (50k, 20 slides) or PDF (80k).

Lecture notes for an intro to reinforcement learning, in PostScript or PDF (215k, 43 slides).
Others

A tutorial
on machine learning for educational data that Emma Brunskill and I
gave at NIPS 2012 (or, direct link to the video).
 Advice for technical speaking, written for our Journal Club course at CMU.
 Code for path planning via Dijkstra's
algorithm and A* search, in Java with a Matlab interface.
 A tutorial on synthetic division and
partial fraction expansion, which are useful in working with the
rational functions which arise when analyzing a linear, timeinvariant
system of differential equations.
 Software for tracking
dots in images. This is a useful primitive for some types of
computational biology experiments: fluorescently tag something, take
pictures of it, and track how it moves. This software isn't very
polished, but we couldn't find anything out there for the purpose; so
we wrote this, and some friends of mine used it to help with the data
for one of the papers
below.
 Notes on edge and corner detectors.

The iterated prisoner's dilemma (text,
10k).

Slides on rankbased nonparametric statistical tests,
as PostScript (115k)
or PDF (253k).
 Matlab code for computing log(exp(a)+exp(b)).

A simple tutorial on the Common LISP
language, written as class material for the AI core course at CMU.
Some publications
This list is approximately in reverse chronological order.
Some of my publications are also available from the CMU SCS tech reports
archive or from arXiv.
2016
 H. Zhao, T. Adel, G. Gordon, and B. Amos. Collapsed variational inference for sumproduct networks. In Proc. Intl. Conf. on Machine Learning (ICML), 2016.
 Z. Marinho, B. Boots, A. Dragan, A. Byravan, G. J. Gordon, and S. Srinivasa. Functional gradient motion planning in reproducing kernel Hilbert spaces. In Proc. Robotics: Science and Systems XII (RSS), 2016.
 W. Sun, R. Capobianco, J. A. Bagnell, B. Boots, and G. J. Gordon. Learning to smooth with bidirectional predictive state inference machines. In Proc. Conf. on Uncertainty in Artificial Intelligence (UAI), 2016.
 M. Falakmasir, G. J. Gordon, J. P. GonzálezBrenes, and K. E. DiCerbo. A datadriven approach for inferring student proficiency from game activity logs. In Proc. Learning at Scale (L@S), 2016.
2015
 Ahmed Hefny, Carlton Downey, and Geoffrey J. Gordon. Supervised Learning for Dynamical System Learning. In Advances in Neural Information Processing Systems (NIPS), 2015.
 G. Xia, Y. Wang, R. Dannenberg, and G. Gordon.
Spectral
learning for expressive interactive ensemble music performance. In
Proc. Intl. Soc. Music Information Retrieval (ISMIR), 2015.
 Matteus Tanha, TseHan Huang, Geoffrey J. Gordon and David J. Yaron.
Imitation Learning for Accelerating Iterative Computation of Fixed Points.
European Workshop on Reinforcement Learning (EWRL), 2015. (sorry, no link yet)
2014
 A. Hefny, R. Kass, S. Khanna, M. Smith, and G. Gordon. Fast and Improved SLEX Analysis of Highdimensional Time Series. NIPS Workshop on Machine Learning and Interpretation in Neuroimaging (MLINI), 2014.
 Khalil Ghorbal, JeanBaptiste Jeannin, Erik Zawadzki, André Platzer, Geoffrey J. Gordon, and Peter Capell. Hybrid Theorem Proving of Aerospace Systems: Applications and Challenges. Journal of Aerospace Information Systems, 2014.
 John Kowalski, Yanhui Zhang, and Geoffrey J. Gordon. Statistical Modeling of Student Performance to Improve Chinese Dictation Skills with an Intelligent Tutor. Journal of Educational Data Mining 6:1, 2014.
2013
 Matteus Tanha, Shiva Kaul, Alex Cappiello, Geoffrey J. Gordon, David
J. Yaron. Embedding
parameters in ab initio theory to develop wellcontrolled
approximations based on molecular similarity. Technical report
arXiv:1311.3440.
 A. Hefny, G. Gordon, and K. Sycara. Random walk features for networkaware topic models. In NIPS Workshop on Frontiers of Network Analysis: Methods, Models, and Applications, 2013.
 E. Zawadzki, G. J. Gordon, and
A. Platzer. A
projection algorithm for strictly monotone linear complementarity
problems. In Proc. NIPS OPT Workshop, 2013.
 Geoffrey
J. Gordon. Galerkin
Methods for Complementarity Problems and Variational
Inequalities. Technical report arXiv:1306.4753, 2013.
 Byron Boots, Arthur Gretton, Geoffrey Gordon. Hilbert Space
Embeddings of Predictive State Representations. ICML workshop on
Machine Learning and System Identification (MLSYSID), 2013.
(sorry, no link yet)
 B. Boots, A. Gretton, and
G. J. Gordon. Hilbert
space embeddings of predictive state representations. In
29th Intl. Conf. on Uncertainty in Artificial Intelligence (UAI),
2013.
 M. H. Falakmasir, Z. A. Pardos, G. J. Gordon, and
P. Brusilovsky. A
spectral learning approach to knowledge tracing. In 6th
Intl. Conf. on Educational Data Mining (EDM), 2013.
 M. V. Yudelson, K. R. Koedinger, and G. J. Gordon. Individualized Bayesian knowledge tracing models. In Proc. 16th Intl. Conf. on Artificial Intelligence in Education (AIED), 2013.
 A. Gupta, K. Sycara, G. Gordon, and A. Hefny. Differences
in social influence across cultures in Twitter. In
Proc. IEEE/ACM Intl. Conf. on Advances in Social Networks Analysis and
Mining (ASONAM), 2013. (sorry, no link yet)
 E. Zawadzki, A. Platzer, and
G. J. Gordon. A
generalization of SAT and #SAT for policy evaluation. In
Proc. Intl. Joint Conf. on Artificial Intelligence (IJCAI), 2013.
 B. Boots and
G. J. Gordon. A spectral
learning approach to rangeonly SLAM. In 30th Intl. Conf. on
Machine Learning (ICML), 2013. (see also arXiv version below)
2012
 B. Boots and
G. J. Gordon. A
spectral learning approach to rangeonly SLAM. In NIPS
Workshop on Spectral Algorithms for Latent Variable Models, 2012.
 B. Boots, A. Gretton, and
G. J. Gordon. Hilbert
space embeddings of PSRs. In NIPS Workshop on Spectral
Algorithms for Latent Variable Models, 2012.
 G. J. Gordon. Fast
solutions to projective monotone linear complementarity
problems. Technical Report arXiv:1212.6958, 2012.
 G. J. Gordon, P. Varakantham, W. Yeoh, H. C. Lau,
A. S. Aravamudhan, and
S.F. Cheng. Lagrangian
relaxation for largescale multiagent planning. In
Proc. IEEE/WIC/ACM Intl. Conf. on Intelligent Agent Technology (IAT),
2012.
 B. Boots and
G. J. Gordon. A spectral
learning approach to rangeonly SLAM. Technical report
arXiv:1207.2491, 2012.
 P. Varakantham, S.F. Cheng, G. Gordon, and A. Ahmed. Decision support for agent populations in uncertain and congested environments. In Proc. 26th Conf. on Artificial Intelligence (AAAI), 2012.
 B. Boots and G. J. Gordon. Two manifold problems with applications to nonlinear system identification. In Proc. Intl. Conf. on Machine Learning (ICML), 2012.
 Geoffrey J. Gordon, Pradeep Varakantham, William Yeoh, Hoong
Chuin Lau, Ajay S. Aravamudhan, and ShihFen
Cheng. Lagrangian
Relaxation for LargeScale MultiAgent Planning (Extended
Abstract). 11th Intl. Conf. on Autonomous Agents and
Multiagent Systems (AAMAS), 2012.

Brian D. Ziebart, Miro Dudik, Geoff Gordon, Katia Sycara, Wendi Adair, and Jeanne Brett.
Identifying Culture and Leveraging Cultural Differences in Negotiations for Negotiation Agents.
Hawaii International Conference on System Science (HICSS), 2012.
2011
 Byron Boots and Geoffrey
J. Gordon. TwoManifold
Problems. Technical report arXiv:1112.6399, 2011.
 Byron Boots and Geoff
Gordon. Online Spectral
Identification of Dynamical Systems. NIPS workshop on Sparse
Representation and Lowrank Approximation, 2011.
 Shiva Kaul and Geoffrey Gordon. Anticoncentration
regularizers for stochastic combinatorial problems. NIPS
workshop on Computational Tradeoffs in Statistical Learning (COST),
2011. (sorry, no link yet)
 Sue Ann Hong and Geoffrey
Gordon. An
Accelerated Gradient Method for Distributed MultiAgent Planning with
Factored MDPs. NIPS workshop on Optimization for Machine
Learning (OPT), 2011.
 A. D. Dragan, G. J. Gordon, and
S. S. Srinivasa. Learning
from experience in manipulation planning: Setting the right
goals. In Proc. Intl. Symp. on Robotics Reasearch (ISRR),
2011.
 G. Gordon, D. Dunson, and
M. Dudík. Proceedings
of the Fourteenth International Conference on Artificial Intelligence
and Statistics (AISTATS). JMLR W&CP, vol. 15, 2011.
 M. Chi, K. Koedinger, G. Gordon, P. Jordan, and
K. VanLehn. Instructional factors analysis:
A cognitive model for multiple instructional interventions.
In Proc. 4th Intl. Conf. on Educational Data Mining (EDM), 2011.
 Byron Boots and Geoffrey J. Gordon.
An Online Spectral Learning Algorithm for Partially Observable Nonlinear Dynamical Systems. AAAI, 2011.
 Stephane Ross, Geoffrey Gordon, and Drew Bagnell. Reduction of Imitation Learning and Structured Prediction to NoRegret Online Learning. AISTATS, 2011. (see also the arXiv version, next bullet)
 Stephane Ross, Geoffrey J. Gordon, J. Andrew Bagnell. A Reduction of Imitation Learning and Structured Prediction to NoRegret Online Learning. Technical report
arXiv:1011.0686, arXiv, 2011.
 Erik Zawadzki, Geoffrey Gordon, Andre Platzer. An InstantiationBased Theorem Prover for FirstOrder Programming. AISTATS, 2011.
 Sue Ann Hong and Geoffrey Gordon. Optimal Distributed MarketBased Planning for MultiAgent Systems with Shared Resources. AISTATS, 2011.
 Byron Boots, Sajid Siddiqi, and Geoff Gordon. Closing the LearningPlanning Loop with Predictive State Representations. International Journal of Robotics Research (IJRR), 30(7):954966, 2011.
 N. Turan, M. Dudík, G. Gordon, and L. Weingart.
Modeling
group negotiation: Three computational approaches that can inform
behavioral sciences. In E. A. Mannix, M. A. Neale, and
J. R. Overbeck, eds., Negotiation and Groups (Research on Managing
Groups and Teams), volume 14, pages 189205, 2011.
2010
 B. Boots and
G. J. Gordon. Predictive state
temporal difference learning. In Advances in Neural
Information Processing Systems (NIPS), vol. 23, 2010. (see also
the arXiv version, next bullet)
 B. Boots and
G. J. Gordon. Predictive
state temporal difference learning. Technical report
arXiv:1011.0041, arXiv, 2010.
 A. Singh and
G. Gordon. A Bayesian
matrix factorization model for relational data. In
Proc. Intl. Conf. on Uncertainty in Artificial Intelligence (UAI),
2010.
 B. Boots, S. M. Siddiqi, and
G. J. Gordon. Closing
the learningplanning loop with predictive state
representations. In Proc. Robotics: Science and Systems VI
(RSS),
2010. (see also the extended abstract, tech report, and workshop
paper of the same name below)
 L. Song, B. Boots, S. M. Siddiqi, G. J. Gordon, and
A. J. Smola. Hilbert
space embeddings of hidden Markov models. In Proc. 27th
Intl. Conf. on Machine Learning (ICML), 2010. (best paper award)
 S. M. Siddiqi, B. Boots, and
G. J. Gordon. Reducedrank
hidden Markov models. In Proc. 13th Intl. Conf. on Artificial
Intelligence and Statistics (AISTATS), 2010. (see also the tech
report version below, and the abstract)
 J. Ramos, S. Siddiqi, A. Dubrawski, G. Gordon, and
A. Sharma. Automatic
state discovery for unstructured audio scene classification.
In Proc. 35th Intl. Conf. on Acoustics, Speech, and Signal Processing
(ICASSP), 2010.
 B. Boots, S. Siddiqi, and G. Gordon. Closing the
learningplanning loop with predictive state representations (extended
abstract). In 9th Intl. Conf. on Autonomous Agents and
Multiagent Systems (AAMAS), 2010. (see also the next two items)
2009
 B. Boots, S. M. Siddiqi, and G. J. Gordon.
Closing the learningplanning
loop with predictive state representations. Technical Report
arXiv:0912.2385, arXiv, 2009.
 B. Boots, S. M. Siddiqi, and
G. J. Gordon. Closing the
learningplanning loop with PSRs. In M. Deisenroth,
B. Kappen, E. Todorov, D. NguyenTuong, C. E. Rasmussen, and
J. Peters, eds., Proc. NIPS Workshop on Probabilistic Approaches for
Robotics and Control, Whistler, BC, 2009.
 S. M. Siddiqi, B. Boots, and
G. J. Gordon. Reducedrank
hidden Markov models. Technical Report arXiv:0910.0902,
arXiv, 2009.
 Ajit P. Singh and Geoffrey
J. Gordon. A Bayesian
Matrix Model for Relational Data. NIPS workshop on Transfer
Learning for Structured Data
(TLSD09).
 R. Freeman, P. Yang, G. Gordon, K. M. Lynch, S. Srinivasa, and
R. Sukthankar.
Decentralized
estimation and control of graph connectivity for mobile sensor
networks. Automatica, 46(2):390396, 2010.
 Miroslav Dudík and Geoffrey J. Gordon. A
GameTheoretic Approach to Modeling CrossCultural Negotiation.
Proceedings of the
2009 MICON
workshop at IJCAI.
(proceedings)
 Praveen Paruchuri, Nilanjan Chakraborty, Roie Zivan, Katia
Sycara, Miroslav Dudík and Geoff Gordon. POMDP Based
Negotiation Modeling. Proceedings of the
2009 MICON
workshop at IJCAI.
(proceedings)
 T. Stepleton, Z. Ghahramani, G. Gordon, and
T. S. Lee. The
block diagonal infinite hidden Markov model. In Proc. 12th
Intl. Conf. on Artificial Intelligence and Statistics (AISTATS), 2009.
 Geoffrey J. Gordon, Sue Ann Hong, and Miroslav
Dudík. Firstorder
mixed integer linear programming. In Proc. 25th Conf. on
Uncertainty in Artificial Intelligence (UAI), 2009.
 Miroslav Dudík and Geoffrey Gordon. A samplingbased approach to
computing equilibria in succinct extensiveform games. In
Proc. 25th Conf. on Uncertainty in Artificial Intelligence (UAI),
2009.
2008
 Geoff Gordon. Think globally,
act locally. In J. Trinkle and B. H. Krogh, editors,
Proc. IROS Special Session on Robotics and CyberPhysical Systems,
2008.
 Sajid M. Siddiqi, Byron Boots, Geoffrey J. Gordon. A Constraint Generation Approach to Learning
Stable Linear Dynamical Systems. Tech report CMUML08101.
 A. P. Singh and
G. J. Gordon. A
unified view of matrix factorization models. In R. Goebel,
J. Siekmann, and W. Wahlster, editors, Machine Learning and Knowledge
Discovery in Databases (Proc. ECML PKDD), volume 5212/2008 of Lecture
Notes in Computer Science, pages 358373. Springer Berlin /
Heidelberg, 2008. (or,
a local
link, in case the above link is down)
 Ajit P. Singh and Geoffrey J. Gordon. Relational learning via
collective matrix factorization. In Proc. 14th Intl. Conf. on
Knowledge Discovery and Data Mining (KDD), 2008. (see also
the code)
 Ajit P. Singh and Geoffrey J. Gordon. Relational Learning via Collective Matrix
Factorization. Tech report CMUML08109.
 N. ArmstrongCrews, G. Gordon, and
M. Veloso. Solving
POMDPs from both sides: Growing dual parsimonious bounds. In
G. Shani, J. Pineau, P. Poupart, and T. Smith, editors, AAAI workshop
for Advancement in POMDP Solvers, 2008.
 Geoffrey J. Gordon, Amy Greenwald, Casey Marks. Noregret learning
in convex games. ICML, 2008. There are two related
tech reports, one of which is available below (or here).
 I. Rish, G. Grabarnik, G. Cecchi, F. Pereira, and
G. Gordon. Closedform Supervised
Dimensionality Reduction with Generalized Linear Models, in
Proceedings of ICML 2008, Helsinki, Finland. Also available from
the IBM tech report archive as
number RC24834.
 Michael Freed, Jaime Carbonell, Geoff Gordon, Jordan Hayes, Brad
Myers, Daniel Siewiorek, Stephen Smith, Aaron Steinfeld and Anthony
Tomasic. RADAR: A Personal
Assistant that Learns to Reduce Email Overload. AAAI,
2008.
 JanPeter Calliess and Geoffrey Gordon. NoRegret Learning and a Mechanism
for Distributed MultiAgent Planning. AAMAS08.
 ShannChing Chen, Geoffrey Gordon, and Robert Murphy. Graphical
Models for Structured Classification, with an Application to
Interpreting Images of Protein Subcellular Location
Patterns. Journal of Machine Learning Research, v9,
2008.
 Peng Yang, Randy Freeman, Geoffrey J. Gordon, Kevin M. Lynch,
Siddhartha Srinivasa, and Rahul
Sukthankar. Decentralized
Estimation and Control of Graph Connectivity in Mobile Sensor
Networks. ACC08.
2007
 S. Siddiqi, B. Boots, and G. Gordon. A Constraint
Generation Approach to Learning Stable Linear Dynamical
Systems. NIPS, 2007.
 Geoff Gordon, Amy Greenwald, Casey Marks and Martin
Zinkevich. NoRegret
Learning in Convex Games. Brown tech report CS0710.
 Ramprasad Ravichandran, Geoffrey J. Gordon, and Seth Goldstein. A Scalable
Distributed Algorithm for Shape Transformation in Multirobot
Systems. IROS07.
 Chris Murray and Geoff Gordon. Finding correlated equilibria in general sum
stochastic games. Technical report CMUML07113.
 H. Brendan McMahan and Geoffrey Gordon. A Unification of
Extensiveform Games and Markov Decision Processes. AAAI
2007.
 Automated Image Analysis of Protein Localization in Budding
Yeast. ISMB/ECCB 2007. With Sam Chen, Ting Zhao, and Bob
Murphy. This paper will also appear in the journal Bioinformatics.
(the code from this paper is available here)
 M. Likhachev, D. Ferguson, G. Gordon, A. Stentz, and
S. Thrun.
Anytime
search in dynamic graphs. Artificial Intelligence,
172(14):16131643, 2008.
 H. Brendan McMahan and Geoffrey J. Gordon. A Fast Bundlebased Anytime
Algorithm for Poker and other Convex Games.
AISTATS07. (8 pages, PDF; see also the AISTATS online
proceedings)
 Sajid M. Siddiqi, Geoffrey J. Gordon, and Andrew W. Moore.
Fast State Discovery for
HMM Model Selection and Learning. AISTATS07. (8
pages, PDF; see also the AISTATS online
proceedings)
 Purnamrita Sarkar, Sajid M. Siddiqi, and Geoffrey
J. Gordon. A Latent
Space Approach to Dynamic Embedding of Cooccurrence Data.
AISTATS07. (8 pages, PDF; see also the AISTATS online
proceedings)
 Purnamrita Sarkar, Sajid M. Siddiqi, and Geoffrey
J. Gordon. Approximate Kalman
Filters for Embedding AuthorWord Cooccurrence Data over
Time. Workshop on Statistical Network Analysis at the 23rd
International Conference on Machine Learning (2006), Pittsburgh,
PA. (This is the workshop version of the "Latent Space Approach"
paper above; it also appears in
the Springer LNCS series.)
 Geoffrey J. Gordon. Agendas for
MultiAgent Learning. Artificial Intelligence, special issue
on Foundations of Multiagent Learning, 2007. You may also be
interested in the longer tech
report version.
 Kian Hsiang Low, Geoffrey J. Gordon, John M. Dolan, and Pradeep
Khosla. Adaptive
Sampling for MultiRobot Wide Area Exploration.
ICRA07. (6 pages, PDF)
 Kian Hsiang Low, Geoffrey J. Gordon, John M. Dolan, and Pradeep
Khosla. Adaptive Sampling for
MultiRobot Wide Area Prospecting. Technical Report
CMURITR0551.
 S. Siddiqi, B. Boots, G. J. Gordon, and A. W. Dubrawski.
Learning stable
multivariate baseline models for outbreak detection (extended
abstract). Advances in Disease Surveillance, 4:266,
2007.
2006
 G. J. Gordon. Gametheoretic learning. In S. Haykin, J. Principe,
T. Sejnowski, and J. McWhirter,
editors, New
Directions in Statistical Signal Processing: From Systems to
Brain. MIT Press, 2005.
 Geoffrey Gordon.
Noregret algorithms for
Online Convex Programs. NIPS 2006. You may also be
interested in the earlier tech report version
listed below. (The TR contains more complete proofs.)
 Chris Murray and Geoffrey Gordon. MultiRobot Negotiation:
Approximating the Set of Subgame Perfect Equilibria in GeneralSum
Games. NIPS 2006. You may also be interested in the
earlier Snowbird
abstract, or the earlier and longer tech
report.
 Nikos Vlassis, Geoff Gordon and Joelle Pineau, eds. Special
issue on Planning
Under Uncertainty in Robotics, Robotics and Autonomous Systems,
Volume 54, Issue 11, pp. 885944, November 2006.
 Jared Glover, Daniela Rus, Nicholas Roy, and Geoff Gordon.
Robust Models of Object
Geometry. IROS06. (6 pages, PDF)
 Thakar, R., G. J. Gordon and A. K. Csink. Movement and
anchoring of heterochromatin during cell cycle and developmental
progression. Journal of Cell Science, vol 119, 2006.
You may also be interested in the software mentioned in the paper.
 Joelle Pineau, Geoff Gordon, and Sebastian
Thrun. Anytime
PointBased Approximations for Large POMDPs. JAIR, vol 27,
pages 335380, 2006. This is the journal version of our paper
about the PBVI algorithm. See also
the arXiv version.
 ShannChing Chen, Ting Zhao, Geoffrey Gordon, and Robert
F. Murphy. A
Novel Graphical Model Approach to Segmenting Cell Images.
2006 BMES Annual Fall Meeting. (8 pages, PDF)
 ShannChing Chen, Ting Zhao, Geoffrey J. Gordon and Robert
F. Murphy. A Novel
Graphical Model Approach to Segmenting Cell Images. IEEE
Symposium on Computational Intelligence in Bioinformatics and
Computational Biology, 2006.
 S.C. Chen, G. Gordon and R.F. Murphy. A Novel
Approximate Inference Approach To Automated Classification Of Protein
Subcellular Location Patterns In MultiCell Images.
Proceedings of the 2006 International Symposium on Biomedical Imaging (ISBI),
pp. 558561. See also the code and
data used in this paper.
 Francisco Pereira
and Geoff Gordon. The
Support Vector Decomposition Machine. To appear in ICML,
2006. (PDF, 8 pages) (there is also an extension of the SVDM
algorithm to be more SVMlike; it was published and presented at the
2006 workshop on Bioimage Informatics at UCSB, and was also the
subject of a talk at the 2006 NIPS workshop on Novel Applications of
Dimensionality Reduction, but I don't have a link up yet)
 Brian Gerkey, Sebastian Thrun, and Geoff Gordon. Visibilitybased pursuitevasion
with limited field of view. IJRR, v25, n4, p299, 2006. (23
pages, PDF)
 Geoffrey J. Gordon. Noregret algorithms for
structured prediction problems. Tech report
CMUCALD05112. (45 pages, PDF; or try gzipped
postscript.) This is the tech report version of my paper on
Lagrangian Hedging algorithms, which are for online learning in
problems with structured hypothesis and/or output spaces. This
file replaces an earlier draft which had been available on this
website.
 C. Murray and G. Gordon. MultiRobot Negotiation:
Approximating the Set of Subgame Perfect Equilibria in GeneralSum
Stochastic Games. Snowbird Learning Workshop, 2006.
 Matthijs T.J. Spaan, Nikos Vlassis, and Geoffrey J. Gordon.
Decentralized planning under uncertainty
for teams of communicating agents. AAMAS06. (8 pages
PDF)
2005
 Joelle Pineau and Geoff Gordon. POMDP Planning for Robust
Robot Control. ISRR05. (10 pages, PDF,
or try a local copy.) Also appears
in Robotics
Research (part of the Springer Tracts in Advanced Robotics (STAR)
series).
 H. Brendan McMahan, Maxim Likhachev, and Geoffrey Gordon.
Bounded RealTime Dynamic
Programming: RTDP with monotone upper bounds and performance
guarantees. ICML05. (8 pages, PDF.)
 H. Brendan McMahan and Geoffrey J. Gordon. Fast Exact Planning in Markov
Decision Processes. ICAPS05. (10 pages, PDF.)
See also the tech report version, CMUCS05127. (22 pages, PDF.)
 Dave Ferguson, Maxim Likhachev, Geoff Gordon, Anthony Stentz, and
Sebastian Thrun. Anytime Dynamic A*: An
Anytime, Replanning Algorithm. ICAPS05. (10 pages,
PDF)
 Rosemary EmeryMontemerlo, Geoff Gordon, Jeff Schneider, and
Sebastian Thrun. Game
Theoretic Control for Robot Teams. ICRA 2005. (7
pages, PDF)
 Brian Gerkey, Sebastian Thrun, and Geoff Gordon. Parallel Stochastic
HillClimbing with Small Teams. 3rd International NRL Workshop
on MultiRobot Systems, 2005. (12 pages, PDF)
 Nicholas Roy, Geoffrey Gordon, and Sebastian
Thrun. Finding
Approximate POMDP Solutions Through Belief Compression. JAIR
vol 23 p 140 (2005). Here is
a local
link, in PDF, in case the above link is down. Also available
through arXiv.
2004
 Maxim Likhachev, Geoff Gordon, and Sebastian Thrun. Planning for Markov
Decision Processes with Sparse Stochasticity. NIPS
2004. Describes a searchbased algorithm, MCP, for extracting a
purelystochastic MDP from a larger problem with lots of deterministic
transitions.

Matthew Rosencrantz, Geoff Gordon, and Sebastian Thrun. Learning Low Dimensional
Predictive Representations. ICML 2004. (8 pages, PDF)

Rosemary EmeryMontemerlo, Geoff Gordon, Jeff Schneider, and Sebastian
Thrun. Approximate
Solutions For Partially Observable Stochastic Games with Common
Payoffs. AAMAS 2004. (8 pages postscript, or try PDF)

Brian P. Gerkey, Sebastian Thrun, and Geoff Gordon. Visibilitybased pursuitevasion
with limited field of view. AAAI 2004. (8 pages
postscript; or try PDF)

Allison Bruce and Geoffrey Gordon. Better Motion Prediction for
Peopletracking. ICRA04. pdf, 6 pages.

Vandi Verma, Geoff Gordon, Reid Simmons, and Sebastian Thrun. Particle Filters for Rover Fault
Diagnosis. IEEE Robotics & Automation Magazine special issue
on Human Centered Robotics and Dependability, June 2004. (Or try
PDF.)
2003

Aaron Courville, Nathaniel Daw, Geoff Gordon, and Dave
Touretzky. Model Uncertainty in Classical Conditioning.
NIPS 2003.
pdf,
postscript

Joelle Pineau, Geoff Gordon, and Sebastian Thrun.
Applying MetricTrees to BeliefPoint POMDPs.
NIPS 2003.
pdf,
postscript

Maxim Likhachev, Geoff Gordon, and Sebastian Thrun.
ARA*: Anytime A* with Provable Bounds on SubOptimality.
NIPS 2003.
pdf,
postscript

Curt Bererton, Geoff Gordon, Sebastian Thrun, and Pradeep
Khosla. Auction Mechanism Design for MultiRobot
Coordination. NIPS 2003.
pdf,
postscript

The tech report
version of our paper on ARA* (Anytime Repairing A*), with Maxim Likhachev and Sebastian Thrun.
Describes an anytime modification of A* search which produces a
suboptimal solution quickly, then repeatedly repairs the plan until it
runs out of search time or proves optimality. This version
contains the full proofs of correctness for the algorithm. (2.5M
PDF, 26 pages, CMUCS03148)

My ICML03 paper with Brendan
McMahan and Avrim Blum,
Planning in the presence
of cost functions controlled by an adversary. Shows how to
efficiently solve a class of zerosum games where one player tries to
plan a path through an MDP while the other player tries to block the
path (349k gzipped postscript, 8 pages) (or try 301k PDF). See also the
related abstract in the NIPS03
games workshop.

My IJCAI03 paper with Joelle
Pineau and Sebastian
Thrun, Pointbased
value iteration: an anytime algorithm for POMDPs. Describes
an approximation to POMDP value iteration which is both fast and
provably lowerror (142k gzipped postscript, 6 pages) (or try 293k PDF). See also some
slides for a talk about this paper (PDF,
37 pages).

My IJCAI03 extended abstract with a host of other authors, A learning algorithm for localizing people
based on wireless signal strength that uses labeled and unlabeled
data. (57k gzipped postscript, 2 pages)

My UAI03 paper with Joelle
Pineau and Sebastian
Thrun, Policycontingent abstraction
for robust robot control. Describes the PolCA algorithm for
hierarchical solution of MDPs and POMDPs. (454k gzipped
postscript, 8 pages) (or try 261k PDF)

My UAI03 paper with Matt
Rosencrantz and Sebastian
Thrun, Decentralized Sensor
Fusion with Distributed Particle Filters. Describes a
queryresponse algorithm for deciding which sensor data is interesting
enough to send to your neighbors. (390k gzipped postscript, 8
pages) (or try 223k PDF)

My FSR03 paper with Nick
Roy and Sebastian
Thrun, Planning under uncertainty for
reliable health care robotics. Describes how we used the
math from the two NIPS02 papers below to build a planner for the
NurseBot. (447k gzipped postscript, 6 pages, or try PDF) (also appears in proceedings in Springer Tracts in
Advanced Robotics)

My iSAIRAS03 paper with Vandi
Verma and Reid Simmons,
Efficient monitoring for planetary
rovers. (141k gzipped postscript, 8 pages) (You may
also be interested in the following related extended abstract: Vandi Verma
and Geoff Gordon. "Bayesian Methods for Identifying Faults on
Robots for Planetary Exploration." Proceedings of ISBA,
Viña del Mar, Chile, May 2004.)

My AAMAS03 paper with Matt
Rosencrantz and Sebastian
Thrun, Locating
Moving Entities in Dynamic Indoor Environments with Teams of Mobile
Robots. (Matt was awarded the "Best Student Paper" prize for
this paper.) It describes a technique for factoring multiagent
tracking problems so that the interactions we need to consider are
simpler, as well as an implemented system which tracks teams of robots
as they play laser tag. (793k gzipped postscript, 8 pages, or try
1.1M PDF)
2002 and earlier

My NIPS02 paper, Generalized^{2}
Linear^{2} Models. It combines principal components
analysis, independent components analysis, and generalized linear
models. The result is a nonlinear component analysis model which
can be optimized quickly and which can express a variety of useful
relationships between hidden and visible variables. (222k
gzipped postscript, 8 pages) (or try 144k PDF)

My NIPS02 paper with Nick
Roy, Exponential Family PCA
for Belief Compression in POMDPs. It describes a way to find
structure in robot belief states and take advantage of that structure
for planning. Belief states are probability distributions over
physical states, and therefore highdimensional. In order to
plan, we must reduce the highdimensional representation to a
lowerdimensional one; so, we applied a nonlinear component analysis
algorithm to find the lowdimensional features which allow us to
reconstruct our belief most accurately in KLdivergence. (66k
gzipped postscript, 8 pages; or try PDF)

My UAI02 paper, Distributed
planning in hierarchical factored MDPs (with Carlos
Guestrin). It describes a way to decompose a large MDP with
factored dynamics into several smaller MDPs that run in parallel and
are coupled by constraints, and provides a principled distributed
planning algorithm based on this intuition. (384k gzipped
postscript, 10 pages) (or try 458k PDF)

My NIPS00 paper, Reinforcement Learning
with Function Approximation Converges to a Region. It proves
that two related algorithms, SARSA(0) and V(0), cannot diverge.
(The latter algorithm was used in the TDGammon program, albeit with a
nonlinear function approximator that doesn't fit my
assumptions.) (55k gzipped postscript, 7 pages.) You can
also dowload some slides (246k
gzipped postscript, 18 pages).

My ML00 paper, Learning
Filaments (with Andrew
Moore). It
describes a modification to kmeans that allows cluster centers to be
shaped like line segments instead of points. (500k gzipped
postscript, 8 pages)

Hierarchical Linear Models and Cell Data, a
Robotics Institute tech report (82k gzipped postscript, 14 pages).
You can also download it from the RI TR archive as gzipped
postscript or PDF.

My thesis, Approximate Solutions to Markov
Decision Processes. Contains results about fitted value
iteration, worst case learning, and the relationship between MDPs and
convex programming (680k gzipped postscript, 150 pages). Also
available from the CMU
CS tech reports archive as postscript (3208k) or PDF
(1286k). Here is the abstract.

My COLT99 paper, Regret bounds for prediction
problems, which proves worstcase performance bounds for some
widelyused learning algorithms (379k, 12 pages). You can also
download some slides (493k, 37
pages). Chapter 3 of my thesis is a slightly longer and more
recent presentation of the material in this paper.

The online proceedings
of the workshop on modelling in reinforcement learning, held at ICML97,
coorganized with Chris Atkeson.

Stable Function Approximation in
Dynamic Programming from ML95: convergence guarantees for offline
Markov decision problem solvers based on function approximators like
knearestneighbor. (84k, 8 pages) (or try PDF)

Stable Function Approximation in Dynamic
Programming, tech report CMUCS95103: an earlier, longer version
of the above paper. Contains proofs and discussion which were
left out of the ML95 paper due to limited space. Also available
from the tech
reports archive. (188k, 23 pages) (or try PDF)

Online
Fitted Reinforcement Learning from the Value Function Approximation
workshop at ML95: an addendum to the above two papers which extends
some of their techniques to online Markov decision problems.
(81k, 3 pages) (or try PDF)

An example
of SARSA failing to converge. (50k, 6 pages)

My NIPS95 paper. The previous two papers (the one from the ML95
VFA workshop and the SARSA example) are more recent and cover the same
topics. So, this paper is mostly obsolete. If you want it
anyway, you can click here. (56k, 7
pages)
