Teaching
Spring 2008 I am teaching 10-725,
Optimization, with Carlos
Guestrin.
Fall 2007 I taught 15-780, the graduate AI Star
course, with Ziv
Bar-Joseph. I also taught this course in Fall 2006.
Spring 2004 I taught CS23N, Robotics and
Machine Learning, with Andrew
Ng at Stanford.
Summer 2003 I organized the CALD summer
school with Tom
Mitchell.
Fall 2002 I taught 16-899C, Statistical
Techniques in Robotics, with Sebastian Thrun.
Students
Here is a current list of the students I
am supervising.
Notes, examples, and tutorials
These are informal notes rather than polished presentations, so let me
know if you find any errors.
Playing games
- Play some one-card poker.
- Some slides on what
it means to be a reasonable learning algorithm in a repeated
game. I presented these as an invited talk at the AAAI workshop
on multiagent learning in 2005.
Algorithms for statistical inference
- Some code for generalized linear PCA using a
Poisson error model (and its matching exponential link function)
- Some lecture notes on Monte Carlo algorithms, including Matlab demos.
- Some lecture notes on support vector machines, including a simple Java applet.
- Some lecture notes on variational algorithms, including k-means clustering and mean-field image segmentation.
- Notes on Gaussian distributions as they are used in the Kalman filter.
- An example of how to fit a logistic
model using iteratively-reweighted least squares.
-
An example of using gradient descent to fit a
discrete exponential family. Matlab code, 2k.
-
Notes on the concave-convex procedure (CCCP) and its relationship to
variational bounding algorithms, in PostScript (44k, 20 slides).
-
Notes on Fisher scoring, in PostScript (42k, 8 slides).
-
Notes on boosting, in PostScript (90k,
20 slides).
Linear programming and optimization
-
A very simple implementation of an infeasible interior-point method
for linear and convex quadratic programs, as a Matlab
M-file, and an example of its use.
I also have a slightly more sophisticated
implementation (also in Matlab). If you have access to Matlab's
quadprog, I'd recommend using that instead; when I wrote this, I did
not have access to quadprog.
-
A tutorial on some geometry behind linear programming, in PostScript (780k, 30 slides).
-
For comparison, here's another short interior
point linear programming solver. This one is due to Yin
Zhang and was presented at SIAM 2000;
I have basically only reformatted the code so that it's slightly
easier to use and read.
-
Support vector machines are an interesting use of optimization, and
there is some interior point code for learning SVMs on my SVM page.
Reinforcement learning
-
A (very partial) annotated bibliography on
robot learning via MDPs and related methods. I made this as an
initial cut at readings our multirobot planning group might want to go
over.
-
Notes on conditioning (the dogs and bells kind), in PostScript (50k, 20 slides) or PDF (80k).
-
Lecture notes for an intro to reinforcement learning, in PostScript or PDF (215k, 43 slides).
Others
- Code for path planning via Dijkstra's
algorithm and A* search, in Java with a Matlab interface.
- A tutorial on synthetic division and
partial fraction expansion, which are useful in working with the
rational functions which arise when analyzing a linear, time-invariant
system of differential equations.
- Software for tracking
dots in images. This is a useful primitive for some types of
computational biology experiments: fluorescently tag something, take
pictures of it, and track how it moves. This software isn't very
polished, but we couldn't find anything out there for the purpose; so
we wrote this, and some friends of mine used it to help with the data
for one of the papers
below.
- Notes on edge and corner detectors.
-
The iterated prisoner's dilemma (text,
10k).
-
Slides on rank-based nonparametric statistical tests, as PostScript (115k).
-
A simple tutorial on the Common LISP
language, written as class material for the AI core course at CMU.
Some publications
This list is approximately in reverse chronological order.
Some of my publications are also available from the CMU CS tech reports
archive or RI tech reports
archive.
- Geoffrey J. Gordon, Amy Greenwald, Casey Marks. No-regret learning
in convex games. ICML, 2008. There are two related
tech reports, one of which is available below (or here).
- Michael Freed, Jaime Carbonell, Geoff Gordon, Jordan Hayes, Brad
Myers, Daniel Siewiorek, Stephen Smith, Aaron Steinfeld and Anthony
Tomasic. RADAR: A Personal Assistant that Learns to Reduce Email
Overload. AAAI, 2008. (sorry, no link yet)
- Jan-Peter Calliess and Geoffrey Gordon. No-Regret Learning and a Mechanism
for Distributed Multi-Agent Planning. AAMAS-08.
- Shann-Ching Chen, Geoffrey Gordon, and Robert Murphy. Graphical
Models for Structured Classification, with an Application to
Interpreting Images of Protein Subcellular Location
Patterns. Journal of Machine Learning Research, v9,
2008.
- Peng Yang, Randy Freeman, Geoffrey J. Gordon, Kevin M. Lynch,
Siddhartha Srinivasa, and Rahul Sukthankar. Decentralized
Estimation and Control of Graph Connectivity in Mobile Sensor
Networks. ACC-08. (sorry, no link yet)
- S. Siddiqi, B. Boots, and G. Gordon. A Constraint
Generation Approach to Learning Stable Linear Dynamical
Systems. NIPS, 2007.
- Geoff Gordon, Amy Greenwald, Casey Marks and Martin
Zinkevich. No-Regret
Learning in Convex Games. Brown tech report CS-07-10.
- Ramprasad Ravichandran, Geoffrey J. Gordon, and Seth Goldstein. A Scalable
Distributed Algorithm for Shape Transformation in Multi-robot
Systems. IROS-07.
- Chris Murray and Geoff Gordon. Finding correlated equilibria in general sum
stochastic games. Technical report CMU-ML-07-113.
- H. Brendan McMahan and Geoffrey Gordon. A Unification of
Extensive-form Games and Markov Decision Processes. AAAI
2007.
- Automated Image Analysis of Protein Localization in Budding
Yeast. ISMB/ECCB 2007. With Sam Chen, Ting Zhao, and Bob
Murphy. This paper will also appear in the journal Bioinformatics.
(the code from this paper is available here)
- M. Likhachev, D. Ferguson, G. Gordon, A. Stentz, and
S. Thrun. Anytime search in large, dynamic graphs.
Artificial Intelligence, to appear. (sorry, no link yet)
- H. Brendan McMahan and Geoffrey J. Gordon. A Fast Bundle-based Anytime
Algorithm for Poker and other Convex Games.
AISTATS-07. (8 pages, PDF; see also the AISTATS online
proceedings)
- Sajid M. Siddiqi, Geoffrey J. Gordon, and Andrew W. Moore.
Fast State Discovery for
HMM Model Selection and Learning. AISTATS-07. (8
pages, PDF; see also the AISTATS online
proceedings)
- Purnamrita Sarkar, Sajid M. Siddiqi, and Geoffrey
J. Gordon. A Latent
Space Approach to Dynamic Embedding of Co-occurrence Data.
AISTATS-07. (8 pages, PDF; see also the AISTATS online
proceedings)
- Purnamrita Sarkar, Sajid M. Siddiqi, and Geoffrey
J. Gordon. Approximate Kalman
Filters for Embedding Author-Word Co-occurrence Data over
Time. Workshop on Statistical Network Analysis at the 23rd
International Conference on Machine Learning (2006), Pittsburgh,
PA. (This is the workshop version of the "Latent Space Approach"
paper above; it also appears in
the Springer LNCS series.)
- Geoffrey J. Gordon. Agendas for
Multi-Agent Learning. Artificial Intelligence, special issue
on Foundations of Multiagent Learning, 2007. You may also be
interested in the longer tech
report version.
- Kian Hsiang Low, Geoffrey J. Gordon, John M. Dolan, and Pradeep
Khosla. Adaptive Sampling for Multi-Robot Wide Area
Exploration. ICRA-07. (sorry, no link yet; but the TR
version is the next bullet below)
- Kian Hsiang Low, Geoffrey J. Gordon, John M. Dolan, and Pradeep
Khosla. Adaptive Sampling for
Multi-Robot Wide Area Prospecting. Technical Report
CMU-RI-TR-05-51.
- Geoffrey Gordon.
No-regret algorithms for
Online Convex Programs. NIPS 2006. You may also be
interested in the earlier tech report version
listed below. (The TR contains more complete proofs.)
- Chris Murray and Geoffrey Gordon. Multi-Robot Negotiation:
Approximating the Set of Subgame Perfect Equilibria in General-Sum
Games. NIPS 2006. You may also be interested in the
earlier Snowbird
abstract, or the earlier and longer tech
report.
- Nikos Vlassis, Geoff Gordon and Joelle Pineau, eds. Special
issue on Planning
Under Uncertainty in Robotics, Robotics and Autonomous Systems,
Volume 54, Issue 11, pp. 885-944, November 2006.
- Jared Glover, Daniela Rus, Nicholas Roy, and Geoff Gordon.
Robust Models of Object
Geometry. IROS-06. (6 pages, PDF)
- Thakar, R., G. J. Gordon and A. K. Csink. Movement and
anchoring of heterochromatin during cell cycle and developmental
progression. Journal of Cell Science. In press.
You may also be interested in the software mentioned in the paper.
- Joelle Pineau, Geoff Gordon, and Sebastian Thrun. Anytime Point-Based
Approximations for Large POMDPs. JAIR, vol 27, pages
335-380, 2006. This is the journal version of our paper about
the PBVI algorithm.
- Shann-Ching Chen, Ting Zhao, Geoffrey Gordon, and Robert
F. Murphy. A
Novel Graphical Model Approach to Segmenting Cell Images.
2006 BMES Annual Fall Meeting. (8 pages, PDF)
- Shann-Ching Chen, Ting Zhao, Geoffrey J. Gordon and Robert
F. Murphy. A Novel
Graphical Model Approach to Segmenting Cell Images. IEEE
Symposium on Computational Intelligence in Bioinformatics and
Computational Biology, 2006.
- S.-C. Chen, G. Gordon and R.F. Murphy. A Novel
Approximate Inference Approach To Automated Classification Of Protein
Subcellular Location Patterns In Multi-Cell Images.
Proceedings of the 2006 International Symposium on Biomedical Imaging (ISBI),
pp. 558-561. See also the code and
data used in this paper.
- Francisco Pereira
and Geoff Gordon. The
Support Vector Decomposition Machine. To appear in ICML,
2006. (PDF, 8 pages) (there is also an extension of the SVDM
algorithm to be more SVM-like; it was published and presented at the
2006 workshop on Bioimage Informatics at UCSB, and was also the
subject of a talk at the 2006 NIPS workshop on Novel Applications of
Dimensionality Reduction, but I don't have a link up yet)
- Brian Gerkey, Sebastian Thrun, and Geoff Gordon. Visibility-based pursuit-evasion
with limited field of view. IJRR, v25, n4, p299, 2006. (23
pages, PDF)
- Geoffrey J. Gordon. No-regret algorithms for
structured prediction problems. Tech report
CMU-CALD-05-112.  (45 pages, PDF; or try gzipped
postscript.) This is the tech report version of my paper on
Lagrangian Hedging algorithms, which are for online learning in
problems with structured hypothesis and/or output spaces. This
file replaces an earlier draft which had been available on this
website.
- C. Murray and G. Gordon. Multi-Robot Negotiation:
Approximating the Set of Subgame Perfect Equilibria in General-Sum
Stochastic Games. Snowbird Learning Workshop, 2006.
- Matthijs T.J. Spaan, Nikos Vlassis, and Geoffrey J. Gordon.
Decentralized planning under uncertainty
for teams of communicating agents. AAMAS-06. (8 pages
PDF)
- Joelle Pineau and Geoff Gordon. POMDP Planning for Robust
Robot Control. ISRR-05. (10 pages, PDF,
or try a local copy.) Also appears
in Robotics
Research (part of the Springer Tracts in Advanced Robotics (STAR)
series).
- H. Brendan McMahan, Maxim Likhachev, and Geoffrey Gordon.
Bounded Real-Time Dynamic
Programming: RTDP with monotone upper bounds and performance
guarantees. ICML-05.  (8 pages, PDF.)
- H. Brendan McMahan and Geoffrey J. Gordon. Fast Exact Planning in Markov
Decision Processes. ICAPS-05. (10 pages, PDF.)
See also the tech report version, CMU-CS-05-127. (22 pages, PDF.)
- Dave Ferguson, Maxim Likhachev, Geoff Gordon, Anthony Stentz, and
Sebastian Thrun. Anytime Dynamic A*: An
Anytime, Replanning Algorithm. ICAPS-05. (10 pages,
PDF)
- Rosemary Emery-Montemerlo, Geoff Gordon, Jeff Schneider, and
Sebastian Thrun. Game
Theoretic Control for Robot Teams. ICRA 2005. (7
pages, PDF)
- Brian Gerkey, Sebastian Thrun, and Geoff Gordon. Parallel Stochastic
Hill-Climbing with Small Teams. 3rd International NRL Workshop
on Multi-Robot Systems, 2005. (12 pages, PDF)
- Nicholas Roy, Geoffrey Gordon, and Sebastian Thrun. Finding Approximate
POMDP Solutions Through Belief Compression. JAIR vol 23 p
1-40 (2005). Here is a local link, in
PDF, in case the above link is down.
- Maxim Likhachev, Geoff Gordon, and Sebastian Thrun. Planning for Markov
Decision Processes with Sparse Stochasticity. NIPS
2004. Describes a search-based algorithm, MCP, for extracting a
purely-stochastic MDP from a larger problem with lots of deterministic
transitions.
-
Matthew Rosencrantz, Geoff Gordon, and Sebastian Thrun. Learning Low Dimensional
Predictive Representations. ICML 2004. (8 pages, PDF)
-
Rosemary Emery-Montemerlo, Geoff Gordon, Jeff Schneider, and Sebastian
Thrun. Approximate
Solutions For Partially Observable Stochastic Games with Common
Payoffs. AAMAS 2004. (8 pages postscript, or try PDF)
-
Brian P. Gerkey, Sebastian Thrun, and Geoff Gordon. Visibility-based pursuit-evasion
with limited field of view. AAAI 2004. (8 pages
postscript; or try PDF)
-
Aaron Courville, Nathaniel Daw, Geoff Gordon, and Dave
Touretzky. Model Uncertainty in Classical Conditioning.
NIPS 2003.
pdf,
postscript
-
Joelle Pineau, Geoff Gordon, and Sebastian Thrun.
Applying Metric-Trees to Belief-Point POMDPs.
NIPS 2003.
pdf,
postscript
-
Maxim Likhachev, Geoff Gordon, and Sebastian Thrun.
ARA*: Anytime A* with Provable Bounds on Sub-Optimality.
NIPS 2003.
pdf,
postscript
-
Curt Bererton, Geoff Gordon, Sebastian Thrun, and Pradeep
Khosla. Auction Mechanism Design for Multi-Robot
Coordination. NIPS 2003.
pdf,
postscript
-
Allison Bruce and Geoffrey Gordon. Better Motion Prediction for
People-tracking. ICRA-04. pdf, 6 pages.
-
Vandi Verma, Geoff Gordon, Reid Simmons, and Sebastian Thrun. Particle Filters for Rover Fault
Diagnosis. IEEE Robotics & Automation Magazine special issue
on Human Centered Robotics and Dependability, June 2004. (Or try
PDF.)
-
The tech report
version of our paper on ARA* (Anytime Repairing A*), with Maxim Likhachev and Sebastian Thrun.
Describes an anytime modification of A* search which produces a
suboptimal solution quickly, then repeatedly repairs the plan until it
runs out of search time or proves optimality. This version
contains the full proofs of correctness for the algorithm. (2.5M
PDF, 26 pages, CMU-CS-03-148)
-
My ICML-03 paper with Brendan
McMahan and Avrim Blum,
Planning in the presence
of cost functions controlled by an adversary. Shows how to
efficiently solve a class of zero-sum games where one player tries to
plan a path through an MDP while the other player tries to block the
path (349k gzipped postscript, 8 pages) (or try 301k PDF). See also the
related abstract in the NIPS-03
games workshop.
-
My IJCAI-03 paper with Joelle
Pineau and Sebastian
Thrun, Point-based
value iteration: an anytime algorithm for POMDPs. Describes
an approximation to POMDP value iteration which is both fast and
provably low-error (142k gzipped postscript, 6 pages) (or try 293k PDF). See also some
slides for a talk about this paper (PDF,
37 pages).
-
My IJCAI-03 extended abstract with a host of other authors, A learning algorithm for localizing people
based on wireless signal strength that uses labeled and unlabeled
data. (57k gzipped postscript, 2 pages)
-
My UAI-03 paper with Joelle
Pineau and Sebastian
Thrun, Policy-contingent abstraction
for robust robot control. Describes the PolCA algorithm for
hierarchical solution of MDPs and POMDPs. (454k gzipped
postscript, 8 pages) (or try 261k PDF)
-
My UAI-03 paper with Matt
Rosencrantz and Sebastian
Thrun, Decentralized Sensor
Fusion with Distributed Particle Filters. Describes a
query-response algorithm for deciding which sensor data is interesting
enough to send to your neighbors. (390k gzipped postscript, 8
pages) (or try 223k PDF)
-
My FSR-03 paper with Nick
Roy and Sebastian
Thrun, Planning under uncertainty for
reliable health care robotics. Describes how we used the
math from the two NIPS-02 papers below to build a planner for the
NurseBot. (447k gzipped postscript, 6 pages, or try PDF) (also appears in proceedings in Springer Tracts in
Advanced Robotics)
-
My iSAIRAS-03 paper with Vandi
Verma and Reid Simmons,
Efficient monitoring for planetary
rovers. (141k gzipped postscript, 8 pages) (You may
also be interested in the following related extended abstract: Vandi Verma
and Geoff Gordon. "Bayesian Methods for Identifying Faults on
Robots for Planetary Exploration." Proceedings of ISBA,
Viña del Mar, Chile, May 2004.)
-
My AAMAS-03 paper with Matt
Rosencrantz and Sebastian
Thrun, Locating
Moving Entities in Dynamic Indoor Environments with Teams of Mobile
Robots. (Matt was awarded the "Best Student Paper" prize for
this paper.) It describes a technique for factoring multiagent
tracking problems so that the interactions we need to consider are
simpler, as well as an implemented system which tracks teams of robots
as they play laser tag. (793k gzipped postscript, 8 pages, or try
1.1M PDF)
-
My NIPS-02 paper, Generalized2
Linear2 Models. It combines principal components
analysis, independent components analysis, and generalized linear
models. The result is a nonlinear component analysis model which
can be optimized quickly and which can express a variety of useful
relationships between hidden and visible variables. (222k
gzipped postscript, 8 pages) (or try 144k PDF)
-
My NIPS-02 paper with Nick
Roy, Exponential Family PCA
for Belief Compression in POMDPs. It describes a way to find
structure in robot belief states and take advantage of that structure
for planning. Belief states are probability distributions over
physical states, and therefore high-dimensional. In order to
plan, we must reduce the high-dimensional representation to a
lower-dimensional one; so, we applied a nonlinear component analysis
algorithm to find the low-dimensional features which allow us to
reconstruct our belief most accurately in KL-divergence. (66k
gzipped postscript, 8 pages; or try PDF)
-
My UAI-02 paper, Distributed
planning in hierarchical factored MDPs (with Carlos
Guestrin). It describes a way to decompose a large MDP with
factored dynamics into several smaller MDPs that run in parallel and
are coupled by constraints, and provides a principled distributed
planning algorithm based on this intuition. (384k gzipped
postscript, 10 pages) (or try 458k PDF)
-
My NIPS-00 paper, Reinforcement Learning
with Function Approximation Converges to a Region. It proves
that two related algorithms, SARSA(0) and V(0), cannot diverge.
(The latter algorithm was used in the TD-Gammon program, albeit with a
nonlinear function approximator that doesn't fit my
assumptions.) (55k gzipped postscript, 7 pages.) You can
also dowload some slides (246k
gzipped postscript, 18 pages).
-
My ML-00 paper, Learning
Filaments (with Andrew
Moore). It
describes a modification to k-means that allows cluster centers to be
shaped like line segments instead of points. (500k gzipped
postscript, 8 pages)
-
Hierarchical Linear Models and Cell Data, a
Robotics Institute tech report (82k gzipped postscript, 14 pages).
You can also download it from the RI TR archive as gzipped
postscript or PDF.
-
My thesis, Approximate Solutions to Markov
Decision Processes. Contains results about fitted value
iteration, worst case learning, and the relationship between MDPs and
convex programming (680k gzipped postscript, 150 pages). Also
available from the CMU
CS tech reports archive as postscript (3208k) or PDF
(1286k). Here is the abstract.
-
My COLT-99 paper, Regret bounds for prediction
problems, which proves worst-case performance bounds for some
widely-used learning algorithms (379k, 12 pages). You can also
download some slides (493k, 37
pages). Chapter 3 of my thesis is a slightly longer and more
recent presentation of the material in this paper.
-
The online proceedings
of the workshop on modelling in reinforcement learning, held at ICML-97,
co-organized with Chris Atkeson.
-
Stable Function Approximation in
Dynamic Programming from ML-95: convergence guarantees for offline
Markov decision problem solvers based on function approximators like
k-nearest-neighbor. (84k, 8 pages) (or try PDF)
-
Stable Function Approximation in Dynamic
Programming, tech report CMU-CS-95-103: an earlier, longer version
of the above paper. Contains proofs and discussion which were
left out of the ML-95 paper due to limited space. Also available
from the tech
reports archive. (188k, 23 pages) (or try PDF)
-
Online
Fitted Reinforcement Learning from the Value Function Approximation
workshop at ML-95: an addendum to the above two papers which extends
some of their techniques to online Markov decision problems.
(81k, 3 pages) (or try PDF)
-
An example
of SARSA failing to converge. (50k, 6 pages)
-
My NIPS-95 paper. The previous two papers (the one from the ML-95
VFA workshop and the SARSA example) are more recent and cover the same
topics. So, this paper is mostly obsolete. If you want it
anyway, you can click here. (56k, 7
pages)
|