email: (click to reveal)
office: College of Computing Building (CCB) 260
telephone: 4043858556
I am an Assistent Professor in the School of Interactive Computing and affiliated with the Institute for Robotics and Intelligent Machines at the Georgia Insititute of Technology.
My research is focused on statistical machine learning, artificial intelligence, and robotics. I am particularly interested in designing learning algorithms with strong theoretical guarantees that also achieve stateoftheart performance in applied domains. The algorithms that I develop make use of and extend theory from kernel methods, spectral methods, and predictive state representations (See publications below).
Before coming to Georgia Tech, I was a postdoctoral Research Associate in the Robotics and State Estimation Lab in the Computer Science and Engineering Department at the University of Washington. I received my Ph.D. from the Machine Learning Department in the School of Computer Science at Carnegie Mellon University where I was a member of the SELECT Lab run by Carlos Guestrin and my advisor Geoff Gordon.
I have coorganized the following workshops and tutorials:
Authors  Title  Year  Journal/Proceedings 

B. Boots, A. Byravan, & D. Fox  Learning Predictive Models of a Depth Camera & Manipulator from Raw Execution Traces. (48% Acceptance Rate) 
2014  Proceedings of the 2014 IEEE Conference on Robotics and Automation (ICRA2014) 
Abstract: We attack the problem of learning a predictive model of a depth camera and manipulator directly from raw execution traces. While the problem of learning manipulator models from visual and proprioceptive data has been addressed before, existing techniques often rely on assumptions about the structure of the robot or tracked features in observation space. We make no such assumptions. Instead, we formulate the problem as that of learning a highdimensional controlled stochastic process. We leverage recent work on nonparametric predictive state representations to learn a generative model of the depth camera and robotic arm from sequences of uninterpreted actions and observations. We perform several experiments in which we demonstrate that our learned model can accurately predict future observations in response to sequences of motor commands. 

BibTeX:
@inproceedings{BootsDepthcamManipulator, 

A. Byravan, B. Boots, S. Srinivasa, & D. Fox.  SpaceTime Functional Gradient Optimization for Motion Planning. (48% Acceptance Rate)  2014  Proceedings of the 2014 IEEE Conference on Robotics and Automation (ICRA2014) 
Abstract: Functional gradient algorithms (e.g. CHOMP) have recently shown great promise for producing optimal motion for complex many degreeoffreedom robots. A key limitation of such algorithms, is the difficulty in incorporating constraints and cost functions that explicitly depend on time. We present TCHOMP, a functional gradient algorithm that overcomes this limitation by directly optimizing in spacetime. We outline a framework for joint spacetime optimization, derive an efficient trajectorywide update for maintaining time monotonicity, and demonstrate the significance of TCHOMP over CHOMP in several scenarios. By manipulating time, TCHOMP produces lowercost trajectories leading to behavior that is meaningfully different from CHOMP. 

BibTeX:
@inproceedings{ByravanSpaceTime, 

B. Boots, A. Gretton, & G. J. Gordon.  Hilbert Space Embeddings of Predictive State Representations. (Selected for plenary presentation: 11% Acceptance Rate) [Abstract] [BibTeX] [PDF] 
2013  Proceedings of the 29th Conference on Uncertainty in Artificial Intelligence (UAI2013) 
Abstract: Predictive State Representations (PSRs) are an expressive class of models for controlled stochastic processes. PSRs represent state as a set of predictions of future observable events. Because PSRs are defined entirely in terms of observable data, statistically consistent estimates of PSR parameters can be learned efficiently by manipulating moments of observed training data. Most learning algorithms for PSRs have assumed that actions and observations are finite with low cardinality. In this paper, we generalize PSRs to infinite sets of observations and actions, using the recent concept of Hilbert space embeddings of distributions. The essence is to represent the state as a nonparametric conditional embedding operator in a Reproducing Kernel Hilbert Space (RKHS) and leverage recent work in kernel methods to estimate, predict, and update the representation. We show that these Hilbert space embeddings of PSRs are able to gracefully handle continuous actions and observations, and that our learned models outperform competing system identification algorithms on several prediction benchmarks. 

BibTeX:
@inproceedings{BootsHSEPSRs, 

B. Boots & G. J. Gordon.  A Spectral Learning Approach to RangeOnly SLAM. (24% Acceptance Rate) 
2013  Proceedings of the 30th International Conference on Machine Learning (ICML2013) 
Abstract: We present a novel spectral learning algorithm for simultaneous localization and mapping (SLAM) from range data with known correspondences. This algorithm is an instance of a general spectral system identification framework, from which it inherits several desirable properties, including statistical consistency and no local optima. Compared with popular batch optimization or multiplehypothesis tracking (MHT) methods for rangeonly SLAM, our spectral approach offers guaranteed low computational requirements and good tracking performance. Compared with popular extended Kalman filter (EKF) or extended information filter (EIF) approaches, and many MHT ones, our approach does not need to linearize a transition or measurement model; such linearizations can cause severe errors in EKFs and EIFs, and to a lesser extent MHT, particularly for the highly nonGaussian posteriors encountered in rangeonly SLAM. We provide a theoretical analysis of our method, including finitesample error bounds. Finally, we demonstrate on a realworld robotic SLAM problem that our algorithm is not only theoretically justified, but works well in practice: in a comparison of multiple methods, the lowest errors come from a combination of our algorithm with batch optimization, but our method alone produces nearly as good a result at far lower computational cost. 

BibTeX:
@inproceedings{BootsspectralROSLAM, Author = "Byron Boots and Geoffrey Gordon ", Booktitle = "Proc.\ 30th Intl.\ Conf.\ on Machine Learning (ICML)", Title = "A Spectral Learning Approach to RangeOnly SLAM", Year = {2013} } 

B. Boots & G. J. Gordon.  TwoManifold Problems with Applications to Nonlinear System Identification. (27% Acceptance Rate)  2012  Proceedings of the 29th International Conference on Machine Learning (ICML2012) 
Abstract: Recently, there has been much interest in spectral approaches to learning manifoldssocalled kernel eigenmap methods. These methods have had some successes, but their applicability is limited because they are not robust to noise. To address this limitation, we look at twomanifold problems, in which we simultaneously reconstruct two related manifolds, each representing a different view of the same data. By solving these interconnected learning problems together, twomanifold algorithms are able to succeed where a nonintegrated approach would fail: each view allows us to suppress noise in the other, reducing bias. We propose a class of algorithms for twomanifold problems, based on spectral decomposition of crosscovariance operators in Hilbert space, and discuss when twomanifold problems are useful. Finally, we demonstrate that solving a twomanifold problem can aid in learning a nonlinear dynamical system from limited data. 

BibTeX:
@inproceedings{Boots2Manifold, Author = "Byron Boots and Geoffrey Gordon ", Booktitle = "Proc.\ 29th Intl.\ Conf.\ on Machine Learning (ICML)", Title = "TwoManifold Problems with Applications to Nonlinear System Identification", Year = {2012} } 

B. Boots & G. J. Gordon.  An Online Spectral Learning Algorithm for Partially Observable Nonlinear Dynamical Systems. (25% Acceptance Rate) 
2011  Proceedings of the 25th National Conference on Artificial Intelligence (AAAI2011) 
Abstract: Recently, a number of researchers have proposed spectral algorithms for learning models of dynamical systemsfor example, Hidden Markov Models (HMMs), Partially Observable Markov Decision Processes (POMDPs), and Transformed Predictive State Representations (TPSRs). These algorithms are attractive since they are statistically consistent and not subject to local optima. However,
they are batch methods: they need to store their entire training data set in memory at once and operate on it as a large matrix, and so they cannot scale to extremely large data sets (either many examples or many features per example). In turn, this restriction limits their ability to learn accurate models of complex systems. To overcome these limitations, we propose a new online spectral algorithm, which uses tricks such as incremental SVD updates and random projections to scale to much larger data sets and more complex systems than previous methods. We demonstrate the new method on a highbandwidth video mapping task, and illustrate desirable behaviors such as "closing the loop,'' where the latent state representation changes suddenly as the learner recognizes that it has returned to a previously known place. 

BibTeX:
@inproceedings{Bootsonlinepsr, Author = "Byron Boots and Geoffrey Gordon ", Booktitle = "Proceedings of the 25th National Conference on Artificial Intelligence (AAAI2011)", Title = "An Online Spectral Learning Algorithm for Partially Observable Nonlinear Dynamical Systems ", Year = "2011" } 

B. Boots, S. M. Siddiqi & G. J. Gordon.  Closing the LearningPlanning Loop with Predictive State Representations. (Invited Journal Paper)  2011  The International Journal of Robotics Research 
Abstract: A central problem in artificial intelligence is to choose actions to maximize reward in a partially observable, uncertain environment. To do so, we must learn an accurate model of our environment, and then plan to maximize reward. Unfortunately, learning algorithms often recover a model which is too inaccurate to support planning or too large and complex for planning to be feasible; or, they require large amounts of prior domain knowledge or fail to provide important guarantees such as statistical consistency. To begin to fill this gap, we propose a novel algorithm which provably learns a compact, accurate model directly from sequences of actionobservation pairs. To evaluate the learned model, we then close the loop from observations to actions: we plan in the learned model and recover a policy which is nearoptimal in the original environment (not the model). In more detail, we present a spectral algorithm for learning a Predictive State Representation (PSR). We demonstrate the algorithm by learning a model of a simulated highdimensional, visionbased mobile robot planning task, and then performing approximate pointbased planning in the learned PSR. This experiment shows that the algorithm learns a state space which captures the essential features of the environment, allows accurate prediction with a small number of parameters, and enables successful and efficient planning. Our algorithm has several benefits which have not appeared together in any previous PSR learner: it is computationally efficient and statistically consistent; it handles highdimensional observations and long time horizons by working from realvalued features of observation sequences; and finally, our closetheloop experiments provide an endtoend practical test. 

BibTeX:
@inproceedings{Boots2011b, Author = "Byron Boots and Sajid Siddiqi and Geoffrey Gordon ", Title = "Closing the Learning Planning Loop with Predictive State Representations", Journal = "I. J. Robotic Research ", Volume = "30", Year = "2011", Pages = "954956" } 

B. Boots & G. J. Gordon  Predictive State Temporal Difference Learning. (24% Acceptance Rate) 
2011  Proceedings of Advances in Neural Information Processing Systems 24 (NIPS2010) 
Abstract: We propose a new approach to value function approximation which combines linear temporal difference reinforcement learning with subspace identification. In practical applications, reinforcement learning (RL) is complicated by the fact that state is either highdimensional or partially observable. Therefore, RL methods are designed to work with features of state rather than state itself, and the success or failure of learning is often determined by the suitability of the selected features. By comparison, subspace identification (SSID) methods are designed to select a feature set which preserves as much information as possible about state. In this paper we connect the two approaches, looking at the problem of reinforcement learning with a large set of features, each of which may only be marginally useful for value function approximation. We introduce a new algorithm for this situation, called Predictive State Temporal Difference (PSTD) learning. As in SSID for predictive state representations, PSTD finds a linear compression operator that projects a large set of features down to a small set that preserves the maximum amount of predictive information. As in RL, PSTD then uses a Bellman recursion to estimate a value function. We discuss the connection between PSTD and prior approaches in RL and SSID. We prove that PSTD is statistically consistent, perform several experiments that illustrate its properties, and demonstrate its potential on a difficult optimal stopping problem. 

BibTeX:
@inproceedings{Boots11a, Author = "Byron Boots and Geoffrey J. Gordon ", Booktitle = "Proceedings of Advances in Neural Information Processing Systems 24", Title = "Predictive State Temporal Difference Learning", Year = {2011} } 

L. Song, B. Boots, S. M. Siddiqi, G. J. Gordon & A. J. Smola  Hilbert Space Embeddings of Hidden Markov Models. (Best Paper Award) 
2010 
Proceedings of the 27th International Conference on Machine Learning (ICML2010) 
Abstract: Hidden Markov Models (HMMs) are important tools for modeling sequence
data. However, they are restricted to discrete latent states, and are
largely restricted to Gaussian and discrete observations. And,
learning algorithms for HMMs have predominantly relied on local search
heuristics, with the exception of spectral methods such as those
described below. We propose a Hilbert space embedding of HMMs that
extends traditional HMMs to structured and nonGaussian continuous distributions. Furthermore, we derive a
localminimumfree kernel spectral algorithm for learning these
HMMs. We apply our method to robot vision data, slot car inertial
sensor data and audio event classification data, and show that in
these applications, embedded HMMs exceed the previous stateoftheart
performance. 

BibTeX:
@inproceedings{Song:2010fk, Author = "L. Song and B. Boots and S. M. Siddiqi and G. J. Gordon and A. J. Smola", Booktitle = "Proc.\ 27th Intl.\ Conf.\ on Machine Learning (ICML)", Title = "Hilbert Space Embeddings of Hidden {M}arkov Models", Year = {2010} } 

B. Boots, S. M. Siddiqi & G. J. Gordon.  Closing the LearningPlanning Loop with Predictive State Representations. (Selected for plenary presentation: 8% acceptance rate) 
2010  Proceedings of Robotics: Science and Systems
VI (RSS2010) 
Abstract: A central problem in artificial intelligence is to choose actions to maximize reward in a partially observable, uncertain environment. To do so, we must learn an accurate model of our environment, and then plan to maximize reward. Unfortunately, learning algorithms often recover a model which is too inaccurate to support planning or too large and complex for planning to be feasible; or, they require large amounts of prior domain knowledge or fail to provide important guarantees such as statistical consistency. To begin to fill this gap, we propose a novel algorithm which provably learns a compact, accurate model directly from sequences of actionobservation pairs. To evaluate the learned model, we then close the loop from observations to actions: we plan in the learned model and recover a policy which is nearoptimal in the original environment (not the model). In more detail, we present a spectral algorithm for learning a Predictive State Representation (PSR). We demonstrate the algorithm by learning a model of a simulated highdimensional, visionbased mobile robot planning task, and then performing approximate pointbased planning in the learned PSR. This experiment shows that the algorithm learns a state space which captures the essential features of the environment, allows accurate prediction with a small number of parameters, and enables successful and efficient planning. Our algorithm has several benefits which have not appeared together in any previous PSR learner: it is computationally efficient and statistically consistent; it handles highdimensional observations and long time horizons by working from realvalued features of observation sequences; and finally, our closetheloop experiments provide an endtoend practical test. 

BibTeX:
@inproceedings{BootsRSS10, Author = "B. Boots AND S. Siddiqi and G. Gordon", Title = "Closing the LearningPlanning Loop with Predictive State Representations", Booktitle = "Proceedings of Robotics: Science and Systems", Year = "2010", Address = "Zaragoza, Spain", Month = "June" } 

S. M. Siddiqi, B. Boots & G. J. Gordon.  ReducedRank Hidden Markov Models. (Selected for plenary presentation: 8% acceptance rate) 
2010  Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS2010) 
Abstract: Hsu et al. (2009) recenly proposed an efficient, accurate spectral learning algorirhm for Hidden Markov Models (HMMs). In this paper we relax their assumptions and prove a tighter finitesample error bound for the case of ReducedRank HMMs, i.e., HMMs with lowrank transition matrices. Since rankk RRHMMs are a larger class of models than kstate HMMs while being equally efficient to work with, this relaxation greatly increases the learning algorithm's scope. In addition, we generalize the algorithm and bounds to models where multiple observations are needed to disambiguate state, and to models that emit multivariate realvalued observations. Finally we prove consistency for learning Predictive State Representations, an even larger class of models. Experiments on synthetic data and a toy video, as well as on difficult robot vision data, yeild accurate models that compare favorably with alternatives in simulation quality and prediction accuracy. 

BibTeX:
@inproceedings{Siddiqi10a, author = "Sajid Siddiqi and Byron Boots and Geoffrey J. Gordon", title = "ReducedRank Hidden {Markov} Models", booktitle = "Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS2010)", year = "2010" } 

B. Boots, S. M. Siddiqi & G. J. Gordon.  Closing the LearningPlanning Loop with Predictive State Representations. (Short paper: for longer version see paper accepted at RSS above)  2010 
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS2010) 
Abstract: A central problem in artificial intelligence is to plan to maximize future reward under uncertainty in a partially observable environment. Models of such environments include Partially Observable Markov Decision Processes (POMDPs) as well as their generalizations, Predictive State Representations (PSRs) and Observable Operator Models (OOMs). POMDPs model the state of the world as a latent variable; in contrast, PSRs and OOMs
represent state by tracking occurrence probabilities of a set of future events (called tests or characteristic events) conditioned on past events (called histories or indicative events). Unfortunately, exact planning algorithms such as value iteration are intractable for most realistic POMDPs due to the curse of history and the curse of dimensionality. However, PSRs and OOMs hold the promise of mitigating both of these curses: first, many successful approximate planning techniques designed to address these problems in POMDPs can easily be adapted to PSRs and OOMs. Second, PSRs and OOMs are often more compact than their corresponding POMDPs (i.e., need fewer state dimensions), mitigating the curse of dimensionality. Finally, since tests and histories are observable quantities, it has been suggested that PSRs and OOMs should be easier to learn than POMDPs; with a successful learning algorithm, we can look for a model which ignores all but the most important components of state, reducing dimensionality still further. 

BibTeX:
@inproceedings{Boots10a, author = "Byron Boots and Sajid Siddiqi and Geoffrey J. Gordon", title = "Closing the LearningPlanning Loop with Predictive State Representations", booktitle = "Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS2010)", year = "2010" } 

S. M. Siddiqi, B. Boots & G. J. Gordon.  A Constraint Generation Approach to Learning Stable Linear Dynamical Systems.
(Selected for plenary presentation: 2.5% acceptance rate) 
2008  Proceedings of Advances in Neural Information Processing Systems 21 (NIPS2007) 
Abstract: Stability is a desirable characteristic for linear dynamical systems, but it is often ignored by algorithms that learn these systems from data. We propose a novel method for learning stable linear dynamical systems: we formulate an approximation of the problem as a convex program, start with a solution to a relaxed version of the program, and incrementally add constraints to improve stability. Rather than continuing to generate constraints until we reach a feasible solution, we test stability at each step; because the convex program is only an approximation of the desired problem, this early stopping rule can yield a higherquality solution. We apply our algorithm to the task of learning dynamic textures from image sequences as well as to modeling biosurveillance drugsales data. The constraint generation approach leads to noticeable improvement in the quality of simulated sequences. We compare our method to those of Lacy and Bernstein, with positive results in terms of accuracy, quality of simulated sequences, and efﬁciency. 

BibTeX:
@inproceedings{Siddiqi07b, author = "Sajid Siddiqi and Byron Boots and Geoffrey J. Gordon", title = "A Constraint Generation Approach to Learning Stable Linear Dynamical Systems", booktitle = "Proceedings of Advances in Neural Information Processing Systems 20 (NIPS07)", year = "2007" } 

S. M. Siddiqi, B. Boots G. J. Gordon, & A. W. Dubrawski  Learning Stable Multivariate Baseline Models for Outbreak Detection.  2007  Advances in Disease Surveillance 
Abstract: We propose a novel technique for building generative models of realvalued multivariate time series data streams. Such models are of considerable utility as baseline simulators in anomaly detection systems. The proposed algorithm, based on Linear Dynamical Systems (LDS), learns stable parameters efficiently while yeilding more accurate results than previously known methods. The resulting model can be used to generate infinitely long sequences of realistic baselines using small samples of training data. 

BibTeX:
@article{Siddiqi07c, author = "Sajid Siddiqi and Byron Boots and Geoffrey J. Gordon and Artur W. Dubrawski", title = "Learning Stable Multivariate Baseline Models for Outbreak Detection", journal = "Advances in Disease Surveillance", year = "2007", volume = {4}, pages = {266} } 

B. Boots, S. Nundy & D. Purves.  Evolution of Visually Guided Behavior in Artificial Agents.  2007  Network: Computation in Neural Systems 
Abstract: Recent work on brightness, color, and form has suggested that human visual percepts represent the probable sources of retinal images rather than stimulus features as such. Here we investigate the plausibility of this empirical concept of vision by allowing autonomous agents to evolve in virtual environments based solely on the relative success of their behavior. The responses of evolved agents to visual stimuli indicate that fitness improves as the neural network control systems gradually incorporate the statistical relationship between projected images and behavior appropriate to the sources of the inherently ambiguous images. These results: (1) demonstrate the merits of a wholly empirical strategy of animal vision as a means of contending with the inverse optics problem; (2) argue that the information incorporated into biological visual processing circuitry is the relationship between images and their probable sources; and (3) suggest why human percepts do not map neatly onto physical reality. 

BibTeX:
@article{Boots2007a, author = "Byron Boots and Surajit Nundy and Dale Purves", title = "Evolution of Visually Guided Behavior in Artificial Agents", journal = "Network: Computation in Neural Systems", year = "2007", volume = {18(1)}, pages = {1134} } 

S. Majercik & B. Boots  DCSSAT: A DivideandConquer Approach to Solving Stochastic Satisfiability Problems Efficiently. (18% acceptance rate) 
2005 
Proceedings of the 20th National Conference on Artificial Intelligence (AAAI2005) 
Abstract: We present DCSSAT, a sound and complete divideandconquer algorithm for solving stochastic satisfiability (SSAT) problems that outperforms the best existing algorithm for solving such problems (ZANDER) by several orders of magnitude with respect to both time and space. DCSSAT achieves this performance by dividing the SSAT problem into subproblems based on the structure of the original instance, caching the viable partial assignments (VPAs) generated by solving these subproblems, and using these VPAs to construct the solution to the original problem. DCSSAT does not save redundant VPAs and each VPA saved is necessary to construct the solution. Furthermore, DCSSAT builds a solution that is already humancomprehensible, allowing it to avoid the costly solution rebuilding phase in ZANDER. As a result, DCSSAT is able to solve problems using, typically, 12 orders of magnitude less space than ZANDER, allowing DCSSAT to solve problems ZANDER cannot solve due to space constraints. And, in spite of its more parsimonious use of space, DCSSAT is typically 12 orders of magnitude faster than ZANDER. We describe the DCSSAT algorithm and present empirical results comparing its performance to that of ZANDER on a set of SSAT problems. 

BibTeX:
@inproceedings{Majercik2005, author = "Stephen M. Majercik and Byron Boots", title = "DCSSAT: A DivideandConquer Approach to Solving Stochastic Satisfiability Problems Efficiently", booktitle = "Proceedings of the Twentieth National Conference on Artificial Intelligence (AAAI05)", year = "2005", note = "AAAI05" } 
Refereed Abstracts & Workshop Publications
Authors  Title  Year  Venue 

B. Boots, A. Byravan, & D. Fox. 
Learning Predictive Models of a Depth Camera & Manipulator from Raw Execution Traces.  2013  NIPS Workshop on Advances in Machine Learning for Sensorimotor Control 
Abstract: We attack the problem of learning a predictive model of a depth camera and manipulator directly from raw execution traces. While the problem of learning manipulator models from visual and proprioceptive data has been addressed before, existing techniques often rely on assumptions about the structure of the robot or tracked features in observation space. We make no such assumptions. Instead, we formulate the problem as that of learning a highdimensional controlled stochastic process. We leverage recent work on nonparametric predictive state representations to learn a generative model of the depth camera and robotic arm from sequences of uninterpreted actions and observations. We perform several experiments in which we demonstrate that our learned model can accurately predict future observations in response to sequences of motor commands. 

BibTeX:
@article{Boots13NIPSa, author = "Byron Boots and Arunkumar Byravan and Dieter Fox", title = "Learning Predictive Models of a Depth Camera & Manipulator from Raw Execution Traces", journal = "NIPS Workshop on Advances in Machine Learning for Sensorimotor Control", year = "2013" } 

B. Boots & D. Fox.  Learning Dynamic Policies from Demonstration.  2013  NIPS Workshop on Advances in Machine Learning for Sensorimotor Control 
Abstract: We address the problem of learning a policy directly from expert demonstrations. Typically, this problem is solved with a supervised learning approach such as regression or classification resulting in a reactive policy. Unfortunately, reactive policies cannot model longrange dependancies, and this omission can result in suboptimal performance. We take a different approach. We observe that policies and dynamical systems are mathematical duals, and we use this fact to leverage the rich literature on system identification to learn policies from demonstration. System identification algorithms often have desirable properties like the ability to model longrange dependancies, statistical consistency, and efficient offtheshelf implementations. We show that by employing system identification to learning from demonstration problems, all of these properties can be carried over to the learning from demonstration domain resulting in improved practical performance. 

BibTeX:
@article{Boots13NIPSb, author = "Byron Boots and Dieter Fox", title = "Learning Dynamic Policies from Demonstration", journal = "NIPS Workshop on Advances in Machine Learning for Sensorimotor Control", year = "2013" } 

B. Boots, A. Gretton, & G. J. Gordon. 
Hilbert Space Embeddings of PSRs.  2013  ICML Workshop on Machine Learning and System Identification 
Abstract: We fully generalize PSRs to continuous observations and actions using a recent concept called Hilbert space embeddings of distributions. The essence of our method is to represent distributions of tests, histories, observations, and actions, as points in (possibly) infinitedimensional Reproducing Kernel Hilbert Spaces (RKHSs). During filtering we update these distributions using a kernel version of Bayes' rule. To improve computational tractability, we develop a spectral system identification method to learn a succinct parameterization of the target system. 

BibTeX:
@article{Boots13ICML, author = "Byron Boots, Arthur Gretton, and Geoffrey J. Gordon", title = "Hilbert Space Embeddings of PSRs", journal = "ICML workshop on Machine Learning and System Identification (MLSYSID)", year = "2013" } 

B. Boots, A. Gretton, & G. J. Gordon. 
Hilbert Space Embeddings of PSRs.  2012  NIPS Workshop on Spectral Algorithms for Latent Variable Models 
Abstract: We fully generalize PSRs to continuous observations and actions using a recent concept called Hilbert space embeddings of distributions. The essence of our method is to represent distributions of tests, histories, observations, and actions, as points in (possibly) infinitedimensional Reproducing Kernel Hilbert Spaces (RKHSs). During filtering we update these distributions using a kernel version of Bayes' rule. To improve computational tractability, we develop a spectral system identification method to learn a succinct parameterization of the target system. 

BibTeX:
@article{Boots12NIPSb, author = "Byron Boots, Arthur Gretton, and Geoffrey J. Gordon", title = "Hilbert Space Embeddings of PSRs", journal = "NIPS Workshop on Spectral Algorithms for Latent Variable Models", year = "2012" } 

B. Boots, & G. J. Gordon.  A Spectral Learning Approach to RangeOnly SLAM.  2012  NIPS Workshop on Spectral Algorithms for Latent Variable Models 
Abstract: We present a novel spectral learning algorithm for simultaneous localization and mapping (SLAM) from range data with known correspondences. This algorithm is an instance of a general spectral system identification framework, from which it inherits several desirable properties, including statistical consistency and no local optima. Compared with popular batch optimization or multiplehypothesis tracking methods for rangeonly SLAM, our spectral approach offers guaranteed low computational requirements and good tracking performance. 

BibTeX:
@article{Boots12NIPSa, author = "Byron Boots and Geoffrey J. Gordon", title = "A Spectral Learning Approach to RangeOnly SLAM", journal = "NIPS Workshop on Spectral Algorithms for Latent Variable Models", year = "2012" } 

B. Boots & G. J. Gordon.  Online Spectral Identification of Dynamical Systems.  2011  NIPS Workshop on Sparse Representation and Lowrank Approximation 
Abstract: Recently, a number of researchers have proposed spectral algorithms for learning models of dynamical systemsfor example, Hidden Markov Models (HMMs), Partially Observable Markov Decision Processes (POMDPs), and Transformed Predictive State Representations (TPSRs). These algorithms are attractive since they are statistically consistent and not subject to local optima. However,
they are batch methods: they need to store their entire training data set in memory at once and operate on it as a large matrix, and so they cannot scale to extremely large data sets (either many examples or many features per example). In turn, this restriction limits their ability to learn accurate models of complex systems. To overcome these limitations, we propose a new online spectral algorithm, which uses tricks such as incremental SVD updates and random projections to scale to much larger data sets and more complex systems than previous methods. We demonstrate the new method on a highbandwidth video mapping task, and illustrate desirable behaviors such as "closing the loop,'' where the latent state representation changes suddenly as the learner recognizes that it has returned to a previously known place. 

BibTeX:
@inproceedings{Boots11nips, author = "Byron Boots and Geoffrey J. Gordon", title = "Online Spectral Identification of Dynamical Systems", journal = "NIPS Workshop on Sparse Representation and Lowrank Approximation", year = "2011" } 

B. Boots, S. M. Siddiqi & G. J. Gordon.  Closing the LearningPlanning Loop with Predictive State. Representations.  2009  NIPS Workshop on Probabilistic Approaches for Robotics and Control 
Abstract: We propose a principled and provably statistically consistent modellearning algorithm, and demonstrate positive results on a challenging highdimensional problem with continuous observations. In particular, we propose a novel, consistent spectral algorithm for learning a variant of PSRs called Transformed PSRs (TPSRs) directly from execution traces. 

BibTeX:
@article{Boots09nips, author = "Byron Boots and Sajid M. Siddiqi and Geoffrey J. Gordon", title = "Closing the LearningPlanning Loop with Predictive State Representations", journal = "NIPS Workshop on Probabilistic Approaches for Robotics and Control", year = "2009" } 

D. Purves, & B. Boots  Evolution of Visually Guided Behavior in Artificial Agents.  2006  Vision Science Society Annual Meeting/Journal of Vision 
Abstract: Recent work on brightness, color and form has suggested that human visual percepts represent the probable sources of retinal images rather than stimulus features as such. We have investigated this empirical concept of vision by asking whether agents using neural network control systems evolve successful visually guided behavior based solely on the statistical relationship of images on their sensor arrays and the probable sources of the images in a simulated environment. A virtual environment was created with OpenGL consisting of an arena with a central obstacle, similar to arenas used in evolutionary robotics experiments. The neural control system for each agent comprised a singlelayer, feedforward network that connected all 256 inputs from a sensor array to two output nodes that encoded rotation and translation responses. Each agent's behavioral actions in the environment were evaluated, and the fittest individuals selected to produce a new population according to a standard genetic algorithm. This process was repeated until the average fitness of subsequent generations reached a plateau. Analysis of the actions of evolved agents in response to visual input showed their neural network control systems had incorporated the statistical relationship between projected images and their possible sources, and that this information was used to produce increasingly successful visually guided behavior. The simplicity of this paradigm notwithstanding, these results support the idea that biological vision has evolved to solve the inverse problem on a wholly empirical basis, and provide a novel way of exploring visual processing. 

BibTeX:
@article{Purvesn2006, author = "Dale Purves and Byron Boots", title = "Evolution of Visually Guided Behavior in Artificial Agents", journal = "Journal of Vision", year = {2006}, volume = {6(6)}, pages = {356a} } 
Technical Reports & Book Chapters
Authors  Title  Year  Tech. Report/Book 

B. Boots & G. J. Gordon.  A Spectral Learning Approach to RangeOnly SLAM.  2012  Technical Report arXiv:1207.2491 
Abstract: We present a novel spectral learning algorithm for simultaneous localization and mapping (SLAM) from range data with known correspondences. This algorithm is an instance of a general spectral system identification framework, from which it inherits several desirable properties, including statistical consistency and no local optima. Compared with popular batch optimization or multiplehypothesis tracking (MHT) methods for rangeonly SLAM, our spectral approach offers guaranteed low computational requirements and good tracking performance. Compared with popular extended Kalman filter (EKF) or extended information filter (EIF) approaches, and many MHT ones, our approach does not need to linearize a transition or measurement model; such linearizations can cause severe errors in EKFs and EIFs, and to a lesser extent MHT, particularly for the highly nonGaussian posteriors encountered in rangeonly SLAM. We provide a theoretical analysis of our method, including finitesample error bounds. Finally, we demonstrate on a realworld robotic SLAM problem that our algorithm is not only theoretically justified, but works well in practice: in a comparison of multiple methods, the lowest errors come from a combination of our algorithm with batch optimization, but our method alone produces nearly as good a result at far lower computational cost. 

BibTeX:
@article{Boots11arxiv, author = "Byron Boot and Geoffrey J. Gordon", year = {2012}, title = "A Spectral Learning Approach to RangeOnly SLAM", journal = {http://arxiv.org/abs/1207.2491} } 

B. Boots & G. J. Gordon.  TwoManifold Problems.  2011  Technical Report arXiv:1112.6399 
Abstract: Recently, there has been much interest in spectral approaches to learning manifoldssocalled kernel eigenmap methods. These methods have had some successes, but their applicability is limited because they are not robust to noise. To address this limitation, we look at twomanifold problems, in which we simultaneously reconstruct two related manifolds, each representing a different view of the same data. By solving these interconnected learning problems together and allowing information to flow between them, twomanifold algorithms are able to succeed where a nonintegrated approach would fail: each view allows us to suppress noise in the other, reducing bias in the same way that an instrumental variable allows us to remove bias in a {linear} dimensionality reduction problem. We propose a class of algorithms for twomanifold problems, based on spectral decomposition of crosscovariance operators in Hilbert space. Finally, we discuss situations where twomanifold problems are useful, and demonstrate that solving a twomanifold problem can aid in learning a nonlinear dynamical system from limited data. 

BibTeX:
@article{Boots11arxiv, author = "Byron Boot and Geoffrey J. Gordon", year = {2011}, title = "TwoManifold Problems", journal = {http://arxiv.org/abs/1112.6399} } 

B. Boots & G. J. Gordon.  Predictive State Temporal Difference Learning.  2010  Technical Report arXiv:1011.0041 
Abstract: We propose a new approach to value function approximation which combines linear temporal difference reinforcement learning with subspace identification. In practical applications, reinforcement learning (RL) is complicated by the fact that state is either highdimensional or partially observable. Therefore, RL methods are designed to work with features of state rather than state itself, and the success or failure of learning is often determined by the suitability of the selected features. By comparison, subspace identification (SSID) methods are designed to select a feature set which preserves as much information as possible about state. In this paper we connect the two approaches, looking at the problem of reinforcement learning with a large set of features, each of which may only be marginally useful for value function approximation. We introduce a new algorithm for this situation, called Predictive State Temporal Difference (PSTD) learning. As in SSID for predictive state representations, PSTD finds a linear compression operator that projects a large set of features down to a small set that preserves the maximum amount of predictive information. As in RL, PSTD then uses a Bellman recursion to estimate a value function. We discuss the connection between PSTD and prior approaches in RL and SSID. We prove that PSTD is statistically consistent, perform several experiments that illustrate its properties, and demonstrate its potential on a difficult optimal stopping problem.


BibTeX:
@article{Boots10arxiv, author = "Byron Boot and Geoffrey J. Gordon", year = {2010}, title = "Predictive State Temporal Difference Learning", journal = {http://arxiv.org/abs/1011.0041} } 

B. Boots, S. M. Siddiqi & G. J. Gordon.  Closing the LearningPlanning Loop with Predictive State Representations. (Updated: 2010)  2009  Technical Report arXiv:0912.2385 
Abstract: A central problem in artificial intelligence is that of planning to maximize
future reward under uncertainty in a partially observable
environment. In this paper we propose and demonstrate a novel
algorithm which accurately learns a model of such an environment
directly from sequences of actionobservation pairs. We then
close the loop from observations to actions by planning in
the learned model and recovering a policy which is nearoptimal in the original environment.
Specifically, we present an efficient and statistically consistent
spectral algorithm for learning the parameters of a Predictive State
Representation (PSR). We demonstrate the algorithm by learning a
model of a simulated highdimensional, visionbased mobile robot
planning task, and then perform approximate pointbased planning in
the learned PSR. Analysis of our results shows that the algorithm
learns a state space which efficiently captures the essential
features of the environment. This representation allows accurate
prediction with a small number of parameters, and enables successful
and efficient planning.


BibTeX:
@article{Boots09arxiv, author = "Byron Boots and Sajid M. Siddiqi and Geoffrey J. Gordon", year = {2009}, title = "Closing the LearningPlanning Loop with Predictive State Representations", journal = {http://arxiv.org/abs/0912.2385} } 

S. M. Siddiqi, B. Boots & G. J. Gordon.  ReducedRank Hidden Markov Models.  2009  Technical Report arXiv:0910.0902 
Abstract: We introduce the ReducedRank Hidden Markov Model (RRHMM), a generalization of HMMs that can model smooth state evolution as in Linear Dynamical Systems (LDSs) as well as nonlogconcave predictive distributions as in continuousobservation HMMs. RRHMMs assume an mdimensional latent state and n discrete observations, with a transition matrix of rank k <= m. This implies the dynamics evolve in a kdimensional subspace, while the shape of the set of predictive distributions is determined by m. Latent state belief is represented with a kdimensional state vector and inference is carried out entirely in R^k, making RRHMMs as computationally efficient as kstate HMMs yet more expressive. To learn RRHMMs, we relax the assumptions of a recently proposed spectral learning algorithm for HMMs and apply it to learn kdimensional observable representations of rankk RRHMMs. The algorithm is consistent and free of local optima, and we extend its performance guarantees to cover the RRHMM case. We show how this algorithm can be used in conjunction with a kernel density estimator to efficiently model highdimensional multivariate continuous data. We also relax the assumption that single observations are sufficient to disambiguate state, and extend the algorithm accordingly. Experiments on synthetic data and a toy video, as well as on a difficult robot vision modeling problem, yield accurate models that compare favorably with standard alternatives in simulation quality and prediction capability. 

BibTeX:
@article{siddiqi09arxiv, author = "Sajid M. Siddiqi and Byron Boots and Geoffrey J. Gordon", year = "2009", title = "ReducedRank Hidden {M}arkov Models", journal = "http://arxiv.org/abs/0910.0902" } 

S. M. Siddiqi, B. Boots & G. J. Gordon.  A Constraint Generation Approach to Learning Stable Linear Dynamical Systems.  2008 
Technical Report CMUML08101 
Abstract: Stability is a desirable characteristic for linear dynamical systems, but it is often ignored by algorithms that learn these systems from data. We propose a novel method for learning stable linear dynamical systems: we formulate an approximation of the problem as a convex program, start with a solution to a relaxed version of the program, and incrementally add constraints to improve stability. Rather than continuing to generate constraints until we reach a feasible solution, we test stability at each step; because the convex program is only an approximation of the desired problem, this early stopping rule can yield a higherquality solution. We apply our algorithm to the task of learning dynamic textures from image sequences as well as to modeling biosurveillance drugsales data. The constraint generation approach leads to noticeable improvement in the quality of simulated sequences. We compare our method to those of Lacy and Bernstein, with positive results in terms of accuracy, quality of simulated sequences, and efﬁciency. 

BibTeX:
@techreport{Siddiqi08, author = "Sajid Siddiqi and Byron Boots and Geoffrey J. Gordon", title = "A Constraint Generation Approach to Learning Stable Linear Dynamical Systems", institution = {Carnegie Mellon University}, number = {CMUML08101}, year = {2008} } 

E. Chown & B. Boots  Learning in Cognitive Maps: Finding Useful Structure in an Uncertain World.  2008  Robot and Cognitive Approaches to Spatial Mapping 
Abstract: In this chapter we will describe the central mechanisms that influence how people learn about largescale space. We will focus particularly on how these mechanisms enable people to effectively cope with both the uncertainty inherent in a constantly changing world and also with the high information content of natural environments. The major lessons are that humans get by with a “less is more” approach to building structure, and that they are able to quickly adapt to environmental changes thanks to a range of general purpose mechanisms. By looking at abstract principles, instead of concrete implementation details, it is shown that the study of human learning can provide valuable lessons for robotics. Finally, these issues are discussed in the context of an implementation on a mobile robot. 

BibTeX:
@incollection{Chown2008, author = "Eric Chown and Byron Boots", title = "Learning Cognitive Maps: Finding Useful Structure in an Uncertain World", booktitle = "Robot and Cognitive Approaches to Spatial Mapping", editor = "Margaret E. Jefferies and WaiKiang Yeap", publisher = {Springer Verlag}, year = {2008}, chapter = {10}, pages ={215236} } 
Theses
Author  Title  Year  Degree & Institution 

B. Boots  Spectral Approaches to Learning Predictive Representations.  2012 
Doctoral Thesis Carnegie Mellon University 
Abstract: A central problem in artificial intelligence is to choose actions to maximize reward in a partially observable, uncertain environment. To do so, we must obtain an accurate environment model, and then plan to maximize reward. However, for complex domains, specifying a model by hand can be a time consuming process. This motivates an alternative approach: learning a model directly from observations. Unfortunately, learning algorithms often recover a model that is too inaccurate to support planning or too large and complex for planning to succeed; or, they require excessive prior domain knowledge or fail to provide guarantees such as statistical consistency. To address this gap, we propose spectral subspace identification algorithms which provably learn compact, accurate, predictive models of partially observable dynamical systems directly from sequences of actionobservation pairs. Our research agenda includes several variations of this general approach: spectral methods for classical models like Kalman filters and hidden Markov models, batch algorithms and online algorithms, and kernelbased algorithms for learning models in high and infinitedimensional feature spaces. All of these approaches share a common framework: the model's belief space is represented as predictions of observable quantities and spectral algorithms are applied to learn the model parameters. Unlike the popular EM algorithm, spectral learning algorithms are statistically consistent, computationally efficient, and easy to implement using established matrixalgebra techniques. We evaluate our learning algorithms on a series of prediction and planning tasks involving simulated data and real robotic systems. 

BibTeX:
@phdthesis{BootsThesis2012, author = "Byron Boots", title = "Spectral Approaches to Learning Predictive Representations", school = "Carnegie Mellon University", year = {2012}, month = {December}, } 

B. Boots  Spectral Approaches to Learning Predictive Representations: Thesis Proposal
[Abstract]  2011 
Thesis Proposal Carnegie Mellon University 
Abstract: A central problem in artificial intelligence is to choose actions to maximize reward in a partially observable, uncertain environment. To do so, we must obtain an accurate environment model, and then plan to maximize reward. However, for complex domains, specifying a model by hand can be a time consuming process. This motivates an alternative approach: learning a model directly from observations. Unfortunately, learning algorithms often recover a model that is too inaccurate to support planning or too large and complex for planning to succeed; or, they require excessive prior domain knowledge or fail to provide guarantees such as statistical consistency. To address this gap, we propose spectral subspace identification algorithms which provably learn compact, accurate, predictive models of partially observable dynamical systems directly from sequences of actionobservation pairs. Our research agenda includes several variations of this general approach: batch algorithms and online algorithms, kernelbased algorithms for learning models in high and infinitedimensional feature spaces, and manifoldbased identification algorithms. All of these approaches share a common framework: they are statistically consistent, computationally efficient, and easy to implement using established matrixalgebra techniques. Additionally, we show that our framework generalizes a variety of successful spectral learning algorithms in diverse areas, including the identification of Hidden Markov Models, recovering structure from motion, and discovering manifold embeddings. We will evaluate our learning algorithms on a series of prediction and planning tasks involving simulated data and real robotic systems. We anticipate several difficulties while moving from smaller problems and synthetic problems to larger practical applications. The first is the challenge of scaling learning algorithms up to the higherdimensional state spaces that more complex tasks require. The second is the problem of integrating expert knowledge into the learning procedure. The third is the problem of properly accounting for actions and exploration in controlled systems. We believe that overcoming these remaining difficulties will allow our models to capture the essential features of an environment, predict future observations well, and enable successful planning. 

B. Boots  Learning Stable Linear Dynamical Systems.  2009 
M.S. Machine Learning Carnegie Mellon University 
Abstract: Stability is a desirable characteristic for linear dynamical systems, but it is often ignored
by algorithms that learn these systems from data. We propose a novel method for learning
stable linear dynamical systems: we formulate an approximation of the problem as a convex
program, start with a solution to a relaxed version of the program, and incrementally add
constraints to improve stability. Rather than continuing to generate constraints until we
reach a feasible solution, we test stability at each step; because the convex program is only
an approximation of the desired problem, this early stopping rule can yield a higherquality
solution. We employ both maximum likelihood and subspace ID methods to the problem of
learning dynamical systems with exogenous inputs directly from data. Our algorithm is applied to a variety of problems including the tasks of learning dynamic textures from image
sequences, learning a model of laser and vision sensor data from a mobile robot, learning
stable baseline models for drugsales data in the biosurveillance domain, and learning a
model to predict sunspot data over time. We compare the constraint generation approach
to learning stable dynamical systems to the best previous stable algorithms (Lacy and Bernstein, 2002, 2003), with positive results in terms of prediction accuracy, quality of simulated
sequences, and computational eﬃciency. 

BibTeX:
@MastersThesis{Boots:Thesis:2009, author = "Byron Boots", title = "Learning Stable Linear Dynamical Systems", school = "Carnegie Mellon University", year = {2009}, month = {May}, } 

B. Boots  Robot Localization and Abstract Mapping in Dynamic Environments.  2003 
B.A. Computer Science Bowdoin College 
Abstract: In this chapter we will describe the central mechanisms that influence how people learn about largescale space. We will focus particularly on how these mechanisms enable people to effectively cope with both the uncertainty inherent in a constantly changing world and also with the high information content of natural environments. The major lessons are that humans get by with a “less is more” approach to building structure, and that they are able to quickly adapt to environmental changes thanks to a range of general purpose mechanisms. By looking at abstract principles, instead of concrete implementation details, it is shown that the study of human learning can provide valuable lessons for robotics. Finally, these issues are discussed in the context of an implementation on a mobile robot. 

BibTeX:
techreport{Boots2003b, author = "Byron Boots", title = "Robot Localization and Abstract Mapping in Dynamic Environments", institution = {Bowdoin College}, year = {2003} } 

B. Boots  Chunking: A Modified Dynamic Programming Approach to Solving Stochastic Satisfiability Problems Efficiently.  2003 
B.A. Computer Science Bowdoin College 
Abstract: The best general stochastic satisfiability solvers systematically evaluate all possible variable assignments, using heuristics to prune assignments whenever possible. The idea of chunking differs in that it breaks the solution to stochastic satisfiability problems into pieces, amounting to a modified dynamic programming approach. The benefit of this approach, as compared with the straightforward application of dynamic programming, is that the saved solutions to the problem pieces are partial solutions and thus reusable in multiple situations. 

BibTeX:
techreport{Boots2003a, author = "Byron Boots", title = "Chunking: A Modified Dynamic Programming Approach to Solving Stochastic Satisfiability Problems Efficiently", institution = {Bowdoin College}, year = {2003} } 