16-832 Integrated Planning and Learning

Planning has always been one of the central components in autonomy stacks of robotic systems, ranging from self-driving vehicles and autonomous drones to mobile manipulation platforms and quadrupeds to multi-agent robotic systems. Learning-based control such as Deep RL and Imitation Learning has also emerged as one of the powerful, and potentially alternative, approaches to controlling various robotic systems, especially the ones that need to interact with their environment. These two methodologies to robot control have their own strengths and weaknesses but are often complementary. Planning provides strong guarantees on performance and safety but can be computationally intensive and depends heavily on the quality of a model, which is often hard to attain when interaction with environment is involved. Learning-based control is often faster and removes dependency on a model but depends heavily on the adequacy of the training data. This complementarity of the approaches leads to strong interest in combining them, both within academia and industry.

This class studies the latest algorithmic approaches to integrated planning and learning. In particular, the class first examines different reasons for combining the two methodologies, from computational reasons to model dependency to the duration of the development cycle of a fielded robotic system. It then studies state-of-the-art research within each group of algorithmic approaches to integrating planning and learning, each group potentially targeting its own reasons for doing so.

The course is structured to have several classes where the instructor teaches the material. The rest of the class, the students present papers from the list of papers compiled by the instructor. In addition, the students have to come up with and conduct a semester-long research project in the area of integrated planning and learning. They will be presented with a set of potential domains in the areas of manipulation, multi-agent coordination, and others, with the corresponding codebases and test benchmarks. The students are also free to choose their own domains based on their interests. The project is supposed to lead to a research paper worthy of publication at a top-tier robotics venue.

To take the class students should have good knowledge in classical and NN-based machine learning, learning-based control (Deep RL, Imitation Learning), planning and decision-making, and strong programming skills to complete the research project.

Spring 2026 Course Information

Announcements

1/25: The class on Monday 1/26 will be held over zoom. The link was posted on Piazza.

Dates/times

Class meetings:

Mondays, Wednesdays, 12:30-1:50PM, NSH 3002

Instructor

Who	Email
Maxim Likhachev

Teaching Assistants

Who	Email
Gopal Venkitachalam

Office Hours

Who	Location	Hours
Maxim	NSH 3211	By appointment
Gopal	NSH 1612	Tue 2:30-3:30PM and Thu 5-6PM

Grading

The criteria used to compute the final grade include the quality of the research project and participation in the class including the presentation of papers:

Research project	70%
In-class participation and paper presentations	30%

Class lectures/notes:

Tentative schedule posted here (PDF)

Date	Topic	Papers	Slides	Additional Info
1/12 (Mon)	Introduction; Why Integrate Planning and Learning	-	slides	-
1/14 (Wed)	Integrating learning into planning: speeding up planning Part 1	-	slides	Test domains: MAPF, Manipulation
1/19 (Mon)	MLK DAY: NO CLASS	-	-	-
1/21 (Wed)	Integrating learning into planning: speeding up planning Part 1 (cont'd)	-	-	-
1/26 (Mon)	Integrating learning into planning: learning cost function Part 1	-	slides	-
1/28 (Wed)	Integrating learning into planning: learning goal conditions Part 1	see schedule above	Open-World Task and Motion Planning, Survey of Optimization-based Task and Motion Planning_ From Classical To Learning Approaches	-
2/2 (Mon)	Integrating learning into planning: planning with imperfect world dynamics model Part 1	see schedule above	CMAX++, SACHA: Soft Actor-Critic with Heuristic-Based Attention for Partially Observable Multi-Agent Path Finding	-
2/4 (Wed)	Integrating planning into learning: planning for learning long-horizon tasks Part 1	see schedule above	Deep Skill Graphs, SPIN: distilling Skill-RRT for long-horizon prehensile and non-prehensile manipulation	-
2/9 (Mon)	Integrating planning into learning: safe task achievement in learning-based control Part I	see schedule above	Primer on Diffusion, Motion Planning Diffusion	-
2/11 (Wed)	Integrating planning into learning: learning from planning Part I	see schedule above	Planning-Guided Diffusion Policy Learning for Bimanual Manipulation, Offline Imitation Learning Through Graph Search and Retrieval	-
2/16 (Mon)	Integrating planning into learning: improving inference process Part I	see schedule above	Stream of Search Learning to Search in Language	-
2/18 (Wed)	Project Proposal Presentations	-	-	-
2/23 (Mon)	Project Proposal Presentations (cont'd)	-	-	-
2/25 (Wed)	Integrating planning into learning: improving inference process Part I (cont'd)	see schedule above	Monte Carlo Tree Diffusion for System 2 Planning	-
3/2 (Mon)	SPRING BRAKE: NO CLASSES	-	-	-
3/4 (Wed)	SPRING BRAKE: NO CLASSES	-	-	-
3/9 (Mon)	Learning to plan Part I	see schedule above	Path Planning using Neural A* Search, Value Iteration Networks	-
3/11 (Wed)	Integrating learning into planning: speeding up planning Part II	see schedule above	Using VLM Reasoning to Constrain Task and Motion Planning, MOSAIC: A Skill-Centric Algorithmic Framework for Long-Horizon Manipulation Planning	-
3/16 (Mon)	Integrating learning into planning: learning cost function Part II	see schedule above	Learning Navigation Costs from Demonstration with Semantic Observations, Learning Navigation Costs from Demonstration via Differentiable Planning	-
3/18 (Wed)	Integrating learning into planning: learning goal conditions Part II	see schedule above	Fast and Accurate Task Planning using Neuro-Symbolic Language Models and Multi-level Goal Decomposition, Large Language Models as Commonsense Knowledge for Large-Scale Task Planning	-
3/23 (Mon)	Project Progress Presentations	-	-	-
3/25 (Wed)	Project Progress Presentations (cont'd)	-	-	-
3/30 (Mon)	Integrating learning into planning: planning with imperfect world dynamics model Part II	see schedule above	Learning Skills to Patch Plans Based on Inaccurate Models, Guided Uncertainty-Aware Policy Optimization: Combining Learning and Model-Based Strategies for Sample-Efficient Policy Learning	-
4/1 (Wed)	Integrating planning into learning: scaling up learning-based control to long-horizon tasks Part II	see schedule above	Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments, Search on the Replay Buffer: Bridging Planning and Reinforcement Learning, Planning-Augmented Hierarchical Reinforcement Learning	-
4/6 (Mon)	Integrating planning into learning: safe task achievement in learning-based control Part II	see schedule above	Do what you say: Steering vision-language-action models via runtime reasoning-action alignment verification, Planning with Diffusion for Flexible Behavior Synthesis	-
4/8 (Wed)	Integrating planning into learning: learning from planning Part II	see schedule above	Imitating Task and Motion Planning with Visuomotor Transformers, PLAN-SEQ-LEARN: Language model guided RL for solving long horizon robotics tasks	-
4/13 (Mon)	Integrating planning into learning: improving inference process Part II	see schedule above	Parallel Heuristic Search as Inference for Actor-Critic Reinforcement Learning Models, Tree of thoughts: Deliberate problem solving with large language models	-
4/15 (Wed)	Learning to plan Part II	see schedule above	Beyond a*: Better planning with transformers via search dynamics bootstrapping	-
4/20 (Mon)	Final Project Presentations	-	-	-
4/22 (Wed)	Final Project Presentations (cont'd)	-	-	-