+ 16-785: Integrated Intelligence in Robotics: Vision, Language, and Planning

16-785: Integrated Intelligence in Robotics: Vision, Language, and Planning

Special Theme in Spring 2023:

Film Making using AI and Robotics

A celebrated motion picture, Loving Vincent (2017), was fully hand painted by 152 artists from around the world and its production took 7 years, costing 5.5 million dollars. In this class, we will develop and experiment with AI and robotics technologies to make film. Course topics include painted movies (robot painting), stop motion movies (manipulation), animation (text to image/video generation), aerial/ground cinematography, and more. The final projects will be premiered at the end of the semester.

Instructor: Jean Oh (jeanoh@cmu.edu)
(Please prefix the subject line with [16-785].)

TA: Peter Schaldenbrand (pschalde@andrew.cmu.edu)
(Please prefix the subject line with [16-785].)

Location: NSH 3002
Dates/Times: Monday & Wednesday, 11:00 - 12:20 (Eastern Time)
Office hours: By appointments
Spring 2023 Canvas

[Class Home]

NOTE: This course will be in person. Since the course is discussion intensive synchronous attendance is required.

Course Description

This course covers the topics on building cognitive intelligence for robotic systems. Cognitive capabilities constitute high-level, humanlike intelligence that exhibits reasoning or problem-solving skills. Such capabilities as semantic perception, language understanding, and task planning can be built on top of low-level robot autonomy that enables autonomous control of physical platforms. The topics generally bridge across multiple technical areas, for example, vision-language intersection and language-action/plan grounding.

This course is composed of 50% lectures and 50% seminar classes. Since this is a project-oriented course, we will put a special emphasis on learning research skills, e.g., problem formulation, literature review, ideation, evaluation planning, results analysis, and hypothesis verification.

Prerequisites: There are no explicit prerequisites for this class, but a general background knowledge in AI and machine learning is assumed.

Course Goals

In this course, we will strive to answer the following research questions and beyond towards the goal of developing cognitive capabilities on robots.
  • How can we make robots to perform tasks following natural language instructions?
  • How can we develop robots that can describe in natural language what they perceive through vision or explain what they are doing and why?
  • How can we fuse information coming in multiple modalities, e.g., language and images, to understand context-aware, semantic meanings of sensory data?
  • How do we measure the quality of information translated between different modalities, e.g., how do we measure the quality of language description given an image? What are the limitations and shortcomings of existing metrics?
  • How can we make use of semantic information digested from raw sensory inputs in the process of planning to solve a problem/task?
  • How do we measure the performance of computer vision algorithms outside benchmark datasets, e.g., on robots?
  • How should learned knowledge be stored? Do we need a universal representation for knowledge?
  • How can we make robots learn to improve over time, e.g., by learning new skills?
Using these research questions, we will learn to follow basic steps of conducting research through class projects.

More information on Canvas/Slack

Please check Canvas or Slack (cmu16785) for more information and updates.


Academic Integrity

We formally follow the guidelines in the CMU's academic integrity policy

Reasonable Person Principle (RPP)

We informally follow Reasonable Person Principle (RPP), a base culture of CMU's School of Computer Science, where everyone gives/gets the benefit of doubt for trying to be reasonable. The four rules of RPP are the following:

  • Everyone will be reasonable.
  • Everyone expects everyone else to be reasonable.
  • No one is special.
  • Do not be offended if someone suggests you are not being reasonable.

Extensions and Late Assignments

Each student will have up to 5 days of grace that can be used for any homework in whatever way without a penalty (Note that there will NOT be any extension for final project presentation and report). For example, you can use all of the 5 days for the first homework assignment, or split into 2 and 3 days to use for the first and the second assignments, respectively. After the 5 grace days have been used up, there will be no additional extensions; 50% will be deducted 1 day after a due date, and no points will be given after 2 days.