16-721: Advanced Machine Perception

Spring 2006



16-721 is a graduate seminar devoted to recent research on computer vision. We will be reading an eclectic mix of vision papers on topics such as perception, object and scene recognition, segmentation, tracking, as well as "best papers of all time".

We will meet on Mondays and Wednesdays from 10:30am-11:50am in NSH 3002. The first meeting will be on Monday January 16th, and the final meeting will be on Wednesday May 3, 2006.

Instructor: Alexei (Alyosha) Efros, Assistant Professor, 4207 Newell-Simon Hall.

Office Hours: Monday 12:-12:30 p.m.

Friday 2:30-3:30 p.m.

TA: David Bradley, 2216 Newell-Simon Hall.

Office Hours: Tuesday 1:00-2:00 p.m. or by appointment.

Feel free to send email to efros (at) cs or dbradley (at) cs with any questions.


Check out this list of data sources for some ideas on where to get images to work with.

*NEW* 20 minute Project meetings will be held with each group every other week at Craig Street Coffee on Mondays and on campus on Wednesdays.


Monday (A)

Wednesday (A)

Monday (B)

Wednesday (B)

12:10 - 12:30





12:30 - 12:50

Thompson & Dunlop


Batra & Kim


12:50 - 1:10


Chan & Barnum

1:10 - 1:30



1:30 - 1:50





A list of suggested papers to present is available here.

For some journal-length papers, shorter conference versions have been posted. Feel free to read either paper.

The discussion board for signing up for papers is now available here

Sign up for at least 2 papers, demo 1 and oppose 1.

If you want to change your presentation date, please arrange a swap with another student and notify the instructor at least two weeks in advance.



paper title




Jan. 16

Alyosha Efros

Introduction, Vision: Measurement vs. Perception

Administrative stuff, overview of the course, datasets




Jan. 18

Alyosha Efros

Overview lecture on the physiology of vision

Suggested reading: The Plenoptic Function and the Elements of Early Vision (1991)

Adelson & Bergen



Jan. 23

Alyosha Efros

Dave Thompson

Overview lecture on theories of Visual Perception

Vision is getting easier every day (1995)

What's up in top-down processing? (1991)

Pictorial art and vision (1991)

Patrick Cavanagh


Perception ppt

Cavanagh ppt

Part I: Low-level Vision (images as texture)

Jan. 25

Peter Barnum


Heather Dunlop

Presenting: The Earth Mover's Distance as a Metric for Image Retrieval. (conference version)

Optional Reading: Empirical Evaluation of Dissimilarity Measures for Color and Texture


Presenting: Learning to Detect Natural Image Boundaries Using Local Brightness, Color, and Texture Cues (conference version)

Rubner , Tomasi, & Guibas

Rubner, Puzicha, Tomasi, & Buhmann


Martin, Fowlkes, Malik



Rubner ppt

Martin ppt

Jan. 30

Jonathan Huang

Statistics of Natural Image Categories

Optional Reading: Depth estimation from image structure

Optional Reading: Modeling the shape of the scene: a holistic representation of the spatial envelope

Torralba & Oliva

Torralba, & Oliva

Oliva & Torralba



Feb. 1

Alyosha Efros



Presenting an overview of bag-of-words appraoches:

Optional: When is scene recognition just texture recognition?

Optional: Visual categorization with bags of keypoints

Optional: Object Categorization by Learned Universal Visual Dictionary



Renninger, L.W. & Malik, J

G. Csurka, C. Bray, C. Dance, and L. Fan

Winn, A. Criminisi and T. Minka



Feb. 6

David Bradley

Presenting: Object Recognition with Informative Features and Linear Classification

Optional Reading: Visual features of intermediate complexity and their use in classification

Ullman, S., Vidal-Naquet, M. , and Sali, E

Michel Vidal-Naquet, Shimon Ullman



Feb. 8

Tomasz Malisiewicz (P)

Alyosha Effros (O)

A Bayesian hierarchical model for learning natural scene categories.

Discovering Objects and their Location in Images,

Fei-Fei and P. Perona

Josef Sivic, Bryan Russell, Alexei A. Efros, Andrew Zisserman, Bill Freeman



Part II: Mid-level Vision (Image Segmentation)

Feb. 13-15

Carlos Vallespi (P)

Joseph Djugash (D)

Gunhee Kim (O)

 Normalized cuts and image segmentation

Segmentation using eigenvectors: a unifying view

Jianbo Shi; Malik, J.

Weiss, Y.



Feb. 20

Mohit Gupta (P)

Interactive Graph Cuts for Optimal Boundary & Region Segmentation of Objects in ND Images

Optional: Lazy Snapping


Optional: Video Object Cut and Paste (cool SIGGRAPH video)

Boykov & Jolly

Yin Li, Jian Sun, Chi-Keung Tang, Heung-Yeung Shum

Yin Li, Jian Sun, Heung-Yeung Shum



Feb. 22



Project Proposals





Feb. 27

Derek Hoiem

Geometric Context from a Single Image

Derek Hoiem, Alexei Efros, Martial Hebert



Mar. 1

Tomasz Malisiewicz (P)

Image Segmentation by Data-Driven Markov Chain Monte Carlo

Tu and Zhu



Part III: 2D Recognition

Mar. 6 (A)

Nicolas Chan (P)

Tomasz Malisiewicz (O)

Pete Barnum (D)

Object Detection Using the Statistics of Parts

Robust Real-time Object Detection

H. Schneiderman and T. Kanade

Viola, Jones


Main ppt

Opp. ppt

Demo ppt

Mar. 8 (A)

Pete Barnum (P)

Histograms of Oriented Gradients for Human Detection

Dalal, Triggs



Mar. 13


Spring Break




Mar. 15


 Spring Break




Mar. 20 (B)

David Lee (P)

Heather Dunlop (D)

David Thompson (O)

Object Recognition from Local Scale-Invariant Features

David G. Lowe


Main ppt

Demo ppt

Mar. 22 (B)

Stephan Zickler (P)

Real-time Object Detection for Smart Vehicles

Optional: Automatic Target Recognition by Matching Oriented Edge Pixels

Gavrila & Philomin

Olson & Huttenlocher


Main ppt

Mar. 27 (A)

Gunhee Kim (P)

Joseph Djugash (O)

? (D)

Shape Matching and Object Recognition Using Shape Contexts

Shape Matching and Object Recognition using Low Distortion Correspondences

Belongie, Malik, and Puzicha

A Berg, T Berg, J Malik


Main ppt

Mar. 29 (A)

Dhruv Batra (P)

Krishnan Ramnath (D)

*NEW: paper changed to a more readable version* Active Appearance Models

Optional: Active Appearance Models Revisited

Optional: Manipulating Facial Appearance Through Shape and Color

Optional: A Morphable Model for the Synthesis of 3D Faces

T. F. Cootes, G. J. Edwards, C. J. Taylor

Matthews & Baker

Rowland & Perret

Blanz & Vetter


Main ppt

Demo (quicktime)

Recognition with Segmentation

Apr. 3 (B)

Joseph Djugash (P)

Heather Dunlop (O)

Class-Specific, Top-Down Segmentation

Learning to Segment

Combining Top-Down and Bottom-Up Segmentation

Eran Borenstein, Shimon Ullman

Eran Borenstein, Shimon Ullman

E. Borenstein, E. Sharon, S. Ullman


 Opp ppt

Apr. 5 (B)

Dhruv Batra (P)

Pedestrian Detection in Crowded Scenes

B Leibe, E Seemann, B Schiele


Main ppt

Apr. 10 (A)

Nik Melchior (P)

David Lee (O)

Stephan Zickler (D)

LOCUS: Learning Object Classes with Unsupervised Segmentation

J. Winn and N. Jojic


Main (openoffice)

Opp ppt

Demo ppt

Apr. 12 (A)

David Lee (P)

Carlos Vallespi (O)

Gunhee Kim (D)

Context-based vision system for place and object recognition

Optional: Contextual Models for Object Detection using Boosted Random Fields

A. Torralba,  K. P. Murphy, W. T. Freeman and M. A. Rubin


Main ppt

Opp ppt

Machine Translation Approaches

Apr. 17 (B)

Heather Dunlop (P)

Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary


Matching Words and Pictures

Pinar Duygulu, Kobus Barnard, Nando de Freitas, and David Forsyth

Kobus Barnard, Pinar Duygulu, Nando de Freitas, David Forsyth, David Blei, and Michael I. Jordan


Main ppt

Apr. 19 (B)

Krishnan Ramnath (P)

Nicholas Chan (D)

Nicolas Chan (O)

Names and Faces

Tamara L. Berg, Alexander C. Berg, Jaety Edwards, Michael Maire, Ryan White, Yee Whye Teh, Erik Learned-Miller, David A. Forsyth


Main ppt

Intrinsic Images


Apr. 21 (A)

NSH 4201

Mohit Gupta (P)

Mohit Gupta (D)


Malola Prasath (P)

David Lee (D)

Nik Melchoir (O)

Deriving intrinsic images from image sequences



Recovering Intrinsic Images from a Single Image

Yair Weiss



Marshall F Tappen, William T Freeman, Edward H Adelson




Main and Demo ppt

Demo ppt

Opp (pdf)

Apr. 26 (A)

Stephan Zickler (P)

The Perception of Shading and Reflectance

Adelson & Pentland


Main ppt

Manifold Learning

May 1

Dave Thompson (P)

Jonathan Huang (O)

Nik Melchoir (D)

A global geometric framework for nonlinear dimensionality reduction


Nonlinear dimensionality reduction by locally linear embedding

J. B. Tenenbaum, V. De Silva, and J. C. Langford

Sam Roweis & Lawrence Saul


Main pdf

May 3

Ramnath & Gupta

Batra & Kim

Huang & Malisiewicz

Melchior & Lee

Stefan Zickler


Project Presentations Part I





May 8

Chan & Barnum


Thompson & Dunlop

Carlos Vallespi

Project Presentations Part II






Vision Science: Photons to Phenomenology by Stephen E. Palmer
Computer Vision: A Modern Approach, Forsyth and Ponce
Introductory Techniques for 3-D Computer Vision Trucco and Verri
An Invitation to 3D Vision: From Images to Geometric Models, Y. Ma, S. Soatto, J. Kosecka, S. Sastry
Multiple View Geometry in Computer Vision by Hartley & Zisserman
The Geometry of Multiple Images by Faugeras, Luong, and Papadopoulo
Neural Networks for Pattern Recognition, Bishop.

Most recently updated on January. 27, 2006 by David Bradley

Site design courtesy of Serge Belongie.