CMU Advanced Perception Seminar: Description and Syllabus

CMU Advanced Perception Seminar

Advanced Perception Seminar Description

Bruce Maxwell, TA 1994-5

Class Format:

For 1994 and 1995, the Advanced Perception course has been taught as a graduate reading seminar, meeting once a week for 3-4 hours to discuss a set of papers covering a specific topic. The students are divided into small groups (2-4), and for a given week one of the groups is responsible for writing a set of critiques which briefly summarize and analyze the papers. All members of the class are expected to have read all the papers and all the critiques prior to arriving at class. During fall of 1995, we had 4 groups of 3 students each, and each group ended up doing 3 sessions.

On a typical day, either the instructor or another faculty member begins with a brief (~5 minutes) introduction of the topic, specifically addressing why the field is important, why they chose the particular set of papers, and some specific points to address in the discussion. For all but the first few weeks (on Physics-Based Vision), we always had another faculty member besides the instructor sit in on the class to provide experience and direction to the discussion. Getting visiting faculty from outside CMU is also an option and worked well with Rick Szeliski and Matthew Turk from Microsoft Research.

After the faculty introduction, the student reviewers for the week begin going through the papers. Typical practice for the past 2 years has been that the reviewer or reviewers for a given paper expand upon their critiques for 5-10 minutes with the purpose of leading off a discussion. The ensuing discussion can take anywhere from 20 minutes to an hour per paper. The instructor is responsible for keeping the discussion in a fruitful vein and making sure all students get a chance to participate. The instructor is also responsible for making sure that the important points are touched upon during the discussion, which will sometimes mean asking questions of the class. The instructor's role is especially important in the first few weeks when students are not used to the format. The instructor is also responsible for making sure the all the papers are covered (which sometimes means cutting off discussion and moving on). It also helps the discussion to have a TA who has taken the course before and can provide an additional voice during a lull.

One practice that has worked well is to go around the room a few times during the class and get answers from each person to a specific question. A common question we used was, "How do you rate this paper on a scale of 1-5?" These questions would often spark other discussions and helped to involve everyone in the class. For example, we would often ask the people who rated a paper either low or high to defend their position.

What Should be in a Critique?:

After two years doing this class, my comments on what makes a good critique are as follows:

1) Put the complete and correct citation for the paper on the critique.

2) Try to summarize the paper in a sentence and put this at the beginning. This can be contrasted with a sentence taken from the paper that attempts to do the same thing.

3) Give a brief summary of the paper, highlighting what is new, what is old, and what is important. Sometimes definitions or brief explanations of difficult or technical aspects of the paper are appropriate.

4) Do an analysis of the paper: is the paper important, is it written well, do the experiments back-up the claims, are the results interpreted correctly, are the experiments representative of where such a method would be used, is this ground-breaking research or just a modification of something else, does the paper integrate knowledge from other fields, what background did the author's come from, what are the weaknesses of the method, what are the strengths, do the authors discuss the limitations, are all of the assumptions specified, is there baloney or laborious math, how does the paper relate to other people's work, how general is the paper, is there adequate justification for models they use, are there any obvious extensions to the work, why didn't this paper solve the vision problem, is the method feasible, what is the new idea, are there significant typos, how does the paper compare or contrast with other papers that the class has read or are reading for that week? (Note: the last has not been emphasized much the past two years and in my opinion should be a more integral part of the course in the future.)

5) Provide a set of issues or questions to lead off a discussion. Some students have done this by asking a series of questions about the paper, whereas some have advocated very strong opinions for or against a given method. The latter, in my experience, seems to generate the more interesting discussions, but is not always possible for every paper.

Exams and Grading Policy:

This has been the one problem for this class. Class participation and critique writing should account for at least half of a student's grade. Otherwise, in the past two years there has only been a take-home final. In my opinion, there should also be a take-home mid-term. My feeling is that the exams should present the students with a task to solve and ask them to compare and contrast different solutions using methods that have been covered in class. It would also be an appropriate part of the mid-term to ask all of the students to do a critique of one paper so they can get feedback on their critique-writing skills.

Timeline:

T-2 weeks - reviewers meet with appropriate faculty member to get the set of papers and words of wisdom

T-1 week - reviewers give papers to the class

T-3 days - critiques distributed to the class

T - class - all students are expected to have read all of the papers and critiques

Papers Used for Fall 1995:

Week 1: Introduction and Explanation

Introduction, explanation of class format and timeline, division into review groups, distribution of papers for week 2 to class and to reviewers. Following the business aspect of the class, we would generally spend 30-40 minutes discussing how to write a good critique.

Week 2: Reflection Models and Shape from Shading (good set of papers)

B. K. P. Horn, "Understanding Image Intensities," (Blue Books)
R. Cook and K. Torrance, "A Reflectance Model for Computer Graphics," (Blue Books)
M. Oren and S. Nayar, "A Theory of Specular Surface Geometry," in Proceedings of Int'l Conference on Computer Vision, 1995, pp. 740-747.
R. Zhang et. al., "Analysis of Shape from Shading Techniques," in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 1994, pp. 377-383.
T. Wada, H. Ukida, and T. Matsuyama, "Shape from Shading with Interreflections under Proximal Light Source," in Proceedings of Int'l Conference on Computer Vision, 1995, pp. 66-71.
K. Ikeuchi and B. K. P. Horn, "Numerical Shape from Shading and Occluding Boundaries," Shape from Shading,
ed. B. K. P. Horn and M. Brooks, MIT Press, Cambridge, 1989.
D. Forsyth, and A. Zisserman, "Reflections on Shading," IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 13, no. 7, July 1991, pp. 671-9.

Week 3: Color Constancy and Segmentation (good set of papers, use new Maxwell & Shafer)

H. C. Lee, "Method for computing the scene-illuminant chromaticity from specular highlights," Journal of the Optical Society of America, vol. 3, no. 10, October 1986, pp. 1694-9.
G. Healey, "Estimating spectral reflectance using highlights," Image and Vision Computing, vol. 9, no. 5, October 1991, pp. 335-9.
B. A. Maxwell and S. A. Shafer, "A Framework for Segmentation using Physical Models of Image Formation," in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 1994, pp. 361-8.
G. Funka-Lea and R. Bajcsy, "Combining color and geometry for the active, visual recognition of shadows," in Proceedings of Int'l Conference on Computer Vision, 1995, pp. 203--9.
R. Bajcsy, S. W. Lee, and A. Leonardis, "Color Image Segmentation with Detection of Highlights and Local Illumination Induced by Inter-reflection," in Proceedings, 10 Int'l Conference on Pattern Recognition, 1990, pp. 795-90.
Y. Ohta, T. Kanade, and T. Sakai, "Color Information for Region Segmentation," Computer Graphics and Image Processing, vol. 13, 1980, pp. 222-241.
G. Healey, "Segmenting Images using Normalized Color," IEEE Trans. on Systems, Man and Cybernetics, vol. 22, no. 1, 1992, pp. 64-73.
G. J. Klinker, S. A. Shafer, and T. Kanade, "A Physical Approach to Image Understanding," Int'l Journal of Computer Vision, vol. 4, no. 1, Jan. 1990, pp. 7-38.

Week 4: Photometric Stereo (good set of papers)

S. Nayar, K. Ikeuchi, and T. Kanade, "Shape from Interreflections," Int'l Journal of Computer Vision, vol. 6, no. 3, 1991, pp. 73-195.
L. B. Wolff, "Spectral and Polarization Stereo Methods using a Single Light Source," in Proceedings of Int'l Conference on Computer Vision, 1987, pp. 708-715.
R. J. Woodham, "Surface Curvature from Photometric Stereo," Technical Report, University of British Columbia, Computer Science TR 90-29, October 1990.
P. H. Christensen, and L. G. Shapiro, "Three-Dimensional Shape from Color Photometric Stereo," Int'l Journal of Computer Vision, vol. 13, no. 2, 1994, pp. 213-227.
Y. Sato and K. Ikeuchi, "Temporal-color space analysis of reflection," Journal of the Optical Society of America, vol. 11, no. 11, Nov. 1994, pp. 2990-3002.
S. Nayar, K. Ikeuchi, and T. Kanade, "Shape from Interreflections," Int'l Journal of Computer Vision, vol. 6, no. 3, 1991, pp. 73-195.

Week 5: Object, Face, and Gesture Recognition (good set of papers)

M. Swain, D. Ballard, "Color Indexing," Int'l Journal of Computer Vision, vol. 7, no. 1, 1991, pp. 11-32.
H. Murase, and S. K. Nayar, "Visual Learning and Recognition of 3-D Objects from Appearance," Int'l Journal of Computer Vision, vol. 14, 1995, pp. 5-24.
D. Lee, R. Barber, and W. Niblack, "Indexing for Complex Queries on a Query-By-Content Image Database," in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 1994, pp. 142-146.
J. Matas, R. Marik, and J. Kittler, "On Representation and Matching of Multi-colored Objects," in Proceedings of Int'l Conference on Computer Vision, 1995, pp. 726-732.
M. Turk and A. Pentland, "Face Recognition Using Eigenfaces," in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 1991, pp. 586-591.
A. Bobick and A. Wilson, "A State-based Technique for the Summarization and Recognition of Gesture," in Proceedings of Int'l Conference on Computer Vision, 1995, pp. 382-388.

Week 6: Projective Geometry and Camera Calibration (these papers were too dry, choose others)

C. Rothwell et. al., "Extracting Projective Structure from Single Perspective Views of 3D Point Sets," in Proceedings of Int'l Conference on Computer Vision, 1993, pp. 573-582.
*J. L. Mundy and A. Zisserman, "Towards a New Framework for Vision," Geometrical Invariance in Computer Vision, ed. J. Mundy and A. Zisserman, MIT Press, Cambridge, 1992, chapter 1.
C. Coelho et. al., "An Experimental Evaluation of Projective Invariants," Geometrical Invariance in Computer Vision, ed. J. Mundy and A. Zisserman, MIT Press, Cambridge, 1992, chapter 4.
J. Ponce and D. Kriegman, "Toward 3D Curved Object Recognition from Image Contours," Geometrical Invariance in Computer Vision, ed. J. Mundy and A. Zisserman, MIT Press, Cambridge, 1992, chapter 21.
R. Mohr, L. Morin, and E. Grosso, "Relative Positioning with Uncalibrated Cameras," Geometrical Invariance in Computer Vision, ed. J. Mundy and A. Zisserman, MIT Press, Cambridge, 1992, chapter 22.
*J. L. Mundy and A. Zisserman, "Projective Geometry for Machine Vision," Geometrical Invariance in Computer Vision, ed. J. Mundy and A. Zisserman, MIT Press, Cambridge, 1992, chapter 23.
* These papers provided as optional background reading

Week 7: Stereo and High Performance Computing for Vision (ok, but more stereo, less computing)

P. P. Jonker, "Why Linear Arrays are Better Image Processors," Int'l Conference on Pattern Recognition, D: Parallel Computing, 1994, pp. 334-338.
R. Haralick et. al., "Proteus: a reconfigurable computational network for computer vision," Machine Vision and Applications, vol. 8, no. 2, 1995, pp. 85-100.
J. A. Webb, "Latency and Bandwidth Considerations in Parallel Robotics Image Processing," in Proceedings of SUPERCOMPUTING `93, 1993, pp. 230-9.
G. Ahearn, "MaxVideo 200: A pipeline image processing architecture for performance-demanding applications," SPIE Vol. 2368, Image and Information Systems, 1994, pp. 225-8.
B. Ross, "A Practical Stereo Vision System," in Proceedings of Int'l Conference on Computer Vision, 1993, pp. 148-153.

Week 8: Model-Based Object Recognition (pretty good set, may want to switch some)

F. Stein and G. Medioni, "Structural Indexing: Efficient 3-D Object Recognition," IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 14, no. 2, Feb. 1992, pp. 125-145.
M. D. Wheeler and K. Ikeuchi, "Sensor Modeling, Probabilistic Hypothesis Generation, and Robust Localization for Object Recognition," IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 14, no. 2, March 1995, pp. 252-265.
R. J. Vayda and A. C. Kak, "A Robot Vision System for Recognition of Generic Shaped Objects," CVGIP: Image Understanding, vol. 54, no. 1, pp. 1-46.
D. P. Huttenlocker and S. Ullman, "Recognizing Solid Objects by Alignment with an Image," Int'l Journal of Computer Vision, vol. 5, no. 2, 1990, pp. 195-212.
R. Basri and S. Ullman, "The Alignment of Objects with Smooth Surfaces," CVGIP: Image Understanding, vol. 57, no. 3, May 1993, pp. 331-345.
P. Flynn and A. K. Jain, "3D Object Recognition Using Invariant Feature Indexing of Interpretation Tables," CVGIP: Image Understanding, vol. 55, no. 2, March 1992, pp. 119-129.

Week 9: HCI, Gesture and face recognition (good set of papers, get some newer stuff as well)

J. M. Rehg and T. Kanade, "Model-Based Tracking of Self-Occluding Articulated Objects," in Proceedings of Int'l Conference on Computer Vision, 1995, pp. 612-617.
F. K. H. Quek, T. Mysliwiec, and M. Zhao, "FingerMouse: A Freehand Pointing Interface," in Proceedings of Int'l Workshop on Automatic Face and Gesture Recognition, 1995, pp. 372-7.
M. J. Black and Y. Yacoob, "Tracking and Recognizing Facial Expressions in Image Sequences using Local parameterized Models of Image Motion," Technical Report, University of Maryland Computer Science Dept., CS-TR-3401, January 1995.
T. Darrell and A. Pentland, "Space-Time Gestures," in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 1993, pp. 335-340.
J. Davis and M. Shah, "Gesture Recognition," Technical Report, Univ. of Central Florida, Computer Science Dept., CS-TR-93-11.
W. T. Freeman and M. Roth, "Orientation Histograms for Hand Gesture Recognition," in IEEE Workshop on Automatic Face and Gesture Recognition, Zurich, 1995.
F. K. H. Quek, "Eyes in the Interface," Image and Vision Computing, vol. 13, no. 6, August 1995, pp. 511-525.
W. T. Freeman, and C. D. Weissman, "Television control by hand gestures," in IEEE Workshop on Automatic Face and Gesture Recognition, Zurich, 1995. (optional paper)

Week 10: Physically-based Modeling and Reasoning (somewhat dry, but good topic)

D. Terzopoulos, A. Witkin, and M. Kass, "Constraints on Deformable Models: Recovering 3D Shape and Nonrigid Motion," Artificial Intelligence, 36, 1988, pp. 91-123.
L. D. Cohen, "On Active Contour Models and Balloons," CVGIP: Image Understanding, vol. 52, no. 2, March 1991, pp. 211-218.
D. Metaxas and D. Terzopoulos, "Shape and Nonrigid Motion Estimation through Physics-Based Synthesis," IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 15, no. 6, June 1993, pp. 580-591.
A. Pentland, and S. Sclaroff, "Closed-Form Solutions for Physically Based Shape Modeling and Recognition," IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 13, no. 7, July 1991, pp. 715-729.
W. Neuenschwander et. al., "Deformable Velcro\xaa Surfaces," in Proceedings of Int'l Conference on Computer Vision, 1995, pp. 828-833.
H. Delingetter, "Simplex Meshes: A General Representation for 3D Shape Reconstructions," in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 1994, pp. 856-860.

Week 11: Optical Flow (very good week, good format for this topic (see special note below))

(Special Note: For this week critiques were written for and all members of the class read the first 3 papers. Each person in the class was also responsible for one of the remaining papers. After discussing the first 3 general papers, each student gave a 5 minute presentation of the method used in their paper and a brief summary of it's performance. The class spent the remainder of the class discussing the relative strengths and weaknesses of the various methods. This class was particularly good in that it gave the students a broad understanding of research in optical flow.)

J. Bergen et. al., "Hierarchical Model-Based Motion Estimation," in Proceedings of European Conference on Computer Vision, 1992, pp. 237-252.
J. Y. A. Wang and E. H. Adelson, "Layered Representation for Motion Analysis," in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 1993, pp. 361-366.
J. L. Barron, D. J. Fleet, and S. S. Beauchemin, "Performance of Optical Flow Techniques," Int'l Journal of Computer Vision, vol. 12, no. 1, Jan. 1994, pp. 43-77.
B. K. P. Horn and B. G. Schunck, "Determining optical flow," Artificial Intelligence, 17, 1981, pp. 185--203
B. D. Lucas and T. Kanade, "An iterative image registration technique with an application in stereo vision," In Proceedings of Seventh International Joint Conference on Artificial Intelligence (IJCAI-81), 1981, pp. 674--679,
S. Uras, F. Girosi, A. Verri, and V. Torre, "A computational approach to motion perception," Biological Cybernetics, 60, 1988, pp. 79--87.
H. H. Nagel, "On the estimation of optical flow: Relations between different approaches and some new results," Artificial Intelligence, 33, 1987, pp. 299--324.
M. Otte and H. H. Nagel, "Optical flow estimation: advances and comparisons," In Proceedings of Third European Conference on Computer Vision (ECCV'94), vol. 1, 1994, pp. 51--60,
P. Anandan, "A computational framework and an algorithm for the measurement of visual motion.," International Journal of Computer Vision, vol. 2, no. 3, January 1989, pp. 283--310,
A. Singh, "An estimation-theoretic framework for image-flow computation," In Proceedings of Third International Conference on Computer Vision (ICCV'90), 1990, pp. 168--177,
D. J. Heeger, "Optical flow using spatiotemporal filters," International Journal of Computer Vision," vol. 1, 1988, pp. 279--302,
A. M. Waxman, J. Wu, and M. Bergholm, "Convected activation profiles and receptive fields for real-time measurement of short range visual motion," In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR'88), 1988, pp. 717--723.
D. Fleet and A. Jepson, "Computation of component image velocity from local phase information.," International Journal of Computer Vision, vol. 5, 1990, pp. 77--104.
J. Weber and J. Malik, "Robust computation of optical flow in a multi-scale differential framework," In Proceedings of Fourth International Conference on Computer Vision (ICCV'93), 1993, pp. 12--20,
R. Szeliski and J. Coughlan, "Hierarchical spline-based image registration," In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR'94), 1994, pp. 194--201. A longer version is available as Technical Report 94/1}, Digital Equipment Corporation, Cambridge Research Lab, April 1994. http://www.research.digital.com/CRL/abstracts/94.1.html
Y. Xiong and S. Shafer, "Hypergeometric filters for optical flow and affine matching," In Proceedings of Fifth International Conference on Computer Vision (ICCV'95), 1995, pp. 771-776.

Week 12: Structure from Motion (good set of papers)

H. C. Longuet-Higgins, "A computer algorithm for reconstructing a scene from two projections," Nature, vol 293, 1981, pp. 133-135.
C. Tomasi and T. Kanade, "Shape and Motion from Image Streams under Orthography: a Factorization Method," Int'l Journal of Computer Vision, vol. 9, no. 2, 1992, pp. 137-154.
A. Azarbayejani and A. Pentland, "Recursive Estimation of Motion, Structure, and Focal Length," IEEE Trans. on Patterna Analysis and Machine Intelligence, vol. 17, no. 6, June 1995, pp. 562-575.
R. I. Hartley, "Projective Reconstruction and Invariants from Multiple Images," IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 17, no. 6, June 1994, pp. 1036-1041.
B. K. P. Horn and W. J. Weldon, Jr., "Direct Methods of Recovering Motion," Int'l Journal of Computer Vision, vol. 2, 1988, pp. 51-76.

CMU Computer Vision Home Pages
USC annotated vision bibliography

CMU Advanced Perception Seminar

Table of Contents

Advanced Perception Seminar Description