CMU Advanced Perception Seminar, Spring 1999
 Table of Contents
- Class Format 
- What Should be in a Critique? 
- Grading Policy   
- Computer Vision Resources
- Overview of topics, by week     
- Week 1. Introduction and Explanation     
- Week 2. Edge Extraction    
- Week 3. Region/Volume Segmentation     
- Week 4. Active Contours    
- Week 5. Object Recognition    
- Week 6. Volumetric Registration    
- Week 7. Projective Geometry     
- Week 8. Symmetry and Perception     
- Week 9. Stabilization And Mosaicing     
- Week 10. Egomotion and Structure from Motion   
- Week 11. New View Synthesis
- Week 12. Range Imaging   
- Week 13. Auditory Sensing    
 
 Class Format
The Advanced Perception course is a graduate reading seminar, meeting once a week to discuss a 
set of papers covering a specific topic in computer vision and perception.  We will look at historically important papers in field, as well as current papers from recent conferences and journals.  By 
reading a mixture of both types of papers, we will be able to trace the development of some the 
fundamental ideas that make up current-day research.
Each week, two papers on a particular topic will be assigned. After reading them, your must find a 
third paper on your own that is relevant to the topic (for example, in Week 2 you will find a paper 
on edge extraction, published in a conference proceedings or archival journal).  Finally, you will 
write a short critique/essay (3-4 pages) on the topic area based on the three papers you have read.  
This essay will be handed in for grading.  During class, each of the two assigned papers will be 
presented by one of the students (one student per paper, assigned the week before).  This is 
expected to be a formal 20 minute presentation in front of the class, using transparancies.  The 
presentation will then evolve into a class discussion on the topic covered in the paper. The instructors are responsible for keeping the discussion in a fruitful vein and making sure all students get a 
chance to participate. The instructors are also responsible for making sure that the important 
points are touched upon during the discussion, which will sometimes mean asking questions of 
the class, and for making sure that each paper is covered (which sometimes means cutting off discussion and moving on).   
At the end of the class,  we will go around the room asking each of you to cite the third paper you 
have personally chosen for that week, very briefly describe it (1 minute),  tell us why you picked 
it (i.e. how does it relate to the topic area and the two assigned papers), and finally whether or not  
you would recommend that paper for others to read.
 What Should be in a Critique?
The critiques you write will provide a short summary and analysis of the technical papers you 
have read each week.  Critique writing is an important component of the class, and serves several 
goals: to give you practice in technical writing, to concretely organize your ideas in preparation 
for class discussion, and to develop the skills necessary to become a good conference/journal 
paper referee.  Furthermore, getting in the habit of writing critiques of the papers you have read 
will help you do better research - a good critique provides a concise summary that you can refer to 
later without having to dig out and read the original work, and can provide a written starting point 
for the obligatory literature review section of your own papers/thesis. To help provide you with a 
sense for what goes into a critique, see the handout `The Task of the Referee,' by Alan Jay Smith, 
particularly the section entitled `Evaluating a Research Paper.' 
We have found that it is helpful to us, when grading critiques, to have them all follow a consistent 
format.  We ask you to hand in critiques with roughly the following sections (in this order):
- Reviewer: your name and the date
- Citation: the title, author, year, and publication citation of the three papers you are reviewing
- A one paragraph summary (abstract) of the topic area.  Why is it important?  
- A short overview of each paper including a) key ideas, b) technical approaches and c) results.
- Comparison of the papers, including strong points and weak points of each.  How would you 
rank each paper relative to the others?
- Questions and issues
We will grade critiques on a three-level scale: check-minus, check, check-plus. Above average 
resourcefulness, initiative, creativity and depth of analysis will get a check-plus. Missing any 
required sections (1-6) or obvious lack of effort on any of them results in a check-minus. 
Pay attention to your speling and grammar of English.      :-)
 Grading Policy
You will be graded on the following items:
| 1. Written Critiques | (40%) | 
| 2. Oral Presentations | (20%) | 
| 3. Class Participation | (20%) | 
| 4. Take-Home Final | (20%) | 
| 5. Extra Credit | (10%) | 
|  | -------- | 
|  | 110% total | 
Written critiques form the highest-weighted category, as they represent the bulk of the work that 
you will be performing (aside from reading the papers themselves). Each critique will be graded 
based on your demonstration that you know what that week's papers are about and have carefully 
considered their technical approaches and reported results.  We are  particularly interested in how 
well you compare and contrast the three papers that you read that week.
Oral presentation refers to the formal presentation of a paper in front of the class.   Depending on 
class size, you will be giving roughly two-three oral paper presentations during the semester.  To 
make it more like a real conference presentation, your talk will be strictly timed to be 20 minutes 
long.  We suggest you carefully organize and prioritize what you want to say, and maybe even 
practice it once with a watch.
Class participation is rather hard to judge objectively (but we are going to try).  We highly encourage you to participate in class discussion, and indeed, this type of class will be a complete failure 
if people don't speak up with their opinions.  On the other hand, we don't wish to penalize folks 
who aren't naturally talkative.  We will try to ensure that even soft-spoken people get a chance to 
air their opinions, and will attempt to grade based on the insightfulness of your comments, rather 
than the frequency or volume.
There will be a take-home final exam.  It will involve writing!
The extra credit category will reflect both objective evidence and subjective impressions we 
receive that indicate you are genuinely putting in a lot of effort. Anything you do (of a professional nature, related to this class) that makes us like you better, will increase your extra credit 
score.
 Computer Vision Resources
There are many places to go to look for computer vision papers, ranging from archival journals to 
on-line web sites.  Here is a list of our favorite sources of material:
Archival Journals
- International Journal of Computer Vision (IJCV)
- Computer Vision and Image Understanding (CVIU)- used to be Computer Vision, Graphics and Image Processing (CVGIP)
 
- IEEE Trans on Pattern Analysis and Machine Intelligence (PAMI)
- Image and Vision Computing (IVC)
- Pattern Recognition (PR)
Conference Proceedings
- International Conference on Computer Vision (ICCV)
- Computer Vision and Pattern Recognition (CVPR)
- European Conference on Computer Vision (ECCV)
- DARPA Image Understanding Workshop (IUW)
WWW Resources
 Overview of Topics by Week   (Selections subject to change)
Week 1: Introduction and Explanation
Introduction; explanation of class format and logistics.  Instructors talk about computer vision 
resources, and why particular papers were selected for this course.  Discussion of how to write a 
critique, give a presentation, and find relevant research papers.
Week 2. Feature Extraction I: Edge Extraction 
(Reminder: read these two and also find a third related paper on your own.)
- E.C.Hildreth, `The Detection of Intensity Changes by Computer and Biological Vision Systems,' 
Computer Vision, Graphics and Image Processing, Vol. 27, 1983, pp.1-27. 
- J.F.Canny. `A computational approach to edge detection.'' 
IEEE Trans. on Pattern Analysis 
and Machine Intelligence,Vol.8(6), November 1986, pp.679-698. 
Week 2 Third Papers (selected by the students):
- C. Harris, B. Buxton. `Low-level Edge Detection Using Genetic Programming: Performance, 
Specificity, and Application to Real-World Signals''. June 1997, University College London 
Tech Report RN/97/34.  -- This paper describes how genetic programming can be used 
to evolve a set of edge detectors specific to a training dataset.  These detectors are shown 
to outperform both theoretical optimal detectors and other evolved detectors.
- P. Perona and J. Malik, `Scale-Space and Edge Detection Using Anisotropic Diffusion,' 
IEEE 
Trans. on PAMI, V. 12 (7), July 1990. -- This paper presents a global approach to edge 
detection which formulates edge detection as a diffusion process and attempts to find 
edges via global deformation of the image rather than local sliding-window operations.  
Second summary: Instead of detecting edge locally, this paper approaches the problem 
globally. It views the convolution with a Gaussian as similar to the solution of heat conduction/diffusion. The approach fixes many of the shortcomings of convolution-based and 
Canny edge detectors, however the computational cost is higher for sequential machine.
-  Asada 
et.al., `Edge and Depth from Focus' ,   
IJCV, 
 26(2), 1998, 153-163.  -- Edges are 
extracted by observing the blurring in an image when a series of de-focussing operations 
is deliberately introduced.
- Y. Lu and R. C. Jain. `Reasoning about Edges in Scale Space,''  
EEE Trans. on Pattern Analysis and Machine Intelligence, Vol 14(4), April 1992.  --  RESS is a method of integrating 
edges from multiple scales of the LoG edge operator using a knowledge base of the behavior of edges at different scales.
- D. Demigny and T. Kamle, `A Discrete Expression of Cannys Criteria for Step Edge Detector 
Performances Evaluation,'  
IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.19(11), November 1997, pp. 1199-1211. -- Since all filters are implemented in 
the discrete domain, this paper proposes three criteria (similar to Canny's 3 criteria) to 
directly optimize filters in the discrete domain; the paper also shows that optimizing the 
three discrete domain criteria yields better results than what is obtained by sampling the 
optimized Canny filter.
- MIT AI lab memo: 
AI memo 773, April 1984.
- J.B. Burns, A.R.Hanson and E.M.Riseman, Extracting Straight Lines, IEEE Trans. on Pattern 
Analysis and Machine Intelligence, Vol8(4), July 1986, pp.425-455.  --  This paper presents an approach for the extraction of straight lines in intensity images. It starts at the 
level of lines directly without going through the intermediate stage of first detecting local 
edges.  They argue that this overcomes the difficulties encountered in aggregation when 
using local operators.
- A. A. Farag and E. J. Delp, `Edge Linking By Sequential Search,' 
Pattern Recognition, Vol. 
28(5), 1995, pp. 611-633.  -- Considering the edge detection as a two-stage process (edge 
enhancement followed by edge linking), more focus should be given to the edge linking 
process than what Canny did in his detector. The paper by Farag and Delp used Laplacian 
of Gaussian operator for edge enhancement and used A* (or Stack) search with mathematically sound heuristic for edge linking.
 Week 3. Feature Extraction II:  Region/Volume Segmentation
- T.Kapur, W.E.Grimson, W.Wells and R.Kikinis, `Segmentation of Brain Tissue from Magnetic Resonance Images , 
Medical Image Analysis, Vol.1(2), 1996, pp. 109-127.
- B.Maxwell and S.Shafer, `Physics-Based Segmentation of Complex Objects using Multiple 
Hypotheses of Image Formation,' 
ComputerVision and Image Understanding, Vol.65(2), 
Feb 1997, pp.269-295.
Week 3 Third Papers (selected by the students):
- V. Rehrmann and L. Priese, `Fast and Robust Segmentation of Natural Color Scenes'', 
Proceedings from Third Asian Conference on Computer Vision, Hongkong, Jan 1998. -- This 
paper describes the CSC algorithm, Color Structure Code, for performing real-time segmentation of color images. Images are represented with hexagonal connectivity using a 
hierarchical tree structure.  Regions are created by color similarity comparisons of local 
elements, with provision for later splitting regions that prove to be dissimilar at a global 
level of analysis.
- M.A. Gonzalez Ballester, A. Zisserman, and J.M. Brady. `Measurement of Brain Structures 
based on Statistical and Geometrical 3D Segmentation,' MICCAI'98. 
To appear.  --This 
paper presents a method for three-dimensional segmentation and measurement of volumetric data based on the combination of statistical and geometrical information. The shape 
of complex three-dimensional structures, such as the cortex is represented by combining a 
discrete 3D simplex mesh with the construction of a smooth surface using triangular Gregory-Bezier patches. Confidence bounds are produced for all the measurements, thus 
obtaining bounds on the position of the surface segmenting the image. 
- T. Uchiyama and M. A. Arbib, `Color Image Segmentation Using Competitive Learning,' 
IEEE Transactions on Pattern Analysis and Machine Intelligence, 
Vol.16(12), Dec.1994, 
pp.1197-1206. -- This paper deals with the problem of colour image segmentation; clusters of the same colour are identified using competitive learning, thereby producing the 
least sum of squares solution.   
- T. Leung and J. Malik, `Contour Continuity in Region-Based Image Segmentation', 
Fifth 
Euro. Conf. on Computer Vision, Freiburg, Germany, June 1998.  -- The paper takes into 
account contour continuity, in addition to intensity, color and texture to determine the partitioning of an image. The image `soft' contour is first detected by using elongated filters 
and Hilbert transform, giving out the `orientation energy' measure. The orientation 
energy is used as a basic to propagate contour. Afterward, the regions are segmented by 
using the normalized cut approach.
- B. Leroy, I.L. Herlin, L.D. Cohen, `Multi-Resolution Algorithms for Active Contour Models', 
Proceedings of the 12th International Conference on Analysis and Optimization of 
Systems, Images, Wavelets and PDE'S, Rocquencourt (France), 1996. S.C. Zhu, T.S.  -- 
The paper attempts to speed up active contour models, the balloons, by going into multi-resolution using two separate methods. The first uses multi-resolution data, the second 
incorporate multi-resolution to the model itself (by using elliptic Fourier harmonics).
-  
 Week 4. Feature Extraction III: Active Contours
- M.Kass, A.Witkin, D.Terzopoulos, `Snakes: Active Contour Models,' 
International Journal 
of Computer Vision, Vol.1(4), January 1988, pp. 321-331. 
- A.Pentland,and S.Sclaroff, `Closed-Form Solutions for Physically Based Shape Modeling 
and Recognition,' 
IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 13, no. 
7, July 1991, pp. 715-729.
Week 4 Third Papers (selected by the students):
- F. Leymarie and M. Levine, `Tracking Deformable Objects in the Plane Using an Active Contour Model'', 
IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 15, no. 6, 
June 1993, pp 617--634.  -- This paper suggests improvements on the original snake active 
contour model (Kass, Witkin, Terzopoulos): 1) a different terminating criterion to improve 
convergence 2) selection of bounds on parameters to prevent oscillation 3) initialization 
using a sequence of hierarchical discrete correlations (Burt and Adelson Laplacian pyramid).  Active contours, along with the proposed modifications, are used to track the movement of cells on microscope slides.
- Lee, A.L Yuille, `Region Competition: Unifying Snakes, Region Growing, Energy/Bayes/MDL for Multi-band Image Segmentation', 
Proceedings of the Fifth ICCV, pp. 416-425, 
1995. -- Using the statistical properties, a new region competition algorithm will have a 
combined best features of snakes/balloons, region growing, and Bayes/MDL. This new 
Region Competition algorithm allows pixels inside regions to compete for pixels along 
boundaries. The likelihood of membership in a region is determined using statistical properties.
- Michael Isard and Andrew Blake, `Contour tracking by stochastic propagation of conditional 
density', 
Proc. European Conference on Computer Vision, vol. 1, pp. 343--356, Cambridge UK, (1996).  -- The paper proposes a stochastic algorithm (Condensation algorithm) for tracking curves in dense visual cluttered images. It uses `factored sampling', a 
method previously applied to interpretation of static images, in which the distribution of 
possible interpretations is represented by a randomly generated set of  representatives. The 
algorithm combines factored sampling with learned dynamical models to propagate an 
entire probability distribution for object position and shape, over time. The result is highly 
robust real-time tracking of agile motion in clutter. Clearly written paper with a good 
explanation of the proposed technique, it contains experimental results and a complexity 
analysis. Not surprisingly it won the best paper award.
- A. Hoover, D. Goldgof, K. W. Bowyer, `Extracting a Valid Boundary Representation from a 
Segmented Range Image,' 
IEEE Trans. On Pattern Analysis and Machine Intelligence, 
vol.17 no.9, September 1995, pp. 920-925.  -- This paper addresses the problem of creating boundary representations (b-rep) of polyhedral shapes, by using topological and geometric information, and also including a hypothetical representation of the un-visible 
section of the object. 
Week 5. Object Recognition
- D.P.Huttenlocker and S.Ullman, `Recognizing Solid Objects by Alignment with an Image,' 
Int'l Journal of Computer Vision, vol. 5(2), 1990, pp. 195-212.
- H.Murase, and S.K.Nayar, `Visual Learning and Recognition of 3-D Objects from Appearance,' 
Int'l Journal of Computer Vision, vol. 14, 1995, pp. 5-24.
Week 5 Third Papers (selected by the students):
- C.S. Chua and R. Jarvis.   `Point Signatures:  A New Representation for 3D Object Recognition.'  
International Journal of Computer Vision, 
25(1), 63-85 (1997).  -- A point signature is a 1D feature curve that describes the undulation of the 3D object surface local to a 
point of interest, a collection of which facilitates the recognition of 3D free form objects.
- P. Viola, `Complex Feature Recognition: A Bayesian Approach for Learning to Recognize 
Objects,'' 
MIT AI Labs Tech Report 1591.  -- 
This paper describes a Bayesian approach 
for extracting complex object features that are less affected by illumination and pose 
changes.  Since each feature captures a greater area of a scene, the correspondence problem between model and image is reduced as well.
- C.F. Olson and D.P.Huttenlocher `Automatic Target Recognition by Matching Oriented Edge 
Pixels', 
IEEE Trans. on Image Processing, 6(1):103-113, January 1997. -- The paper 
defines oriented edge pixels by taking x, y, and delta (which is either the direction of the 
gradient, edge normal or tangent). A modified Hausdorff measure, which measures the 
maximum distance and orientation of nearest points, is utilized to provide a closeness 
measure. K number of pixels (not all) are matched, to account for occlusion and noise. 
The 3-D models (and multiple models) are organized in hierarchical way based on similarity (so if you have two similar models, you will create a parent having the intersection of 
the models).  A recognition is done by computing the Hausdorff distance between the 
image and the models. Additionaly, a probability of a false alrm is computed by calculating Markov process, both for predicted false alarm and observed false alarm.
Week 6. Volumetric Registration
- R.Bajcsy and S.Kovacic, `Multiresolution Elastic Matching,' 
Computer Vision, Graphics and 
Pattern Recognition, 
Vol 46, 1989, pp.1-21.
- P.A.Viola and W.Wells, `Alignment by Maximization of Mutual Information, 
International 
Journal of Computer Vision, Vol.24(2), September 1997, pp. 137-154.
Week 7. Projective Geometry
- J.B.Burns, R.S.Weiss and E.M.Riseman, `The Non-Existence of General-Case View-Invariants,' 
Geometrical Invariance in Computer Vision, ed. J. Mundy and A.Zisserman, MIT 
Press, Cambridge, 1992, pp.120-131.
The following two papers will be treated as one, for the purposes of critiquing/presenting:
- H.C.Longuet-Higgins, `A Computer Algorithm for Reconstructing a Scene from Two Projections,' 
Nature, vol 293, 1981, pp. 133-135.
- R.Hartley, `In Defense of the 8-point Algorithm,' 
IEEE Trans on Pattern Analysis and 
Machine Intelligence, 19(6),  June 1997, pp. 580-593.
Week 8. Symmetry and Perceptio
- F.Ulupinar and R.Nevatia, `Constraints for Interpretation of Line Drawings under Perspective 
Projection,' 
CVGIP: Image Understanding, 
Vol. 53(1), 1991, pp.88-96.
The following two papers will be treated as one, for the purposes of critiquing/presenting:
- H.Zabrodsky, S.Peleg and D.Avnir, `Symmetry as a Continuous Feature,' 
IEEE Transactions 
on Pattern Analysis and Machine Intelligence, 
vol 17(12), 1995, pp.1154-1165.
- K.Kanatani, `Comments on `Symmetry as a Continuous Feature', 
IEEE Transactions on Pattern Analysis and Machine Intelligence, 
vol 19(3), 1997, pp. 246-247.
Week 9. Stabilization And Mosaicing
- J.Bergen et.al., `Hierarchical Model-Based Motion Estimation,' in  
Proceedings of European 
Conference on Computer Vision, 
1992, pp. 237-252.
- H.Shum and R.Szeliski, `Construction and Refinement of Panoramic Mosaics with Glocal 
and Local Alignment,' 
International Conference on Computer Vision, Bombay, India, 
Jan.1998, pp. 953-958. 
Week 10. Egomotion and Structure from Motion
- C.Tomasi and T.Kanade, `Shape and Motion from Image Streams under Orthography: a Factorization Method,' 
Int'l Journal of Computer Vision, Vol. 9(2), 1992, pp. 137-154.
- J.L.Barron, D.J.Fleet, and S.S.Beauchemin, `Performance of Optical Flow Techniques,' 
Int'l 
Journal of Computer Vision, vol. 12, no. 1, Jan. 1994, pp. 43-77.
Week 11. New View Synthesis
- L.McMillan and G.Bishop, `Plenoptic Modeling: An Image-Based Rendering System,' 
Proc. 
SIGGRAPH, 
1995, pp.39-46.
- S.Gortler, R.Grzeszczuk, R.Szeliski and M.Cohen, `The Lumigraph,' 
Proc. SIGGRAPH, 
1996, pp.43-54.
Week 12. Range Imaging
- A.Johnson and M.Hebert, `Surface Matching for Object Recognition in Complex Three-Dimensional Scenes,' 
Image and Vision Computing, 
Vol.16, 1998, pp.635-651.
- P.Besl and N.McKay, `A Method for Registration of 3-D Shapes,' 
IEEE Trans on Pattern 
Analysis and Machine Intelligence (PAMI), Vol. 14(2), 1992, pp.239-256.
Week 13. Auditory Sensing