Suggested Papers for 16-721: Advanced Machine Perception

Spring 2006

*NEW*  The discussion board for signing up for papers is now available here *NEW*




  1. Adelson & Bergen, The Plenoptic Function and the Elements of Early Vision


  1. Cavanagh, P. (1996). Vision is getting easier every day. Perception, 24, 1227-1232.


  1. Cavanagh, P. (1991). What's up in top-down processing? In A. Gorea (ed.) Representations of Vision: Trends and Tacit Assumptions in Vision Research, Cambridge, UK: Cambridge University Press, 295-304.


  1. Cavanagh, P. (1999). Pictorial art and vision. In Robert A. Wilson and Frank C. Keil (Eds.), MIT Encyclopedia of Cognitive Science, (pp. 648-651) Cambridge, MA: MIT Press.






Part I: Low-level Vision (images as texture)


  1. Olshausen & field, Wavelet-like receptive fields emerge from a network that learns sparse codes for natural images. (1996) Nature, 381: 607-609. (code available)


  1. Y. Rubner and C. Tomasi and L. J. Guibas. The Earth Mover's Distance as a Metric for Image Retrieval. International Journal of Computer Vision, 40(2) November 2000, pages 99--121. (code available)
  2. Y. Rubner,J. Puzicha, C. Tomasi, and J. M. Buhmann. Empirical Evaluation of Dissimilarity Measures for Color and Texture. Computer Vision and Image Understanding Journal, 84(1):25-43, October 2001.
    • Advocate: Peter Barnum


  1. Martin, Fowlkes, Malik, Learning to Detect Natural Image Boundaries Using Local Brightness, Color, and Texture Cues.  IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(5):530-549, May 2004. (short version) (code and data available)
    • Advocate: Heather Dunlop


Scene Models


  1. A. Torralba and A. Oliva. Statistics of Natural Image Categories (2003) Network: Computation in Neural Systems. Vol. 14, 391-412.
  2.  A. Torralba, A. Oliva.  Depth estimation from image structure (2002) IEEE Transactions on Pattern Analysis and Machine Intelligence. 24(9): 1226-1238.
  3. A. Oliva, A. Torralba.  Modeling the shape of the scene: a holistic representation of the spatial envelope. (2001) International Journal of Computer Vision, Vol. 42(3): 145-175.
    • Advocate: Jonathan Huang



"Bag of Words" Models


  1. Renninger, L.W. & Malik, J. (2004).  When is scene recognition just texture recognition?  Vision Research, 44, 2301-2311 (data available)
  2. G. Csurka, C. Bray, C. Dance, and L. Fan. Visual categorization with bags of keypoints. In Workshop on Statistical Learning in Computer Vision, ECCV, pages 1-22, 2004.
  3. J. Winn, A. Criminisi and T. Minka.  Object Categorization by Learned Universal Visual Dictionary.  Proc. IEEE Intl. Conf. on Computer Vision (ICCV), Beijing 2005.
    • To be briefly covered by Alyosha Efros


  1. Ullman, S., Vidal-Naquet, M. , and Sali, E.  Visual features of intermediate complexity and their use in classification. (2002) Nature Neuroscience, 5(7), 1-6
  2. Michel Vidal-Naquet, Shimon Ullman. Object Recognition with Informative Features and Linear Classification. ICCV 2003
    • Advocate: David Bradley


  1. Fei-Fei and P. Perona. A Bayesian hierarchical model for learning natural scene categories. In Proceedings of the IEEE Conference on Computer Vision and Pattern  Recognition, San Diego, CA, volume 2, pages 524-531, June 2005. (code available)
    • Advocate: Tomasz Malisiewicz
    • Demo: Ellie Lin
  1. Josef Sivic, Bryan Russell, Alexei A. Efros, Andrew Zisserman, Bill Freeman, Discovering Objects and thier Location in Images, ICCV 2005 (code available)
    • Advocate: Tomasz Malisiewicz


Part II: Mid-level Vision (Image Segmentation)


  1. Max Wertheimer, Laws of Organization in Perceptual Forms (1923)


  1. Jianbo Shi; Malik, J. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, Aug. 2000, vol.22, (no.8):888-905. (code available)
    • Advocate: Carlos Vallespi
    • Demo: Joseph Djugash


  1. Meila, M. and Shi, J. Learning Segmentation with Random Walks. Advances in Neural Information Processing Systems 13 (NIPS 2000).


  1. Weiss, Y. Segmentation using eigenvectors: a unifying view.  Proceedings of the Seventh IEEE International Conference on Computer Vision, Kerkyra, Greece, 20-27 Sept. 1999.
    • advocate: Carlos Vallespi


  1. Andrew Y. Ng, Michael I. Jordan, Yair Weiss, On Spectral Clustering: Analysis and an algorithm (2001) NIPS


  1. Xiaofeng Ren and Jitendra Malik, Learning a Classification Model for Segmentation. in ICCV '03 (superpixel code available)


  1. Tu and Zhu, Image Segmentation by Data-Driven Markov Chain Monte Carlo, PAMI (2002)
    • Advocate: Tomasz Malisiewicz


  1. D. Comaniciu, P. Meer.  Mean Shift: A Robust Approach toward Feature Space Analysis, IEEE Trans. Pattern Analysis Machine Intell., Vol. 24, No. 5, 603-619, 2002


  1. Boykov & Jolly, Interactive Graph Cuts for Optimal Boundary & Region Segmentation of Objects in ND Images. ICCV 01
  2. Yin Li; Jian Sun; Chi-Keung Tang; Heung-Yeung Shum, Lazy Snapping, SIGGRAPH 04
    • Advocate: Mohit Gupta


Part III: 2D Recognition


Window Scanning Approaches


  1. H. Schneiderman and T. Kanade.  Object Detection Using the Statistics of Parts.  International Journal of Computer Vision, 2004 (demo available)
  2. Viola, Jones, Robust Real-time Object Detection (2001) Second International Workshop on Statistical and Computational Theories of Vision (short version)
    • Advocate: Nicolas Chan
    • Opposition: Tomasz Malisiewicz
    • Demo: Pete Barnum


  1. Dalal, Triggs, Histograms of Oriented Gradients for Human Detection, CVPR 2005 (data available)
    • Advocate: Pete Barnum


Correspondence Matching Approaches


  1. Gavrila & Philomin, Real-time Object Detection for Smart Vehicles, ICCV 1999
    • Advocate: Stephan Zickler
    • Oppose: ?


  1. Olson & Huttenlocher. Automatic Target Recognition by Matching Oriented Edge Pixels, IEEE Transactions on Image Processing 1997


  1. Erik Learned-Miller, (2005) Data Driven Image Models through Continuous Joint Alignment. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI).  (code available)


  1. Belongie, Malik, and Puzicha. Shape Matching and Object Recognition Using Shape Contexts (2002)
    • Advocate: Carlos Vallespi
    • Demo: Carlos Vallespi


  1. A Berg, T Berg, J Malik, Shape Matching and Object Recognition using Low Distortion Correspondences, CVPR 2005
    • Advocate: Gunhee Kim
    • Opposition: Joseph Djugash


  1. M. Leordeanu and M. Hebert, A Spectral Technique for Correspondence Problems using Pairwise Constraints, ICCV 2005
    • Demo: Dhruv Batra


  1. David G. Lowe, Object Recognition from Local Scale-Invariant Features, ICCV 1999
    • Advocate: David Lee
    • Demo: Heather Dunlop


  1. Fitzgibbon, A. W. and Zisserman, A. On Affine Invariant Clustering and Automatic Cast Listing in Movies, ECCV 2002


  1. T. F. Cootes, G. J. Edwards, C. J. Taylor, Active Appearance Models, PAMI 2001


Recognition with Segmentation


  1. Eran Borenstein, Shimon Ullman. Class-Specific, Top-Down Segmentation. ECCV 2002
  2. Eran Borenstein, Shimon Ullman: Learning to Segment. ECCV 2004
  3. E. Borenstein, E. Sharon, S. Ullman, Combining Top-Down and Bottom-Up Segmentation, Proceedings IEEE workshop on Perceptual Organization in Computer Vision, IEEE Conference on Computer Vision and Pattern Recognition, Washington, DC, June 2004.
    • Advocate: Joseph Djugash
    • Opposition: Heather Dunlop


  1. Xiaofeng Ren, Charless Fowlkes and Jitendra Malik, Cue Integration for Figure/Ground Labeling. (2005) NIPS


  1. Stella X. Yu and Jianbo Shi, Object-Specific Figure-Ground Segregation, CVPR 2003


  1. B Leibe, E Seemann, B Schiele, Pedestrian Detection in Crowded Scenes, CVPR 2005


  1. J. Winn and  N. Jojic. LOCUS: Learning Object Classes with Unsupervised Segmentation, Proc. IEEE Intl. Conf. on Computer Vision (ICCV), Beijing 2005.
    • Advocate: Nik Melchior
    • Opponent: David Lee


  1. Z Tu, X Chen, AL Yuille, SC Zhu.  Image Parsing: Unifying Segmentation, Detection, and Recognition.  International Journal of Computer Vision, 2005


  1. A. Torralba,  K. P. Murphy, W. T. Freeman and M. A. Rubin, Context-based vision system for place and object recognition, ICCV 2003
    • Advocate: David Lee
    • Demo: Gunhee Kim


  1.  A. Torralba, K. P. Murphy and W. T. Freeman (2004), Contextual Models for Object Detection using Boosted Random Fields. To appear in Adv. in Neural Information Processing Systems (NIPS)


Words and Pictures


  1. Tamara L. Berg, Alexander C. Berg, Jaety Edwards, Michael Maire, Ryan White, Yee Whye Teh, Erik Learned-Miller, David A. Forsyth. Names and Faces. in submission
    • Advocate: Krishnan Ramnath


  1. Pinar Duygulu, Kobus Barnard, Nando de Freitas, and David Forsyth.  Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. ECCV 2002.
    • Advocate: Heather Dunlop


  1. Kobus Barnard, Pinar Duygulu, Nando de Freitas, David Forsyth, David Blei, and Michael I. Jordan, Matching Words and Pictures. Journal of Machine Learning Research, 2003.


Part IV: Intrinsic Images


  1. HG Barrow, JM Tenenbaum, Recovering Intrinsic Scene Characteristics from Images, 1978 (classic paper!)


  1. Adelson & Pentland, The Perception of Shading and Reflectance, 1996
    • Advocate: Seth Koterba
    • Opponent: Stephan Zickler


  1. Sinha & Adelson: Recovering Reflectance in a World of Painted Polyhedra, ICCV 1993


  1. Yair Weiss,  Deriving intrinsic images from image sequences, ICCV 2001 (code available)
    • Advocate: Mohit Gupta
    • Demo: Mohit Gupta


  1. GD Finlayson, MS Drew, C Lu, Intrinsic Images by Entropy Minimization, ECCV 04


  1. Marshall F Tappen, William T Freeman, Edward H Adelson, Recovering Intrinsic Images from a Single Image.  NIPS 2002. (there is also a longer version that was published in the September 2005 issue of IEEE Transactions on Pattern Analysis and Machine Intelligence)
    • Advocate: Malola Prasath


  1. Hoiem, Efros, Hebert, Geometric Context from a Single Image, ICCV 2005 (code available)
    • Advocate: Stefan Zickler + demo too?


  1. Ashutosh Saxena, Sung Chung, and Andrew Y. Ng. Learning Depth from Single Monocular Images. NIPS 2005.
    • Advocate: Malola Prasath


  1. Tenenbaum, & Freeman.  Separating Style and Content with Bilinear Models. Neural Computation, 2000.



Part V: Dealing with Data


  1. J. B. Tenenbaum, V. De Silva, and J. C. Langford. A global geometric framework for nonlinear dimensionality reduction. Science 290 (5500): 22 December 2000. (code available)
  2. Sam Roweis & Lawrence Saul.  Nonlinear dimensionality reduction by locally linear embedding. Science v.290 no.5500, Dec.22, 2000. (code available)
    • Advocate: Dave Thompson
    • Opponent: Jonathan Huang


  1. D. D. Lee and H. S. Seung.  Learning the parts of objects by non-negative matrix factorization.  Nature 401, 788-791 (1999).  (code available)


Part VI: Tracking & Motion Segmentation


  1. Isard & Blake, CONDENSATION conditional density propagation for visual tracking.  IJCV, 1998


  1. Toyama & Blake, Probabilistic Tracking with Exemplars in a Metric Space.  IJCV, 2002


  1. C. L. Zitnick, N. Jojic, S. B. Kang.  Consistent segmentation for optical flow estimation.  IEEE Int'l Conf. on Computer Vision, 2005.


  1. Ramanan, Forsyth, Zisserman.  Strike a Pose: Tracking People by Finding Stylized Poses, CVPR 2005 (video examples)


  1. MP Kumar, PHS Torr, A Zisserman, Learning Layered Motion Segmentations of Video, ICCV '05 

Most recently updated on January. 16, 2006 by Alyosha Efros