Computer Vision Misc Reading Group
2006 Archived Schedule

Date Presenter Description
1/4/2006 Winter Break No Meeting
1/11/2006 Yan Ke NIPS Overview
1/18/2006 Marius Leordeanu The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features
by Kristen Grauman & Trevor Darrell.
1/25/2006 Caroline Pantofaru I'll spend a few minutes discussing the paper that a few of us were mumbling about last week that uses the pyramid histogram kernel to model spatial relationships.

I'll spend the bulk of the time discussing the paper:

Hyperfeatures -- Multilevel Local Coding for Visual Recognition
by Ankur Agarwal and Bill Triggs.

It's been accepted as an oral for ECCV 2006. The tech report relating to the paper is here. The conference submission will be made available via email, but should not be distributed, as requested by the author.

2/1/2006 Ranjith Unnikrishnan This week, I'll present two recent papers related to "Dynamic Graph Cuts" by Pushmeet Kohli and Phil Torr from across the pond:

  1. Efficiently Solving Dynamic Markov Random Fields Using Graph Cuts, ICCV 2005 (oral)
  2. Measuring Uncertainty in Graph Cut Solutions - Efficiently Computing Min-marginal Energies using Dynamic Graph Cuts, ECCV '06 (accepted oral)
    [PREPRINT - please do not distribute, link will be distributed by email]

In (1) they exploit an simple idea to quickly compute optimal graph cuts in slowly changing energy functions for figure-ground segmentation in video. In (2) they show how to compute min-marginals associated with the label assignments for any latent variable in an MRF, and subsequently compute a useful confidence measure for label assignments in image segmentation.

2/8/2006 Andrew Stein I will present Kumar, Torr, and Zisserman's ICCV 2005 paper,
Learning Layered Motion Segmentations of Video

Abstract: We present an unsupervised approach for learning a generative layered representation of a scene from a video for motion segmentation. The learnt model is a composition of layers, which consist of one or more segments. Included in the model are the effects of image projection, lighting, and motion blur. The two main contributions of our method are: (i) A novel algorithm for obtaining the initial estimate of the model using efficient loopy belief propagation; (ii) Using alpha-swap and alpha-beta-expansion algorithms, which guarantee a strong local minima, for refining the initial estimate. Results are presented on several classes of objects with different types of camera motion. We compare our method with the state of the art and demonstrate signicant improvements.

Derek Hoiem TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-Class Object Recognition and Segmentation
J. Shotton, J. Winn, C. Rother, and A. Criminisi

A link for the paper will be distributed via the email list. Please do not distribute.

2/22/2006 Goksel Dedeoglu "Learning Depth from Single Monocular Images", Ashutosh Saxena, Sung H. Chung, Andrew Y. Ng, NIPS 18

Abstract, paper and results.

3/1/2006 Qifa Ke NOTE: Meeting in NSH 3001 this week!

I will present the following paper:

Parameter-Free Radial Distortion Correction with Centre of Distortion Estimation
Hartley, R.I.; Sing Bing Kang;
Computer Vision, 2005. ICCV 2005. Tenth IEEE International Conference on
Volume 2, 17-20 Oct. 2005 Page(s):1834 - 1841

We propose a method of simultaneously calibrating the radial distortion function of a camera along with the other internal calibration parameters. The method relies on the use of a planar (or alternatively non-planar) calibration grid, which is captured in several images. In this way, the determination of the radial distortion is an easy add-on to the popular calibration method proposed by Zhang [17]. The method is entirely non-iterative, and hence is extremely rapid and immune from the problem of local minima. Our method determines the radial distortion in a parameter-free way, not relying on any particular radial distortion model. This makes it applicable to a large range of cameras from narrow-angle to fish-eye lenses. The method also computes the centre of radial distortion, which we argue is important in obtaining optimal results. Experiments show that this point may be significantly displaced from the centre of the image, or the principal point of the camera.

3/8/2006 Simon Lucey In my presentation I am going to discuss a recent (CVPR'05) paper
"Hallucinating Faces: TensorPatch Super-Resolution and Coupled Residue Compensation".
If I have time I will also discuss some of my more recent work involving patches and faces from this year's upcoming CVPR.

In this paper, we propose a new face hallucination framework based on image patches, which integrates two novel statistical super-resolution models. Considering that image patches reflect the combined effect of personal characteristics and patch-location, we first formulate a TensorPatch model based on multilinear analysis to explicitly model the interaction between multiple constituent factors. Motivated by Locally Linear Embedding, we develop an enhanced multilinear patch hallucination algorithm, which efficiently exploits the local distribution structure in the sample space. To better preserve face subtle details, we derive the Coupled PCA algorithm to learn the relation between high-resolution residue and low-resolution residue, which is utilized for compensate the error residue in hallucinated images. Experiments demonstrate that our framework on one hand well maintains the global facial structures, on the other hand recovers the detailed facial traits in high quality.

3/15/2006 Spring Break Srinivas will be out of town, and it's spring break, so this meeting is cancelled.
3/22/2006 Ankur Datta Weakly Supervised Learning of Part-Based Spatial Models for Visual Object Recognition
David Crandall and Daniel Huttenlocher (Cornell University)
ECCV 2006 Oral

Abstract: In this paper we investigate a new method of learning part- based models for visual object recognition, from training data that only provides information about class membership (and not object location or configuration). This method learns both a model of local part ap- pearance and a model of the spatial relations between those parts. In contrast, other work using such a weakly supervised learning paradigm has not considered the problem of simultaneously learning appearance and spatial models. Some of these methods use a “bag” model where only part appearance is considered whereas other methods learn spatial models but only given the output of a particular feature detector. Pre- vious techniques for learning both part appearance and spatial relations have instead used a highly supervised learning process that provides substantial information about object part location. We show that our weakly supervised technique produces better results than these previous highly supervised methods. Moreover, we investigate the degree to which both richer spatial models and richer appearance models are helpful in improving recognition performance. Our results show that while both spatial and appearance information can be useful, the effect on perfor- mance depends substantially on the particular object class and on the difficulty of the test dataset.

3/29/2006 Philipp Michel Stable Real-Time 3D Tracking Using Online and Offline Information
L. Vacchetti, V. Lepetit and P. Fua
IEEE Transactions on Pattern Analysis and Machine Intelligence,
Vol. 26, Nr. 10, pp. 1391-1391, 2004.

Abstract: We propose an efficient real-time solution for tracking rigid objects in 3D using a single camera that can handle large camera displacements, drastic aspect changes, and partial occlusions. While commercial products are already available for offline camera registration, robust online tracking remains an open issue because many real-time algorithms described in the literature still lack robustness and are prone to drift and jitter.

To address these problems, we have formulated the tracking problem in terms of local bundle adjustment and have developed a method for establishing image correspondences that can equally well handle short and wide-baseline matching. We then can merge the information from preceding frames with that provided by a very limited number of keyframes created during a training stage, which results in a real-time tracker that does not jitter or drift and can deal with significant aspect changes.

4/5/2006 Cancelled No Meeting
4/12/2006 Stano Funiak Distance Metric Learning for Large Margin Nearest Neighbor Classification
by Kilian Q. Weinberger, John Blitzer and Lawrence K. Saul

Abstract: We show how to learn a Mahanalobis distance metric for k-nearest neighbor (kNN) classification by semidefinite programming. The metric is trained with the goal that the k-nearest neighbors always belong to the same class while examples from different classes are separated by a large margin. On seven data sets of varying size and difficulty, we find that metrics trained in this way lead to significant improvements in kNN classification—for example, achieving a test error rate of 1.3% on the MNIST handwritten digits. As in support vector machines (SVMs), the learning problem reduces to a convex optimization based on the hinge loss. Unlike learning in SVMs, however, our framework requires no modification or extension for problems in multiway (as opposed to binary) classification.

4/19/2006 Jonathan Huang I'll be talking about the NIPS '05 paper:

Describing Visual Scenes using Transformed Dirichlet Processes
by Erik B. Sudderth, Antonio Torralba, William T. Freeman, and Alan S. Willsky.

Abstract: Motivated by the problem of learning to detect and recognize objects with minimal supervision, we develop a hierarchical probabilistic model for the spatial structure of visual scenes. In contrast with most existing models, our approach explicitly captures uncertainty in the number of object instances depicted in a given image. Our scene model is based on the transformed Dirichlet process (TDP), a novel extension of the hierarchical DP in which a set of stochastically transformed mixture components are shared between multiple groups of data. For visual scenes, mixture components describe the spatial structure of visual features in an object–centered coordinate frame, while transformations model the object positions in a particular image. Learning and inference in the TDP, which has many potential applications beyond computer vision, is based on an empirically effective Gibbs sampler. Applied to a dataset of partially labeled street scenes, we show that the TDP’s inclusion of spatial structure improves detection performance, flexibly exploiting partially labeled training images.

4/26/2006 Tomasz Malisiewicz I'll be presenting the CVPR 2006 paper titled

The Layout Consistent Random Field for Recognizing and Segmenting Partially Occluded Objects, by John Winn and Jamie Shotton.

Abstract: This paper addresses the problem of detecting and segmenting partially occluded objects of a known category. We first define a part labelling which densely covers the object. Our Layout Consistent Random Field (LayoutCRF) model then imposes asymmetric local spatial constraints on these labels to ensure the consistent layout of parts whilst allowing for object deformation. Arbitrary occlusions of the object are handled by avoiding the assumption that the whole object is visible. The resulting system is both efficient to train and to apply to novel images, due to a novel annealed layout-consistent expansion move algorithm paired with a randomised decision tree classifier. We apply our technique to images of cars and faces and demonstrate state-of-the-art detection and segmentation performance even in the presence of partial occlusion.

Slides (PDF)
Jean-Francois Lalonde
This meeting will start early, at 12:00pm!!!

Data-driven scale selection and data structure for mobile robot perception

This talk presents research work I have been involved in during my master's, and serves as the speaking requirement for the degree.

Autonomous robot navigation in terrain containing vegetation remains a considerable challenge because of the difficulty in capturing the variability of such complex environments. Usual perception techniques that rely on a 2-D map of the terrain fail to capture three-dimensional details, such as overhanging obstacles for example. In this talk, we will present an approach that enables robotic navigation in complex, 3-D environments.

This presentation will be divided in three sections. First, we present an overview of our approach, that generates a detailled 3-D semantic representation of the environment using only 3-D data from a laser range sensor. The approach relies on point-wise classification based on the extraction of local geometric features taken over a region of interest around each point. This is subject to two main problems: the approach is computationally expensive, and the size of the region of interest is determined manually.

In the two following sections, we propose solutions to each of these problems. First, we present an efficient data structure and algorithm that allows a 4x speedup of range search, a critical operation that lies at the core of our approach, but can also be used in other applications. Second, we introduce an automatic scale selection technique that improves classification accuracy for point-sampled surfaces.

Here are some related papers:

  • Data structure for efficient processing in 3-D, for details on the data structure
  • Scale selection for classification of point-sampled 3-D surfaces, for details on 3-D scale selection
  • 5/10/2006 Mohit Gupta I will be presenting the following CVPR'06 paper:

    Dimensionality Reduction by Learning an Invariant Mapping

    Abstract: Dimensionality reduction involves mapping a set of high dimensional input points onto a low dimensional manifold so that “similar” points in input space are mapped to nearby points on the manifold. Most existing techniques for solving the problem suffer from two drawbacks. First, most of them depend on a meaningful and computable distance metric in input space. Second, they do not compute a “function” that can accurately map new input samples whose relationship to the training data is unknown. We present a method - called Dimensionality Reduction by Learning an Invariant Mapping (DrLIM) - for learning a globally coherent non-linear function that maps the data evenly to the output manifold. The learning relies solely on neighborhood relationships and does not require any distance measure in the input space. The method can learn mappings that are invariant to certain transformations of the inputs, as is demonstrated with a number of experiments. Comparisons are made to other techniques, in particular LLE.

    5/17/2006 David Bradley I'm going to talk primarily about the CVPR 2006 paper:
    Incremental learning of object detectors using a visual shape alphabet, by Opelt, Pinz, and Zisserman.

    I will also be describing the boundary fragment model that they use which is outlined in the ECCV 2006 paper:
    A Boundary-Fragment-Model for Object Detection

    Here is the abstract from the CVPR paper:
    We address the problem of multiclass object detection. Our aims are to enable models for new categories to ben- efit from the detectors built previously for other categories, and for the complexity of the multiclass system to grow sub- linearly with the number of categories. To this end we intro- duce a visual alphabet representation which can be learnt incrementally, and explicitly shares boundary fragments (contours) and spatial configurations (relation to centroid) across object categories. We develop a learning algorithm with the following novel contributions: (i) AdaBoost is adapted to learn jointly, based on shape features; (ii) a new learning sched- ule enables incremental additions of new categories; and (iii) the algorithm learns to detect objects (instead of cate- gorizing images). Furthermore, we show that category sim- ilarities can be predicted from the alphabet. We obtain excellent experimental results on a variety of complex categories over several visual aspects. We show that the sharing of shape features not only reduces the num- ber of features required per category, but also often im- proves recognition performance, as compared to individual detectors which are trained on a per-class basis.

    5/24/2006 Black Friday No meeting.
    5/31/2006 Sanjeev Koppal I'm presenting the CVPR 2006 paper, A planar light probe
    6/7/2006 Yanxi Liu Computational Symmetry

    I am in the process of preparing an invited survey paper on the topic of "Computational Symmetry" for the new journal "Foundations and Trends in Computer Graphics and Vision". I am going to give a summary talk on the formal, mathematical definition of types of symmetry, symmetry groups and computational symmetry, their relevance to CV and CG, a sample of previous work, current challenges and future directions. I am looking for your feedback on clarity and completeness. Here's a quote from an incoming SIGGRAPH paper on the relevance of symmetry to get you started:

    Symmetry is an essential and ubiquitous concept in nature, science and art. For example, in geometry, the Erlanger program of Felix Klein has fueled for over a century mathematicians' interest in invariance under certain group actions as a key principle for understanding geometric spaces. Numerous biological, physical, or man-made structures exhibit symmetries as a fundamental design principle or as an essential aspect of their function. Whether by evolution or by design, symmetry implies certain economies and efficiencies of structure that make it universally appealing. Symmetry also plays an important role in human visual perception and aesthetics. Arguably much of the understanding of the world around us is based on the perception and recognition of shared or repeated structures, and so is our sense of beauty. [Mitra, Guibas, Pauly]

    6/14/2006 Alyosha Efros We haven't been reading many mid-level vision papers lately. So, I will expose my West Coast bias and present:

    Figure/Ground Assignment in Natural Images.
    Xiaofeng Ren, Charless Fowlkes and Jitendra Malik, in ECCV '06, Graz 2006.

    6/21/2006 CVPR 2006 No meeting.
    6/28/2006 Srinivas Narasimhan CVPR Overview, Part I

    Srinivas will cover a few papers that caught his eye at CVPR 2006 and we will organize other overview presenters for the following week.

    7/5/2006 CVPR Overview CVPR Overview, Part II

    Various presenters will give short overviews of papers from CVPR 2006:


  • Discriminative Object Class Models of Appearance and Shape by Correlatons by Savarese, Winn and Criminisi
  • Shape Guided Object Segmentation by Borenstein and Malik

    Dave Tolliver:

  • Particle Video by Peter Sand and Seth Teller.
  • "Other segmentation papers" not covered by other people


  • Reciprocal Image Features for Uncalibrated Helmholtz Stereopsis by Todd Zickler
  • A Geodesic Active Contour Framework for Finding Glass by Kenton McHenry and Jean Ponce


  • Learning Object Shape: From Drawings to Images by G. Elidan, G. Heitz, D. Koller
  • SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition by Zhang, Berg, et al.
  • Multiclass object recognition with sparse localized features by Mutch and Lowe
  • Multiple object class detection with a generative model
  • Unsupervised learning of categories from sets of partially matching image features by Kristen Grauman


  • Spectral Methods for Automatic Multiscale Data Clustering by Arik Azran and Zoubin Ghahramani
  • Depth from Familiar Objects: A Hierarchical Model for 3D Scenes by Erik Sudderth, Antonio Torralba, William Freeman and Alan Willsky


  • Example Based 3D Reconstruction from Single 2D Images by T. Hassner and R. Basri (Beyond Patches Workshop)
  • Spatial Boosting for Bag-of-Features by Marcin Marszalek and Cordelia Schmid
  • Supervised Learning of Edges and Object Boundaries by Piotr Dollar, Zhuowen Tu, and Serge Belongie
  • Scale Variant Image Pyramids by Joshua Gluckman


  • Segmentation by level sets and symmetry
  • Globally optimal grouping for symmetric boundaries


  • Unsupervised discovery of action classes by Wang, Jiang, Drew, Li and Mori
  • Automatic Discovery of Action Taxonomies from Multiple Views by Weinland, Ronfard, Boyer

    Leon Gu:

  • Local Features, All Grown Up by Andrea Vedaldi and Stefano Soatto
  • Nonparametric Priors on the Space of Joint Intensity Distributions for Non-Rigid Multi-Modal Image Registration by Daniel Cremers, Christoph Guetter and Chenyang Xu
  • Accurate Face Alignment Using Shape Constrained Markov Network by Lin Liang, Fang Wen, Ying-Qing Xu, Xiaoou Tang and Heung-Yeung Shum


  • Extracting subimages of unknown category from a set of images
  • Tappen+Freeman+Adelson intrinsic paper
  • Animals on the web
  • 7/12/2006 Caroline Pantofaru For my part of the CVPR'06 overview, I'll cover:

    Discriminative Object Class Models of Appearance and Shape by Correlatons
    Savarese, S.; Winn, J.; Criminisi, A.

    7/19/2006 Caroline Pantofaru
    Dave Tolliver
    Caroline will cover the paper she didn't get to last week:

    Shape Guided Object Segmentation
    Borenstein, E.; Malik, J.

    and Dave will briefly go over the multiscale aggregation paper of Galun et al. referenced as [4] in the Borenstein paper. It provides the initial low-level segmentation and the multiscale segmentation prior.

    7/26/2006 Ranjith Unnikrishnan I'll present Noise Estimation from a Single Image by Ce Liu, William T. Freeman, Richard Szeliski and Sing Bing Kang, since it's a good paper and it seems that not too many misc-readers were able to attend its oral presentation at CVPR.

    Abstract: In order to work consistently across images, many computer vision algorithms require that their parameters be adjusted according to the image noise level, making it an important quantity to estimate. We show how to estimate an upper bound on the noise level from a single image based on a piecewise smooth image prior model and measured CCD camera response functions. We illustrate the utility of this noise estimation for two algorithms: edge detection and feature preserving smoothing through bilateral filtering. For a variety of different noise levels, we obtain good results for both these algorithms with no user-specified inputs.

    8/2/2006 Jiang Ni I'll be presenting the following CVPR'06 paper, which uses a Bayes Net to label identities in multi-target tracking problems, where targets may interact or occlude one another.

    Multi-Target Tracking -- Linking Identities using Bayesian Network Inference
    Nillius, P., Sullivan, J. and Carlsson, S.
    In Proc. IEEE Computer Vision and Pattern Recognition (CVPR06), New York City, June 2006

    8/9/2006 Fernando de la Torre The papers we will enjoy on Wednesday will be:

    Combining Discriminative Features to infer Complex Trajectories

    Optimal Multi-frame correspondence with Assigment Tensors

    Happy reading and understanding.

    8/16/2006 Marius Leordeanu Keypoint Recognition Using Randomized Trees
    Lepetit, V. and Fua, P.
    PAMI, Sept. 2006

    They formulate the matching task between keypoints in the training and testing images as a classification problem, using randomized trees.

    Abstract: In many 3D object-detection and pose-estimation problems, runtime performance is of critical importance. However, there usually is time to train the system, which we will show to be very useful. Assuming that several registered images of the target object are available, we developed a keypoint-based approach that is effective in this context by formulating wide-baseline matching of keypoints extracted from the input images to those found in the model images as a classification problem. This shifts much of the computational burden to a training phase, without sacrificing recognition performance. As a result, the resulting algorithm is robust, accurate, and fast-enough for frame-rate performance. This reduction in runtime computational complexity is our first contribution. Our second contribution is to show that, in this context, a simple and fast keypoint detector suffices to support detection and tracking even under large perspective and scale variations. While earlier methods require a detector that can be expected to produce very repeatable results, in general, which usually is very time-consuming, we simply find the most repeatable object keypoints for the specific target object during the training phase. We have incorporated these ideas into a real-time system that detects planar, nonplanar, and deformable objects. It then estimates the pose of the rigid ones and the deformations of the others.

    8/23/2006 Ankur Datta I am presenting the following two BMVC 2006 papers, time permitting:

    1. Finding people in repeated shots of the same scene
    Sivic, Zitnick, Szeliski

    The goal of this work is to find all occurrences of a particular person in a sequence of photographs taken over a short period of time. For identification, we assume each individual’s hair and clothing stays the same throughout the sequence. Even with these assumptions, the task remains challenging as people can move around, change their pose and scale, and partially occlude each other.

    We propose a two stage method. First, individuals are identified by clustering frontal face detections using color clothing information. Second, a color based pictorial structure model is used to find occurrences of each per- son in images where their frontal face detection was missed. Two extensions improving the pictorial structure detections are also described. In the first extension, we obtain a better clothing segmentation to improve the accuracy of the clothing color model. In the second extension, we simultaneously consider multiple detection hypotheses of all people potentially present in the shot.

    Our results show that people can be re-detected in images where they do not face the camera. Results are presented on several sequences from a personal photo collection.

    2. Patch-based Object Recognition Using Discriminatively Trained Gaussian Mixtures
    Hegerath, Deselaers, Ney

    We present an approach using Gaussian mixture models for part-based object recognition where spatial relationships of the parts are explicitly modeled and parameters of the generative model are tuned discriminatively. These extensions lead to great improvements of the classification accuracy. Furthermore we evaluate several improvements over our baseline system which incrementally improve the obtained results which compare favorable well to other published results for the three Caltech tasks and the PASCAL evaluation 05 tasks.

    8/30/2006 James Hays I'll be presenting Object Categorization by Learned Universal Visual Dictionary from ICCV 2005. I'll try to relate this work to others from the same authors, such as

  • TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-Class Object Recognition and Segmentation by J. Shotton, J. Winn, C. Rother, and A. Criminisi, and
  • Discriminative Object Class Models of Appearance and Shape by Correlatons by Savarese, Winn and Criminisi

    which have been previously presented in the MISC reading group by Derek and Caroline respectively.

  • 9/6/2006 Yan Ke Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words
    Juan Carlos Niebles, Hongcheng Wang and Li Fei-Fei.
    9/13/2006 Goksel Dedeoglu Rethinking the Prior Model for Stereo
    Hiroshi Ishikawa and Davi Geiger, ECCV 2006.
    9/20/2006 Andrew Stein BMVC Overview
    9/27/2006 Cancelled No meeting.
    Derek Hoiem I have decided to change the presented subject to what is apparently a very hot paper in segmentation:

    Boundary Extraction in Natural Images Using Ultrametric Contour Maps
    by Pablo Arbelaez

    with additional results here.

    10/11/2006 Shuntaro Yamazaki I will be presenting a method for fully automated calibration of lens distortion and camera intrinsics. We use structured-light patterns using a LCD to generate a dense map between the display and the image coordinate systems. This approach allows us to easily correct the distortion even around the edge of a camera image in sub-pixel accuracy, without assuming any model of lens distortion.

    I haven't published this work to any conference or journal, and still wondering if I can claim the novelty of this work. Recently I found several work which is closely related (or fundamentally equivalent) to our work. One of them is here.

    10/18/2006 Jake Sprouse ***Note: we'll be in the Clemente room at Intel this week!***

    I'll go into detail on Particle Video: Long-Range Motion Estimation using Point Trajectories, by Peter Sand and Seth Teller from CVPR'06. Dave Tolliver presented a quick overview in the CVPR review meeting.

    This paper describes a new approach to motion estimation in video. We represent video motion using a set of particles. Each particle is an image point sample with a longduration trajectory and other properties. To optimize these particles, we measure point-based matching along the particle trajectories and distortion between the particles. The resulting motion representation is useful for a variety of applications and cannot be directly obtained using existing methods such as optical flow or feature tracking. We demonstrate the algorithm on challenging real-world videos that include complex scene geometry, multiple types of occlusion, regions with low texture, and non-rigid deformations.

    10/25/2006 Sanjeev Koppal *** NOTE: This week's meeting will be in NSH 3001! ***

    Here is the paper i'm talking about: Photometric Stereo with Nearby Planar Distributed Illuminants

    Slides (PDF)
    Jean-Francois Lalonde *** NOTE: This week's meeting will be in NSH 3001! ***

    "About natural color statistics"

    Alyosha and I recently became interested about natural color statistics, that is: can we find a distribution of the colors we expect to see in natural images? If there is such a distribution, can we then parameterize it to obtain a compact representation? We could not find a paper on that exact topic, but instead found a lot of papers covering a wide range of color-related topics. In the upcoming misc-read meeting, I will present a high-level overview of the litterature on natural color statics. Topics and papers covered will be :

  • Color constancy:
    - D.A. Forsyth's gamut mapping
    - Kobus Barnard's tutorial
    - Some of Graham Finlayson's recent work at CVPR 2005
    - Erik Miller's color flow
  • Color Harmony
    - Cohen-Or, Siggraph 2006
  • As well as some psycho-physics papers such as
    - Aude Oliva's color for scene recognition

    You can read whichever's closest to your interests!

  • 11/8/2006 Stano Funiak *** NOTE: This week's meeting will be in NSH 3001! ***

    Discriminative Learning of Markov Random Fields for Segmentation of 3D Scan Data. D. Anguelov, B. Taskar, V. Chatalbashev, D. Koller, D. Gupta, G. Heitz, A. Ng. International Conference on Computer Vision and Pattern Recognition (CVPR05), San Diego, CA, June 2005.

    Paper abstract:
    We address the problem of segmenting 3D scan data into objects or object classes. Our segmentation framework is based on a subclass of Markov Random Fields (MRFs) which support efficient graph-cut inference. The MRF models incorporate a large set of diverse features and enforce the preference that adjacent scan points have the same classification label. We use a recently proposed maximum-margin framework to discriminatively train the model from a set of labeled scans; as a result we automatically learn the relative importance of the features for the segmentation task. Performing graph-cut inference in the trained MRF can then be used to segment new scenes very efficiently. We test our approach on three large-scale datasets produced by different kinds of 3D sensors, showing its applicability to both outdoor and indoor environments containing diverse objects.

    Datasets & results from the paper

    Background reading with proofs:
    Learning Associative Markov Networks, B. Taskar, V. Chatalbashev and D. Koller. Twenty First International Conference on Machine Learning (ICML04), Banff, Canada, July 2004.

    11/15/2006 Goksel Dedeoglu *** NOTE: This week's meeting will be in NSH 4201! ***

    Details sent via email.

    11/22/2006 Thanksgiving No Meeting
    11/29/2006 Jonathan Huang I'll provide an overview of several recent manifold learning papers which emphasize the role of topology rather than geometry. They are listed below.

  • Simultaneous Inference of View and Body Pose Using Torus Manifolds
    Chan-Su Lee and Ahmed Elgammal, The 18th International Conference on Pattern Recognition (ICPR), Hong Kong, August 21-24, 2006
  • Finding the Homology of Submanifolds with High Confidence from Random Samples.
    P. Niyogi, S. Smale, and S. Weinberger, to appear, Discrete and Computational Geometry, 2006.
  • On the local behavior of spaces of natural images
    G. Carlsson, T. Ishkhanov, V. de Silva, and A. Zomorodian, preprint, May 31, 2006.
  • Computing persistent homology
    A. Zomorodian and G. Carlsson, Discrete and Computational Geometry, 33 (2), pp. 247–274
  • 12/6/2006 All CVPR Submitters "CVPR De-Briefing"

    Anyone who submitted to CVPR, please send a link of the PDF of your submission to Andrew. We'll put them up on the projector to have a look at the figures, and everyone will have a chance to give the group a low-key, quick talk on their work. No slides necessary.

    12/13/2006 Christopher Geyer I will talk about some work that I did while at Berkeley in doing structure-from-motion without reliable correspondences, or correspondence inlier rates which may be smaller than 1%. With Ameesh Makadia and Kostas Daniilidis, we proposed a method which uses a Radon transform to compute cost functions in the full five-dimensional space of relative motions between two cameras (up to scale). There was a beautiful underlying theory relating the idea that the manifold of essential matrices is a so-called homogeneous space, which admits a Fourier transform, thereby allowing for efficient computation relative to a brute force implementation of a Radon transform.

    For more information see:

  • Euclid meets Fourier: Applying harmonic analysis to essential matrix estimation in omnidirectional cameras, Geyer, Sastry, and Bajcsy
  • Geometric Models of Rolling-Shutter Cameras, Geyer, Meingast, and Sastry
  • 12/20/2006 Black "Friday" No Meeting
    12/27/2006 Winter Break No Meeting