Computer Vision Misc Reading Group

Year 2004    (in chronological order)

Date Presenter Description

Andrew Stein I've selected the following two papers to discuss at next week's group meeting. They are both from the most recent ICCV. They are somewhat related in that each incorporates temporal information to extract information (find features / do classification) from image sequences. Both are also essentially extensions to the temporal domain of formerly spatial-only methodologies, so I thought that was another connection.

Space-time Interest Points by Ivan Laptev and Tony Lindeberg. Paper here.

Detecting Pedestrians Using Patterns of Motion and Appearance by Paul Viola, Michael Jones, and Daniel Snow . 2003 Marr Prize Winner. Paper here.


Tal Blum I will be presenting the papers:
  1. Spectral Histogram Based Face Detection by Christopher Waring and Xiuwen Liu. Proceedings of the International Joint Conference on Neural Networks, Volume: 2, July 20-24, 2003. Paper here.

  2. Rotation Invariant Face Detection Using Spectral Histogram and SVMs. CVPR submission.


Yan Ke I'll be presenting my recent work on local image descriptors. We extend David Lowe's SIFT by applying PCA to build the descriptor instead of using windows of histograms.

A technical report can be found here.


Fernando De La Torre The papers I will discuss on wednesday are:

Extreme Components Analysis
Max Welling, Felix Agakov, Christopher K. I. Williams. Paper here.

and if I have time:

Non-linear CCA and PCA by Alignment of Local Models, Paper here.

Enjoy the reading.


Sanjiv Kumar I will talk about the paper from Teh and Welling about generalized inference:

The Unified Propagation and Scaling Algorithm, Y. W. Teh and M. Welling, NIPS 2001. Paper here.


Ranjith Unnikrishnan I'll be presenting Out-of-sample extensions for LLE, Isomap, MDS, Eigenmaps and Spectral clustering by Y.Bengio et al (NIPS 2003). Paper here.

The paper develops a unified framework for the above manifold learning algorithms and extensions for out-of-sample data. A more detailed tech report is Spectral Clustering and Kernel PCA are Learning Eigenfunctions. Paper here.


Caroline Pantofaru

I'll be talking about Unsupervised Improvement of Visual Detectors using Co- Training by A Levin, P Viola, and Y Freund, from ICCV 2003. You can find the paper here or here.


No Meeting
(Spring Break)


Tom Stepleton At this week's misc reading group meeting, I'll be presenting Tony Jebara's ICCV 2003 paper, Images as Bags of Pixels. You can download the paper here.

Bags of Pixels is a technique for representing images as unordered sets of tuples. Finding optimal orderings for these tuples across sets of images automatically estimates pixel correspondences over the images and compacts the image representations into a subspace conducive to PCA. Furthermore, resultant PCA basis vectors have visual characteristics that give rise to compelling intuitive interpretations. This technique can also be supplied to other sampled data, like audio recordings.

For more background, you may also wish to take a look at this paper, which describes the convex optimization technique of the Bags of Pixels paper in slightly greater detail.


Jiang Ni I will be presenting Portilla & Simoncelli's IJCV 2000 paper on texture synthesis usnig a parametric statistical model:
A parametric texture model based on joint statistics of complex wavelet coefficients Paper here.

And if I have time, I may also talk about Wei & Levoy's Siggraph 2000 paper on texture synthesis using a non-parametric model:
Fast Texture Synthesis using Tree-structured Vector Quantization. Paper here.


Derek Hoiem I'm presenting an overview/tutorial presentation on Adaboost (the vote was 3-2-1 adaboost-bayes nets-unlabeled data). I'll give some background, cover the basic adaboost algorithms, present some theoretical results, discuss practical issues such as the complexity of the weak learner and noisy data, and talk about some of the Adaboost variants. I'll be focusing on confidence-weighted adaboost in the two-class case. Recommended reading is one of the following that can all be found here.

Robert E. Schapire.
The boosting approach to machine learning: An overview.
In MSRI Workshop on Nonlinear Estimation and Classification, 2002.

Jerome Friedman, Trevor Hastie and Robert Tibshirani.
Additive logistic regression: a statistical view of boosting.
The Annals of Statistics, 38(2):337-374, April, 2000.

Robert E. Schapire and Yoram Singer.
Improved boosting algorithms using confidence-rated predictions.
Machine Learning, 37(3):297-336, 1999.


Owen Carmichael This MISC reading group meeting will be an overview of the medical image analysis work I have been involved with since getting my PhD in September. There are no papers to read and little in the way of experimental results since I have only been at it for 6 months. The general problem is, "In what way can images of the brain help us to detect Alzheimer's Disease earlier than we are detecting it now?" The specific computer vision and statistics issues are:

1. Automatically generating 3D models of low-contrast brain structures from volumetric images.
2. Using MCMC-type sampling techniques to evaluate sources of error in registration/segmentation algorithms.
3. Selecting shape features from 3D anatomical structures for predictive statistical models. In other words, building statistical models that tell you things like, "probability that you will get Alzheimer's at time T, given that your hippocampus looks like X"

I will also touch on a couple of feature-based registration projects that I am helping to give advise to.


David Tolliver As has become my habit I'll present another paper by Lior Wolf and Amnon Sha'shua. I'll describe their Q-alpha algorithm, and present results from their recent ECCV 2004 paper.


Bart Nabbe From Alexei's web page:

Texture Synthesis

In our 1999 work ([Efros and Leung,'99]) we tried to address this shortcoming by proposing a very different and extremely simple way of synthesizing textures locally, one-pixel-at-a-time. In recent years this algorithm has been used for synthesiszing a large spectrum of textures as well as filling holes in textured regions. In our latest work ([Efros and Freeman,'01]) we show that similarly good results can be obtained by an even simpler (and much faster) procedure of quilting together patches of the input texutre. Moreover, we demonstrate a method of transfering texture from one object onto another (e.g. rendering a man's face with rice).

The two papers are available here and here.


Martial Hebert I'll discuss the Brendan Frey paper on Epitomic Analysis (paper here) at the Wed. meeting. I can't remember exactly why I liked it.


James Hays I will discuss the recent texture synthesis work of Yanxi Liu, Steve Lin, and myself. Our work concerns near-regular textures -- textures made of up repeated but not necessarily identical texture elements. This is a class of textures that general texture synthesis algorithms, from Efros and Leung to Graph Cuts, do not handle well. We model the different ways textures depart from regular tilings with Geometry, Lighting, and Color deformation fields. We use these deformation fields to manipulate the textures in interesting ways. A pre-print of the paper, to appear in SIGGRAPH 2004, can be found here.


Black Friday No Meeting Today (postponed until May 26).


Dennis Strelow I'll give a high-level overview of my thesis research since the last time I presented my work to the MISC group, which includes work on 6 DOF motion estimation from image measurements and measurements from inexpensive inertial sensors; and the "smalls" image feature tracker for shape-from-motion and similar applications. I'll include motion estimation results from data taken during the Hyperion rover's first field test in 2003 and from CMU's wide area crane.

I'll try to present so that no reading is required beforehand. If you can't wait, there is a brief discussion of 6 DOF motion from image and inertial measurements here and a much more detailed discussion, which includes the Hyperion results, is available here.

There's no paper describing the smalls tracker at this point.


Marius Leordeanu The paper I will present for this week is:

Selective Sampling With Redundant Views
Ion Muslea, Steven Minton, and Craig A. Knoblock

link to paper


Daniel Huber Practice job talk.


Caroline Pantofaru I'll be talking about: "Sharing features: efficient boosting procedures for multiclass object detection" by Torralba, Murphy, and Freeman from CVPR 2004 Paper here


"Mutual Boosting for Contextual Inference" by Michael Fink and Pietro Perona from NIPS 2003 Paper here

The main links between the papers are the obvious: boosting and multiclass detection. The main reason that I've listed both of them is because I don't think either one will fill the entire meeting. But then again, you never know.


Ranjith Unnikrishnan This week I'll try to cover two papers that revisit conventional clustering techniques:

(1) "Interpreting and Extending Classical Agglomerative Clustering Algorithms using a Model-Based Approach". In ICML, 2002 Link here , and
(2) "Multiobjective Data Clustering", M. Law, A. Topchy, A.K. Jain, CVPR 2004. Link here

The first shows how common heuristic clustering algorithms are each equivalent to a hierarchical model-based method, and offers implementers intuition on how to modify and choose between algorithms / distance measures.

The second presents an approach that uses multiple clustering objective functions simultaneously, and tries to find a robust, least conflicting partition of the dataset to be clustered.


Tom Stepleton For the next Misc Reading Group, I'll be presenting

Weakly Supervised Learning of Visual Models and Its Application to Content-Based Retrieval
by Cordelia Schmid, IJCV 56(1/2), 7-16, Jan/Feb 2004.

In this paper, Schmid trains visual models for image retrieval by building up libraries of significant rotation-invariant descriptors from positive and negative class examples. Aside from this binary labeling, no other handholding is performed.

A local link to the paper can be found here.


CVPR - No Meeting TBA


Owen Carmichael I will discuss a couple of interesting papers I saw at CVPR on low-dimensional embeddings of high-dimensional manifolds of data. I will start on this one, which I thought was neat because it casts embedding as a classification problem:

BoostMap: A Method for Efficient Approximate Similarity Rankings
Vassilis Athitsos, Jonathan Alon, Stan Sclaroff, and George Kollios
Link here

Depending on the time and demand, I might go on to talk about this one by one of the Locally Linear Embedding guys. It won best paper:

Unsupervised Learning of Image Manifolds by Semidefinite Programming
Kilian Q. Weinberger and Lawrence K. Saul
Link here


Jiang Ni The paper I will be talking about is:
Combining Top-down and Bottom-up Segmentation, by Borenstein, Sharon and Ullman. Link here

This paper was presented at the POCV workshop of CVPR 2004, on June 28 morning. The top-down approach uses prior knowledge about an object to guide the segmentation, whereas the bottom-up approach first segments the image into regions and then groups the object-related regions together. This paper shows how to combine these two approaches together.


Fernando de la Torre On Wednesday I will present,
A Rao-Blackwellized Particle Filter for EigenTracking
Paper here

Enjoy reading!!


Yan Ke I'll be presenting my recent work on using locality sensitive hashing (LSH) to index and search local descriptors for image retrieval. See this link.

I'll also talk about on-going work in applying similar techniques to music retrieval.


CANCELLED Conflict with sponsor presentation.


Jake Sprouse I'll present Multiscale Conditional Random Fields for Image Labeling by Xuming He, Richard Zemel, and Miguel Carreira-Perpinan from CVPR '04. Paper here.

The authors encode contextual information using learnt label pattern fields and combine it with low-level classification using CRFs.


James Hays TBA


Cris Dima I will present Estimating Replicability of Classifier Learning Experiments by Remco Bouckaert (ICML 2004). An electronic copy can be downloaded here.

Here is the abstract:

Replicability of machine learning experiments measures how likely it is that the outcome of one experiment is repeated when performed with a different randomization of the data. In this paper, we present an estimator of replicability of an experiment that is efficient. More precisely, the estimator is unbiased and has lowest variance in the class of estimators formed by a linear combination of outcomes of experiments on a given data set.We gathered empirical data for comparing experiments consisting of different sampling schemes and hypothesis tests. Both factors are shown to have an impact on replicability of experiments. The data suggests that sign tests should not be used due to low replicability. Ranked sum tests show better performance, but the combination of a sorted runs sampling scheme with a t-test gives the most desirable performance judged on Type I and II error and replicability.


Jonas August The paper is Descour & Dereniak's Computed-tomography imaging spectrometer in Applied Optics '95.

This paper describes a new way of making hyperspectral images free of motion distortion by combining diffraction gratings with a CT-style inverse problem formulation. I'll describe this new sensor and outline the computations involved.

The paper was scanned in, so the PDF file is quite large (14MB).
As an alternative, it is also available in OpenOffice format (.SXW) or in MS Word format (.DOC), each about 2MB.


Daniel Huber Survey of ECCV papers.


Derek Hoiem I will give an overview talk on using context in object detection. The bulk of work in context can be placed into one of three categories: saliency (what are likely locations/scales of the object), local context (neighboring labels are used), and object-based context (likelihoods of other objects' locations are used). I will attempt to organize and summarize the important ideas from about a dozen papers, but I recommend getting the gist (at least read intro, headings, figures and conclusions) of the three papers below in preparation for Wednesday. Incidentally, this presentation will be used to fulfill my speaking qualifier requirement.

Statistical Context Priming for Object Detection, Torralba, et. al 2001
Multiscale Conditional Random Fields for Image Labeling, He et. al 2004
Mutual Boosting for Contextual Inference, Fink, et. al, 2003


James Hays Interactive Graph Cuts for Optimal Boundary & Region Segmentation of Objects in N-D Images
Yuri Boykov & Marie-Pierre Jolly
ICCV 2001
Paper available here.

This paper describes an exact, low-polynomial time solution to finding a MAP labeling for two-label, N-dimensional Markov Random Fields. This is used for interactive foreground/background extraction, where the labeling of image pixels simply determines whether an object is foreground or background. It has similarly been used for texture synthesis. After discussing this paper we'll talk about its extensions to multi-class problems, and then we'll look at various direct extensions to this paper at SIGGRAPH this year.


Goksel Dedeoglu Transformation-Invariant Embedding for Image Analysis
A. Ghodsi, J. Huang, and D. Schuurmans
ECCV 2004

Available here, and also here.


Sanjiv Kumar I will talk about the following work on learning markov networks from Stanford learning group:

Max-Margin Markov Networks, B. Taskar, C. Guestrin and D. Koller. Neural Information Processing Systems Conference (NIPS03), Vancouver, Canada, December 2003.

The paper can be found here.

This work is an extension of SVM to structured or relational data (e.g. chains or MRFs).


Cancelled 25th Anniversary Robotics Symposium (Schedule here)


Bart Nabbe Epipolar Geometry from Three Correspondences
Chum, Matas, Obdrzalek

The paper describes LO-RANSAC 3-LAF a new algorithm for the correspondence problem; Exploiting processes proposed for computation of affineinvariant local frames, three point-to-point correspondences are found for each region-to-region correspondence. Consequently, it is sufficient to select only triplets of region correspondences in the hypothesis stage of epipolar geometry estimation by RANSAC.

The paper can be found here.


Ting Liu This paper concerns approximate nearest neighbor searching algorithms, which have become increasingly important, especially in high dimensional perception areas such as computer vision, with dozens of publications in recent years. Much of this enthusiasm is due to a successful new approximate nearest neighbor approach called Locality Sensitive Hashing (LSH). In this paper we ask the question: can earlier spatial data structure approaches to {\em exact} nearest neighbor, such as metric trees, be altered to provide approximate answers to proximity queries and if so, how? We introduce a new kind of metric tree that allows overlap: certain datapoints may appear in both the children of a parent. We also introduce new approximate k-NN search algorithms on this structure. We show why these structures should be able to exploit the same random-projection-based approximations that LSH enjoys, but with a simpler algorithm and perhaps with greater efficiency. We then provide a detailed empirical evaluation on five large, high dimensional datasets which show accelerations one to three orders of magnitude over LSH. This result holds true throughout the spectrum of approximation levels.

The following papers are all related to this talk:

  • T. Liu, A.W.Moore, K. Yang, and A. Gray. An Investigation of Practical Approximate Nearest Neighbor Algorithms. Accepted to NIPS 2004. (I put the technical report version on-line)
  • A. Gionis, P.Indyk, and R.Motwani. Similarity Search in High Dimensions via Hashing. In Proc 25th VLDB conference, 1999
  • P. Indyk and R. Motwani. Approximate nearest neighbors: towards removing the curse of dimensionality. In STOC, pages 604-613, 1998
  • P. Indyk and N. Thaper. Fast image retrieval via embeddings. In the 3rd International Workshop on Statistical and Computational Theories of Vision (SCTV 2003)

  • 11/3/2004

    Alyosha Efros Seeing Through Water

    We consider the problem of recovering an underwater image distorted by surface waves. A large amount of video data of the distorted image is acquired and the problem is posed in terms of understanding the statistics of local patches in the image plane. This challenging reconstruction task can be formulated as a manifold learning problem, such that the center of the manifold is the image of the undistorted patch. To compute the center, we present a new technique to estimate global distances on the manifold. Our technique achieves robustness through convex flow computations and solves the "leakage" problem inherent in recent manifold embedding techniques.

    Joint work with Volkan Isler, Jianbo Shi and Mirko Visontai.

    Accepted to NIPS'04, draft can be found here.


    Srinivasa Narasimhan I will be presenting the recent CVPR paper which won an award:

    Programmable Imaging using a Digital Micromirror Array
    S. K. Nayar, V. Branzoi, and T. Boult
    Proceedings of IEEE Conference on Computer Vision and Pattern Recognition
    Washington DC, June 2004.

    The paper can be found here.

    I will also try to discuss the different sensors built at Columbia on dynamic range.


    Andrew Stein Scale-invariant shape features for recognition of object categories
    Frederic Jurie and Cordelia Schmid
    CVPR 2004

    We introduce a new class of distinguished regions based on detecting the most salient convex local arrangements of contours in the image. The regions are used in a similar way to the local interest points extracted from gray-level images, but they capture shape rather than texture. Local convexity is characterized by measuring the extent to which the detected image contours support circle or arc-like local structures at each position and scale in the image. Our saliency measure combines two cost functions defined on the tangential edges near the circle: a tangential-gradient energy term, and an entropy term that ensures local support from a wide range of angular positions around the circle. The detected regions are invariant to scale changes and rotations, and robust against clutter, occlusions and spurious edge detections. Experimental results show very good performance for both shape matching and recognition of object categories.

    The paper can be found here. (If that link doesn't work for some reason, you should be able to search for the paper on IEEE Xplore.)

    If there's time -- and if my slides are ready -- I may also take 15-20 minutes to do a practice talk for WACV. This would cover my own work on background-invariant features.


    Thanksgiving No Meeting


    Sanjeev Koppal Appearance Sampling for Obtaining A Set of Basis Images for Variable Illumination
    Imari Sato, Takahiro Okabe, Yoichi Sato, and Katsushi Ikeuchi

    Paper here.


    Qifa Ke Learning a kernel matrix for nonlinear dimensionality reduction
    K. Q. Weinberger, F. Sha, and L. K. Saul
    ICML 2004

    This paper is about how to use semidefinite programming to learn the kernel matrix that can "unfold" the manifold when it maps the input into the feature space. (The input high-dimensional data is assumed to lie in some low-dimensional manifold).


    Black "Friday" No meeting.


    Winter Break No meeting.


    Winter Break No meeting.

    Last modified: 12/13/2004