Yair Movshovitz-Attias

Ph.D. Student (Graduated)
Computer Science Department
Carnegie Mellon University

About Me

I just received my Ph.D. from the Computer Science Department at Carnegie Mellon University working on computer vision and machine learning problems. The next stop for me is Google Research!
At CMU I was part of the CMU computer vision group and was co-advised by Yaser Sheikh and Takeo Kanade.

I am interested in computer vision and machine learning. Specifically I work on fine pose estimation, prediction from rendered data, deep learning, and fine-grained classification. My research focuses on automatically generating large scale data sets for training computer vision algorithms. It is common knowledge by now that we are living in an age of "Big Data". However, while data is cheap labeled data is not. Building large scale labeled datasets is time consuming and expensive. In my research I leverage the large quantities of highly accurate, detailed, 3D models that are readily available on sites such as turbosquid.com to create rendered images as training instances for computer vision tasks. This provides a complete solution to the problem of curating labeled data. Please see my Publications section for my thesis.

For my M.Sc research I explored the use of statistical state estimation techniques for object tracking and foreground layer extraction.

I have industry experience as a software engineer from places such as Google, and Optimet Metrology. I've had many computing related jobs over the years, including being a computer and network technician and even a video game tester. My full experience is detailed in my CV.

In March 2009 I honeymooned in New Zealand which got me thinking about my next adventure - building a bridge between the two peaks of Mt. Kilimanjaro :)
I Have the world's smartest cat, and the world's best wife :-)

List of Publications

google scholar profile

Dataset Curation through Renders and Ontology Matching
Yair Movshovitz-Attias, Ph.D. Thesis, 2015.
[pdf] [bibtex]

Ontological Supervision for Fine Grained Classification of Street View Storefronts
Yair Movshovitz-Attias, Qian Yu, Martin C. Stumpe, Vinay Shet, Sacha Arnoud, and Liron Yatziv, CVPR, 2015.
[pdf] [bibtex]

Classification of blue whale D calls and fin whale 40-Hz calls using deep learning
Jeremy Karnowski and Yair Movshovitz-Attias, DCLDE, 2015.

3D Pose-by-Detection of Vehicles via Discriminatively Reduced Ensembles of Correlation Filters
Yair Movshovitz-Attias , Vishnu Naresh Boddeti, Zijun Wei, and Yaser Sheikh, BMVC , 2015.
[pdf] [Supplementary Material] [Data] [bibtex]

Physical Querying with Multi-Modal Sensing - MARVIN: An Intelligent Retailing Assistant
Iljoo Baek, Taylor Stine, Denver Dash, Fanyi Xiao, Yair Movshovitz-Attias, Mei Chen, Yaser Sheikh, Martial Hebert, and Takeo Kanade, WACV, 2014.
[pdf] [bibtex]

Analysis of the Reputation System and User Contributions on a Question Answering Website: StackOverflow
Dana Movshovitz-Attias*, Yair Movshovitz-Attias*, Peter Steenkiste and Christos Faloutsos, 2013, ASONAM
* Equal Contribution
[pdf] [bibtex] [slides]

Bacteria Filters - Persistent Particle Filters for Background Subtraction
Yair Movshovitz-Attias and Shmuel Peleg, ICIP, 2010.
[pdf] [bibtex]

Persistent Particle Filters for Background Subtraction
Yair Movshovitz-Attias, Thesis, Master of Science, 2009.
[pdf] [bibtex]

Research Projects

Ontological Supervision for Fine Grained Classification of Street View Storefronts

Joint work with Qian Yu, Martin C. Stumpe, Vinay Shet, Sacha Arnoud, and Liron Yatziv

Modern search engines receive large numbers of business related, local aware queries. Such queries are best answered using accurate, up-to-date, business listings, that contain representations of business categories. Creating such listings is a challenging task as businesses often change hands or close down. For businesses with street side locations one can leverage the abundance of street level imagery, such as Google Street View, to automate the process. However, while data is abundant, labeled data is not; the limiting factor is creation of large scale labeled training data. In this work, we utilize an ontology of geographical concepts to automatically propagate business category information and create a large, multi label, training dataset for fine grained storefront classification. Our learner, which is based on the GoogLeNet/Inception Deep Convolutional Network architecture and classifies 208 categories, achieves human level accuracy.


Ontological Supervision for Fine Grained Classification of Street View Storefronts CVPR 2015

Exemplar Based Methods for Viewpoint Estimation

Joint work with Vishnu Naresh Boddeti, Zijun Wei, and Yaser Sheikh

Estimating the precise pose of a 3D model in an image is challenging; explicitly identifying correspondences is difficult, particularly at smaller scales and in the presence of occlusion.
Exemplar classifiers have demonstrated the potential of detection-based approaches to problems where precision is required. In particular, correlation filters explicitly suppress classifier response caused by slight shifts in the bounding box. This property makes them ideal exemplar classifiers for viewpoint discrimination, as small translational shifts can often be confounded with small rotational shifts.

However, exemplar based pose-by-detection is not scalable because, as the desired precision of viewpoint estimation increases, the number of exemplars needed increases as well. We present a training framework to reduce an ensemble of exemplar correlation filters for viewpoint estimation by directly optimizing a discriminative objective. We show that the discriminatively reduced ensemble outperforms the state-of-the-art on three publicly available datasets and we introduce a new dataset for continuous car pose estimation in street scene images.


CMUcar Dataset


Exemplar Based Methods for Viewpoint Estimation BMVC, 2014
Supplementary Material

Visual Accessibility Through Computer Vision

Joint work with Karl Hellstern, Zijun Wei, Yaser Sheikh, and Takeo Kanade

We have built algorithms that improve the BrainPort vision device which is manufactured by Wicab.
With this device blind users can perceive the approximate size, shape and location of objects in their surrounding. Visual information is gathered by a camera that is mounted on a pair of sunglasses. It is then translated into electric pulses that are translated to the surface of the tongue.
We have designed a system that allows users of the BrainPort device to recognize faces.
The system detects faces in the image captured by the camera, compares them with a "prototypical" or "average" face, and produces a difference map which allows the user to literally feel what is unique about the face of the person in front of it. A prototype of our system was depicted in an episode of the BBC show Frontline Medicine.
Additionaly, we have created an Android app that can detect various signs (such as restroom, EXIT, etc') and guide the visually impaired user to the location of the sign using vibrating signals.


QoLT NSF Site Visit


Segment from the BBC's "front-line medicine"

More on the project at CMUs website

Persistent Particle-Filters For Background Subtraction

Joint work with Shmuel Peleg

Moving objects are usually detected by measuring the appearance change from a background model. The background model should adapt to slow changes such as illumination, but detect faster changes caused by moving objects. Particle Filters do an excellent task in modeling non parametric distributions as needed for a background model, but may adapt too quickly to the foreground objects.
A persistent particle filter is proposed, following bacterial persistence. Bacterial persistence is linked to the random switch of bacteria between two states: A normal growing cell and a dormant but persistent cell. The dormant cells can survive stress such as antibiotics. When a dormant cell switches to a normal status after the stress is over, bacterial growth continues.
Similar to bacteria, particles will switch between dormant and active states, where dormant particles will not adapt to the changing environment. A further modification of particle filters allows discontinuous jumps into new parameters enabling foreground objects to join the back- ground when they stop moving. This can also quickly build multi-modal distributions.

M.Sc. Thesis:

Persistent Particle-Filters For Background Subtraction


Bacteria Filters: Persistent Particle-Filters For Background Subtraction, ICIP 2010

HANS - HUJI's Autonomous Navigation System

Joint work with Keren Haas, Dror Shalev, Nir Pochter, Zinovi Rabinovich, and Jeff Rosenschein

I worked on HANS alongside Keren Haas and Dror Shalev as a final project for the Computer Engineering program. Our advisors were Prof. Jeff Rosenschein, Nir Pochter, and Zinovi Rabinovich.
HANS has won the Computer Engineering School 2008 "Best Computer Engineering Project" award.

Building autonomous robots that can operate in various scenarios has long been a goal of research in Artificial Intelligence. Recent progress has been made on space exploration systems, and systems for autonomous driving. In line with this work, we present HANS, an autonomous mobile robot that navigates the Givat Ram campus of the Hebrew University. We have constructed a wheel-based platform that holds various sensors. Our system's hardware includes a GPS, compass, and digital wheel encoders for localizing the robot within the campus area. Sonar is used for fast obstacle avoidance. It also employs a video camera for vision-based path detection. HANS' software uses a wide variety of probabilistic methods and machine learning techniques, ranging from particle filters for robot localization to Gaussian Mixture Models and Expectation Maximization for its vision system. Global path planning is performed on a GPS coordinate database donated by MAPA Ltd., and local path planning is implemented using the A* algorithm on a local grid map constructed by the robot's sensors.


HANS - HUJI's Autonomous Navigation System, Project Book


This blog was used to recored the development process

Video segmentation of unknown, static background using min-cut

Joint work with Shmuel Peleg

The algorithm has been developed as part of 'guided work' that took place during the spring semester of 2007 under the supervision of Prof. Shmuel Peleg.

In this report we describe an algorithm for foreground layer extraction based on EM learning and min-cut. The background is unknown but assumed to be static, and the foreground is therefore defined as the dynamic part of the frame. From a single video stream our algorithm uses color cues as well as information from image contrast, that is, the color differences between adjacent pixels, to cut out the foreground layer. Experimental results show that the accuracy is good enough for most practical uses.


Video segmentation of unknown, static background using min-cut, Project Book


Demo website

Teaching Experience

Courses I TAed and students I've mentored

TA or Grading

Fall 2013

  • 15-463: Computational Photography, given by Kris Kitani (CMU)

Fall 2012

  • 15-381: Artificial Intelligence: Representation and Problem Solving, given by Ariel Procaccia and Emma Brunskill (CMU)

Spring 2009

  • 67808: Operating Systems, given by Dror Feitelson (Hebrew University)

Fall 2008

  • 61701: Introduction to Computer Science, given by Jeffrey S. Rosenschein (Hebrew University)

Students Mentored

Zijun Wei

  • Master, Robotics Institue, CMU
  • I advised Zijun on two projects. In the first we built an Android App for helping visually impaired individuals locate restroom signs using the built in video camera. In the second we built an Android App for collecting a large dataset of car images, and annotating them with the car make and model.

Fanyi Xiao

  • Master, Robotics Institue, CMU
  • I worked with Fanyi on a system for querying the Internet for consumer objects using a wearable camera. I Served on Fanyi’s MSc. Committee

Esha Uboweja

  • Undergraduate, Computer Science Department, CMU
  • Esha worked with me on a creating a large annotated dataset of car images. The images were annotated with their make and model. Her task was to build the backend part of our system. The images were captured on users’ Android devices using an App built by Zijun Wei. The app communicated with the backend server and sent the image data with the annotations. Esha was also in worked on a website that updates in realtime when new data arrives.

Karl Hellstern

  • Undergraduate, Computer Science Department, CMU
  • I advised Karl on a system that helps visually impared people to recognize faces. The system used a wearable video camera, mounted on a pair of sunglasses, detected faces in the video stream, and created a unique, highly distinguishable pattern that was applied as an electrical signal to the user’s tounge using the BrainPort device.


Object Annotation Tool

A simple Python annotation tool I wrote to collect ground truth for part-based models.
This code is designed to be a simple tool for annotating objects in images. An object is defined as a set of parts, and a part is defined as a rectangle in the image.
Each annotated object must have the same number of parts. There can be many objects in a single image, e.g. when annotating people in images, the tool allows you to annotate multiple people in an image.


yair [at] cs.cmu.edu
GHC 7513
Office Phone
Computer Sciences Department,
Carnegie Mellon University,
5000 Forbes Avenue,
Pittsburgh, PA 15213