16-311 Introduction to Robotics
         Main         Schedule         Homework         Labs         Links

  16-311 Lab 2: Vision

Lab 2: Vision


Challenge Statement

Create a MATLAB pipeline to take in images of known objects and determine the distance to these objects.

Lab Goals

  1. Gain experience using MATLAB's image processing tools.
  2. Threshold an image reliably.
  3. Segment and images to reveal separate bodies.
  4. Determine distance from an object using known camera geometry and trigonometry.

Background

Principles
  1. Thresholding:
  2. One of the first techniques we discussed was thresholdiing. This enables us to take colored or grayscale image and reduce it down to a black-and-white thresholded version where each pixel is either a 0 or a 1. In order to create this result, we choose a pixel value (or perhaps a value for each of the colored channels in a pixel) and set all the pixels with a lower value than this threshold value to 0 (black if we assume lower numbers are closer to black) and all pixels with a higher value than this threshold value to 1.

    In order to pick this threhold value, we can use a histogram to visualize the distribution of pixel values. Here we see a simple example of a 9-pixel image and the accompanying histogram.

    arm1
    Example image and corresponding histogram.

    Based on this histogram, we may decide to threshold at 2, so any pixels greater than or eaqual to 2 are set to the high value. And any pixels less than 2 are set to 0.

    arm1
    Example image thresholded at 2.

    Alternatively, we may see the lighter pixels as still valuable portions of the colored space and decide to threshold at 1, instead of 2.

    arm1
    Example image thresholded at 1.
  3. Segmentation:
  4. In this class, we preform segmentation using the floodfill algorithm. Here are two examples of the alorithm in action, one using 4-point connectivity and one using 8-point connectivity.

    4point 8point
    Flood Fill with 4-point Connectivity. André Karwath aka Aka [CC BY-SA (https://creativecommons.org/licenses/by-sa/2.5)].
    Flood Fill with 8-point Connectivity. André Karwath aka Aka [CC BY-SA (https://creativecommons.org/licenses/by-sa/2.5)].

    The above animations use a recursive structure. That is, you select a pixel. If this pixel is of interest (we will say those pixels have value 1 for explanatory purposes), it is set equal to our segmentation number. If this pixel is not of interest, it is not changed. Then this same function is applied in each direction (4 directions or 8 directions depending on connectivity).

    We recommend that you implement this function using a queue, instead of recursion. To do this, instead of calling the funciton in each of the directions, if those pixels in each direction are of interest, they are changed to the segment number and added to the queue. Then you repeat this process until the queue is empty.

  5. Camera Geometry:
  6. Check out the information here: Homework 2: Camera Geometry

Lab Requirements

The specifications for Lab 2 are presented in the following document. This will be the most up-to-date resource for lab requirements.

Lab 2 Writeup

Lab 2 Materials (0.5 GB Zip folder)

Extensions

Last updated 01/24/2024 by Ananya Rao
(c) 1999-2024: Howie Choset, Carnegie Mellon