CMU 15-494 Final Project

Kinect Depth Map In Mirage (Mark Perez)


Goal:

To have a working simulator of the kinect depth map in mirage.


Approach:

Ray Tracing vs. Graphics

There are two basic approaches to model the kinect depth map in a simulated environment. The first involves ray tracing, where for each pixel you draw a straight line from the camera and measure how far it travels until it collides with an object. The second involves calculating depth using data from the rendering system.

  Ray tracing Graphics based
Performance <1 FPS +30 FPS
Accuracy Perfect Close to perfect, may act funny on objects with low vertex counts
Need to implement Ray Tracing Algorithm Vertex Shader

 

In the end, I decided to go with using the rendering system, mainly because of the difference in performances.


Implementation:

Shaders

Shaders are what allow the rendering system to do most of the work when calculating depth. Objects in 3D graphics are typically represented as triangulated polyhedra (see image below).

triangulated sphere

Shaders allow you to perform operations on the vertices of these objects. Specifically, when the 3D object is being rendered, its position from the rendering camera is known. This tells us the depth of each vertex. That depth is then encoded as a 24-bit RGB color according to the following formula:

depth = (R << 16) + (G << 8) + B

Mirage

Cameras are implemented as dictionaries inside mirage. The only thing I needed to change there was to add a flag which stated if the camera should render with the normal materials present in the scene or with the "depth material" defined by the shader above.

A seperate image stream also needs to be created and directed to where Tekkotsu stores the depth buffer.


Progress / Future Work:

The vertex program is finished and it is partially integrated into mirage. More work needs to be done on the mirage driver to make the depth camera a seperate stream from the raw cam.

In the future, it would be interesting to explore modeling the error of the physical kinect. The depth frame returned right now does not accurately reflect how the kinect performs.


Demo:

Youtube clip:

demo

The demo shows a video of the vertex shader being used to detect depth. The raw cam channel is used to get the depth frames (see above). Therefore, the blue/green image displayed is a visualization of the depth values. The demo state machine then looks at a pixel in the center of this image and converts the YUV pixel into an RGB pixel to retrieve the depth. You can see in the demo the robot starts out approximately 3000 mm from the target, moves to 2000 mm, and then moves to 1000 mm.