Our work combines the technology behind television, VR, and Computer Vision (also see the Computer Vision Handbook) to create virtual models of real-world events -- what we call Virtualized RealityTM dynamic event models. These models can then be used to construct views of the real events from nearly any viewpoint, without interfering with the events! Like VR, Virtualized Reality dynamic event models allow viewers to see whatever they want to, but unlike VR, this "other world" is actually a real event, and the views of this event are photorealistic.
An alternative approach to modeling is to mimic the way humans observe the world: with pictures! Many times we are allowed to look at a scene or event, but we are not allowed to touch it. Even so, we can still figure out the basic shape and color of the scenes, and of course it all looks real because it is real. To make the computer be able to do the same thing, the computer must have eyes -- this is a task called computer vision. These eyes usually come in one of two forms: those that capture still images, like a 35mm camera, and those that capture motion, like video cameras. (Surprisingly, a video camera is just a still-image camera that takes pictures so quickly that our eyes think that they see motion.) In our case we want to capture time-varying events, so we use video cameras.
Having lots of videos, by itself, is not enough to model real scenes, though. (Actually, if you had tons of videos, you can make it look like you have a model, but that's left for someone else to explain.) The images directly show the color of the world, so we can use them directly as models of scene appearance. The images also include information about the shape of the objects in the world, but not directly, so we must somehow recover the shape from the images.
The shape model is constructed in two stages. First, we apply a computer vision technique called stereo to determine the shapes of the objects visible in each image. Stereo tries to find corresponding features in a set of images, and then triangulates these correspondences to determine how far away the 3D feature is. For example, suppose I take two pictures of your face from slightly different viewpoints. Because both pictures contain an image of your face, I can find your left eye in both pictures. Trying to find these matching points is called the correspondence problem. If I have found the corresponding points, and if I know where the cameras were when they took the pictures, I can compute how far away your eye was from each camera. This computation is called triangulation. (Incidentally, the process of determining the positions of the cameras is called calibration. Before we actually capture all our images, we perform camera calibration to determine precisely where the cameras are.)
Stereo only estimates the shapes of objects visible in each image. The second stage is to integrate these image-based shape models into a single, complete shape model of the entire scene. Continuing our example from before, the images of your face can be used to determine the shape of your face, but not of the rest of your head. If we have more cameras behind your head, we can use their images to tell us the shape of the back of your head, but not your face. By integrating, or merging, the shape of your face and the shape of the back of your head, we can create a complete shape model of your head. In a similar way, we integrate the shape information from many cameras to create a complete shape model of real events.
Virtualized Reality dynamic event modeling has great significance in entertainment of many types, not just sporting events. In the movie industry, Virtualized Reality models can be used to create many special effects that currently are very difficult or totally impossible to simulate. Movies themselves could be replaced by an entirely different medium that allows the viewers to enter into the scene and, unlike the "3D" movies we occasionally see now, viewers could independently move around and see whatever they want to.
Another major use of Virtualized Reality modeling is in training. Rather than getting a typical instructional video, you could get a complete model of the same event, so that you could study the lesson from any viewpoint, even from the instructor's. In medical school, for example, a student might be trying to learn a surgical operation. By virtually reproducing a real operation, students could review the procedure as it was actually performed, not just the way it is described in a textbook. They could also walk around the modeled operating room and see the procedure from any viewpoint, even the real surgeon's, without interfering with the real operation.
Return to the Virtualized RealityTM Home Page