Virtual reality has been a subject of great interest. Less attention has been paid to the related field of Augmented Reality, despite its similar potential. The difference between Virtual Reality and Augmented Reality is in their treatment of the real world. Virtual Reality immerse a user inside a virtual world that completely replaces the real world outside. In contrast, Augmented Reality let the user see the real world around him and augment the user's view of the real world by overlaying or composing three-dimensional virtual objects with their real world counterparts. Ideally, it would seem to the user that the virtual and real objects coexisted.
The key issue to realize Augmented Reality is the registration problem, the registration of the object virtual information is overlaid. In typical augmented reality systems developed, head-trackers are used for tracking user's head position/orientation, rangefiner or sonar sensor is used for detecting or tracking the object pose in the world. The problems are lack of accuracy and latency of the system. Most commercially available head- trackers do not provide sufficient accuracy and range. The rangefiner and sonar sensor is not sufficient enough for its speed and accuracy.
We are trying to apply computer vision to the registration problem in Augmented Reality. From computer vision point of view, it will be a real-time visual tracking system of the known 3D object using intensity images.
The developed system realizes real-time object registration and image overlay by computer vision. It utilizes 2D intensity patterns and detect feature points by template matching. The change of intensity patterns due to view change is compensated by skewing reference images with computed object pose parameters.
Tracking verification by use of geometric invariants helps to realize the developed system's high rate and reliability. After selecting only successfully tracked features, object position and orientation is computed from those feature positions in the image. The prestored model is projected with the computed object pose and the composite image is displayed on the monitor.
At the beginning of the operation, the system displays a wire frame of the object and requires a user to align the object and the wire frame. The detection of the object is carried out by template matching with precaptured multiple templates in various illumination for increasing robustness. Computational complexity of template matching is reduced by approximating reference images as linear combinations of some major eigenvectors. The Karhunen-Loeve expansion gives us optimal major eigenvectors. This compression greatly reduces the computation cost of template matching.
The system implemented on multiple DSPs(TMS320C40) realizes real-time tracking at frame rate (30 Hz). It tracks templates within a relatively wide range. The images below show how it works. Publications are here.
Overlay of image on a PC.
Overlay of a bone on a leg.