One attempt to simplify the problem is to make some assumptions about the environment. One such assumption is to use a planar parallax model of the road to make predictions about the position of objects in the environment.
This means that for a given scene, the predicted position of every object is computed, given the planar road model, current velocity, and camera parameters.
Currently, it is possible to run at about 15 frames per second on a 166 MhZ Pentium processor. Rather than processing the entire image, only selected windows are processed. This helps keep the computational costs down. A prediction is generated for each window, and the prediction is compared with the actual image motion (within a tolerance). If pixel in the window does not match its prediction, then it is tagged as an obstacle. I look for horizontal lines of obstacle pixels, and assume that they are vehicles. Using this, really bad range estimates can also be calculated.
The following two .mpg's show the performance of the system in real time. The first is a stretch of road with numerous vehicles passing us. The second demonstrates the system's tolerance of lane markers.
Batavia, P. H., Pomerleau, D. A., Thorpe, C. E., Overtaking Vehicle Detection Using Implicit Optical Flow, To Appear in Proceedings of the IEEE Intelligent Transportation Systems Conference, Boston, MA, 1997.
Batavia, P. H., Pomerleau, D. A., Thorpe, C. E., Detecting Overtaking Vehicles With Implicit Optical Flow, CMU Tech Report CMU-RI-97-28, 1997