PITTSBURGH—Carnegie Mellon researchers have developed a system that increases the accuracy of face recognition by computer.
After a slow start in the 1970s, interest and progress in face recognition technology has exploded recently as applications in multi-media began to emerge in the 1990s and exploded as its role in security applications since Sept. 11, 2001. began to attract international attention.
The basis of the new technology is Carnegie Mellon's PIE (which stands for Pose, Illumination and expression) Database, developed under the direction of university professor and internationally renowned vision expert Takeo Kanade.
Between October and December 2000 we collected a database of 41,368 images of 68 people. By extending the CMU 3D Room wewere able to image each person under 13 different poses, 43 different illumination conditions, and with 4 different expressions. Wecall this database the CMU Pose, Illumination, and Expression (PIE) database.
People have the ability to recognize the identity of a human face from pictures taken in various poses, under different lighting conditions, and even when they haven't seen the person for a long time. Computers don't have this expertise.
Attempts to give computers the ability to recognize a human face began more than 30 years ago. My Ph.D. thesis detailed one of the earliest computer programs that tried to automate the process of face recognition, including digitizing a face, finding its location, localizing its features, computing various attribute values and recognizing its identity.
Automating human face recognition is a very difficult task, especially if one wishes to deal with a variety of poses and different kinds of illumination. In fact, the Face Recognition Vender Test 2000, sponsored by the Department of Defense and the National Institute of Justice, reports that the recognition rate by representative face recognition programs drops by 20 percent under different illumination conditions, and as much as 75 percent for different poses.
The first figure below illustrates the face recognition problem. We have to deal with at least three axes of variables: Person, pose and illumination. There are a very large number of possible images (shown in each plane) due to different poses and lighting conditions. The following is a typical face recognition problem. Given a gallery of facial images of many people taken in a particular pose and under varying lighting conditions (that is, one image from the whole set of possible images of each person), tell which plane (i.e. person) a face image at hand, called a probe image, belongs to, despite the fact that the probe image is likely to be very different from the gallery image of the same person and, of course, from that of other people as well. In order to cope with the difficulty, one needs to nullify the effect of illumination and consider how the facial features appear to change due to variations in pose.
To study this, we have developed the PIE Image Database. A subject sits in a room with 13 cameras and 17 flashes, each positioned to look at him/her from various angles. Images of all the combinations of poses and illumination angles were collected for 68 people. After three months, another set of images of the same subjects was collected.
Using the PIE Database, we have been developing an automated face recognition system that can recognize people in different poses and under different types of illumination. The structure of the system is illustrated in the second figure. After finding the location of the face in the image, the first step is to deal with the effect of illumination. In general, the intensity of an image is formed as a product of reflectance and illuminance. The reason that people seem to be able to cope with various lighting conditions in a real physical environment is that they "perceive" reflectance without noting its intensity.
Obviously, "computing" reflectance given only intensity is an ill-posed problem; we cannot know the components given only their product. However, it has been shown that it is possible to estimate reflectance from intensity as a solution of a large partial differential equation by imposing anisotropic smoothness, which stimulates the function of peoples' retinal horizontal and amacrine cells..
This "normalized" image is the input to the remaining process. Then facial landmarks, such as eyes and nose, are located, and a set of small areas is defined with respect to those landmark positions. Various attribute values of those areas are computed, such as intensity distribution, edge distribution, edge orientations, etc. Those attribute values are compared with those of gallery images to make a decision to whom the inset probe image is "closest."
However, the key technique of our system is that we model and take into account how those attributes change as the pose changes. We have examined, analyzed and modeled such changes beforehand by using the PIE Database, since it consists of images of known pose and illumination conditions. The decision making is done by properly weighting the attributes based on the model. Naturally, the system does not know the pose of the input probe face image, but a technique of (A?) hidden variable in probabilistic modeling can still take advantage of the attribute change model.
We have shown that the system can handle up to plus/minus 35 degrees of pose and illumination variables without reducing the recognition rate more than five percent. I will show various examples during my presentation. What is the PIE Database? You never sayIn graph 7, after finding the location of the face in the image (how?)Are we the only people who have made this discovery and developed a system? Is this patentable technology? You said it's based on Simon's system. Can that be described to me? Can you give me quote about why this is important? How it represents a breakthrough?