|
A Brief Introduction to the paper
A Face Annotation Framework with Partial Clustering and Interactive Labeling
What does the paper do?
|
The overall goal is to design a Photo Annotation System. It aims to alleviate human labors on labeling lots of digital photos, especially grouping them by human identities, or faces.
The
recent achievements of Face Detection/Recognition techniques make it
possible, but only to limited extent. The performance of Face Detection
now approaches commercial strength but Face Recognition is far from
being a useful and stable component. In practical cases, facial
features are liable to illumination, view and pose changes, thus not
reliable.
That situation makes fully
automatic grouping by faces an intractable problem. We apply
semi-automatic approaches to solve this, aiming to minimize human
labors as possible. |
The framework
|
Firstly
we regularly use Face Detection techniques to extract faces from
images, get multiple features and fuse them into a similarity matrix.
This similarity matrix is not necessarily precise and may contain a lot
of noise.
Then we divide the labeling
process into two parts: unsupervised and interactive part. The overall
framework is like the following: |
 |
Unsupervised part
|
| In unsupervised part, we cluster faces via Partial Clustering.
As one contribution in the paper, Partial clustering only groups
evident/good clusters in which similarity is consistent, and leave all
the other faces in the litterbin, in which similarity measure is
contaminated by noise and not reliable. |
 |
Interactive part
|
In
interactive part, user interactions are involved. Users are required to
label evident/good clusters so that we can get a lot of labeled faces
as "seed" without giving annoying experience to users.
Then we apply Efficient Labeling, the second contribution in the paper, to the half-labeled face set.

This
algorithm picks out a list of faces Q, once at a time for the user to
label. Regarding labeling precess as information influx to resolve the
ambiguity of half-labeled set, the list of faces are deliberately
chosen so that the ratio of information gain to estimated number of
user interactions, or "information efficiency", is maximized. We expect that user can label many faces while fewer user interactions are required.
Estimation of #user interactions is done via Subset Saliency, indicating how cohesive Q w.r.t. other unlabeled faces. |
|
Experiment
|
We conduct several experiments. One can find detailed experimental description and results in the Paper.
Here we compare overall performance of our framework with Riya's. (See http://www.riya.com).
Ours outperform Riya by about 46%! |
 |
Download PDF: A
Face Annotation Framework with Partial Clustering and Interactive
Labeling
|