Improving the HoG Descriptor
Carl Doersch and Alexei Efros
Abstract: The HoG descriptor has become one of the most popular low-level image
representations in computer vision: even a small improvement in its ability
to represent images would be useful. In this project, we explore several
ways to enhance HoG at minimal performance cost.
One approach is to
separate high-frequency gradients ('step edges'), which tend to represent edges, from
low-frequency gradients ('diffuse gradients'), which tend to represent shading. We hypothesize
that these two types of gradients should give different and complimentary
information: the first indicates boundaries whereas the second gives clues
about smooth 3-d shapes. As it is, however, HoG only represents the
orientation of an edge, rather than its spatial extent, and so the distinction is lost. We propose several
algorithms to separate these types of edges: for example, the above is
the result of a convex optimization at the image level, where the goal
is to find a sparse representation that can explain the gradients of the
image at one orientation using both low-frequency (middle) and
high-frequency (right) components.
We also attempt to more strongly separate texture from contours. In
the current implementation of HoG, an SVM cannot become sensitive a
black object on a white background and also to a white object on a black
background without also becoming somewhat sensitive to ordinary texture.
We argue that a very simple modification to HoG can help with this problem.
[pdf] (under construction)
|