Non-Rigid Structure from Motion (NRSfM) refers to the problem of reconstructing cameras and the 3D point cloud of a non-rigid object from a sequence of images with 2D correspondences. Current NRSfM algorithms are mainly limited within two perspectives: (i) the number of images, and (ii) the type of shape variability they can handle. These difficulties are mostly from the conflict between the condition of the system and the degree of freedom to model non-rigid objects, which has hampered the practical utility of NRSfM for many applications within vision.
In this thesis we propose a novel hierarchical sparse coding model which can exploit an exceedingly over-complete shape dictionary while still maintain a sufficiently constrained solving system. Further, we propose a novel deep neural network to solve the proposed hierarchical block sparse dictionary learning problem with the capacity of handling unprecedented scale and shape complexity. Moreover, the proposed hierarchical sparse model characterizes a simply improvement to handle invisible points with no need of matrix completion. Extensive experiments demonstrate the impressive performance of our approach where we exhibit superior precision and robustness against all available state-of-the-art works in some instances by an order of magnitude. We further propose a new quality measure (based on the network weights) which circumvents the need for 3D ground-truth to ascertain the confidence we have in the reconstructability.
Simon Lucey (Chair)
Hongdong Li (Australian National University)