Eric P. Xing
Recent Advances in Learning Sparse Structure Input/Output Model: Models, Algorithms, and Applications

In many high-dimensional structured input/output problems, such as genome-phenome association analysis, where both input and output can contain tens of thousands, sometimes even millions of inter-related features, learning a sparse and consistent structured predictive function can be of paramount importance for both robustness and interpretability of the model. Despite its impotence, this problem has not been extensively explored in machine learning and statistics literature. In this talk, I will present some recent results along this line. I will first present a family of sparse structured regression models in the contexts of uncovering true associations between linked genetic variations (inputs) and networked phenotypes (outputs), and reverse engineering time-varying networks from time series data, which can be cast as efficiently solvable convex optimization problems and yield parsimonious and possibly consistent maximum likelihood estimates of the model. Then I will present another class of new models known as the maximum entropy discrimination Markov networks, which address the same problem in the maximum margin paradigm, but using a entropic regularizer that lead to a distribution of structured prediction function that are simultaneously primal and dual sparse (i.e., with few support vectors, and of low effective feature dimension), and can be efficiently solved via a novel algorithm that builds on variational inference and existing solvers for the maximum margin Markov network (which is a special case of our proposed model).