Machine Learning Thesis Proposal

  • Remote Access - Zoom
  • Virtual Presentation
  • Ph.D. Student
  • Machine Learning Department
  • Carnegie Mellon University
Thesis Proposals

Learning and Reasoning with Fast Semidefinite Programming and Mixing Methods

Semidefinite programming has long been a theoretically powerful tool for solving relaxations of challenging, often NP-hard optimization problems. However, it has typically not been practical for most large-scale tasks, owing to the high memory and computational cost of typical solvers for solving SDPs. In this thesis, we aim to break the barrier and bring SDP's power back to large-scale machine learning problems. To achieve this, we introduce a series of optimization solvers, operating on the low-rank or low-cardinality manifolds of the semidefinite variables. We find that in many domains, these methods allow SDP relaxations to exceed the state of the art in terms of both computational cost and the relevant performance metrics.

First, we proposed the Mixing method, a low-rank SDP solver aimed at the classical MAXCUT SDP relaxation. We also show that the Mixing method can accurately estimate the mode and partition function of the pairwise Markov Random Fields, and it scales to millions of variables. Further, we show how to learn the parameters inside SDPs by analytically differentiating through the optimization problem with implicit differentiation and the mixing methods, which leads to a differentiable SAT solver that can be integrated within the loop of larger deep learning systems. Finally, we propose a separate variant aimed at low cardinality SDPs, and demonstrate how to apply the method to community detection on finding clusters within large-scale networks.

Moving forward, we propose two new works to continue in this direction. In the first work, we proposed a transformer-like equilibrium layer for language modeling, which has a constant memory requirement and guaranteed convergence by the mixing method. In the second work, we develop a graph convolution network, which learns representations in an unsupervised manner based on the SDP relaxation of the Weisfeiler-Lehman algorithm for graph isomorphism.

Thesis Committee: 
Zico Kolter (Chair)
Gary Miller
Ryan Tibshirani
Fatma Kilinc-Karzan
Max Welling (University of Amsterdam)

Zoom Participation. See announcement.

For More Information, Please Contact: