Machine Learning Thesis Proposal
- Gates Hillman Centers
- Reddy Conference Room 4405
- HONGYANG ZHANG
- Ph.D. Student
- Machine Learning Department
- Carnegie Mellon University
Non-Convex Learning with Few Parameters
Non-convex learning has received significant attention in recent years due to its wide applications in various big data problems such as computer vision, natural language processing, statistics, and theoretical computer science. Modern non-convex learning includes learning with sparsity, learning with low-rank approximations, and learning with deep neural networks, corresponding to the assumptions that data lie with only a few non-zero coordinates, lie on low-rank subspaces, and lie on low-dimensional manifolds, respectively.
Despite a large amount of work on non-convex learning, many fundamental questions remain unresolved. From the computational aspects, direct formulations of non-convex learning are NP-hard to optimize in general. Therefore, one of the long-standing questions is designing computationally efficient algorithms by leveraging the structure of the data. In this thesis, we develop new paradigms toward global optimality of non-convex optimization in polynomial time. In particular, we design algorithms and understand landscape (e.g., duality gap) for the problems of (1-bit) compressed sensing, deep neural network, GAN, matrix factorization.
From the statistical aspect, understanding the tight sample complexity of big data problems is an important research question as well. Intuitively, the intrinsic dimension of structural data should be much smaller than their ambient dimension. Because the tight sample complexity should be comparable to the intrinsic dimension rather than the ambient dimension, this implies the possibility of sub-linear sample complexity w.r.t. the ambient dimension. In this thesis, we design principled, practical and scalable algorithms for big data problems with near-optimal sample complexity. These include models of matrix completion, robust PCA, margin-based active learning, property testing, compressed sensing.
Maria-Florina Balcan (Co-Chair)
David P. Woodruff (Co-Chair)
Avrim Blum (Toyota Technical Institute at Chicago)