Language Technologies Thesis Proposal

  • Gates Hillman Centers
  • Reddy Conference Room 4405
  • ZHILIN YANG
  • Ph.D. Student
  • Language Technologies Institute
  • Carnegie Mellon University
Thesis Proposals

Deep Generative Modeling with Applications in Semi-Supervised Learning

The ultimate goal of generative modeling is to model the probability of the world, either implicitly or explicitly. In practice, researchers devise models to estimate the probability of data, often unlabeled in its natural form, such as text corpora and images. Generative modeling not only serves as a bridge towards characterizing and understanding the world from a probabilistic perspective, but also has a benefit of learning transferable features from unlabeled data. This thesis proposes novel deep learning architectures for generative modeling, along with semi-supervised learning algorithms that leverage generative modeling on unlabeled data to improve performance on downstream tasks.

Specifically, the thesis consists of two parts—better architectures to improve generative modeling, and applications of generative modeling in semi-supervised learning. In the first part, we identify an expressiveness bottleneck of prior neural language models, and propose a high-rank language model called the Mixture of Softmaxes (MoS) to break through such bottleneck. We later propose a faster high-rank language model that trains faster than MoS while maintaining the capacity to break the bottleneck. In the second part, we present four semi-supervised learning algorithms based on generative approaches, including generating low-density adver- sarial samples, generating natural language questions given the context, generating random walk paths on a graph, and language modeling.

Thesis Committee:
Ruslan Salakhutdinov, (Chair)
William W. Cohen, (Co-Chair)
Graham Neubig
Jason Weston, (Facebook AI Research)

Copy of Thesis Proposal Document
 

For More Information, Please Contact: 
Keywords: