Language Technologies Ph.D. Thesis Defense

  • Gates Hillman Centers
  • Traffic21 Classroom 6501
  • Ph.D. Student
  • Language Technologies Institute
  • Carnegie Mellon University
Thesis Orals

Learning Cross-language and Cross-style Mappings with Limited Supervision

Recent natural language processing(NLP) research has been increasingly focusing on deep learning methods and producing superior results on various NLP tasks. Deep NLP models are usually based on the dense vector representation of input and are able to automatically extract multi-scale features given human-annotated data. However, human annotations are expensive and often not evenly distributed across different languages, domains, genres, and styles.

This thesis focuses on multiple aspects of cross-language and cross-style mapping in text, addressing the limitations of existing methods and improving the state-of-the-art results when sufficient amounts of labeled data are not available. By developing both task-oriented transfer learning models (e.g., for class-languageclassification) and generic methods for mapping among embedded words or sentences, the key contribution of this thesis is a set of novel approaches to leveraging unlabeled text data for effective and efficient mapping across languages or styles.

Thesis Committee:
Yiming Yan (Chair)
Jaime Carbonell
Graham Neubig
Ming Zhou (Microsoft Research Asia)

Additional Thesis Information

For More Information, Please Contact: