Dongyeop Kang

I am a final-year Ph.D. student in the Language Technologies Institute of the School of Computer Science at Carnegie Mellon University, working with my fantastic advisor, Eduard Hovy. I interned at Facebook AI, Allen Institute for AI (AI2), and Microsoft Research. My Ph.D. study has been supported by Allen Institute for AI (AI2) Fellowship, CMU Presidential Fellowship, and ILJU Graduate Fellowship. In the middle of my study, I completed my alternative military service in South Korea at Naver Labs and KAIST Institute. Before joining CMU, I obtained my BS and MS in Computer Science Engineering at KAIST, Korea.

I'm interested in building human-like language generation systems. Natural language generation (NLG) is a key component of many NLP applications such as dialogue systems (e.g., Alexa, Assistant), automatic email replies, news summarization, and more. Natural language generation is the process of converting computer-internal semantic representation of content into the correct form of English so that the semantics are accurately included. One might think that the only information an NLG system would need to produce is that contained explicitly in the utterance. However, there is a multitude of implicit information NOT explicitly obvious on the surface. For instance, many different surface sentences can say the same meaning (i.e., denotation) but still have slightly different surface outputs (i.e., connotations). Human-like NLG systems require this richer information to be able to accurately produce a single specific output. What parameters are needed besides what is reflected explicitly on the surface? There are the kinds of parameters that seem to be reflected in variations of a language: external knowledge, the intents, interpersonal information, speaker-internal information, and more. One of M. Halliday’s linguistic theories called Systemic Functional Linguistics (SFL) (1978) suggests that such information can be categorized into three metafunctions: ideational, textual, and interpersonal. Each metafunction in turn consists of various types of information. My work focuses repackaging some of the information for each metafunction into three facets: (1) knowledge augmentation for the ideational function, (2) structure imposition for the textual function, and (3) style variation for the interpersonal function, and presenting effective computing methods for handling each facet in a wide range of generation tasks. The key contributions of my work are as follows: Ph.D. Thesis Proposal: Research Interests:






Before Ph.D. study





Last updated in January 2020