NeuLab Presentations at ACL 2018

NeuLab members have seven main conference paper presentations and are organizing a workshop at ACL 2018, the flagship conference in natural language processing and computational linguistics! Come check them out if you’re in Melbourne for the conference.

Main Conference Papers

StructVAE: Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing

  • Authors: Pengcheng Yin, Chunting Zhou, Junxian He, Graham Neubig.
  • Time: Monday July 16, 14:50-15:15. Plenary MCEC.

StructVAE: Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing

Semantic parsing is the task of transducing natural language (NL) utterances into formal meaning representations (MRs), commonly represented as tree structures. Annotating NL utterances with their corresponding MRs is expensive and time-consuming, and thus the limited availability of labeled data often becomes the bottleneck of data-driven, supervised models. We introduce StructVAE, a variational auto-encoding model for semi-supervised semantic parsing, which learns both from limited amounts of parallel data, and readily-available unlabeled NL utterances. StructVAE models latent MRs not observed in the unlabeled data as tree-structured latent variables. Experiments on semantic parsing on the ATIS domain and Python code generation show that with extra unlabeled data, StructVAE outperforms strong supervised models. Code is available here..

Stack-Pointer Networks for Dependency Parsing

  • Authors: Xuezhe Ma, Zecong Hu, Jingzhou Liu, Nanyun Peng, Graham Neubig, and Eduard Hovy.
  • Time: Tuesday, July 17, 10:55-11:20. Room 220, MCEC.

Stack-Pointer Networks for Dependency Parsing

We introduce a novel architecture for dependency parsing: stack-pointer networks (StackPtr). Combining pointer networks (Vinyals et al., 2015) with an internal stack, the proposed model first reads and encodes the whole sentence, then builds the dependency tree top-down (from root-to-leaf) in a depth-first fashion. The stack tracks the status of the depth-first search and the pointer networks select one child for the word at the top of the stack at each step. The StackPtr parser benefits from the information of the whole sentence and all previously derived subtree structures, and removes the left-to-right restriction in classical transition-based parsers. Yet, the number of steps for building any (including non-projective) parse tree is linear in the length of the sentence just as other transition-based parsers, yielding an efficient decoding algorithm with O(n^2) time complexity. We evaluate our model on 29 treebanks spanning 20 languages and different dependency annotation schemas, and achieve state-of-the-art performance on 21 of them. Code is available here.

Learning to Generate Move-by-Move Commentary for Chess Games from Large-Scale Social Forum Data

  • Authors: Harsh Jhamtani, Varun Gangal, Eduard Hovy, Graham Neubig, Taylor Berg-Kirkpatrick.
  • Time: Tuesday, July 17, 12:30-14:00. Poster Session.

Learning to Generate Move-by-Move Commentary for Chess Games from Large-Scale Social Forum Data

This paper examines the problem of generating natural language descriptions of chess games. We introduce a new large-scale chess commentary dataset and propose methods to generate commentary for individual moves in a chess game. The introduced dataset consists of more than 298K chess move-commentary pairs across 11K chess games. We highlight how this task poses unique research challenges in natural language generation: the data contain a large variety of styles of commentary and frequently depend on pragmatic context. We benchmark various baselines and propose an end-to-end trainable neural model which takes into account multiple pragmatic aspects of the game state that may be commented upon to describe a given chess move. Through a human study on predictions for a subset of the data which deals with direct move descriptions, we observe that outputs from our models are rated similar to ground truth commentary texts in terms of correctness and fluency. Code and data are both available here.

Extreme Adaptation for Personalized Neural Machine Translation

  • Authors: Paul Michel and Graham Neubig.
  • Time: Tuesday, July 17, 12:30-14:00. Poster Session.

Extreme Adaptation for Personalized Neural Machine Translation

Every person speaks or writes their own flavor of their native language, influenced by a number of factors: the content they tend to talk about, their gender, their social status, or their geographical origin. When attempting to perform Machine Translation (MT), these variations have a significant effect on how the system should perform translation, but this is not captured well by standard one-size-fits-all models. In this paper, we propose a simple and parameter-efficient adaptation technique that only requires adapting the bias of the output softmax to each particular user of the MT system, either directly or through a factored approximation. Experiments on TED talks in three languages demonstrate improvements in translation accuracy, and better reflection of speaker traits in the target text. Code and data are both available.

Sparse and Constrained Attention for Neural Machine Translation

  • Authors: Chaitanya Malaviya, Pedro Ferreira, Andre F. T. Martins.
  • Time: Tuesday, July 17, 14:00-14:15. ROOM 203/204, MCEC.

Sparse and Constrained Attention for Neural Machine Translation

In NMT, words are sometimes dropped from the source or generated repeatedly in the translation. We explore novel strategies to address the coverage problem that change only the attention transformation. Our approach allocates fertilities to source words, used to bound the attention each word can receive. We experiment with various sparse and constrained attention transformations and propose a new one, constrained sparsemax, shown to be differentiable and sparse. Empirical evaluation is provided in three languages pairs. Code is available here.

Neural Factor Graph Models for Cross-lingual Morphological Tagging

  • Authors: Chaitanya Malaviya, Matthew R. Gormley, Graham Neubig.
  • Time: Wednesday, July 18, 12:30-14:00. Poster Session.

Neural Factor Graph Models for Cross-lingual Morphological Tagging

Morphological analysis involves predicting the syntactic traits of a word (e.g. {POS: Noun, Case: Acc, Gender: Fem}). Previous work in morphological tagging improves performance for low-resource languages (LRLs) through cross-lingual training with a high-resource language (HRL) from the same family, but is limited by the strict—often false—assumption that tag sets exactly overlap between the HRL and LRL. In this paper we propose a method for cross-lingual morphological tagging that aims to improve information sharing between languages by relaxing this assumption. The proposed model uses factorial conditional random fields with neural network potentials, making it possible to (1) utilize the expressive power of neural network representations to smooth over superficial differences in the surface forms, (2) model pairwise and transitive relationships between tags, and (3) accurately generate tag sets that are unseen or rare in the training data. Experiments on four languages from the Universal Dependencies Treebank (Nivre et al., 2017) demonstrate superior tagging accuracies over existing cross-lingual approaches. Code is available here.

Automatic Estimation of Simultaneous Interpreter Performance

  • Authors: Craig Stewart, Nikolai Vogler, Junjie Hu, Jordan Boyd-Graber, Graham Neubig.
  • Time: Wednesday, July 18, 14:15-14:30. ROOM 203/204, MCEC.

Automatic Estimation of Simultaneous Interpreter Performance

Simultaneous interpretation, translation of the spoken word in real-time, is both highly challenging and physically demanding. Methods to predict interpreter confidence and the adequacy of the interpreted message have a number of potential applications, such as in computer-assisted interpretation interfaces or pedagogical tools. We propose the task of predicting simultaneous interpreter performance by building on existing methodology for quality estimation (QE) of machine translation output. In experiments over five settings in three language pairs, we extend a QE pipeline to estimate interpreter performance (as approximated by the METEOR evaluation metric) and propose novel features reflecting interpretation strategy and evaluation measures that further improve prediction accuracy. Code is available here.

Workshop

The Second Workshop on Neural Machine Translation and Generation

Neural sequence to sequence models are now a workhorse behind a wide variety of different natural language processing tasks such as machine translation, generation, summarization and simplification. This workshop aims to provide a forum for research in applications of neural models to machine translation and other language generation tasks (including summarization, NLG from structured data, dialog response generation, among others). The workshop features a number of papers on hot topics in neural machine translation, including incorporating linguistic structure, domain adaptation, data augmentation, handling inadequate resources, and analysis of models. It also features a shared task on efficient neural machine translation (NMT), where participants were tasked with creating NMT systems that are both accurate and efficient.