Unsupervised neural learning of contextual representations has been a central focus in recent machine learning research, with great success in natural language processing, question answering, sentiment analysis, information retrieval, and more. ElMo, BERT, Transformer-XL are well-known methods in this area, with complementary strengths and limitations. We combine the strengths of those methods and address their limitations with a novel approach, namely XLNet, which outperforms BERT on benchmark datasets for 20 challenging tasks, yielding new state-of-the-art results at the time (late 2019). This talk presents the key ideas and our main findings with XLNet in comparison with other representative approaches.
Yiming Yang is a professor at Carnegie Mellon University with a joint appointment in the Language Technologies Institute and the Machine Learning Department. She received her Ph.D. in Computer Science from Kyoto University (Japan), and has been a faculty member at CMU since 1996. Her research has centered on machine learning, scalable algorithms and a broad range applications, including extreme-scale text categorization, semi-supervised learning over graphs, time series forecasting and anomaly detection, cross-language and cross-domain transfer learning, analogical multi-relational embedding, deep neural learning over social and epidemiological networks, semi-supervised clustering, and optimization for online advertising.
The LTI Colloquium is generously sponsored by Abridge.
Zoom participation. See announcement.