Hints of linguistic structure in neural models of language and translation.
The advent of efficiently trainable neural networks has led to striking improvements in the accuracy of next word prediction, machine translation and many other NLP tasks. It has also produced models that are much less interpretable. In particular, the role played by linguistic structure in sequence prediction and sequence-to-sequence models remains hard to gauge. What makes recurrent neural networks work so well for next word prediction? Do neural translation models learn to extract linguistic features from raw data and exploit them in any explicable way? In this talk I will give an overview of recent work, including my own, that aims at answering these questions. I will also present on-going experiments on the importance of recurrency for capturing hierarchical structure with sequential models. Answering these questions is not only interesting per se, but is also important to establish whether injecting linguistic knowledge (e.g. via supervised annotation) into neural models is a promising research direction, and to understand how close we are to building intelligent systems that can truly understand and process human language.
Arianna Bisazza is Assistant Professor in natural language processing at Leiden University, Netherlands. Her research focuses on the statistical modeling of natural language, with the main goal of improving the quality of machine translation for challenging language pairs. She previously worked as a postdoc at the University of Amsterdam and as a research assistant at Fondazione Bruno Kessler. She obtained her PhD from the University of Trento in 2013 and was awarded a VENI (NWO starting grant) in 2016.
Add to your calendar
23 February 2018 - Arianna Bisazza: Seminar
Informatics Forum 4.31/4.33