25 May 2018 - Wilker Ferreira Aziz: Seminar
Probabilistic modelling for NLP powered by deep learning
Deep generative models (DGMs) are probabilistic models parametrised by neural networks (NNs). DGMs combine the power of NNs with the generality of the probabilistic learning framework allowing a modeller to be more explicit about her statistical assumptions. To unlock this power however one must consider efficient ways to approach probabilistic inference. Amortised variational inference (Kingma and Welling, 2013; Mnih and Gregor, 2014) is a black-box technique where we see inference as a reverse modelling problem (from data to latent space) and have approximate posteriors parametrised by NNs. Parameter estimation works by back-propagation through stochastic computation graphs---which can be made efficient in circumstances where a certain reparametrisation of latent variables is available.
I will start this talk by presenting amortised VI in order to set a common background for the rest of the talk. I will then present a number of DGMs I have developed with my collaborators at UvA to improve neural network models for natural language problems such as word representation and machine translation. For machine translation in particular, I will talk about making every component of an encoder-decoder architecture stochastic (i.e. encoder, attention mechanism, and decoder) and how that helps, for example, in low-resource scenarios.