12 October 2018 - James Henderson: Seminar
Learning Vector Representations of Abstraction with Entailment-Based Distributional Semantics
Representation learning for natural language has made great progress using the rich notion of similarity provided by a vector space. In particular, models of the distribution of contexts where a word occurs (distributional semantics) have learned vector representations of words (word embeddings) which capture a widely useful notion of semantic similarity. But for many tasks we want abstraction, not similarity. This talk presents distributional semantic models using a vector space for abstraction instead of similarity. These entailment vectors represent how much is known in each dimension, thereby representing information inclusion between vectors, known as entailment or abstraction. A variational approximation leads to operators for measuring entailment between vectors and methods for inferring vectors from entailment relations. These are used to define an entailment-based model of the semantic relationship between a word and its context, which forms the basis of distributional semantic models for learning entailment-based representations of words. These representations give state-of-the-art results on unsupervised lexical entailment (hyponymy) detection. We argue that the entailment vectors framework has wider applicability both in natural language semantics and deep learning architectures.
James Henderson is the head of the Natural Language Understanding group at Idiap Research Institute, since joining Idiap in September 2017. He previously worked at the University of Geneva, XRCE (now Naver Labs Europe), University of Edinburgh and University of Exeter, and received his PhD from the University of Pennsylvania. He is an action editor for TACL, and was previously on the editorial board of CL.