13 September 2019 - Sharid Loáiciga: Seminar
Title: Understanding non-nominal anaphora
Pronominal reference is hard because it requires deep language understanding. The problem of non-nominal or event anaphora is particularly hard and resources annotated with this phenomenon are scarce. In this talk, I will present this topic from two points of view: machine translation and psycholinguistics, the first as a method to tackle the lack of resources and the second to improve our understanding of non-nominal anaphora. In machine translation, systems often struggle with the functional ambiguity of pronouns. Some pronouns have the same surface form but different functions, and each of these functions has different translations depending on the languages in question. I have addressed this problem in the form of a prediction task of three functions of the English pronoun 'it': nominal anaphoric (e.g., 'The party ended late. It was fun.'), non-nominal or event reference (e.g., 'He can't speak Finnish. It annoys me.') and pleonastic (e.g., 'It's been raining all day.'). On this topic, I will present results based on gold-standard data and self-training experiments using silver-standard data. I will also present recent experiments on exploiting large parallel data as an unsupervised signal for detecting the different functions of 'it'. It turns out that distinguishing between the anaphoric and event reference readings is the hardest part of the three-way classification task mentioned above. This rises the question of when and to what degree event instances serve as antecedents when a competing entity is also available. To answer this question, I will present a series of ongoing multilingual psycholinguistic studies using Amazon Mechanical Turk in which participants are presented with a context sentence and prompted with either 'it' or 'this' (e.g., 'The cake for the guests cooked poorly'. This/It ___). The objective is to set a baseline at which comprehenders assign the 'entity' or 'event' label to a set of pronouns and then to measure if the same rate is found using naturally occurring contexts where we know whether an event or entity is referred.
Sharid Loáiciga is a postdoctoral researcher at the Centre for Linguistic Theory and Studies in Probability (CLASP), University of Gothenburg. Her research focuses on discourse and machine translation in general and on pronominal reference in particular. On this subject she has done work concerning the annotation and interpretation of different referential expressions from a multilingual point of view. She received her PhD from the University of Geneva in 2017.