8 June 2017 - Tal Linzen: Seminar
Structure-sensitive dependency learning in recurrent neural networks
Neural networks have recently become ubiquitous in natural language processing systems. Yet we typically have little understanding of cognitive capabilities of these networks beyond their overall accuracy in an applied task. The present work investigates the ability of recurrent neural networks (RNNs), which are not equipped with explicit syntactic representations, to learn structure-sensitive dependencies from a natural corpus; we use English subject-verb number agreement as our test case.
We examine the success of the RNNs (in particular LSTMs) in predicting whether an upcoming English verb should be plural or singular. We focus on specific sentence types that are indicative of the network's syntactic abilities; our tests use both naturally occurring sentences and constructed sentences from the experimental psycholinguistics literature. We analyze the internal representations of the network to explore the sources of its ability (or inability) to approximate sentence structure. Finally, we compare the errors made by the RNNs to agreement attraction errors made by humans.
RNNs were able to approximate certain aspects of syntactic structure very well, but only in common sentence types and only when trained specifically to predict the number of a verb (as opposed to a standard language modeling objective). In complex sentences their performance degraded substantially; they made many more errors than human participants. These results suggest that stronger inductive biases are likely to be necessary to eliminate errors altogether; we begin to investigate to what extent these biases can arise from multi-task learning. More broadly, our work suggests that methods from linguistics and psycholinguistics may help us understand the abilities and limitations of "black-box" neural network models.
Tal Linzen is currently a postdoctoral researcher at the École Normale Supérieure in Paris, affiliated with the Laboratoire de Sciences Cognitives et Psycholinguistique and Institut Jean Nicod; he will join Johns Hopkins University as an Assistant Professor of Cognitive Science in July 2017. Tal obtained his PhD from New York University in 2015 under the supervision of Alec Marantz. His interests are in developing and testing cognitive models of human language; particular problems he has worked on are probabilistic prediction in language comprehension, generalization in language learning and the linguistic capacities of artificial neural networks.