On the syntactic abilities of recurrent neural networks
Tal Linzen - Johns Hopkins University
2130 HJ Patterson
Recent technological advances have made it possible to train recurrent neural networks (RNNs) on a much larger scale than before. These networks have proved effective in applications such as machine translation and speech recognition. These engineering advances are surprising from a cognitive point of view: RNNs do not have the kind of explicit structural representations that are typically thought to be necessarily for syntactic processing. In this talk, I will discuss studies that go beyond standard engineering benchmarks and examine the syntactic capabilities of contemporary RNNs using established cognitive and linguistic diagnostics. These studies show that RNNs are able to compute agreement relations with considerable success across languages, although their error rate increases in complex sentences. A comparison of the detailed pattern of agreement errors made by RNNs to those made by humans in a behavioral experiment reveals some similarities (attraction errors, number asymmetry) but also some differences (relative clause modifiers increase the probability of attraction errors in RNNs but decrease it in humans). Overall, RNNs can learn to exhibit sophisticated syntactic behavior despite the lack of an explicit hierarchical bias, but their behavior differs from humans in important ways.