Systematic Generalization in Connectionist Models
Decanato - Facoltà di scienze informatiche
Data: 18 Settembre 2023 / 16:30 - 19:00
USI East Campus, Room D0.03, and online
You are cordially invited to attend the PhD Dissertation Defence of Robert Csordas on Monday 18 September 2023 at 16:30 in room D0.03 (USI East Campus) and online.
In recent years, neural networks (NNs) revolutionized computer science, solving many problems out of reach of classical methods. Thanks to their flexibility, they can process raw data, such as images, audio, or text, and defeat humans in games. However, a critical challenge remains: they often fail on test data that follow the same underlying rules as the training data but present superficial differences, like longer inputs or unseen word combinations. Generalization to such structurally related data is called systematic generalization. Analysis suggests that NNs often learn a smart interpolation between their training data points and rarely learn a generally applicable rule-based solution. This limits both their applicability and their trustworthiness. Thus, systematic generalization is of utmost importance. This work consists of multiple parts. First, we improve the performance of differentiable neural computers in algorithmic and reasoning tasks. Then we analyze the implicit modularity of neural networks and show that it does not support compositionality. Motivated by compositionality, we introduce architectural changes to transformers, significantly boosting generalization on multiple well-known datasets. Pushing this idea further, we introduce the purely connectionist NDR architecture that can generalize to longer inputs on algorithmic tasks. Then we move our focus to systematicity, and we propose a new dataset to analyze the behavior of the model. Finally, we focus on scaling up NDRs to real-world tasks and improving the Mixture of Experts models, matching the performance of the parameter-equivalent dense baselines. We hope that the high-level ideas outlined in this thesis can provide guidance for further research aiming to achieve compositional generalization.
- Prof. Jürgen Schmidhuber, Università della Svizzera italiana, Switzerland (Research Advisor)
- Prof. Cesare Alippi, Università della Svizzera italiana, Switzerland (Internal Member)
- Prof. Rolf Krause, Università della Svizzera italiana, Switzerland (Internal Member)
- Prof. Jacob Andreas, MIT, USA (External Member)
- Prof. Dzmitry Bahdanau, McGill University, Canada (External Member)
- Prof. Marco Baroni, Universitat Pompeu Fabra, Spain (External Member)