Fast Weight Programmers for Greater Systematic Generalisation in Language

Staff - Faculty of Informatics

Date: 22 May 2023 / 16:00 - 18:00

USI Campus Est, room C1.05, Sector C

You are cordially invited to attend the PhD Dissertation Defence of Imanol Schlag on Monday 22 May 2023 at 16:00 in room C1.05.

Over the past decade, deep neural network models have made significant advancements and achieved impressive results across various domains, including computer vision, natural language processing, and game playing. However, there is an ongoing debate questioning the ability of connectionist models to serve as a substrate for general AI due to their lack of systematicity, which continues to persist in modern deep learning models. To address this challenge, we propose the use of Fast Weight Programmers (FWPs) to enable structured representations and adaptive, context-specific computations. An FWP is a two-network system introduced in the early '90s, where a slow network with regular weights continuously updates the fast weights of a fast network. This makes the fast weights dependent on the context of the current input data, resulting in several benefits. In this work, we present novel neural architectures that build upon existing FWPs and contemporary neural networks to improve their systematicity. We also establish the formal equivalence of FWPs and linear Transformers, a variant of the Transformer architecture that linearises the attention mechanism for improved scalability. We demonstrate that modern FWP models can facilitate more structured representations and adaptive context-specific computation, leading to significant improvements in tasks such as question answering, machine translation, and natural language modelling.

Dissertation Committee:
- Prof. Jürgen Schmidhuber, Università della Svizzera italiana, Switzerland (Research Advisor)
- Prof. Cesare Alippi, Università della Svizzera italiana, Switzerland (Internal Member)
- Prof. Rolf Krause, Università della Svizzera italiana, Switzerland (Internal Member)
- Prof. Jimmy Ba, University of Toronto, Canada (External Member)
- Prof. Sepp Hochreiter, Johannes Kepler Universität Linz, Austria (External Member)