Towards Unsupervised Multi-Object Perception in Neural Networks

Staff - Faculty of Informatics

Date: 8 August 2022 / 16:30 - 18:30

You are cordially invited to attend the PhD Dissertation Defence of Klaus Greff on Monday 8 August 2022 at 16:30, you can join online at this link.

By decomposing the world in terms of objects, humans are able to recombine their existing knowledge in a virtually unbounded number ways to understand unfamiliar situations, make novel inferences, or generate new behavior. This ability to form meaningful entities from unstructured sensory information is of central importance for our impressive ability far beyond our direct experience. Contemporary neural networks still fall short of human-level generalization, which we argue is due to their inability to dynamically and flexibly bind information that is distributed throughout the network. This binding problem affects their capacity to acquire a compositional understanding of the world in terms of symbol-like entities (like objects), which is crucial for generalizing in predictable and systematic ways. We focus in particular on the process of perceptually grouping raw sensory inputs into meaningful objects. Importantly, we aim to enable neural networks to learn about objects in an unsupervised fashion, because their required scope and flexibility, renders adequate supervision or engineering infeasible. To that end, we propose a functional definition of objects in terms of predictive modularity, and use it to derive a formalization of perceptual grouping as a particular form of clustering. We demonstrate the feasibility of this approach by developing several neural network models that learn to segment and represent meaningful objects without supervision. Using simple synthetic datasets, we show that these representations are useful for prediction and semi-supervised classification tasks, and that they facilitate certain kinds of systematic generalization. The resulting representations are also more interpretable than non-object centric representations. We believe that a compositional approach to AI, in terms of grounded symbol-like representations, is of fundamental importance for realizing human-level generalization, and we hope that this thesis may contribute towards that goal.

Dissertation Committee:
- Prof. Jürgen Schmidhuber, Università della Svizzera italiana, Switzerland (Research Advisor)
- Prof. Cesare Alippi, Università della Svizzera italiana, Switzerland (Internal Member)
- Prof. Rolf Krause, Università della Svizzera italiana, Switzerland (Internal Member)
- Prof. Michael C. Mozer Mozer, University of Colorado, Boulder, USA (External Member)
- Prof. Wolf Singer, Ernst Strüngmann Institute (ESI) for Neuroscience, Germany (External Member)