Self-Supervised Neural Topic Modeling

Decanato - Facoltà di scienze informatiche

Data: 2 Giugno 2023 / 11:30 - 12:30

USI East Campus, Room D0.02

Speaker: Dr. Ali Bahrainian, University of Tüebingen

Topic models are useful tools for analyzing and interpreting the main underlying themes of large corpora of text. Most topic models rely on word co-occurrence for computing a topic, i.e., a weighted set of words that together represent a high-level semantic concept. In this presentation, we discuss a new light-weight Self-Supervised Neural Topic Model (SNTM) that learns a rich context by learning a topic representation jointly from three co-occurring words and a document that the triple originates from. Our experimental results indicate that our proposed neural topic model, SNTM, outperforms previously existing topic models in coherence metrics as well as document clustering accuracy. Moreover, apart from the topic coherence and clustering performance, the proposed neural topic model has a number of advantages, namely, being computationally efficient and easy to train.

Biography: Ali Bahrainian is a senior researcher at the University of Tüebingen, Germany. He received a PhD from the University of Lugano in 2019 and held postdoctoral positions at EPFL, Switzerland and Brown University, USA. His research focus is mainly on Natural Language Processing, and generative models in particular. This includes sequence-to-sequence text generation models, as well as, topic modeling. More specifically his current research focuses on integrating common-sense knowledge and semantic concepts into text generation models and introducing controlability mechanisms to satisfy pre-defined requirements. He has served as a PC member at various international conferences such as ACL, SIGIR, and IJCAI.

Host: Prof. Fabio Crestani