Language modeling beyond language modeling

Mariya Toneva (MPI SWS)

Live Stream

Abstract

Language models that have been trained to predict the next word over billions of text documents have been shown to also significantly predict brain recordings of people comprehending language. Understanding the reasons behind the observed similarities between language in machines and language in the brain can lead to more insight into both systems. Additionally, the human language system integrates information from multiple sensory modalities which puts text-only language models at a fundamental disadvantage as cognitive models. In this talk, we will discuss a series of recent works that make progress towards these questions along different dimensions. The unifying principle among these works that allows us to make scientific claims about why one black box (language model) aligns with another black box (the human brain) is our ability to make specific perturbations in the language model and observe their effect on the alignment with the brain. Building on this approach, these works reveal that the observed alignment is due to more than next-word prediction and word-level semantics and is partially related to joint processing of select linguistic information in both systems. Furthermore, we find that the brain alignment can be improved by training a language model to summarize narratives, and to incorporate auditory and visual information of a scene. Taken together, these works make progress towards determining the sufficient and necessary conditions under which language in machines aligns with language in the brain.

Bio: Mariya Toneva is a tenure-track faculty at the Max Planck Institute for Software Systems, where she leads the BrAIN (Bridging AI and Neuroscience) group. Her research is at the intersection of Machine Learning, Natural Language Processing, and Neuroscience, with a focus on building computational models of language processing in the brain that can also improve natural language processing systems. Prior to MPI-SWS, she was a C.V. Starr Fellow at the Princeton Neuroscience Institute and obtained her Ph.D. from Carnegie Mellon University in a joint program between Machine Learning and Neural Computation.

Links

seminar

14.08.25 02.10.25

Math Machine Learning seminar MPI MIS + UCLA Math Machine Learning seminar MPI MIS + UCLA

MPI for Mathematics in the Sciences Live Stream

See Details

Upcoming Events of this Seminar

Thursday, 14.08.25 to be announced with Jonathan Siegel
Thursday, 02.10.25 to be announced with Marcello Carioni