Measuring concept interpretations with seeded topic models
- Marc Keuschnigg (Leipzig University)
Abstract
Social scientists are discussing the need for more formal ways to extract meaning from digital text. We focus attention on an old school model---the seeded topic model---that allows domain knowledge to be infused into the computational learning of meaning structures. Seed words help crystallize topics around known concepts, while utilizing topic models’ functionality to identify associations in text based on word co-occurrences. We implement a scalable version of that model to estimate a concept’s shared interpretation (or framing) via its associations with other frequently co-occurring topics. In a case study, we extract longitudinal measures of the understanding of immigration from a corpus of millions of Swedish newspaper articles from the period 1945–2019. We infer turning points that partition the immigration discourse into meaningful eras and locate Sweden’s era of multicultural ideals that coined its tolerant reputation.