Abstract for the talk on 06.12.2022 (15:00 h)

Group Seminar

Riccardo Carlucci (MPI for Dynamics and Self-Organization, Göttingen)
Modelling discussion dynamics across Reddit communities

With more than 10 million monthly contributing users, Reddit is one of the largest and most influential social media platforms in the world. Understanding its dynamics at different scales is an important research challenge, not only in its own right but also in relation e.g. to the study of political polarization. Reddit has been steadily growing over the last few years, both in terms of content and userbase. For example: half of all threads and comments ever created date back to the last 2 years, despite Reddit being over 15 years old. I will therefore start by presenting a general overview of Reddit statistics, updating some figures from an earlier review work [1].

Next, I will present some results on the modelling of discussions across various subreddits. It is known that the number of comments per thread Nc across the whole of Reddit follows approximately a power-law distribution [1]. We analysed the largest 500 subreddits in the years 2019-2022 using data from the Pushshift dataset [2] and we found that the distribution of Nc varies considerably across individual communities. In most subreddits Nc follows approximately a power-law

distribution with an upper cut-off. Both the width and the exponent of the power-law window depend on the particular community. In other subreddits however a power-law fit appears inappropriate.

In order to explain this variability we developed a preferential attachment model where the ability of a thread to attract comments is affected by its age and also by an intrinsic fitness. I will then conclude by discussing the phenomenology of the

model, its limitations, and some possible extensions.

[1] - Medvedev, A.N. et al. (2019) doi.org/10.1007/978-3-030-14683-2_9

[2] - Baumgartner, J. et al. (2020) doi.org/10.1609/icwsm.v14i1.7347


08.12.2022, 00:10