Search

Workshop

Evolutionary Linguistics in Scientific Articles: A Study of Linguistic Development and Optimization in Scholarly Discourse

  • Stefania Degaetano-Ortlieb (Saarland University, Germany)
E1 05 (Leibniz-Saal)

Abstract

Our research explores the evolution of scientific writing since the inception of the first scientific journal, "Philosophical Transactions of the Royal Society," established in 1665 under Henry Oldenburg's visionary editorship. Although the journal's primary function—to disseminate new scientific discoveries—has remained constant, the language and style of scientific communication have undergone significant changes over the centuries. Our study is anchored in the hypothesis that the linguistic development of scientific writing is driven by the dual processes of specialization and conventionalization, aimed at enhancing the efficiency of communication.

To investigate these socio-linguistic changes, we apply methods developed at our Collaborative Research Center, rooted in information-theoretic measures. This approach facilitates the study of language variation on a large scale, enabling a comprehensive understanding of the evolution of scientific discourse. Our methodology includes two pivotal components:

Data-Driven Feature Selection by Relative Entropy: This aspect of our approach allows for the identification of distinctive features of linguistic variation. Using the principle of relative entropy, as outlined by Kullback & Leibler (1951) and Fankhauser et al. (2014), we can compare entire linguistic levels across different dimensions—such as time, situational context, and social variables—without the need for pre-selecting specific features. This data-driven selection process reveals the unique characteristics and changes in scientific language over time.

Context-Aware Analysis by Surprisal: Building on the concept of surprisal, as introduced by Shannon (1948), our approach models linguistic variation based on the amount of information a linguistic unit conveys in its preceding context. This method allows us to understand how certain words or phrases become more or less predictable over time within scientific discourse, reflecting the evolving norms and expectations of the scientific community.

Through these advanced analytical techniques, our research aims to shed light on the complex mechanisms that have shaped the language of scientific communication over the centuries. We seek to understand not just how scientific writing has changed, but also why these changes have occurred, revealing the underlying socio-linguistic forces at play in the evolution of this specialized form of communication.

Antje Vandenberg

Max Planck Institute for Mathematics in the Sciences Contact via Mail

Guillermo Restrepo

Max Planck Institute for Mathematics in the Sciences