Poster session in the entrance hall of the institute
Authors: Martin Gerlach, Francesc Font-Clos, and Eduardo G. Altmann
In this work we investigate the temporal and author variation in the usage of language. We quantify the difference in the vocabularies by looking at the statistics of word frequencies in written text and applying tools rooted in information theory (e.g. mutual information, Jensen-Shannon divergence). This not only yields a well-defined measure capturing how much information is shared between two different (samples of) text, but also allows for a calculation of the expected fluctuations in order to assess the significance of these differences. We use these tools to compare individuals books to a reference corpus with yearly resolution (Google n-gram database). We confirm a good matching of our results with the publication date of the books and we
develop measures that quantify how innovative the vocabulary of individual books and authors were.
Throughout history language sciences have been dealing with numerous phenomena that are either inherently complex/dynamic systems, or which display typical qualities of such systems. Within an individual, one can bring to mind perceptual dynamics and categorisation in speech, the emergence of phonological templates, or word and sentence processing; across society, think variations and typology, the rise of new grammatical constructions, semantic bleaching, language evolution in general, and the spread and competition of both individual expressions, and entire languages.
A representative handful of language phenomena will be depicted which have been known to exhibit such properties as hysteresis, phase transition, bifurcation, attractor states, or power law distribution. The multifaceted dynamism and complexity will also be discussed of the process of language acquisition, highlighting the importance of adopting designs with different timescales in order to trace language development as a process of change over time, of the utility of time-series analyses, and of the ability to determine optimal temporal integration windows, e.g. in analyses of dynamic motifs in human communication.
As has been pointed out, the quest to understand the causal origin of
linguistic structures is ultimately an investigation into the
diachronic processes that shape languages across time. Consequently,
new approaches to causal inference can be informed and strengthened by
considering insights on the nature of language change from the long
tradition of diachronic studies. In this poster I want to investigate
how the concept of arbitrariness, as well as the sporadic nature and
so-called 'actuation problem' of language change relate to causal
explanations of language. I will try to argue that these central
linguistic tenets call into question whether causal explanations of
language are possible, or even desirable. Rather than completely
discarding the role of causality in language, I hope to show that much
could be gained from studying the inverse (but related) problem, i.e.
from identifying and characterising the mechanisms which keep
languages from becoming completely determined by their environment. A
general understanding of the dynamics of linguistic changes would seem
to be an important prerequisite for making links between particular
changes and their underlying causes, particularly in terms of
identifying the exact nature of 'causal triggers' of change.