The statistics of non-linguistic symbol systems.

Richard Sproat (Google, Inc., USA)

Hörsaal Max-Planck-Institut für Evolutionäre Anthropologie (Leipzig)

Abstract

For 5000 years humans have been using visible marks to encode spoken language. For a far longer period, they have been using visible marks to encode concepts, ideas or, in general, a variety of non-linguistic information. When faced with an ancient symbol system whose meaning is unknown, can one tell if was linguistic (and therefore worth trying to decipher as a language), or some sort of non-linguistic system?

On the face of it, it seems reasonable to use as evidence statistical information on the behavior of symbols in the system. If the symbols distribute in a way that is similar to the distribution of elements (phonemes, morphemes, words, etc) in language, then this could serve as evidence that the system is writing. In causal terms, the fact that it is writing causes the system to show the statistical properties it has.

Recent work that has used this line of argumentation suffers from a variety of problems. First, while such work invariably makes the claim that the statistical measures used are evidence for structure, often the measures actually tell us little or nothing about structure. Second, even if the measures do relate to structure, do they specifically imply linguistic structure? A parse tree looks very similar to a tree that describes the structure of a mathematical formula, so structure per se hardly seems enough. This leads to a third problem with such work in that it depends to some degree on a widespread misconception that non-linguistic systems are structureless. Finally there is the question of whether sample sizes for such systems are ever large enough to make robust statistical claims.

In this talk I review the results of my own work on the statistics of non-linguistic symbol systems, and draw a mostly negative conclusion about the possibility of finding statistical measures that are useful in answering this question.

conference

13.04.15 15.04.15

Causality in the language sciences Causality in the language sciences

Max-Planck-Institut für Evolutionäre Anthropologie Hörsaal

See Details

Antje Vandenberg

Max Planck Institute for Mathematics in the Sciences Contact via Mail

Damián Blasi

MPI for Mathematics in the Sciences and MPI for Evolutionary Anthropology (Leipzig), Germany

Jürgen Jost

Max Planck Institute for Mathematics in the Sciences (Leipzig), Germany

Peter Stadler

Leipzig University, Germany

Russell Gray

Max Planck Institute for Human History (Jena), Germany

Bernard Comrie

Max Planck Institute for Evolutionary Anthropology (Leipzig), Germany

Stephen C. Levinson

MPI for Psycholinguistics (Nijmegen), Netherlands

Nihat Ay

Max Planck Institute for Mathematics in the Sciences (Leipzig), Germany

Sean Roberts

Max Planck Institute for Psycholinguistics (Nijmegen), Netherlands

Leonardo Lancia

Max Planck Institute for Evolutionary Anthropology (Leipzig), Germany