Pressures for processing and communicative efficiency bias language development evidence from bigish data and direct causal manipulation in Causality in the language sciences: MPI MIS

As an illustration of how my lab explores and tests causal links, I will present some of our research research on functional biases in language change. Such biases can in turn explain typological patterns, including (hypothesized) linguistic universals. In tune with the motivation for this workshop, I’ll argue that advances in Big Data and experimental methods afford exciting new approaches to these questions, which have enjoyed continued prominence in the language sciences.

For this talk, I’ll focus on two specific pressures on language use. The first pressure relates to the fact that linguistic communication takes place in the presence of noise, so listeners need to infer intended message from noisy input —making less probable message harder to infer (e.g., Levy, 2008; Norris & McQueen, 2008; Bicknell & Levy, 2012; Gibson et al., 2013; Kleinschmidt & Jaeger, 2015). The second pressure relates to memory demands during language processing, where longer dependencies are associated with slower processing (Gibson, 1998, 2000; Lewis et al., 2006; Vasishth & Lewis, 2005). Both pressures are well-known to affect language processing, including evidence from both experimental data (e.g., McDonald & Shillcock, 2003; Grodner & Gibson, 2005) and broad coverage corpus studies (e.g., Demberg & Keller, 2008; Boston et al., 2010; Smith & Levy, 2013). This means that an ideal speaker (in the sense of ideal observers) should a) support low-probability —i.e., high information— messages with ‘better’ linguistic signals to the extent that this is warranted against the effort in implies (e.g., due to aiming for more precise articulations or due to articulation additional words, cf. Lindblom, 1990, Jaeger, 2006, 2013; Gibson et al., 2013) and b) aim for short dependencies (e.g., by reordering constituents, Hawkins, 2004, 2014).

Here, I ask whether the same pressures affect language learning, change, and/or the distribution of languages across the world. Case study 1 asks whether actual natural languages have syntactic properties that increase processing efficiency, as would be expected if the processing efficiency biases language learning and/or change. Using data from five large syntactically annotated corpora, I show that natural languages have lower information density and shorter dependency lengths than expected by chance (Gildea & Jaeger, in prep; for dependency length, see also Gildea & Temperley, 2010). Previous work has found similar properties for phonological and lexical systems (e.g, Manin, 2006; Piantadosi et al., 2011, 2012; Wedel et al., 2013). The present work is the first to find that the same properties affect even the syntactic system (which involves considerably more complex latent structure and has often been assumed to be encapsulated from functional pressures).

Case studies 2 and 3 employ an miniature language learning approach to the same question. I show that learners of such languages restructure them in a way that improves both the inferability of messages and the dependency length (Fedzechkina et al., 2011, 2013, under review; Fedzechkina & Jaeger, 2015). Unlike approaches that rely on statistical modeling of typological data, miniature language learning does not suffer from data sparsity and can —if applied correctly— assess causality by directly manipulating the relevant factors. Additionally, the approach I describe differs from almost all other miniature language learning experiments in that the observed learning biases were neither in the input nor in the native language of learners (see also Culbertson et al., 2012; Culbertson & Adger, 2014; though see Goldberg, 2013).

Time permitting, I will also present big data on the role of probabilistic transfer during L2 learning (Schepens & Jaeger, 2015). I close with some caveats about statistical work on these questions.

Some references to related work from my lab

Fedzechkina, M. and Jaeger, T. F. 2015. ‘Long before short’ preference in a head-final artificial language: In support of dependency minimization accounts. The 28th CUNY Sentence Processing Conference. USC, CA, March 19th-21st.
Fedzechkina, M., Jaeger, T. F., and Newport, E.2013. Communicative biases shape structures of newly acquired languages. In Knauff, M., Pauen, N., Sebanz, & I. Wachsmuth (eds.) Proceedings of the 35th Annual Meeting of the Cognitive Science Society (CogSci13), 430-435. Austin, TX: Cognitive Science Society.
Fedzechkina, M., Jaeger, T. F., and Newport, E. 2012. Language learners restructure their input to facilitate efficient communication. Proceedings of the National Academy of Sciences 109(44), 17897-17902. [doi:10.1073/pnas.1215776109; IF: 9.681]
Fedzechkina, M., Jaeger, T. F., and Newport, E. L. 2011. Functional Biases in Language Learning: Evidence from Word Order and Case-Marking Interaction. In Carlson, L., Hoelscher, C., and Shipley, T. F. (eds.) Proceedings of the 33rd Annual Meeting of the Cognitive Science Society (CogSci11), 318-323.Austin, TX: Cognitive Science Society.
Schepens, J. and Jaeger, T. F. 2015. L2 phonological learning in adults: The role of language background, length of exposure, and age of acquisition. DGfS, Leipzig, Germany, March 4th- 6th.

Pressures for processing and communicative efficiency bias language development: evidence from big(ish) data and direct causal manipulation

Abstract

Causality in the language sciences

Antje Vandenberg

Damián Blasi

Jürgen Jost

Peter Stadler

Russell Gray

Bernard Comrie

Stephen C. Levinson

Nihat Ay

Sean Roberts

Leonardo Lancia