Meaningful mistakes in language behaviour [...]

2 • Simulating cultural transmission with evolects

See the full set of slides from this talk.

I would like to briefly present how language structure is modelled in the Iterated Learning methodology by means of an ‘evolect’. I propose evolect as a term for an artificial miniature language used in such simulations, for two reasons. Firstly, simulation models like this became a recognisable trend in language evolution research, thus having a label for this kind of models comes very handy, especially when presenting them to general public. But, more importantly, using such a term in appropriate contexts will help us to maintain a clear distinction between a conceptual tool for modelling language and the real-world phenomenon under study.

JPEG - 123 kb
JPEG - 96.9 kb

To date, these two meanings of ‘language’ (artificial mini-language and the phenomenon of language itself) are often used interchangeably in the literature, which leaves room for confusion. Therefore I consider evolect to be a quite convenient term for any mathematical construct designed for testing hypotheses about cultural evolution of language.

Back to the evolect-design, the next two slides (based on Fig. 5 from Smith, Kirby & Brighton, 2003:381) show how the amount of structure within the so called ‘meaning space’ can be quantified. The simulated meanings are more or less dense clusters of categories (in computational models they can be arrays or vectors, for example), hence an anticipated order of some kind is already there. Actually, we can easily recognise implicitly imposed structural constraints in the very design of the meaning space. In the examples on the slides the meanings are constrained by a geometrical space of three dimensions, each of them divided into 5×5×5 grid cells for subcategorisations.

JPEG - 107.8 kb

For the purposes of this presentation I will consider the evolect designed for the lab experiment of Kirby, Cornish & Smith (2008), which is probably the most influential work in this line of research. The appeal of this experiment to scholars from many diverse disciplines (as well as to the general public) lies in its extreme simplicity. This simplicity is reflected in the very design of the meaning space, which was “maximally minimised“, so to speak: As far as the meanings are concerned we have here a rather basic division into triplets (3×3×3). However, such a minimal meaning space turned out to be complex enough to illustrate the main idea.

The psychological experiment, conducted in laboratory conditions, was basically a scaled-down version of some of the earlier computer simulations. The only difference was the design of the evolect, which had been specifically adapted to the cognitive constraints of human agents. Nevertheless, the experiment addressed in sufficient detail the ‘logical problem of language evolution’ (Chater & Christiansen, 2010) and revealed the crucial role of cumulative cultural transmission in the evolutionary dynamics which gave rise to the emergence of lectal structure.

JPEG - 125.4 kb

The signal space of the evolect consisted of randomised strings of up to three open syllables which were (also randomly) mapped to the cartoon-like objects from the meaning space. A small repertoire of artificial words – i.e. SIGNAL—MEANING pairs coined from elements of this two sets – was presented on the computer screen as visual stimuli to be learned and recalled by the participant – a human agent.

In the course of the iterated learning procedure, the randomised structure of the initial evolect was passed along a transmission chain of several individuals, who represented succeeding generations of language users. Following the defining characteristic of this method, the learning output of one learner served as the input for the next learner. The transmission of the evolect to naïve learners (or “fresh brains” as Simon Kirby likes to call them) was supposed to model first language acquisition by children. In fact, however, the iterated regime of memorising fragments of the evolect seems rather to mirror circumstances of the second language learning.

JPEG - 181 kb

I will come back to this in a moment but now let me move to the main result of this experiment, which basically is the emergence of compositional structure in one of the conditions. This kind of combinatorial compositionality you see in the final evolect (on the right) emerges in about 8 to 10 iterations. As discussed in the paper by Kirby et al., it is mainly due to the already mentioned selective pressures which both conflict and complement each other. I prefer to call these adaptive pressures ’combinatorial’ because of their essentially probabilistic, estimative character.

JPEG - 152.7 kb

Firstly, the emergence and development of the particular organisation of the lectal structure can be seen as an adaptive response to the pressure for combinatorial learnability (i.e. increasing compression) of the whole evolect. I’m not sure whether Mr. Gabelenz meant exactly this with his Bequemlichkeitstrieb, but the human agents learning the alien language in this experiment do simplify their vocabulary. What is crucial is that they do it unintentionally. Since there is a bottleneck on the transmission and every agent learns only half of the SIGNAL—MEANING pairs, he/she discerns structural analogies in the input data set and imposes them on the remaining forms when tested on the whole set of meanings.

In this way, the subsequent participants memorise some of the reproductive blends made by their predecessors, and also insert their own mistakes into the evolect. If this would have been a real second language learning situation then at least some of the analogical, paradigmatic blends would have been treated as errors to be corrected by the teacher.

And indeed, they are registered as transmission errors in the evaluation of the test results conducted after each iteration, which gives a convenient measure of transmission fidelity. The evolect becomes more learnable by gradually losing its diversity: over time, there will be less and less words and some of them will start to denote more and more objects. It effectively becomes a kind of degenerate language, in extreme cases much like the language resources of Mr. Leborgne, who never stopped trying to communicate, even though the only thing he could say was „tan tan“.

JPEG - 156.7 kb

However, in one of the conditions the designers of the experiment have demonstrated how this effect might be counterbalanced by the opposing pressure for combinatorial expressivity. In order to achieve that, they have filtered out all the ambiguous signals (homonyms) from every test result. We could interpret this as modelling our need to be informative and observing in laboratory conditions how this fact influences the lectal structure. Indeed, if we want to precisely convey some informative content, we usually avoid ambiguous expressions and want them to be as conclusive as possible.

As a result, the evolect continuously adapts to contain more structure but also the transmission error rate goes down, which makes the structural patterns easily recognisable. The dynamics coming from these two opposing selection pressures ensures that the combinatorial structure of the evolect becomes compositional, and thus more compressible (at the end there will be only few morphemes plus 1 or 2 rules of their combination).

To sum up this section, an important contribution of this work is the identification of two basic selective pressures acting at the combinatorial level of lectal structure, which are responsible for the emergence and further development of fundamental structural properties of language resources. What this lab experiment and the previous computer simulations tell us is that the appearance of design we observe in human language could be, at least partly, explained by the mere fact that language is transmitted culturally.

However, the modelling work accounts for this dynamics on a very high level of abstraction.

One common feature of these studies has been that they look at language “from above” and extrapolate the predictions of simple models to the human language resources as a whole. In my thesis I wish to explore a complementary perspective, by zooming in on grammatical features of particular lects,1 where we can observe structuring processes of similar kind to those revealed in the simulation models. To this end, I am going to reinterpret cases of morphosyntactic variation in (spoken) Polish and German, which are well-documented in terms of “language errors” by linguists following the traditional, descriptive research agenda.


by Arkadiusz Jasiński
published on: 26.09.2015

Creative Commons License

This work is licensed under a CC BY-SA 4.0
Creative Commons · Attribution-ShareAlike 4.0 International License




  • Chater, N. & Christiansen, M. H. (2010). Language acquisition meets language evolution. Cognitive Science 34, 1131–1157.
  • Hurford, J.R. (2011). The Origins of Grammar: Language in the Light of Evolution. Oxford, UK: Oxford University Press.
  • Kirby, S., Cornish, H. & Smith, K. (2008). Cumulative cultural evolution in the laboratory: An experimental approach to the origins of structure in human language. Proceedings of the National Academy of Sciences 105(31), 10681–10686.
  • Smith, K., Kirby, S. & Brighton, H. (2003). Iterated learning: A framework for the emergence of language. Artificial Life 9(4), 371–386.



1 This argumentation relies on two underlying assumptions: that the evolution of lectal structure proceeded at a gradual pace (although it might had been punctuated by occasional catastrophic events) AND that we can infer some crucial information about it from the processes going on at the present day. Such a view is inspired by the uniformitarian assumption, originally applied in 19th-century geological explanations for the formation of Earth. As for this particular study, I am adopting the notion of dynamic uniformitarianism discussed in Hurford, 2011.