Adult Learning and Language Simplification

Abstract Languages spoken in larger populations are relatively simple. A possible explanation for this is that languages with a greater number of speakers tend to also be those with higher proportions of non‐native speakers, who may simplify language during learning. We assess this explanation for the negative correlation between population size and linguistic complexity in three experiments, using artificial language learning techniques to investigate both the simplifications made by individual adult learners and the potential for such simplifications to influence group‐level language characteristics. In Experiment 1, we show that individual adult learners trained on a morphologically complex miniature language simplify its morphology. In Experiment 2, we explore how these simplifications may then propagate through subsequent learning. We use the languages produced by the participants of Experiment 1 as the input for a second set of learners, manipulating (a) the proportion of their input which is simplified and (b) the number of speakers they receive their input from. We find, contrary to expectations, that mixing the input from multiple speakers nullifies the simplifications introduced by individuals in Experiment 1; simplifications at the individual level do not result in simplification of the population's language. In Experiment 3, we focus on language use as a mechanism for simplification, exploring the consequences of the interaction between individuals differing in their linguistic competence (as native and non‐native speakers might). We find that speakers who acquire a more complex language than their partner simplify their language during interaction. We ultimately conclude that adult learning can result in languages spoken by more people having simpler morphology, but that idiosyncratic simplifications by non‐natives do not offer a complete explanation in themselves; accommodation—by comparatively competent non‐natives to less competent speakers, or by native speakers to non‐natives—may be a key linking mechanism.


Stimuli set for Experiments and 2
The set of images for Experiments 1 and 2 is shown in Fig. 1. Fig. 1. Stimuli set. Participants were trained on an artificial language which provided descriptions for these 18 scenes, made up of every combination of 3 Animals (duck, bird, and crocodile), 2 Numbers (1 or 2), and 3 Movements (a straight motion, bouncing, and looping).
Our meaning-dependent measure of the complexity of the suffix sets is described in Section 2.2.2 in the main manuscript. Here, we consider an alternative meaning-independent measure of complexity, the entropy of the suffixes for each stem class (Q, N or V). The entropy of a set of signals, H(S), is given by: where P (s) is the probability of suffix s. We calculate entropy separately for the suffixes associated with each word type (Q, N, and V): entropy therefore captures the extent to which a single word-type is associated with multiple suffixes, with entropy being low when one suffix is used for most stems (e.g. entropy will be 0 when a single suffix is used consistently for all stems in a category) and high when multiple suffixes are used with equal frequency (e.g. entropy would be 1 if 2 suffixes were used with equal probability).
In the target language, entropy for quantifier suffixes (H(S Q )) is 1.918, for noun suffixes (H(S N )) is 0.918, and for verb suffixes (H(S V )) is 2.224. In the Round 2 data shown in Table 2 in the main manuscript, H(S Q ) = 1, H(S N ) = 0.991 and H(S V ) = 1.194 -lower entropy reflects the relative invariance of forms.
Entropy by word type is illustrated in Fig. 2. As can be seen from this figure, entropy for each suffix type converges to the entropy of the target language over rounds; for Q and V suffixes this involves a steady increase in entropy, whereas for the (relatively simple) N suffixes participants over-shoot the target entropy from Round 2, and gradually converge on the target entropy from above.
Entropy scores were submitted to a linear regression (Bates, Maechler, & Bolker, 2013;R Core Team, 2013), with fixed effects of round (revalued such that the model intercept reflects entropy at Round 1), suffix (Q, N or V; this predictor was contrast-coded, such that the model intercept reflects the estimated entropy for the N suffix at Round 1) and their interaction; we included by-participant random intercepts and random slopes for round and suffix. This model confirms that the suffixes differ in their entropy at Round 1, as indicated by a significant intercept (b = 0.977, SE = 0.079, t = 12.290, p < 0.001; this simply reflects the fact that N entropy at Round 1 is non-zero) and significant effects for Q and V suffix types indicating that these have higher entropy (Q: b = 0.334, SE This suggests that the morphological systems produced at Round 2 are somewhat less complex than those at Round 8, in that entropy is lower initially and increases with further Entropy increases with training for quantifier and verbal suffixes, but is essentially flat for nominal suffixes. Error bars are 95% confidence intervals. noun suffixes may be more likely to increase complexity than for the quantifiers and verbs.

Experiment 2 input data speakers
This subset of 12 Experiment 1 participants from which the Experiment 2 input is drawn is generally representative of the full set of Experiment 1 participants, with the same trend of an increase in complexity from the Round 2 to Round 8 data. To confirm this, the entropy scores were again submitted to a linear regression (Bates et al., 2013;R Core Team, 2013), with fixed effects of round (revalued so that the model intercept reflects entropy at Round 2), suffix (Q, N or V; this predictor was contrast-coded, such that the model intercept reflects the estimated entropy for the N suffix at Round 2) and their interaction; we included by-participant random intercepts. This differs from the model for the full data set in including no random slope effects. This was necessary for model convergence.
The model confirmed that the suffixes differ in their entropy at Round 2, as indicated

Experiment 2 input complexity
The same two measures reported for the participant productions (suffix entropy and complexity) can be applied to the input data participants in each of our conditions received.
These results are plotted in Fig. 4. As for the output of the participants trained on these various languages, the input languages themselves, despite being composed of rather different constituent languages, show no systematic differences between conditions. This is confirmed by regression analyses using identical models to those described in the preceding sections, which again show no effects of population size or input composition on entropy (lowest p observed in the fixed effect of input composition, b = -0.023, SE = 0.013, t = -1.765, p = .095, indicating at best a suggestion that overall entropy might be higher in  Fig. 4. Entropy (upper row) and complexity (lower row) by condition and word type for the inputs. The left plots shows means averaging over the 3 suffix types; the right-hand panels shows entropy/complexity broken down by suffix.
There is no evidence of a condition-dependent difference in entropy or complexity. Error bars are 95% confidence intervals. Points illustrate data from individual participants. Stimuli set. Made up of every combination of 3 Animals (duck, dog, and crocodile) and 3 Movements (a straight motion, bouncing, and looping).