In this issue of Acta Paediatrica, Christine Moon, Hugo Lagercrantz and Patricia K. Kuhl report their findings in phoneme learning before birth . They demonstrate that the exposure to the voice of the mother (and possibly other people) in the native language of the mother has altered the processing of phonemes. Specifically, they show that newborn infants seem to perceive variants of phonemes differently depending on whether the phonemes belong to the mother's native language or not.
This result shows a completely new dimension on foetal language learning. Foetal or very early neonatal learning of several key aspects of language have already been demonstrated in several studies. There is plenty of evidence that the early human auditory system is very effective in learning from sounds and language in a variety of ways. For example, high-level abstract rules are automatically extracted by neonatal brains from the auditory environment, allowing them to segregate sounds into streams , form coherent sound objects on the basis of sound feature combinations  and spot rule-breaking events after forming predictions on the basis of previously heard sound patterns . The neonatal capabilities in statistical learning  and the amazing speed such learning occurs in a sleeping neonate demonstrate that the neonatal brain is highly capable of differentiating 30 syllables, grouping them according to their statistical co-occurrences, keeping this information in memory, and reacting to syllables with high information content (processing the first syllables of the ‘words’ differentially compared with other syllables). In addition, newborn infants have been shown to react to prosodic aspects of speech and prefer their mother's native language  and even manage to replicate the learned melody and intensity contours of their native language in their cry . It thus seems that the neonatal brain is very well tuned towards learning from sounds and specifically from language. The study of Moon, Lagercrantz and Kuhl  shows for the first time that already neonates possess information on their native language phonemes and that this information was obtained during the foetal period. This new view of foetal and neonatal language learning challenges the previous views. So far, we have assumed that such language components as slow and clear emotional prosodic cues, or co-occurrences of hundreds of repetitions would form the basis of foetal and neonatal language learning, while these new results suggest that also typically very quickly changing events such as phoneme formant frequencies can be learned by the foetus. This implies that the foetal brain at the time close to birth can actually perceive and learn most of the key aspects of language.
Prosody is an interesting part of human language. Prosodic cues can be very strong and one could assume that in many cases in infancy they would override phonemic cues. Infant brains react strongly to changes in voice emotional prosody . We share the emotional prosodic features (and gestures) with nonhuman primates: it is very easy for a human to understand emotions in monkey vocalization – a tender call by a female monkey to her offspring compared with an aggressive roar of a violent male. Naturally, also infants are very capable of recognizing such emotional expressions in voice .
The set of acoustic features that we use to form the emotional prosody are shared not only in the communication of all primates but also in the expressions of music. Musical expressions are built on the same universal skills of the human auditory system as those of language. Both music and speech use the same musical features to signal fear with loud, low, noisy sounds and voices, and both express joy using higher, faster and lighter sounds and voices. Some researchers even claim that the native language of a composer who composes instrumental music affects the structure of the music . There are also many examples of the similarities of musical emotional features and physical emotional gestures, as well as gestures and language prosody. From this point of view, prosody may well be the most multisensory part of language and, as such, clearly the most comprehensible across humans of different ages, different language backgrounds and across species. From this point of view, we may call prosody ‘our first language’.
The results of Moon et al.  are very interesting also from the mere point of view of uteral exposure. With respect to auditory stimulation produced by the mother's voice or sounds outside the mother's body, it is challenging to come to a conclusion on the total frequency response curve of the foetus. Such a curve would have to take into account all the attenuating and masking phenomena that result from such factors as the mother's tissues and their attenuating and reflective qualities for different types of sounds, the amniotic fluid, the effects of the head position of the foetus, the effects of liquid in the foetal ear canals and middle ear and the development of sensitivity of the foetal middle and inner ear to different frequencies at different ages across the pregnancy. Several models exist (e.g. in sheep, ), and demanding underwater measurements in humans have been reported , but a full consensus has not been reached. Importantly, this study contributes greatly to this issue. Namely, for the foetal brain to construct memory trace prototypes of the native language phonemes requires quite accurate perception of the formant frequencies of the sounds of the speech heard in utero. It is not enough that the auditory system receives information from frequencies below, say, 500 Hz, but to differentiate the vowel variants in the study, accurate hearing up to approximately 2600 Hz (approximately 1750 mel) is required. Because this study shows learning that requires hearing of such high frequencies in the uterus, it can be taken as evidence that such information is available to the foetus and is not too significantly attenuated, reflected or masked by the uteral conditions and that it is processed by the foetal auditory system.
Moon et al. conclude that ‘birth is not a benchmark that reflects a complete separation between the effects of nature versus those of nurture’, that is, in addition, the exposure to sounds starts shaping the developing brain and providing it with the necessary tools to perceive the native language. Here, in our modern society, there is a caveat. Millions of mothers across the world are working in noisy environments during the last months of their pregnancy. As we now know that the foetal brain changes and learns from such minute features of the external auditory environment as the formants of the phonemes, the foetal brain may change also from repetitive, loud environmental sounds such as machines or vehicles. The effects of such workplace noise exposure to the foetal hearing and especially to the foetal brain development should thus be investigated. In addition, the effects of the auditory environment at neonatal care units in hospitals to the developing infant brains, especially with respect to the amount of speech input they provide compared with other sounds, should be studied.
Our view of the first steps of human language is quickly changing. Many key aspects of language that are evident in the behaviour of a child at 1–2 years have been shown to develop their neural bases already very early during the foetal period or early infancy. Modern neuroscience and other experimental methods develop and allow us to study such capabilities. In future, we are expecting to learn new, exciting results from infant language capabilities, and hope that this information will also be useful for those cases in which the language development is compromised for any reason. These new results highlight the fact that a lot has been learned in the normal development at a very early age and that care during the foetal period and early infancy may prove to be extremely important for the childhood language development.