Thirty-one children participated in the experiment. The data from six subjects were discarded from the analysis either because there were < 60% of artifact-free trials (n = 4) or because of incomplete questionnaire data (n = 2). The mean age of the remaining 25 subjects (13 females) was 2.79 years (range 2.38–3.29 years). The ERP data of 13 subjects were reported earlier in Putkinen et al. (2012). Signed informed consent was obtained from the parents for their child's participation in the experiment. The child's consent was obtained verbally. The experiment protocol was approved by the Ethical Committee of the former Department of Psychology, University of Helsinki, Finland.
The participants had Finnish as their native language and were from families with two parents and one to three children. For 18 of the families, at least one parent had either a bachelor's (or equivalent), master's, or doctoral degree and for the majority of the families their monthly income was at or above the Finnish average level. The parents were asked about their child's possible hearing difficulties and other illnesses. The parents also provided the child's health summary, which contained information from the child's regular visits to a nurse and/or medical doctor that had occurred at least three times per year. Except for allergies, atopic skin or asthma, the subjects had no illnesses and no reported hearing or other medical problems. The children were born at full term, had normal birth weights, and their weight and height had developed normally.
All of the children also had some musical experience outside the home as they had all attended the same playschool involving musical activities. The playschool sessions took place on a weekly basis expect for the summer months and national holidays (max. approximately 30 sessions/year). In the playschool, the emphasis was on the enjoyment of playful musical group activities such as singing in group, rhyming, and moving with the music, etc. and not on a formal music-educational program involving training on musical instruments. According to the parents, all the children had attended the playschool regularly and displayed great interest in the playschool activities. One of the parents always accompanied the children in the playschool.
Stimuli and procedure
During the experiment, the children sat in a recliner chair either on a parent's lap, or by themselves while the parent sat on a chair next to them in an acoustically attenuated and electrically shielded room. The children and their parents were instructed to move as little as possible and to silently concentrate on a self-selected book and/or children's DVD (with the volume turned off) during the experiment. Generally, the children were able to comply with these instructions well although all children talked and switched their position at least a few times during the recordings. The subjects were video-monitored throughout the 50 min experiment.
The multi-feature paradigm (Näätänen et al., 2004; Putkinen et al., 2012) was used in the experiment. In the paradigm, deviant tones (probability = 0.42) from five categories and novel sounds (probability = 0.08) alternated with standard tones (probability = 0.50). The order of the deviant tones and novel sounds was pseudo-random (with the restriction that two successive non-standard sounds were never from the same category).
The stimulus sequence included 1875 standard tones, 1590 deviant tones, and 280 novel sounds. The sounds were presented with a stimulus onset asynchrony of 800 ms. The first six tones of the block were standard tones out of which the first five were excluded from the analysis. The stimuli were presented through two loudspeakers in front of the participant at a distance of 1.5 m and at an angle of 45° to the right and left.
The standard and deviant tones included the first two upper partials of the fundamental frequency. Compared with the fundamental, the intensity of the second and third partials were −3 and −6 dB, respectively.
The standard tones had a fundamental frequency of 500 Hz, were 200 ms in duration (including 10 ms rise and 20 ms fall times), and were presented at an intensity of 80 dB (sound pressure level) via both loudspeakers.
Each deviant tone differed from the standard tones in frequency, intensity, duration, sound-source location, or by having a silent gap in the middle, but otherwise they were identical to the standard tones. The frequency deviants included large (f0: 750 or 333.3 Hz), medium (f0: 400 or 625 Hz) and small (f0: 454.5 or 550 Hz) frequency increments and decrements. The duration deviants included large, medium and small duration decrements, which were 100, 150, and 175 ms in duration, respectively. Only the responses to the largest frequency and duration deviants were included in the analysis because of their better signal-to-noise ratio compared with the responses to the smaller deviants. The gap deviant had a 5 ms silent gap (5 ms fall and rise times) in the middle of the sound. The intensity deviants were either −6 or +6 dB compared with the standard. Finally, the sound-source location deviants were delivered through either only the left or right speaker (no intensity compensation was employed). The large frequency and duration deviants were both presented 140 times and the intensity, sound-source location, and gap deviants, in turn, were presented 250 times each.
In addition, repeating and varying novel sounds were included in the sequence. Similarly to the standard tones, the novel sounds were 200 ms in duration and their mean intensity was 80 dB. The varying novel sounds were machine sounds, animal calls, etc., whereas the repeating novel sound was the word /nenä/ (‘nose’ in Finnish), spoken in a neutral female voice. The repeating and varying novel sounds were presented 216 and 72 times, respectively. Unlike the repeating novel sounds, each individual varying novel sound was presented no more than four times during the whole experiment. Furthermore, one-third of the varying novel sounds were presented via the right, one-third via the left, and one-third via both loudspeakers, whereas the repeating novel sounds were always presented through both loudspeakers. Because of these factors, the varying novel sounds are arguably more likely to trigger cognitive processes related to novelty detection and distraction than the repeating novel sounds. Consequently, only the responses to the varying novel sounds were included in the analysis of the current study.
The parents of the children filled out a detailed questionnaire concerning the musical behaviour of their children and their own musical activities at home. With regard to singing, both parents were asked to report (i) how often they sang to their children, and more specifically (ii) how often this involved singing familiar songs (e.g. well-known children's songs) or (iii) songs they had invented themselves. With regard to the musical behaviours of the children at home, the parents rated (i) how often their children sang familiar melodies, (ii) sang self-invented melodies, (iii) drummed rhythms, or (iv) danced at home. For all the aforementioned questions, the answers were given using a five-point scale (1, almost never; 2, once a month at most; 3, several times a month; 4, approximately once a week; 5, almost daily).
The scores for the questions related to singing were added together to form a composite singing score separately for both parents. Similarly, the scores for the questions regarding the musical behaviour of the children were summed to form a composite musical behaviour score for each child. Finally, these composite scores were normalized by subtracting the mean of the variable from each score and dividing this difference by the SD of the variable (hence, scores below the mean are negative). The normalized musical behaviour scores and father's singing scores were added together to form an overall composite score for musical activities at home. In line with previously reported differences in the prevalence of maternal and paternal singing (Trehub et al., 1997), the overwhelming majority of the mothers responded with the highest possible value to all the questions related to child-directed singing. In contrast, there was considerable variation in the amount of singing reported by the fathers. Therefore, for the questions regarding child-directed singing, only the fathers' scores were included in the analysis.
Electroencephalographic recording and data analysis
The electroencephalogram (band pass during recording 0.10–70 Hz, 24 dB per octave roll off, 500 Hz sampling rate) was recorded (NeuroScan 4.3) from the channels F3, F4, C3, C4, Pz, and the left and right mastoids using Ag/AgCl electrodes with a common reference electrode placed at Fpz. The electro-oculogram was recorded with electrodes placed above and at the outer canthus of the right eye. At the beginning of the measurement, the impedance of the electrodes was lower than 10 kΩ.
The data were filtered offline between 0.5 and 20 Hz electroencephalographic epochs from 100 ms before to 800 ms after tone onset and were baseline corrected against the 100 ms prestimulus interval. Epochs with a voltage exceeding ± 100 μV at any channel were discarded. After averaging the remaining epochs separately for each stimulus and subject, the resulting ERPs were re-referenced to the average of the two mastoids. Grand-average responses were formed by averaging the individual ERPs separately for each deviant type, novel sounds and the standards. Difference signals were computed by subtracting the responses elicited by the standard tone from the responses elicited by each deviant tone and the novel sounds.
The peak latencies of the responses were determined from the deviant/novel-standard difference signals from channel F3, which was deemed to be a representative of the response for all four channels included in the analysis. For the deviant tones, the peak latency for the MMN was defined as the latency of the largest negativity between 200 and 300 ms, for the P3a as the latency of the largest positivity between 200 and 300 ms, and for the LDN as the latency of the largest negativity between 500 and 600 ms after the deviant became physically distinct from the standard. For the novel sounds, in turn, the peak latency of the P3a was determined as the latency of the largest positivity between 200 and 300 ms and for the LDN/RON as the latency of the largest negativity between 600 and 700 ms.
For the analysis of the MMN and P3a, mean amplitudes of the responses were calculated on channels F3, F4, C3 and C4 over 50 ms time windows centred on the peak latencies. These values were then averaged together separately for each response and the average value was used for testing the significance of the response and for the correlation analyses. An identical procedure was used for the LDN and novelty P3a except that a 100 ms time window was used in the analyses as these responses spanned a longer time period than the MMN and the P3a elicited by the deviant tones.
To test the statistical significance of the MMN, P3a and the LDN for a given deviant, the mean amplitudes were compared with zero with a two-tailed one-sample t-test. Pearson's correlation coefficients between the overall musical behaviour score and the MMN, P3a, and LDN amplitudes were calculated. Partial correlations between the response amplitudes and the overall musical activities at home score were also calculated to control for various external factors. These factors included the child's age, gender, and socioeconomic status. The socioeconomic status measure included the income and education of both parents measured on six-step scales (income scale: 1, under 1000 Euros/month; 2, 1000–2000 Euros/month; 3, 2000–3000 Euros/month; 4, 3000–4000 Euros/month; 5, 4000–5000 Euros/month; 6, over 5000 Euros/month; education scale: 1, comprehensive school; 2, upper secondary school or vocational school; 3, a higher degree than upper secondary school or vocational school that is not a bachelor's, master's, licenciate, or doctoral degree; 4, bachelor's degree or equivalent; 5, master's degree or equivalent; 6, licenciate or doctoral level degree). The answers of both parents to these questions (i.e. number from one to six) were added together to form a composite socioeconomic status score for the parents of each child.
Exposure to recorded music at home was not included in the musical activities index because it was expected that the more active and interactive musical behaviours would be more likely to be associated with auditory development in 2–3-year-olds (cf. Gerry et al., 2012). The number of hours per week that the parents listened to music from CDs, DVDs, radio etc. with their children was nevertheless included as a control variable in the partial correlations to highlight that the correlations between the measures of main interest were not mediated by this variable. For this particular factor, either the mother's or father's response was missing for five children and was substituted by response median. The duration of the playschool attendance (average 17 months; range 1–30 months) was also included as a control variable. It should be noted that neither the exposure to recorded music nor the duration of the playschool attendance correlated with the response amplitudes or the measures included in the musical activities index with the traditional 0.05 criterion. For all of the control variables, however, the P-value for the correlation with either one or more of the responses or the musical activities index was lower than 0.20, which justifies the inclusion of these factors in the statistical model (Maldonado & Greenland, 1993) despite the reduction in parsimony. As a further control, two-way independent samples t-tests were conducted to compare the response amplitudes and the composite musical activities scores of the children whose parents (one or both) were active musicians (N = 10) with those of the rest of the children. These preliminary analyses revealed no evidence for differences in response amplitudes between these groups: musical activities at home score: t23 = 0.06, P = 0.95; duration: MMN t23 = 1.82, P = 0.081, P3a t23 = −1.00, P = 0.326, LDN t23 = −0.345, P = 0.733; gap: MMN t23 = 1.05, P = 0.306, P3a t23 = −0.793, P = 0.436, LDN t23 = −0.484, P = 0.633; frequency: LDN t23 = −0.504, P = 0.619; intensity: LDN t23 = 1.55, P = 0.136; location: LDN t23 = −0.390, P = 0.700; and novel sounds: P3a t23 = −1.23, P = 0.212, RON t23 = 0.125, P = 0.902.