Concurrent affective and linguistic prosody with the same emotional valence elicits a late positive ERP response

Abstract Change in linguistic prosody generates a mismatch negativity response (MMN), indicating neural representation of linguistic prosody, while change in affective prosody generates a positive response (P3a), reflecting its motivational salience. However, the neural response to concurrent affective and linguistic prosody is unknown. The present paper investigates the integration of these two prosodic features in the brain by examining the neural response to separate and concurrent processing by electroencephalography (EEG). A spoken pair of Swedish words—[ˈfɑ́ːsɛn] phase and [ˈfɑ̀ːsɛn] damn—that differed in emotional semantics due to linguistic prosody was presented to 16 subjects in an angry and neutral affective prosody using a passive auditory oddball paradigm. Acoustically matched pseudowords—[ˈvɑ́ːsɛm] and [ˈvɑ̀ːsɛm]—were used as controls. Following the constructionist concept of emotions, accentuating the conceptualization of emotions based on language, it was hypothesized that concurrent affective and linguistic prosody with the same valence—angry [ˈfɑ̀ːsɛn] damn—would elicit a unique late EEG signature, reflecting the temporal integration of affective voice with emotional semantics of prosodic origin. In accordance, linguistic prosody elicited an MMN at 300–350 ms, and affective prosody evoked a P3a at 350–400 ms, irrespective of semantics. Beyond these responses, concurrent affective and linguistic prosody evoked a late positive component (LPC) at 820–870 ms in frontal areas, indicating the conceptualization of affective prosody based on linguistic prosody. This study provides evidence that the brain does not only distinguish between these two functions of prosody but also integrates them based on language and experience.

known. The present paper investigates the integration of these two prosodic features in the brain by examining the neural response to separate and concurrent processing by electroencephalography (EEG). A spoken pair of Swedish words- [ˈfɑːsɛn] phase and [ˈfɑːsɛn] damn-that differed in emotional semantics due to linguistic prosody was presented to 16 subjects in an angry and neutral affective prosody using a passive auditory oddball paradigm. Acoustically matched pseudowords- [ˈvɑːsɛm] and [ˈvɑːsɛm]-were used as controls. Following the constructionist concept of emotions, accentuating the conceptualization of emotions based on language, it was hypothesized that concurrent affective and linguistic prosody with the same valence-angry [ˈfɑːsɛn] damn-would elicit a unique late EEG signature, reflecting the temporal integration of affective voice with emotional semantics of prosodic origin. In accordance, linguistic prosody elicited an MMN at 300-350 ms, and affective prosody evoked a P3a at 350-400 ms, irrespective of semantics. Beyond these responses, concurrent affective and linguistic prosody evoked a late positive component (LPC) at 820-870 ms in frontal areas, indicating the conceptualization of affective prosody based on linguistic prosody. This study provides evidence that the brain does not only distinguish between these two functions of prosody but also integrates them based on language and experience.
The MMN response is typically followed by a fronto-centrally maximal P3a response, reflecting an involuntary attention switch to the auditory change, which indexes salience and contextual novelty of stimuli (Escera, Alho, Winkler, & Näätänen, 1998;Escera & Corral, 2007;Näätänen et al., 2007;Polich, 2007). The P3a response has especially been shown to be sensitive to prosodic salience (Wambacq & Jerger, 2004;Wang, Friedman, Ritter, & Bersick, 2005). Prosodic changes in unattended speech sounds have, for instance, been shown to capture more attention compared to temporal changes, and, therefore, elicit a P3a response (Wang et al., 2005). Similarly, a larger P3a response has also been shown with affective prosody compared to with neutral prosody (Carminati, Fiori-Duharcourt, & Isel, 2018;Pakarinen et al., 2014), probably not only due to the acoustic salience but also due to the motivational salience of affective prosody (Wambacq & Jerger, 2004;Wang et al., 2005; see also Bradley, Codispoti, Cuthbert, & Lang, 2001;Bradley et al., 2003;Schupp et al., 2004). The affective function of prosody differentiates well among emotional expressions and is equally important for spoken communication. Previous ERP research has provided evidence for the integration of affective prosody and emotional semantics for an efficient communication of auditory emotional sentences (Kotz & Paulmann, 2007;Paulmann & Kotz, 2008;Wambacq & Jerger, 2004). Concurrent affective prosody and emotional semantics with the same valence have, for instance, been shown to elicit a larger P3a response than either affective prosody or emotional semantics alone, reflecting the automatic integration of emotional information (Wambacq & Jerger, 2004).
A major issue in comprehending speech communication is to determine how the brain handles multiple prosodic cues concurrently to extract information from speech that is encoded linguistically and affectively. In the present paper, we examined the integration of these prosodic cues by investigating the pre-attentive neural response to concurrent and separate processing of, respectively, affective and linguistic prosody. To this end, a Swedish word pair where the linguistic prosody modulates the emotional semantics-'fasen' [ˈfɑːsɛn] phase and 'fasen' [ˈfɑːsɛn] damn-were investigated once with an angry and once with a neutral affective prosody. Anger was chosen since the recognition of vocal anger tends to be higher than that of other emotions irrespective of language (Pell, Monetta, & Paulmann, 2009a) and to achieve the valence match between affective prosody and emotional semantics of the swear word [ˈfɑːsɛn] damn. Acoustically matched pseudowords, [ˈvɑːsɛm] and [ˈvɑːsɛm], were used to distinguish the sensory prosodic processing from the cognitive prosodic processing. In accordance with the previously established contribution of linguistic prosody to semantic processing (Zora et al., 2016(Zora et al., , 2015, we hypothesize, firstly, that linguistic prosody will generate a larger MMN response in words than in pseudowords. Secondly, in line with previous literature (Carminati et al., 2018;Pakarinen et al., 2014), we hypothesize that a change from neutral to affective prosody will elicit an enhanced P3a response, reflecting its intrinsic motivational salience. Thirdly, we hypothesize that a unique late ERP response will be elicited by concurrent affective and linguistic prosody with the same valence, reflecting the temporal integration of affective prosody with emotional semantics of prosodic origin. According to the constructionist view, the attribution of emotions is the result of a conceptual analysis of core affect-a term used to refer to pre-conceptual physiological states of valence, responding to motivationally salient stimuli-based on language and previous experience (Barrett, 2006;Barrett & Bliss-Moreau, 2009;Russell, 2003Russell, , 2009; for a review see Lindquist, Wager, Kober, Bliss, & Barrett, 2012). Thus, we expect that the conceptualization of affective prosody based on linguistic prosody will be reflected in general topological principles of brain network organization (Bullmore & Sporns, 2009;Hickok & Poeppel, 2007;Rauschecker & Scott, 2009;Smith et al., 2009;Specht, 2014;Frühholz, Wiebke, & Kotz, 2016) giving rise to the unfolding of the integrative brain response. The resulting activation pattern is argued to index the continuous interaction between core affect and representations of linguistic memory (conceptual knowledge) in the production of cognitive processes (Lindquist et al., 2012).

| Participants
Sixteen native speakers of Swedish (eight males, eight females; age range 20-47 years, M = 26.4, SD = 6.3) were recruited and tested in Stockholm. As assessed with the Edinburgh Handedness Inventory (Oldfield, 1971), all participants were right-handed and reported normal development and hearing. All the participants gave written informed consent before the experiment, carried out according to the Declaration of Helsinki. The protocol was approved by the Stockholm Regional Ethics Committee (2015/63-31).

| Stimuli
The study consisted of the Swedish words [ˈfɑːsɛn] phase and [ˈfɑːsɛn] damn, which are identical in segmental structure but differ in linguistic prosody, low tone, and high tone respectively, as well as in their emotional semantics, emotionally neutral and emotionally valenced, respectively. Phonologically acceptable pseudowords [ˈvɑːsɛm] and [ˈvɑːsɛm], which differed from real words only in the initial and final segments, were used as controls. All the stimuli were pronounced once with an affective prosody (angry voice) and once with a neutral prosody (neutral voice). Two forms of anger have previously been reported: hot (i.e., uncontrolled) anger and cold (i.e., suppressed) anger. While hot anger tends to be pronounced with a relatively high f0 mean and intensity, cold anger is expressed with a moderate or low f0 mean and f0 range (Banse & Scherer, 1996;Hammerschmidt & Jürgens, 2007;Pell, Paulmann, Dara, Alasseri, & Kotz, 2009b). In the present study, cold anger was used in order to avoid the possible effects of the inherent acoustic salience of the explosive anger on the neural responses.
All the stimuli were articulated several times by a classically trained Swedish female singer and speech-language pathologist (from Stockholm, 60 years old) in an anechoic chamber. The recordings were conducted using a Brüel & Kjaer 1 / 2 " Free-field Microphone (Type 4189) and REAPER digital audio workstation (version 5.93), and were sampled at a rate of 44.1 kHz with 16 bits per sample. The recordings were analyzed and manipulated using an open source speech analysis software, Praat (version 6.0.33; Boersma & Weenink, 2014). In order to eliminate possible clicks, 10 ms ramps were added to both ends of the stimuli; the total duration of stimuli was 800 ms. Considering their importance in conveying the linguistic and affective functions of prosody (Banse & Scherer, 1996), intensity and f0 were kept constant. Pseudowords were created out of the word stimuli by a cross-splicing technique. The initial and final segments,/f/ and/n/, in [ˈfɑːsɛn] and [ˈfɑːsɛn] were extracted and replaced with their pseudoword equivalents,/v/ and/m/, which were identical with their word equivalents in manner of articulation (fricative and nasal, respectively). To preserve the natural flow of the waveform, the critical segments were extracted and spliced at zero-crossings, and, when necessary, pulses were added/deleted incrementally to eliminate the traces of co-articulation.
To ensure that the sound files are free from unnatural signals after manipulations and that the intended emotion was expressed successfully, a stimulus validation was performed in online research platform, FindingFive (version 1.0). Five native speakers of Swedish with a linguistics background (two males, three females; age range 35-60 years, M = 49, SD = 8.5) reported on the meaning and the emotion conveyed by prosody across all the stimuli. In contrast to previous emotion recognition studies, which used forced choice and were therefore criticized for inflating the recognition rates (Barrett, 2006;Russell, 1994), the listeners were left free in their emotion judgments and labeling. The mean accuracy for both linguistic prosody (low tone vs. high tone) and emotion (negative vs. neutral) identification was 90%.

| Experimental paradigm
The stimuli were presented in a passive auditory oddball paradigm, where infrequent deviant stimuli (p = 20%) were interspersed among frequent standard stimuli (p = 80%). The present study had two blocks, a word block and a pseudoword block, each block consisting of 1 standard (N = 1,440) and 3 deviants (N = 360, 120 for each deviant). The paradigm is illustrated in Figure 1. The affectively neutral stimuli with low tone (i.e., phase, neutral in emotional-semantic valence, pronounced with neutral prosody) and its pseudoword equivalent, always served as Standards; the affectively neutral stimuli with high tone (i.e., damn, emotionally valenced, pronounced with neutral prosody) and its pseudoword equivalent, as Deviant I; the affectively valenced stimuli with low tone (i.e., phase, neutral in emotional-semantic valence, pronounced with angry prosody) and its pseudoword equivalent as Deviant II; and the affectively valenced stimuli with high tone (i.e., damn, emotionally valenced, pronounced with angry prosody) and its pseudoword equivalent as Deviant III. This design enabled us to investigate the neural responses to deviations in the linguistic prosody (Deviant I) and affective prosody (Deviant II) alone as well as in linguistic and affective prosody combined (Deviant III). The deviants within each block randomly replaced the standards, and there were at least two intervening standards between the two consecutive deviants. The stimulus onset asynchrony (SOA) was set at 1,200 ms (interstimulus interval, ISI: 300 ms).

| Procedure
The EEG experiment was run using E-Prime (version 2.0) in an electrically insulated and sound-attenuated recording booth. The stimuli were delivered via loudspeakers at a comfortable listening level of 60-65 dB SPL at source. The participants were instructed to ignore the auditory stimuli, and instead watch a silent documentary (without subtitles), covering approximately a quarter of the screen. The experiment had two blocks, the word block and the pseudoword block, each having 3 sub-blocks for each deviant (see Figure 1). The order of the blocks and the sub-blocks was counterbalanced across participants, and the participants were given a chance to take a break between the blocks. The whole experiment, including the electrode application, lasted about 2-2.5 hr.

| EEG recordings
The electroencephalography (EEG) data were collected with the BioSemi ActiveTwo system and ActiView acquisition software (BioSemi, Netherlands) at a sampling rate of 2,048 Hz. Recordings were made from 16 cap-mounted active electrodes (Fp1, Fp2, F3, Fz, F4, T7, C3, Cz, C4, T8, P3, Pz, P4, O1, Oz, O2), which are equipped with pre-amplifiers, which provide an impedance transformation directly on the electrode. The electrode positioning was in accordance with the International 10-20 system. Two additional electrodes, a common mode sense (CSM) active electrode and a driven right leg (DRL) passive electrode, were used instead of a traditional ground electrode. In addition, seven external electrodes were used: four for electrooculogram (EOG) recordings to monitor horizontal and vertical eye movements, two for mastoid recordings, and one for nose recording, which was used for offline referencing.

| EEG data analysis
The offline EEG data analysis was performed in Matlab (version 9.4) (The Math Works Inc., Natick, Massachusetts, USA) using the EEGLAB toolbox (Delorme & Makeig, 2004). The continuous EEG data were first resampled to 256 Hz and then low-pass filtered at 30 Hz and high-pass filtered at 0.5 Hz. The signals were then referenced to the nose channel. To identify and remove eye artifacts, an independent component analysis (ICA), which is a computational method that separates data into statistically independent components (Jung et al., 2000), was carried out. The EEG data were then segmented into epochs from −100 to 900 ms, time-locked to the word onset. A time window of 100 ms before the onset was used for the baseline correction. Any epochs containing EEG fluctuation F I G U R E 1 Schematic representation of auditory passive oddball paradigm. Both word and pseudoword blocks had three sub-blocks. Stimuli that are neutral in both affect and emotional semantics served always as Standards (S). Each sub-block represented the associated deviants and had 480 standards (p = 8/10) and 120 deviants (p = 2/10). Stimuli that are emotionally loaded but affectively neutral served as Deviant I (DI). Stimuli that are emotionally neutral but affectively valenced served as Deviant II (DII). Stimuli that are both emotionally loaded and affectively valenced as Deviant III (DIII). The stimulus onset asynchrony (SOA): 1,200 milliseconds (ms); Interstimulus interval (ISI): 300 ms exceeding ± 100 μV were automatically removed. The grand average ERPs were computed for each stimulus, and deviantminus-standard subtractions were calculated for each deviant. One participant was excluded from the data analysis due to noisy data. The electrodes were grouped together in three regions of interest (ROI): Frontal, F3, Fz, and F4; Central, C3, Cz, and C4; and Parietal, P3, Pz, and P4. The time windows for ERP quantification were defined based on the grand average peaks, and the component identification was based on the previous ERP literature. Amplitudes were then computed as a mean voltage within a 50-ms-window centered at the peak latency in the grand average waveforms.

| Statistical analysis
The statistical analysis was performed in SPSS (version 24; International Business Machines Corp., Armonk, New York, USA). A three-way repeated-measures ANOVA with factors of ROI (Three levels: Frontal, Central, and Parietal), Block (Two levels: Word and Pseudoword), and Deviant (Three levels: Deviant I, Linguistic prosody; Deviant II, Affective prosody; and Deviant III, Linguistic-Affective prosody) was performed in each time window (50-100 ms, 150-200 ms, 300-350 ms, 350-400 ms, and 820-870 ms). In the case of significant interactions, follow-up ANOVAs and post-hoc pairwise comparisons with Bonferroni corrections were performed. If significant interactions occurred with the ROI, additional pairwise comparisons were carried out to investigate the lateralization effect in the relevant ROI; the electrodes were grouped in three hemispheric regions: left, mid, and right. Mean values are reported with standard deviations. Greenhouse-Geisser corrected p-values are given in case of sphericity violations. Effect sizes are reported with η 2 (partial η 2 ).

| RESULTS
The grand average difference waveforms (i.e., deviantminus-standard subtractions) recorded from Fz and the topographic difference maps are displayed for all the three deviants ( Figure 2; Deviant I, Linguistic prosody 'neutral damn'; Deviant II, Affective prosody 'angry phase'; Deviant III, Linguistic-Affective prosody 'angry damn') in the word block.
The results of the three-way repeated-measures ANOVA, follow-up ANOVAs, and the descriptive information are presented in Tables 1-4. Mean ERP amplitudes and 95% confidence intervals are displayed in Figure 3. Deviations in both linguistic and affective prosody elicited neural responses as early as about 50 ms after change onset (i.e., word onset) in both the word and pseudoword blocks. The results of ANOVA in this early latency, that is the time window 50-100 ms (Table 1), demonstrated a main effect of ROI (F (2, 28) = 8.424, p = .001, η 2 = 0.376) but there were no statistically significant main effects of any other factor and no statistically significant interactions between them. The time window 150-200 ms displayed statistically significant interaction limited to the two-way interaction of ROI and Deviant (F (4, 56) = 3.750, p = .037, η 2 = 0.211), as is the case for the time window 350-400 ms (F (4, 56) = 15.779, p < .001, η 2 = 0.530). Significant three-way interaction of ROI with Block and Deviant occurs only in the time windows 300-350 ms (F (4, 56) = 4.188, p = .024, η 2 = 0.230) and 820-870 ms (F (4, 56) = 5.121, p = .007, η 2 = 0.268). These initial ANOVAs indicate that there are significant differences between words and pseudowords with regard to the deviants only in the time windows 300-350 and 820-870 ms ( Figure  3), whereas this difference is absent in the time windows 150-200 and 350-400 ms. These time windows reveal the integrative EEG response, unfolding with the different aspects of prosody, that is, the affective, linguistic, and affective-linguistic conditions. Further analyses of the significant interactions in these time windows are presented in the following sub-sections.

| Time window 150-200 ms
There is a clear negative deflection to all the deviants in the time window 150-200 ms. This negative voltage deflection is consistent with an MMN response, reflecting the acoustic deviation. The follow-up analyses that were carried out to investigate the two-way interaction of ROI and Deviant did not, however, yield any significant differences between the deviants in any of the ROIs (Table 3).
This finding is in line with an MMN response, and previous research, which indicated that the lexically relevant prosody generated a stronger MMN response in words than in pseudowords (Zora et al., 2016(Zora et al., , 2015.

| Time window 350-400 ms
After 350 ms, there is no statistically significant difference in the mean ERP amplitude between word and pseudoword blocks. However, both the Affective prosody and the Linguistic-Affective prosody deviants elicit positive responses instead. This later ERP response clearly reflects the affective valence of the stimuli (i.e., Affective prosody and Linguistic-Affective prosody; Figures 2 and 3, Dev II and Dev III, respectively), since they are clearly distinguished from the neutral stimuli (i.e., Linguistic prosody; Figures 2 and 3, Dev I). We believe that the listeners attend to the affectively valenced prosody, irrespective of the lexical-semantic information in this time window, which explains the non-significant difference between words and pseudowords both in two-way (F (2, 28) = 1.818, p = .193, η 2 = 0.115) and in the three-way interactions (F (4, 56) = 0.361, p = .707, η 2 = 0.025; Table 1). The follow-up analysis to the two-way interaction of ROI and Deviant, on the other hand, indicates that there is a significant difference between the deviants in the frontal ROI (F (2, 28) = 8.121, p = .002, η 2 = 0.367). The pairwise comparisons indicate that the difference between Linguistic prosody (M = −0.227 μV) and Linguistic-Affective prosody (M = 1.413 μV) is significant (p = .009) as well as the difference between Linguistic prosody and Affective prosody (M = 1.153 μV; p = .040; Figure 3; Table 3). The lateralization analysis did not show any significant results either in the left (F (4, 56) = 3.166, p = .60, η 2 = 0.182), mid (F (4, 56) = 2.571, p = .094, η 2 = 0.155), or the right regions (F (2, 28) = 2.917, p = .071, η 2 = 0.172). The results show a significantly larger positive response to the affectively valenced stimuli (Affective prosody and Linguistic-Affective prosody) than to the neutral stimuli (Linguistic prosody). We argue that this corresponds to a P3a response.

Linguistic prosody
A ffective prosody Linguistic-Affective prosody

| DISCUSSION
The main result of this study shows that when affective prosody is combined with linguistic prosody of the same valence, it generates a distinct frontal response beyond the classical MMN and P3a waves. This study provides evidence that besides distinguishing them on the basis of acoustics and semantics, the brain temporally integrates and conceptualizes these two functions of prosody based on language and experience. Moreover, this novel frontal ERP response suggests that core affect and linguistic knowledge facilitate each other in the production of cognitive processes and is in line with the constructionist view (Lindquist et al., 2012) that cognition builds on large-scale co-operative brain network operations (Bullmore & Sporns, 2009;Spreng, Mar, & Kim, 2009, see also Murphy, Nimmo-Smith & Lawrence, 2003).

| Linguistic prosody generates a MMN response
Linguistic prosody has previously been shown to generate stronger MMN response in words than in pseudowords, revealing the salience and relevance of prosodic information in semantic processing (Zora et al., 2015(Zora et al., , 2016. Zora et al. (2016) has for instance indicated that segmentally identical Turkish words, ˈbebek 'a place name' versus. beˈbek 'baby' are distinguished on the sole basis of linguistic prosody (stress pattern), and argued that the linguistic prosody activates memory traces associated with words, and accelerates semantic processing pre-attentively. This representation of prosody in the mental lexicon is explained by associative learning and long-term memory representations, in line with previous studies, which indicated enhanced MMN response to words than pseudowords that do not have such representations (Alexandrov, Boricheva, Pulvermüller, & Shtyrov, 2011;Pulvermüller et al., 2001;Pulvermüller, Shtyrov, & Hauk, 2009;Shtyrov & Pulvermüller, 2002).
In the present study, a larger negative response was found to tone pattern changes at 300-350 ms. Considering the frontally maximal negativity, in line with the typical topographical distribution of MMN response (Näätänen et al., 2007), and the larger amplitude in words than pseudowords, this response is considered to be a MMN response, indexing the existence of long-term memory traces associated with words and their prosodic features (Zora et al., 2015(Zora et al., , 2016. Thus, when co-activated, segments and tone patterns develop into functional long-term memory networks that guarantee rapid semantic processing independent of attentional processes, supporting the previous literature on the representation of linguistic prosody in the lexicon.

P3a response
Affectively modulated stimuli elicited a frontally maximal P3a response at 350-400 ms irrespective of lexical-semantic information. This is in line with previous research, indicating the sensitivity of the P3a response to prosodic information, and emotional and biological arousal (Carminati et al., 2018;Foti et al., 2009;Hajcak et al., 2010;Olofsson, Nordin, Sequeira, & Polich, 2008;Pakarinen et al., 2014; T A B L E 2 Interactions of block (2 levels: word block and pseudoword block), and Deviant (3 levels: Deviant I, linguistic prosody; Deviant II, affective prosody; Deviant III, linguistic-affective prosody) in the time windows 300-350 and 820-870 ms in each region of interest (ROI, frontal, central, and prietal Polich & Kok, 1995;Wambacq & Jerger, 2004;Wang et al., 2005), and reflecting the motivational salience of affective prosody (Wambacq & Jerger, 2004;Wang et al., 2005; see also Bradley et al., 2001;Bradley et al., 2003;Schupp et al., 2004). We speculate that the deviance from neutral to affective prosody increases the intrinsic vigilance of the participants. The difference between words and pseudowords probably disappears because the neurocognitive system is more sensitive to affect than to semantics (Pell et al., 2015), which is also in line with previous research, demonstrating a larger P3a for prosodic information compared to semantic information (Wambacq & Jerger, 2004). Irrespective of lexical-semantic information, due to the intrinsic motivational salience of affective prosody, this ERP signature provides empirical evidence for the psychological basis of core affect (Barrett, 2006;Barrett & Bliss-Moreau, 2009;Duncan & Barrett, 2007;Russell, 2003). Given that the constructionist model builds on basic psychological concepts (see psychological primitives, Lindquist et al., 2012), such as core affect, which reflects the response of visceral control systems to motivational salience, the P3a can be considered an index of motivational relevance.

| Concurrent linguistic and affective prosody generate an LPC response
A frontally maximum positive response was elicited to the concurrent linguistic and affective prosody in the time window 820-870 ms, which is argued to be an LPC response, reflecting the valence match between emotional semantics and affective prosody. According to the constructionist approach of emotions, the attribution of emotions is the result of an automatic conceptual analysis of core affect based on stored representations of prior experiences, that is, memory and knowledge (Barrett, Lindquist, & Gendron, 2007;Lindquist et al., 2012). Accordingly, it was speculated that core affect, which was conveyed by the affective prosody and that gave rise to the P3a response at 350-400 ms, underwent a conceptual analysis later in time based on linguistic prosody and became psychologically meaningful, manifested itself as a LPC response. That is, with the help of linguistic prosody, the brain was able to conceptualize and interpret core affect conveyed by affective prosody. This is in line with a positive slow wave response, which is associated with conceptual processes such as linguistic comparison, semantic judgment and memory retrieval (Neville, Kutas, Chesney, & Schmidt, 1986;Ruchkin, Johnson, Mahaffey, & Sutton, 1988).

| Limitations and future directions
To understand the processing of different prosodic functions, it is of essence to estimate the timing and coordination of ongoing brain activity in relation to the defined psycho-physical stimuli, which are expected to activate different aspects of the entire brain network. The fact that a distinct LPC response arose with supra-linear integration of two forms of prosody may have relevance for computational studies that have begun to determine the effects of functional activity on structural brain topology, suggesting that large populations of neurons across the neocortex (  (Ringach, 2009). Since the constructionist approach builds on basic network operations that are neither functionally specific to language nor emotions, but common across various perceptual and cognitive domains such as executive control, memory, and language, it will be necessary to combine physiological measurements of cortical activity on different scales, that is, from single neurons and cortical microcircuits in animals, to the whole brain response by EEG and functional magnetic resonance imaging (fMRI) in humans in order to comprehend the physics of the brain's connectome. Further research is planned on comparing the effect of using passive (Opitz, Rinne, Mecklinger, Yves von Cramon, & Kruggel, 1999;Opitz, Rinne, Mecklinger, Yves von Cramon, & Schröger, 2002) and active (Buchanan et al., 2000) paradigms of fMRI, in combination with EEG, in order to better understand the interplay between and relative contribution of bottom-up and top-down brain network activity in the processing of prosody.
The fact that the oddball paradigm used in the present study builds on one single minimal word pair limits generalization. In order to test whether the effects truly represent a general linguistic principle (for a discussion, see Baayen, Davidson, & Bates, 2008), replication is necessary using T A B L E 4 Results for follow-up one-way repeated-measures ANOVAs for two-way interaction of Block (2 levels: word block and pseudoword block), and Deviant (3 levels: Deviant I, linguistic prosody; Deviant II, affective prosody; Deviant III, linguistic-affective prosody) in the frontal region of interest (ROI) in the time windows 300-350 ms and 820-870 ms. Effect sizes are reported with η 2 (partial η 2 ). Mean values (M) are reported with standard deviations (SD) different individual language items. Future work is therefore planned employing further homonyms with different types of prosodic semantic and affective valence, such as joy or sadness, while keeping physical and psycholinguistic parameters under tight control. Prosody is strongly related to pitch perception (Oxenham, 2012). People with hearing loss and with cochlear implants often experience difficulties with pitch perception (McDermott, 2004;Oxenham, 2008), which affects the processing of emotional acoustic cues and their contribution to cognitive and emotional processing (Zinchenko et al., 2018). Fewer socio-emotional acoustic cues may lead to an impoverished environment or social isolation, which in turn may contribute to delayed cognitive development or hastened decline in later life (Arlinger, 2003;Lin et al., 2013;Livingston et al., 2017;Pichora-Fuller et al., 2016;Rönnberg, Rudner, & Lunner, 2011;Rudner, Seeto, Keidser, Johnson, & Rönnberg, 2019). Thus, it will be important, from a clinical point of view, to clarify the contributions and consequences of reduced influence of prosodic cues in both young and aging or hearing-impaired individuals.