Investigating implicit emotion processing in autism spectrum disorder across age groups: A cross‐modal emotional priming study

Cumulating evidence suggests that atypical emotion processing in autism may generalize across different stimulus domains. However, this evidence comes from studies examining explicit emotion recognition. It remains unclear whether domain‐general atypicality also applies to implicit emotion processing in autism and its implication for real‐world social communication. To investigate this, we employed a novel cross‐modal emotional priming task to assess implicit emotion processing of spoken/sung words (primes) through their influence on subsequent emotional judgment of faces/face‐like objects (targets). We assessed whether implicit emotional priming differed between 38 autistic and 38 neurotypical individuals across age groups as a function of prime and target type. Results indicated no overall group differences across age groups, prime types, and target types. However, differential, domain‐specific developmental patterns emerged for the autism and neurotypical groups. For neurotypical individuals, speech but not song primed the emotional judgment of faces across ages. This speech‐orienting tendency was not observed across ages in the autism group, as priming of speech on faces was not seen in autistic adults. These results outline the importance of the delicate weighting between speech‐ versus song‐orientation in implicit emotion processing throughout development, providing more nuanced insights into the emotion processing profile of autistic individuals.


Lay summary
Research is needed to know whether there are differences in how autistic and nonautistic individuals of different ages process emotions unconsciously.Our study shows that hearing emotionally spoken words unconsciously influenced how nonautistic people understood facial expressions across all age groups, while only non-autistic children were influenced by emotionally sung words.In contrast, only autistic children and adolescents, but not autistic adults, were influenced by emotionally spoken words when interpreting facial expressions.Autistic individuals of all age groups were influenced by emotionally sung words when interpreting faces.These results suggest that autistic people are less influenced by spoken information during unconscious emotion processing which can affect real-world social communication, as emotional cues in speech can be used to support judgment of facial expressions.

INTRODUCTION
Social communication difficulties are a hallmark symptom of autism spectrum disorder (American Psychiatric Association, 2013), a core part of which is the lack of sophisticated understanding of nonverbal communicative functions, such as the perception of emotional cues (Trevisan & Birmingham, 2016).Despite the extensive research efforts devoted to understanding emotion processing in autism, the scientific literature remains equivocal about the nature of how, and the extent to which, emotion processing in autism differs from that in neurotypical development.One important question to be answered relates to whether emotion processing difficulties in autism generalize across domains or are specific to certain domain(s) of the visual and auditory modalities (Leung et al., 2022).Addressing this question would help to elucidate whether differences in emotion processing between autistic and neurotypical individuals stem from general mechanisms that span various domains (i.e., domain-general mechanisms) or from factors specific to processes within particular domains (i.e., domainspecific mechanisms) (Connolly et al., 2020;Lewis et al., 2016;Peelen et al., 2010).
Research has shown that implicit emotion processing, defined as an unintentional, uncontrolled, unconscious, efficient, and fast process (Birnboim, 2003;Schneider & Chein, 2003;Schneider & Shiffrin, 1977;see De Houwer & Moors, 2012 for a detailed discussion), is distinctive from explicit emotion processing.This differentiation is supported by evidence that suggests a dissociation between the two processes, such that impairment at the explicit level does not necessarily imply impairment at the implicit level (e.g., Roux et al., 2010;Wagenbreth et al., 2016;Wieser et al., 2006).Thus, while several studies indicate that differences in explicit emotion processing between autistic and neurotypical individuals are likely driven by domain-general mechanisms, with comparatively poorer accuracy and slower response speed in the autism group generalized across domains (e.g., Leung et al., 2023;Philip et al., 2010), inferring implicit emotion processing in autism from these findings is challenging due to the dissociation between the two processing levels.Crucially, our understanding of implicit emotion processing in autism remains limited due to the lack of research adopting a multi-domain approach.
Previous research using neuroimaging methods has detected differences in implicit emotion processing between autistic and neurotypical individuals.Functional magnetic resonance imaging (fMRI) studies have consistently reported reduced levels of activation in brain regions in autistic individuals compared to neurotypical individuals during implicit emotion processing of facial and body expressions irrelevant to the central tasks (Ciaramidaro et al., 2018;Critchley et al., 2000;Kana et al., 2016).Similarly, electroencephalogram (EEG) studies have reported diminished neural responses to emotionally spoken syllables during passive listening tasks in autistic individuals compared to neurotypical individuals (Fan & Cheng, 2014;Lindström et al., 2018).These findings suggest that autism may be associated with atypical implicit emotion processing, and more specifically, with implicit appraisal of emotions and implicit discrimination between emotional expressions at the neural level.
The implicit processing of emotional information can be examined not only neurally, but also behaviorally.The emotional information induced by a preceding stimulus from one modality can implicitly modulate emotional judgment of a stimulus in another modality through emotional priming (Carroll & Young, 2005;Murphy & Zajonc, 1993).It has been proposed that the spreading of activation underlies this phenomenon.That is, the preceding prime stimulus is thought to activate emotionally congruent representations by spreading activation throughout the conceptual network.The preactivated representations, thereby, facilitate the encoding of congruent targets (Collins & Loftus, 1975;De Houwer & Randell, 2004;Hermans et al., 1994).Supporting this proposal, research using a cross-modal emotional priming paradigm has shown that emotional judgment of a target stimulus is faster and more accurate when the target is preceded by a prime of a congruent emotion (e.g., angry-angry) than when the prime and target are of different emotions (e.g., angry-sad) (Carroll & Young, 2005).Importantly, the manipulation of the time interval between the prime and target onsets, known as the stimulus onset asynchrony (SOA), is essential for capturing the early, automatic processing of the prime stimuli in such paradigms (e.g., <300 ms; Posner & Snyder, 2004).
The emotional priming paradigm has been less commonly used to study emotion processing in autism, and findings have been inconsistent.Kamio et al. (2006) found that, despite comparable explicit emotion recognition of faces to neurotypical individuals, these emotional faces when subliminally presented only primed subsequent liking ratings of ideographs in neurotypical individuals but not in individuals with pervasive developmental disorders.By contrast, Vanmarcke and Wagemans (2017) showed that the priming effects of both briefly presented coarse and fine emotional faces on subsequent valence judgment did not differ between autistic and neurotypical individuals.Considering the paucity and inconsistency of these findings, more research is needed to further scrutinize implicit emotional priming in autism.
Priming effects have also been demonstrated with auditory cues, including emotional prosody (Pell et al., 2011;Scherer & Larsen, 2011;Schwartz & Pell, 2012) and musical chords (Zhou et al., 2019) in neurotypical individuals.Yet, to our knowledge, no previous research has investigated implicit emotional priming of auditory cues between autistic and neurotypical individuals, and importantly, whether any group differences generalize across auditory domains.Evidence from the literature focusing on cross-modal influences and multisensory integration may provide some insights into this.Despite studies varying substantially in their designs, findings seem to converge on suggesting reduced crossmodal modulation by emotional vocal expressions (Charbonneau et al., 2013;O'Connor, 2007;Xavier et al., 2015), with typical cross-modal modulation by music seen in autism (Brown, 2017;Wagener et al., 2020).However, it is unclear whether the discrepant patterns between the two domains are reflective of implicit, automatic processing.Specifically, attention may be directed to different channels concurrently using synchronous designs (Charbonneau et al., 2013;O'Connor, 2007;Wagener et al., 2020;Xavier et al., 2015), as well as asynchronous designs when SOA exceeds the critical timeframe for capturing automatic priming (i.e., ≥1500 ms) (Brown, 2017).The direct comparison between the speech and music domains within a cross-modal prime-target paradigm using a short SOA will contribute to our understanding of the domainspecificity (or otherwise) of implicit emotional priming in autism compared to neurotypical development.
In the present study, we implemented a cross-modal emotional priming paradigm, where an auditory prime (speech prosody or song) expressing either a congruent or incongruent emotion to a visual target (human face or face-like object) was presented to the participants with a short SOA of 200 ms on each trial-an appropriate SOA postulated to reflect implicit, automatic emotion processing (Hermans et al., 1994(Hermans et al., , 2001;;Herring et al., 2013).Specifically, according to the cognitive and functional decompositional models of implicit processing, this paradigm enabled us to investigate the unintentional, uncontrolled, efficient, and fast features of automaticity (De Houwer & Moors, 2012).That is, the emotional priming process would unfold within a very short timeframe, given the little time lapsed between the presence of the prime and participants' response to the target.Additionally, the process of emotional priming would be taskindependent, as participants were instructed not to pay attention to the prime.
Through this paradigm, this study aims to address several important gaps in the literature.First, we assessed whether emotional speech prosody and song primed subsequent emotional judgment differently in autistic and neurotypical groups, examining whether differences (if any) were specific to one domain or generalized across domains within the auditory modality.
Secondly, gaining insight into emotion processing in autism requires not only describing differences between autistic and neurotypical individuals, but also tracking and understanding how differences emerge over development.Consistent evidence indicates that group differences in explicit emotion processing of human faces between autistic and neurotypical individuals are notably more pronounced among adults, whereas these differences are less prominent among younger groups (e.g., Lozier et al., 2014;O'Hearn & Lynn, 2023;Rump et al., 2009; but also see Leung et al., 2023).This notion may be attributed to a plateau in the development of this skill beyond late childhood in autism, in contrast to the continuous maturation of such skills through into adulthood in neurotypical development (Rump et al., 2009).However, the development of implicit emotion processing in autism has not been well-explored.To tackle this, we also investigated whether implicit emotional priming differed between the autistic and neurotypical groups as a function of age.
Thirdly, emotional priming effects for prime-target pairs that occur more frequently in the environment (e.g., vocal bursts-faces) have been shown to be stronger compared to those that occur less frequently (e.g., vocal bursts-printed words) in neurotypical individuals (Carroll & Young, 2005).However, this effect has not been previously studied in autistic individuals.The use of human faces and face-like objects as target stimuli facilitated an investigation into the potential influence of prime-target co-occurrences on emotional priming.Specifically, we explored whether human faces, arguably accompanied more frequently by the human voice than everyday objects, would play a privileged role in crossmodal emotional priming to similar extents in autistic relative to neurotypical individuals.Altogether, this study aimed to shed light on implicit emotional priming in autistic and neurotypical individuals, while considering the effects of prime type, target type, and age group.

Participants
Seventy-six native British English speakers residing in the UK, with 38 autistic individuals and 38 neurotypical controls, were recruited via mailing lists, local experimental participant databases, local advertisements, and social media.Following the age group classification in the Autism-Spectrum Quotient (AQ; Auyeung et al., 2008;Baron-Cohen et al., 2001, 2006), the autism group and the neurotypical group each consisted of 14 children (7-11 years), 11 adolescents (12-15 years), and 13 adults (16-56 years). 1 The two groups were matched on chronological age, gender, receptive vocabulary (Receptive One-Word Picture Vocabulary Test-Fourth Edition [ROWPVT-4]; Martin & Brownell, 2011), and nonverbal 1 A different age group classification has also been adopted in previous related research on emotion processing in autism (e.g., Rump et al., 2009).To check the robustness of our results, we reconducted all analyses using this classification, with the autism group and the neurotypical group each consisting of 17 children (≤12 years), 9 adolescents (13-17 years), and 12 adults (≥18 years).Although we found some subtle differences, the significant effects and interactions observed in our original models remained unchanged (see Supplementary Table S3-S5).
reasoning ability (Raven's Standard Progressive Matrices [RSPM]; Raven, 1983), while the autism group had significantly higher overall autistic traits than the neurotypical group (Table 1).All participants reported normal hearing and normal or corrected-to-normal vision, and all participants in the autism group had a clinical diagnosis of autism spectrum disorder from UK professionals independent of this study (see Hayes et al., 2018 for a review on diagnostic procedures in the UK).The study protocol was granted ethical approval by the University of Reading Research Ethics Committee, and written informed assent/consent was obtained from participants and carers prior to the experiments.

Stimuli
The current stimulus set was developed from the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS; Livingstone & Russo, 2018) and validated for emotional content reliability by an independent group of judges as part of a previous study (see Leung et al., 2023 for full details).This set included a total of 16 spoken and 16 sung words (i.e., "door") to be used as auditory primes, and 64 facial and 64 face-like object images to be used as visual targets in this study.The stimuli for each emotion (angry, scared, happy, and sad) were evenly distributed within each domain.The use of static images stems from the necessity to closely align human facial stimuli with face-like object stimuli as visual targets.Notably, well-validated dynamic emotion databases for face-like objects are currently lacking for adoption in our research.Additionally, the cross-modal priming paradigm inherently involves a temporal discrepancy in the onset of prime and target stimuli.Incorporating dynamic stimuli would introduce a mismatch between auditory and visual presentations, potentially interfering with the process of emotional priming and recognition.The use of static images, therefore, not only addresses these practical constraints but also ensures applicability for the methodological demands of our experimental design.

Procedure
The cross-modal emotional priming task was conducted using E-prime 2.0 (Schneider & Zuccoloto, 2007), which assessed participants' emotion recognition of the visual targets after hearing an emotionally congruent or incongruent auditory prime.There were 256 trials in total, with two blocks of 64 trials presenting a human face target and two blocks of 64 trials presenting a face-like object target.The order of the human face and face-like object target conditions was counterbalanced between  (Martin & Brownell, 2011); RSPM = Raven's Standard Progressive Matrices (Raven, 1983); AQ = Autism-Spectrum Quotient (Auyeung et al., 2008;Baron-Cohen et al., 2001, 2006).

NS
Nonsignificant differences between autism and neurotypical groups.***Significant differences between autism and neurotypical groups at p < 0.001.participants and each target condition was preceded by four practice trials.Each auditory prime was paired with a visual target from each of the four emotional categories to create congruent (e.g., angry-angry) and incongruent (e.g., angry-scared, angry-happy, angry-sad) prime-target pairs.This resulted in 16 congruent and 48 incongruent trials, which were pseudorandomised within each primetarget condition (speech prosody-face, song-face, speech prosody-object, song-object).Two versions of pseudorandomization were adopted and counterbalanced between participants. 2n each trial, with an SOA, an auditory prime (speech prosody or song) expressing a congruent or incongruent emotion was presented 200 ms prior to the onset of the visual target (human face or face-like object).Participants were instructed to decide as quickly and accurately as possible, the emotion label that best described the expression presented in the visual target.Responses were made on a Cedrus RB-740 response pad with colored key cap lenses indicating each of the corresponding emotion categories (red for angry, green for scared, yellow for happy, and blue for sad).The position of the corresponding keys was counterbalanced between participants but was held constant throughout the experiment for each participant.The visual and auditory presentation stopped as soon as participants responded, and the visual target remained on the screen until a response was made.A response was considered correct if the emotion selected by the participant corresponded to the intended emotion expressed in the visual targets.Accuracy and response time (RT) from the target onset were recorded for each trial.For an illustration of the task procedure, see Figure 1.
It should be noted that, as part of a wider test battery, the same participants also completed simple emotion recognition tasks across the four visual and auditory domains, as well as cognitive and pitch measures (see Leung et al., 2023).To minimize habituation and potential interference with the prime-target association, we ensured that the priming task was always administered prior to the recognition tasks.This sequencing aimed to prevent participants from forming fixed associations between the face and object images (visual targets) and emotion labels during the priming task, which could hinder the observation of emotional priming effects.

Data analysis
All analyses were performed in RStudio (R Core Team, 2022).As all participants scored above the chance level of 0.25 for each prime-target condition, all accuracy data were retained for analyses.Participants' RT was based on correct responses only, with those less than 150 ms or more than 2.5 SD of the mean of each participant excluded.To quantify cross-modal emotional priming effects, a congruency difference score was calculated in terms of both accuracy (Mean accuracy for congruent prime-target pairs-Mean accuracy for incongruent prime-target pairs) and RT (Mean RT for incongruent prime-target pairs-Mean RT for congruent prime-target pairs).A larger congruency difference score indicated facilitated emotional judgment of the target for congruent primes (i.e., faster and more accurate compared to incongruent primes), and hence stronger cross-modal emotional priming effects.
To examine whether differences in priming effects between diagnostic groups and age groups were modulated by prime and target type, two separate linear mixed effects models were constructed using the lme4 package (Bates et al., 2015), with congruency difference scores for mean accuracy and mean RT as dependent measures, The experimental procedure of the cross-modal emotional priming task.On each trial, a prime stimulus (speech prosody or song) is presented 200 ms prior to the presentation of the target stimulus (human face or face-like object) which remains on the screen until a response is made to identify the emotion (angry, scared, happy, or sad) presented in the image.
respectively.Each model included Diagnostic group (autism vs. neurotypical), Age group (child vs. adolescent vs. adult), Prime type (speech prosody vs. song), Target type (human face vs. face-like object), and all possible interactions as fixed effects.The maximal random effects structure was initially specified with by-subject intercept and by-subject slopes for prime type and target type for both models 3 ; any arising non-convergence issues were addressed following Brown's (2021) recommendations.
The statistical significance of the fixed effects were obtained using anova() from lmerTest (Kuznetsova et al., 2017).For effect sizes, partial eta-squared (η p 2 ) was computed for each fixed effect using eta_squared() from effectsize (Ben-Shachar et al., 2020), with ≥0.01, ≥0.09, and ≥0.25 interpreted as small, medium, and large effects, respectively (Cohen et al., 2013).Significant effects and interactions emerging from the models were followed up through post-hoc tests.Correction of the post-hoc tests for multiple comparisons was performed with Benjamini-Hochberg (false discovery rate) procedure (Benjamini & Hochberg, 1995).

RESULTS
Table 2 displays the results summary of the linear mixed effects analyses on both the congruency mean accuracy and RT difference scores.

Congruency mean accuracy difference
The analysis revealed a significant main effect of Prime type, with speech prosody (M (SD) = 0.12 (0.18)) inducing stronger cross-modal emotional priming effects than song (M (SD) = 0.09 (0.16)) overall.Importantly, the four-way interaction of Diagnostic group Â Age group Â Prime type Â Target type was significant, while no other effects and interactions reached significance.
To guide the post-hoc analyses for unpacking the four-way interaction, boxplots were visually inspected (see Figure 2).Interesting trends were noted with regard to the age-related pattern of emotional priming characterizing each diagnostic group.As such, we first examined the presence of priming, through one-sample t-tests comparing the congruency difference scores to the test Applying a more stringent criterion for determining matching of cognitive abilities between groups, differences in verbal and nonverbal ability observed between our autism and neurotypical groups may be considered "unclear if matched" (see Mervis & Klein-Tasman, 2004).Therefore, we conducted additional analyses controlling for potential effects of verbal and nonverbal ability in the mixed effects models.Results showed that the significant effects and interactions observed remained unchanged, confirming the robustness of our results (see Supplementary Table S6).value of 0 (i.e., no priming) for each diagnostic by age group per prime-target condition.
Regarding the presence of priming, one sample t-tests revealed that for the speech prosody-face condition, the congruency difference score significantly differed from 0 in children and adolescents but not in adults within the autism group, while it significantly differed from 0 across all age groups within the neurotypical group.In contrast, for the song-face condition, the congruency difference score significantly differed from 0 across all age groups within the autism group, while it significantly differed from 0 only in children but not in adolescents and adults within the neurotypical group.For the speech prosodyobject condition, the congruency difference score significantly differed from 0 only in autistic and neurotypical children, with no such differences observed in the adolescent or adult groups.For the song-object condition, the congruency difference score did not differ from 0 across all groups.For full results summary, see Table 3.
Notably, diverging age-related patterns emerged in the autism and neurotypical groups, with regard to the presence of priming on face targets dependent on prime type.Accordingly, we examined whether the strength of priming significantly differed across diagnostic and age groups.A linear mixed effects model with Diagnostic group, Age group, Prime type, and their possible interactions as fixed effects, alongside a by-subject intercept as random effects, was conducted for the face condition.
However, no significant effects or interactions were observed, indicating that facial emotion recognition was primed by speech prosody and song to similar extents across all groups.For full results summary, see Supplementary Table S1.
Additionally, as priming was present in both autistic and neurotypical children for the speech prosody-object condition, we examined whether the strength of priming significantly differed between the two groups.Independent samples t-test results revealed no significant difference (t (26) = À0.41,p = 0.686, d = 0.15), indicating the strength of priming of speech prosody on object emotion recognition did not differ between autistic and neurotypical children.
Altogether, these results suggest that, in terms of accuracy, while priming effects of speech prosody on face targets were prominent across ages in the neurotypical group, these effects were not observed in adults within the autistic group.Conversely, while priming effects of song on face targets appeared prominent across ages in the autistic group, these effects were not observed in adolescents and adults within the neurotypical group.Importantly, these age-related differences were subtle and did not reach statistical significance when directly comparing between groups.Additionally, regardless of speech prosody or song as primes, emotional priming was generally not observed for object targets across groups.
F I G U R E 2 Boxplots of mean accuracy across congruency levels by prime-target condition for each diagnostic by age group.

Congruency mean response time difference
The analysis revealed a significant two-way interaction of Prime type Â Target type.Pairwise comparisons showed stronger cross-modal emotional priming effects when object targets were primed by song (M (SD) = 316.09(481.03))than when they were primed by speech prosody (M (SD) = 121.94(410.86))(t (75) = 2.85, p = 0.006, d = 0.43).For face targets, there was no significant difference in the strength of priming between speech prosody and song as primes (p = 0.121).Additionally, stronger priming effects were found when object targets were primed by song (M (SD) = 316.09(481.03))than when face targets were primed by song (M (SD) = 126.70(358.98))(t (75) = 2.98, p = 0.004, d = 0.45).There was no difference in the strength of priming between face and object targets when primed by speech prosody (p = 0.085).For a full breakdown, see Supplementary Table S2.
These results indicate that in terms of processing speed, emotional priming of song on object targets was particularly prominent.By contrast, emotional priming of speech prosody and song had similar effects on face targets.Importantly, these results were observed across diagnostic groups and age groups.

DISCUSSION
This is the first study to directly compare the implicit emotional priming of speech prosody and song on subsequent visual emotional judgment in autistic and neurotypical individuals of different age groups.Regarding our research questions, our main findings are: (1) there were no significant differences in the strength of emotional priming between the autism and neurotypical groups across ages, regardless of prime and target types; (2) there were divergent age-related patterns in the presence of emotional priming between the two groups, depending on the prime and target type.Specifically, in the neurotypical group, while emotional priming of speech prosody on human faces was found across age groups, emotional priming of song was found only in children.Conversely, in the autism group, emotional priming of speech prosody on human faces was found only in children and adolescents, whereas emotional priming of song was found across all age groups; (3) in general, emotional judgment of face-like objects was less well primed than human faces across all groups.

No overall impairments in cross-modal emotional priming in autism relative to neurotypical development
This study provided no evidence for impaired implicit emotional priming in autistic individuals relative to neurotypical individuals, given that there were no significant group differences in the extent to which visual

Note:
Sensitivity power analyses using G*Power 3.1 (Faul et al., 2007) suggested that our sample size had 80% power to detect large effect sizes (ds ≥0.81) across comparisons.All p-values were adjusted using the Benjamini-Hochberg method for the number of tests conducted within each prime-target condition.Significant effects are in bold.
emotion recognition was primed by emotions expressed auditorily.These findings support previous work by Vanmarcke and Wagemans (2017), while contradicting that by Kamio et al. (2006).The discrepancy in findings may relate to the association between the implicitly preactivated concept and the concept required for the target response.For instance, the present study found no impairments in autism when participants were presented with emotional auditory primes and then made emotional judgments of visual targets.Likewise, Vanmarcke and Wagemans (2017) also found no impairments in autism when participants were presented with emotional face primes (happy vs. sad) and then made valence judgments of face targets (positive vs. negative).These two studies, thus, demonstrate a shared concept of emotional meaning to be transferred from the prime to the target response.By contrast, Kamio et al. (2006) found impaired emotional priming in autism when participants were presented with emotional face primes and made liking judgments of ideographs, which arguably denotes a less related concept between emotional meaning of the prime and preferential judgments required as responses to targets.Taken together, it may be the case where atypicality lies in the extent to which implicitly processed emotional information can guide behaviors in a wider (e.g., non-emotionally related) context in autism.More broadly, our findings appear to contradict several neurophysiological studies that reported atypical implicit emotion processing in autism, particularly for speech prosody (Fan & Cheng, 2014;Lindström et al., 2018).The mismatch between behavior and neural underpinnings of implicit emotion processing in autistic individuals may indicate the potential contributing role of compensation.Compensation in neurodevelopmental disorders refers to the processes that contribute to an improved behavioral presentation, despite persisting core deficit(s) at cognitive and/or neurobiological levels (see Livingston & Happé, 2017 for a detailed discussion).Neural compensation may be evident in additional 'neural effort' through the recruitment of other networks to support emotion processing in autistic individuals (e.g., Gebauer et al., 2014).Alternatively, it is also possible that reduced neural activation is not at a threshold to be reflected in the behavioral performance (e.g., Caria et al., 2011).Future research would benefit from the combined use of both behavioral and neuroimaging measures (e.g., EEG, fMRI) to provide insights into whether the present behavioral results are indicative of neural compensation, low task sensitivity to reflect poor neural encoding, or both.
The special status of speech prosody for crossmodal emotional priming in neurotypical development but not in autism While no overall group differences were observed, our study revealed distinct domain-specific developmental patterns regarding the presence of emotional priming between the autism and neurotypical groups.In the neurotypical group, emotional priming for speech prosody on human faces remained stable across ages, implying that speech prosody cues may have a special status in priming the interpretation of facial expressions for neurotypical individuals of all ages.The absence of emotional priming for song on human faces in the older neurotypical groups further reiterates this notion.By contrast, in the older autism group, there was a lack of priming for speech prosody on human faces.This suggests that speech prosody cues may not share the same importance for priming the interpretation of facial expressions in autistic individuals as they do for neurotypical individuals across development, and they may prefer song cues.
While a plausible explanation for these findings could be a plateau in the development of emotional skills for recognizing human faces (e.g., Rump et al., 2009), this explanation does not fully elucidate why similar age effects were not observed across different contexts (i.e., when autistic adults were primed by speech prosody but not song).Importantly, the varied emotional priming patterns in the autism and neurotypical groups cannot be attributed to their baseline emotion recognition abilities (i.e., in the absence of priming).As reported in a separate study with the same sample, no significant differences in emotion recognition accuracy between autistic and neurotypical groups were found across ages for all four types of stimuli (Leung et al., 2023).
The reduced implicit processing bias towards speech prosody in autistic individuals compared to neurotypical individuals is well documented in the speech perception literature.Studies have shown that neurotypical individuals prefer speech over non-speech stimuli from a very early age (Alegria & Noirot, 1978;Vouloumanos et al., 2010), whereas autistic individuals do not have such a bias (Filipe et al., 2018;Järvinen-Pasley & Heaton, 2007;Klin, 1991Klin, , 1992)).Moreover, findings from event-related potential (ERP) studies using a passive listening paradigm provide substantial evidence for atypical early, preattentive processing of speech stimuli in autism ( Čeponienė et al., 2003;Kujala et al., 2005;Whitehouse & Bishop, 2008;Zhang et al., 2019).Accordingly, the atypical initial orientation may have resulted in reduced automatic responses to emotional speech prosody, exerting an influence on its subsequent cross-modal emotional transfers (e.g., through implicit priming) in autism.
One possibility of the more stable presence of emotional priming of song across ages in the autism group could be related to the greater neural activation for this stimulus type.In support of this, neuroimaging studies using fMRI and diffusion tensor imaging (DTI) have demonstrated that, relative to neurotypical individuals, activation in the neural system associated with speech and song processing (e.g., the left inferior frontal gyrus) was reduced for speech, but was comparable or greater for song in autistic individuals (Lai et al., 2012;Sharda et al., 2015).Additionally, fronto-temporal connectivity was found to be greater for song compared to speech in autistic individuals, while these differences were not seen in neurotypical individuals (Lai et al., 2012;Sharda et al., 2015).The greater neural responses to song over speech may imply greater orientation towards sung cues for emotional information to be processed and transferred through priming in autism.
The present findings corroborate previous multisensory integration research by showing the reduced crossmodal influence of speech (Ben-Yosef et al., 2017;Charbonneau et al., 2013;O'Connor, 2007;Vannetzel et al., 2011;Xavier et al., 2015), but typical cross-modal influence of musical stimuli on the interpretation of facial expressions (Brown, 2017;Wagener et al., 2020) in autism.Taken together, the cross-modal influence of music appears to occur both at the explicit and implicit levels in autism, even as a transient cue in a sung format, as demonstrated in the present study.Our findings, thus, outline the potentially important role of automatic response to emotional cues as a prerequisite to crossmodal transfers and/or integration of emotion.In particular, speech prosody cues may play a less prioritized role in emotional priming in autistic individuals compared to neurotypical individuals.
The modulating effect of prime-target cooccurrence on cross-modal emotional priming in both autism and neurotypical development Our accuracy data provided evidence for a priming advantage of higher co-occurring prime-target pairs in emotional priming.In general, there was a lack of priming effects of both speech prosody and song on the emotional judgment of face-like objects across most age and diagnostic groups.As hypothesized, this may be due to their low co-occurrence with the human voice, as opposed to the high co-occurrence of human faces with the human voice, irrespective of whether it is spoken or sung.There may, thus, be a weaker association formed between face-like objects and the human voice, as they are not typically encountered simultaneously in our everyday social interaction.The present findings extend those reported in Carroll and Young (2005), by demonstrating the role of prime-target co-occurrence with different types of stimuli.
Priming effects of speech prosody on emotional judgment accuracy of face-like objects were, nonetheless, noted for both autistic and neurotypical children.To our knowledge, no previous studies have examined the role of co-occurrence in emotional priming or the ability to process emotions from audiovisual stimuli involving nonhuman faces in children.The lack of research in this area makes it challenging to interpret this finding.Based on the co-occurrence framework, it is possible that children may encounter pairings of the human voice and nonhuman faces more often compared to adolescents and adults, perhaps through watching cartoons or playing with talking toys that are nonhuman looking.However, it is not feasible to extrapolate from the current data whether priming effects on face-like objects in children were confounded by frequent exposure to such pairings during early development.Continued investigation into whether and how exposure to socioemotional stimuli, such as television viewing (e.g., Black & Barnes, 2015), contributes to emotion processing would provide greater understanding of how social experience is shaped and refined throughout development.

Limitations and future directions
The present findings should be interpreted in the context of some limitations.We found that our linear mixed effects analyses were adequately powered to detect small effects within the highest order interactions.However, it should be noted that our post-hoc analyses, specifically in the context of one-sample t-tests within groups and independent samples t-tests between groups, were constrained in power.This restriction stemmed from the reduced sample size due to age group breakdown, affecting our ability to detect smaller effects adequately.Effect sizes were presented for all effects and interactions, by which the present results could be treated as preliminary to guide future large-scale studies in this area, contributing to a better understanding of the developmental trajectory of implicit emotion processing in autism.
It is noteworthy that our adult sample had a more diverging gender ratio, with a higher proportion of females, compared to the child and adolescent samples.This discrepancy is inherent to the volunteer sampling approach employed and may also reflect the ongoing discussion surrounding delayed autism diagnosis in females (e.g., Begeer et al., 2013).While beyond the scope of the present study, the potential influence of gender ratio on the observed differences across age groups should not be overlooked, considering the role of gender differences in emotion processing highlighted in previous autismrelated research (e.g., Livingston et al., 2022).Subsequent replications of this study should aim to explore these findings further across different genders.
As the first study to investigate implicit emotion processing in autism from a multi-domain perspective, prototypical emotional expressions were used as stimuli to signify relatively comparable levels of recognition difficulty across domains, while ensuring sufficient saliency of these stimuli for priming and their suitability to accommodate the wide age range of participants (see Leung et al., 2023).The use of prototypical expressions may have reduced the sensitivity of the present task in detecting subtle differences between the autism and neurotypical groups.Experimental manipulation of the emotional intensities of stimuli across domains (i.e., that represent more naturally occurring expressions) will facilitate task sensitivity in future work, while allowing the threshold at which group differences (if any) emerge to be established.
Relatedly, as an exploratory investigation, the present study paired each emotional prime with targets expressing the four emotions of interest in turn.This resulted in an uneven proportion of congruent (25%) versus incongruent (75%) trials of prime-target relations.The lower occurrence of congruent trials could potentially diminish emotional priming effects (Spruyt et al., 2007).It is proposed that on incongruent trials, automatic stimulusresponse route may lead to interference and become suppressed in the following trial, potentially reducing facilitation in the subsequent congruent trials (Kunde, 2003).Therefore, further investigation is warranted to determine if the lack of group differences was limited by overall reduction in priming effects within the group.In a similar vein, while an SOA of 200 ms is postulated to capture implicit emotion processing, both shorter and longer SOAs have also been employed to study emotional priming in previous studies (e.g., Jiang et al., 2016).This raises further questions about whether subtle group differences may reveal as the amount of information in the prime to be processed changes (i.e., at different SOAs) during this fast-acting cognitive process.Future studies may alter congruence proportions to examine the replicability of the present findings, as well as varying SOAs to elucidate the time course of implicit emotional priming in the autism and neurotypical groups.

Implications
A potential practical implication of this study could be illustrated.Notably, if playing congruent emotional speech prosody and song primes to enhance facial emotion recognition skills proves effective, it could form the basis of learning strategies to boost the decoding of facial expressions for both autistic and neurotypical children.Relatedly, positive intervention effects of song on facial emotion recognition have been demonstrated in previous work (e.g., Katagiri, 2009).Given that the reduced speech-orientation at an early stage of emotion processing may underlie social communication difficulties in autistic individuals, the development of such learning strategies may be guided to incorporate speech prosody in addition to song as a facilitating tool for facial emotion recognition, in order to strengthen the cross-modal transfers between the two domains.Taking advantage of the larger priming effects observed at a younger age, such strategies would perhaps work best earlier in life-that is, prior to the start of weaker cross-modal influence of auditory stimuli (particularly of speech prosody cues) on facial emotional judgment with age in autism.Future work examining whether and how these approaches are effective in the long-term will help practitioners develop learning shortcuts to improving the quality of social interactions in autistic individuals.

Conclusion
Overall, the present study found no differences in implicit emotional priming between autistic individuals and neurotypical individuals, as assessed on a cross-modal emotional priming task.This is true for different types of auditory primes (speech prosody and song) and visual targets (human faces and face-like objects).Differential developmental patterns in implicit emotion processing for the two auditory prime types between the two groups were, nonetheless, depicted.Speech prosody but not song implicitly primed subsequent emotional judgment of human facial expressions in neurotypical individuals across ages.This speech-orienting tendency for implicit emotional priming, however, appeared to be less prominent with age in autistic individuals.Across groups, emotional priming was generally not observed between pairs of stimuli that had a low co-occurrence in the environment (song-and speech prosody-object).Together, these results indicate that the cross-modal influence of implicitly processed emotional cues may become more finetuned for interpersonal events (i.e., speech prosody-face) in neurotypical development.Conversely, the reduced speech-orienting tendency seen in autism may affect realworld social communication, where emotional information from speech can be used to support more efficient judgment of facial expressions.Altogether, these results outline the importance of the delicate weighting between speech-versus song-orientation at an early stage of emotion processing throughout development, contributing to a more nuanced understanding of the emotion processing profile of autistic individuals.

DATA AVAILABILITY STATEMENT
The data that support the findings of this study are available from the corresponding author upon reasonable request.

K
E Y W O R D S affective priming, age, autism, emotion processing, faces, Pareidolia, song, speech prosody Characteristics of the autism and neurotypical groups by age group.
T A B L E 1Note: ROWPVT-4 = Receptive One-Word Picture Vocabulary Test-Fourth Edition T A B L E 2 Linear mixed effects model results for diagnostic group, age group, prime type, target type, and their interactions on congruency mean accuracy and RT difference scores, respectively.
(Green & MacLeod, 2016) using powerSim () from simr(Green & MacLeod, 2016), determined that our linear mixed effects models and sample size had sufficient power (97.40%, 95% CI [96.21%, 98.29%]) to detect small (β = 0.52) interactions between the four variables of interest (i.e., Diagnostic group Â Age group Â Prime type Â Target type), at an alpha level of 0.05, based on 1000 simulations.R model equation for congruency mean accuracy difference score: lmer (Congruency Mean Accuracy Difference $ Diagnostic Group Â Age Group Â Condition Â Emotion + (1 + Prime Type + Target Type j Subject)).R model equation for congruency mean RT difference score: lmer (Congruency Mean Response Time Difference $ Diagnostic Group Â Age Group Â Prime Type Â Target Type + (1 + Prime Type j Subject)).Significant effects and interactions are in bold.3 Results summary of one-sample t-tests on priming congruency mean accuracy difference against the test value of 0 (i.e., no priming) by diagnostic group, age group, and prime-target T A B L E 3