Heterogeneity in response to repeated intranasal oxytocin in schizophrenia and autism spectrum disorders: A meta‐analysis of variance

Intranasal oxytocin (OT) has been suggested as a putative adjunctive treatment for patients with schizophrenia and autism spectrum disorders (ASD). Here, we examine available evidence from trials investigating the effects of repeated administrations of intranasal OT on the core symptoms of patients with schizophrenia and ASD, focusing on its therapeutic efficacy and heterogeneity of response (meta‐ANOVA). Repeated administration of intranasal OT does not improve most of the core symptoms of schizophrenia and ASD, beyond a small tentative effect on schizophrenia general symptoms. However, we found significant moderator effects for dose in schizophrenia total psychopathology and positive symptoms, and percentage of included men and duration of treatment in schizophrenia general symptoms. We found evidence of heterogeneity (increased variance) in the response of schizophrenia negative symptoms to intranasal OT compared with placebo, suggesting that subgroups of responsive and non‐responsive patients might coexist. For other core symptoms of schizophrenia, or any of the core symptom dimensions in ASD, the response to repeated treatment with intranasal OT did not show evidence of heterogeneity.

psychiatric disorders (Cochran et al., 2013). Most of these studies have focused on patients with schizophrenia (Zheng et al., 2019) and autism spectrum disorders (ASD) (Ooi et al., 2017). So far, these RCTs have largely failed to provide convincing evidence that treatment regimens involving the repeated administration of intranasal OT improve any of the core symptoms of patients with schizophrenia (Williams & Burkner, 2017;Zheng et al., 2019) or ASD (Ooi et al., 2017;Wang et al., 2019). Some evidence contradicting all previous meta-analyses was reported in a recent meta-analysis including nine schizophrenia and 10 ASD RCTs published up to 2019: Peled-Avron et al. found that repeated intranasal OT improved total psychopathology in patients with schizophrenia, and repetitive behaviours and social functioning (but not global symptoms) in patients with ASD (Peled-Avron et al., 2020). However, this last meta-analysis did not control for pretreatment symptom scores, which may have led to biased estimates of treatment efficacy (i.e. differences in symptoms severity between the placebo and intranasal OT groups during the post-treatment assessment might simply reflect pre-existing differences between groups at baseline).
A further shortcoming of existing meta-analyses is that they rely on mean group level estimates of efficacy and, therefore, cannot elucidate whether a repeated intranasal OT treatment regime might have beneficial effects for a subset of responsive patients. Such heterogeneity in treatment response is often evoked to explain the lack of overall findings at group-level analyses, as the effects on a subset of responsive patients might be diluted by the lack of effects on nonresponsive patients (Bradley & Woolley, 2017;Cochran et al., 2013), but we are not aware of any study testing this hypothesis formally.
The idea of substantial underlying heterogeneity in the response to repeated intranasal OT treatment in patients with schizophrenia and ASD is supported by two sources of evidence. First, both schizophrenia (Liang & Greenwood, 2015) and ASD (Mottron & Bzdok, 2020) are highly heterogeneous disorders; thus, it is plausible that intranasal OT might benefit specific subsets of patients where a deficit in OT signalling might contribute to symptoms to a greater extent. Second, a number of inter-individual factors have also been shown to moderate the response to intranasal OT, such as (epi)genetic variation (Chen et al., 2020;Feng et al., 2015) and childhood trauma (Joseph et al., 2019;van Zuiden et al., 2017), likely through effecting on the capacity of the OT receptor to respond to OT binding (Kraaijenvanger et al., 2019). Hence, to properly evaluate the therapeutic potential of a daily-repeated intranasal OT treatment regime in patients with schizophrenia and ASD, it is important to investigate whether subgroups of patients may show a differential response to treatment that is masked by mean group level analyses.
The aims of this study are twofold. First, given the inconsistencies and methodological shortcomings among recent meta-analysis, and the emergence of new RCTs that have not been previously included (Bernaerts et al., 2020), we conducted updated meta-analyses of all RCTs investigating the effects of repeated intranasal OT treatment on the core symptoms of patients with schizophrenia and ASD, using gold-standard methods. Specifically, to estimate the effects of repeated intranasal OT on symptoms we used the standardized mean change using raw score standardization (SMCR) statistic (Morris, 2000), which unlike other effect sizes such as Hedge's g, not only contrasts treatment and control groups, but also controls for possible differences in the pretreatment values. This approach would help us to reevaluate evidence regarding the efficacy of intranasal OT in these two populations of patients using state-of-the-art methodology that protects researchers and clinicians from over-optimistic and perhaps misguided conclusions. Second, we investigated whether heterogeneity (i.e. subsets of responsive and non-responsive) might exist in the response to repeated intranasal OT treatment by conducting metaanalyses of variance. Recent meta-analyses using this approach investigated indirectly the presence of individual responses in RCTs (Mizuno et al., 2020;Radua et al., 2020;Senior et al., 2016;Winkelbeiner et al., 2019). These methods are based on the quantification of variance in the treatment group compared with the placebo group and have already been applied to address many other important questions related to heterogeneity in psychiatry Mizuno et al., 2020;Pillinger et al., 2019;Radua et al., 2020;Winkelbeiner et al., 2019). A key prediction of this variance model is that, in the presence of treatment-by-individual (or groups of individuals) interactions, variance in the response to treatment should be higher in the intranasal OT than in the placebo group. In this case, group-level estimates are unlikely to be good approximations of the estimates for the individual, calling for the need of a personalized or stratified medicine approach.

| Literature search
We followed the PRISMA guidelines (Moher et al., 2010) (flowcharts in Figures S1 and S2) during our systematic review process. We conducted searches in the PubMed, Web of Science and Cochrane Central Register of Controlled Trials from inception until 18th June 2020.
To search for intranasal OT schizophrenia RCTs, we used the following search query: '(Oxytocin OR Pitocin) AND (Schizo* OR Dementia praecox OR Psychosis OR Psychotic)'. To search for intranasal OT ASD RCTs, we used the following search query: '(Oxytocin OR Pitocin) AND (Autism OR ASD OR Autist* OR Asperger OR Pervasive developmental disorder)'. Our search queries were adapted in accordance with the specification of each database. Our search strategy was tailored to capture all OT studies in schizophrenia and ASD and not only RCTs. We used a broad search query to minimize search bias.

| Study selection
We performed study selection with the help of the reference manager software Rayyan (Ouzzani et al., 2016). We imported all results from our literature search into Rayyan and started by removing all duplicated studies using the Rayyan smart group function for duplicates. Then, two independent reviewers (DM and MP) screened all titles, abstracts and keywords to select eligible references for further scrutiny. Any disagreements were adjudicated by consensus. The two independent reviewers (DM and MP) then assessed all potentially eligible full-texts against the inclusion/exclusion criteria to decide on the inclusion/exclusion of the reference. Any disagreements were adjudicated by consensus. The inclusion/exclusion criteria were tailored to retain all RCTs assessing the efficacy of intranasal OT, as compared with placebo, on core symptoms measured before and after a period of repeated treatment administration. The inclusion criteria were: original article; written in English; published and peer-reviewed; placebo-controlled, double-blind, randomized clinical trial investigating the effects of repeated administration of intranasal OT on core symptoms of schizophrenia or ASD; enough available data for meta-analysis (i.e. means and SDs of pre-and post-treatment symptoms scores). We excluded: systematic reviews and meta-analyses; studies in healthy volunteers; studies not involving repeated intranasal administration of OT (i.e. single acute administrations); not placebo-controlled; not focusing on core symptoms as outcomes (e.g. using only social cognition tasks); studies where we could not retrieve sufficient data for meta-analysis.

| Outcomes and data extraction
Two authors independently read the full text of all included articles. We extracted the following information: first author, publication year, study design, sample size, proportion of male participants, duration of illness, diagnosis and diagnostic criteria, if patients were in-or outpatients, frequency of intranasal OT administration, daily dose, scheme of administration, total duration of treatment and the concomitant administration of any other treatment. Discrepancies were adjudicated by consensus.
For the RCTs on schizophrenia, we focused on total psychopathology (measured with the Positive and Negative Symptoms Scale (PANSS) (Kay et al., 1987), Brief Psychiatric Rating Scale (BPRS) (Flemenbaum & Zimmermann, 1973) or the Clinical Global Impression scale -Symptoms (CGI-S) (Busner & Targum, 2007)), or any of the three symptom sub-domains: positive symptoms (measured with BPRS, PANSS or Scale for Assessment of Positive Symptoms (SAPS) (Minas et al., 1994)), negative symptoms (measured with the PANSS, Clinical Assessment Interview for Negative Symptoms (CAINS) (Kring et al., 2013) and Scale for Assessment of Negative Symptoms (SANS) (Andreasen, 1989)) and general symptoms (measured exclusively with the PANSS). Briefly, positive symptoms include symptoms such as delusions, hallucinations and disorganized thinking; negative symptoms are characterized by deficits in cognitive, affective, and social functions, including blunting of affect and passive withdrawal; general symptoms include many deficits in cognition such as disorientation, poor attention, lack of insight and active social avoidance, that are not captured in the positive/negative symptoms domains; and total psychopathology reflects a sum of all of the above (Kay et al., 1987).
For all of our outcome measures, we extracted the mean and respective SDs from the pre-and post-treatment scores at each treatment group separately for the efficacy meta-analyses. If means and SDs of the changes from pre-to post-treatment were also reported, we also extracted those values for the variance meta-analyses. In crossover studies, we consider only data from the first randomization period to avoid potential treatment-by-sequence interactions. For those studies that did not report statistics on the scores changes from pre-to post-treatment, we calculated the mean difference by subtracting the post-from the pretreatment scores and the respective SD of the difference using formula 1 below (Cumpston et al., 2019): For the schizophrenia RCTs, we assumed r pre,post to be 0.50 in order to be similar to correlations typically obtained in schizophrenia studies (e.g. see the Schizo_PANSS data in the R package Surrogate). For the ASD RCTs, we assumed r pre,post to be 0.70, as suggested by available patient data from two of the RCTs included in this study (Bernaerts et al., 2020;Munesue et al., 2016). For completeness, to exclude that our findings might be driven by the specific assumptions we made regarding the correlation of pre-and post-treatment scores (r pre,post ), in both cases we recalculated the estimated SD change using three different r pre,post values: 0.5, 0.7 and 0.9 and repeated all of our main analyses to check whether our pooled estimates would change considerably as a function of r pre,post .
Because we did not find any considerable change in the main conclusions of our analyses, for conciseness we report only our main analyses for r pre, post = 0.50 in schizophrenia and r pre,post = 0.70 in ASD.
Where information was missing, we contacted the authors directly to access the missing data. If after contacting the authors, we could not gather all the information we needed to calculate the effect sizes for meta-analysis, but this information could be retrieved from available plots in the manuscript, then we used the DigitizeIt software (http:// www.digitizeit.de/) to extract data presented graphically (this happened with the pretreatment social functioning scores from Parker et al., 2017).
If we could not retrieve the missing data for any outcome measures from the authors or from available plots, then these outcome measures were excluded from further analyses (this happened with the pretreatment scores of repetitive/stereotypical behaviours from Parker et al., 2017).

| Risk of bias
We used the revised Cochrane Risk of Bias tool (RoB 2) (Higgins et al., 2011) to assess and classify the risk of bias in each of the included studies, according to criteria defined a priori. We judged whether each study had a high, low or unclear risk of bias in each of the following five domains: randomization process, deviations from intended intervention, missing outcome data, measurement of the outcome, selection of the reported results. The overall risk of bias was classified as low if none of the above domains were rated as high risk and three or less are rated as unclear risk. It was classified as moderate if one domain was rated as high risk, or none rated as high risk but four or more rated as unclear risk. All other studies were classified as having a high risk of bias. The risk of bias was assessed independently by two reviewers and any disparity was resolved by consensus. Risk of bias assessments for each study can be found in Figures S3 and S4.

| STATISTICAL ANALYSIS
3.1 | Calculation of effect sizes and respective variance

| Meta-analysis of efficacy
To estimate the effects of repeated intranasal OT on symptoms we used the standardized mean change using raw score standardization (SMCR) statistic (Morris, 2000). The SMCR, unlike other effect sizes such as Hedge's g, not only contrasts treatment and control groups, but also controls for possible differences in the pretreatment values.
This statistic is estimated by: i. computing the standardized mean difference between pre-and post-treatment time points separately for the treatment and control groups, using formula 2: The indices t0 and t1 indicate different measurement times (e.g. preand post-treatment). The correction factor to achieve an unbiased estimator is defined as: ii. the final effect size of the treatment effect is computed by calculating the difference in the SMCR between the treatment and placebo groups, using the formula below 4: Negative SMCRs values mean improvement in symptoms from pre-to post-treatment in the repeated intranasal OT group as compared with placebo.
The SMCR provides a better estimate of treatment effects, but it requires knowledge of the correlations of outcomes across time points (r pre,post ) to compute its variance. As described above, we assumed 0.5 for schizophrenia and 0.7 for ASD studies, but repeated all our analyses for r pre,post of 0.5, 0.7 and 0.9.
We estimated the variance of SMCR for each experimental group (treatment, placebo) using the formula below 5: The variance of the final treatment effect estimate is given by formula 6: 3.1.2 | Meta-ANOVA For our meta-analyses of variance, we could have used either the logarithm of the variation ratio (VR) or the coefficient of variation ratio (CVR) to estimate the relative variability in the intranasal OT and placebo groups (Senior et al., 2020). The main difference between these two metrics lies on the fact that CVR adjusts the VR for mean differences between groups, because in biological systems variance often scales with the mean; in this case, unadjusted variation ratios cannot provide an unbiased measure of variance (Senior et al., 2020). To inform the selection of the most appropriate variance effect size in our case, we examined the correlations between mean pre-to post-treatment scores changes and the respective SDs, for each treatment group and symptom domain separately. The correlation coefficients were weighted by study size using the 'weights' package from R (version 1.0). We used bootstrapping (1,000 samples) to assess significance. As expected, we found that in the studies we included in our meta-analyses, mean changes in symptom scores from pre-to post-treatment correlated positively with the SD of the changes in many instances (Table S1).
Hence, we focused our variability meta-analyses on the CVR, using the formula below 7: We estimated the variance of the CVR effect size using the formula below 8: where x OT and x PL are the mean change in symptom scores, SD OT and SD PL the reported SDs of the difference, n OT and n PL the sample sizes for OT and placebo treated groups respectively, and ρ the correlation between converted (log-transformed) means and SDs of changes in scores. The use of CVR to quantify group differences in variability is possible only where data have a true zero point, that is no scores lower than 0 exist. This is not the case for raw change scores which can be positive or negative. Hence, following the approach suggested by McCutcheon et al. (2019), we converted values of mean change to a ratio scale (using formula 9).
where x converted is the converted mean change in symptoms, x reported is the absolute value of the converted mean change in symptoms, and C represents the minimum score possible on the rating scale (e.g. 30 in the case of the PANSS). In all instances where x is mentioned in the text this refers to x converted unless otherwise stated.
A CVR above 1 indicates greater variability in the intranasal OT arm, whereas a value below 1 indicates greater variability in the placebo arm.

| Meta-analysis
Our meta-analyses were carried out using the metafor package (version 2.0.0) from R. For the analyses on schizophrenia studies we used four separate univariate random effects models, one for each symptom category to obtain pooled estimates of the overall effect of repeated intranasal OT, as compared with placebo, across studies for (i) total psychopathology; (ii) positive symptoms; (iii) negative symptoms; and (iv) general symptoms. This approach was appropriate for the schizophrenia studies, given that we did not include more than one effect size from the same study in the same model, and, thus, all effect sizes in a model were independent. However, in the case of the ASD studies, one study tested two different doses. The effect sizes of these two doses cannot be considered independent because they originated from the same study. Hence, as a way of dealing with this dependence, for the ASD meta-analyses we used a hierarchical multivariate random effects model, where we specified study as a clustering factor.
Applying multivariate meta-analytic models can be challenging when the covariance structure is unknown and cannot be estimated based on previous literature-which was our case. To overcome this issue, we estimated the variance-covariance matrix from the data using the clubSandwich package from R (0.5.0). We used these models to obtain separate pooled estimates of the effects of intranasal OT, as compared with placebo, on (i) global symptoms; (ii) repetitive/stereotypical behaviours; and (iii) social functioning.
For all of our meta-analytic models, we assessed the presence of influential effect sizes by calculating the Cook's distances (we took a conservative approach where we considered influential any effect size with a Cook's distance >0.5). When an effect size was deemed influential, we then conducted further sensitivity analyses repeating our main analysis excluding the influential effect sizes.

| Heterogeneity
Because studies differ more or less in their experimental design, tools used for assessing outcomes, and treatment schemes, it is likely that some heterogeneity exists between the outcomes of different studies (Xu et al., 2008). Statistical heterogeneity occurs when the true effects of the different studies show a larger variation than it should be expected due to random error or by chance.
Assessing heterogeneity is an important part of interpreting the results or appraising the conclusions that can be drawn from a meta-analysis. In our study, we assessed between-studies heterogeneity using the Cochran's Q test (Kulinskaya & Dollinger, 2015).

| Moderators
To evaluate the potential influence of moderators, we used metaregression models, applied separately for each moderator. We focused on the following factors: year of publication, percentage of male participants included, mean age, design of the study (i.e. within versus between subjects design), daily dose of OT administered, daily frequency of administration, duration of treatment and the presence of a concomitant additional intervention (i.e. psychological intervention).
We applied Bonferroni correction for the number of moderators investigated.

| Publication bias
We assessed the presence of publication bias through the inspection of funnel plots and the use of the rank correlation test for funnel plot asymmetry (Song et al., 2002).

| Power calculations
To inform the appraisal of our findings, we also conducted some simulations to calculate the statistical power of our different meta-analytic models to detect small (0.3), medium (0.5) and large (0.7) effect sizes under low (τ 2 = 0.33), moderate (τ 2 = 1) and large (τ 2 = 3) heterogeneity scenarios. We performed these simulations taking into account both the number of effect sizes and the average sample size of the studies included in each of our meta-analyses using the R script available in https://bit.ly/30lSple. These analyses would provide us with an idea of the magnitude of effect sizes our meta-analyses would be sufficiently powered to detect, given the amount of effect sizes, average sample size of the included studies and between-studies heterogeneity.
3.1.8 | Individual patient data Because we could retrieve the raw data from two ASD RCTs (Bernaerts et al., 2020;Munesue et al., 2016), in a secondary analysis of these data we examined whether there was any evidence of deviation from unimodality in the distribution of pre-to post-treatment change scores in the repeated intranasal OT and placebo groups of those two studies. We assessed deviations from unimodality using the

| Meta-analyses
Below, we present the results of our quantitative syntheses. For clarity, we will present first the results of our meta-analyses of efficacy. Within each disorder, we will present the results for each symptoms domain in separate subsections. We will then present the results of our meta-analyses of variance, using a similar structure.  We did not find significant heterogeneity (Q(9) = 10.77, p = 0.29). Modabbernia et al. (2013) was deemed an influential study (Cook's distance >0.5). Repeating our meta-analysis without this study did not  We could not find evidence of publication bias (Kendall's τ = À0.29, p = 0.29) ( Figure S5A).

Positive symptoms
Compared with placebo, repeated intranasal OT did not significantly   Figure S5B).
We could not find evidence of publication bias (Kendall's τ = À0.20, p = 0.48) ( Figure S5C).   (Figure 1c). We did not find significant heterogeneity (Q(5) = 6.00, p = 0.31). Both percentage of male patients included F I G U R E 1 Efficacy of repeated intranasal oxytocin (OT) for the core symptoms of patients with schizophrenia. Forest plots of the standardized mean change using raw score standardization (SMCRs) effect sizes separated by symptom's domain. Negative effect sizes favour intranasal OT whereas positive effect sizes favour placebo. Note that here we show forest plots and summary effect sizes obtained after excluding the influential studies identified in the total psychopathology, positive and general symptoms sets of studies (Cook's distance >0.5). CI, confidence interval; ES, effect size F I G U R E 2 Moderators of the effects of repeated intranasal oxytocin (OT) on schizophrenia general symptoms. Meta-scatter plots depicting the relationship between percentage of male patients included in the studies meta-analysed (a), total duration of treatment in weeks (b) and the effect size (standardized mean change using raw score standardization, SMCR) quantifying the response of schizophrenia general symptoms to intranasal OT. Negative effect sizes favour intranasal OT whereas positive effect sizes favour placebo and total duration of treatment were significant moderators, though none would have survived Bonferroni correction (Table S5).
F I G U R E 3 Efficacy of repeated intranasal oxytocin (OT) for the core symptoms of patients with autism spectrum disorders. Forest plots of the standardized mean change using raw score standardization (SMCRs) effect sizes separated by symptom's domain. Negative effect sizes favour intranasal OT whereas positive effect sizes favour placebo. Note that here we show forest plots and summary effect sizes obtained after excluding one influential study identified in the repetitive/stereotypical behaviours set of studies (Cook's distance >0.5). CI, confidence interval; ES, effect size

General symptoms
We did not find any significant differences in variability between the repeated intranasal OT and placebo groups (pooled effect  (Figure 4d), but we no longer found significant heterogeneity (Q(6) = 8.06, p = 0.09). We did not F I G U R E 4 Heterogeneity in the response of the core symptoms of patients with schizophrenia to repeated intranasal oxytocin (OT). Forest plots of coefficient of variation ratio (CVR) effect sizes separated by symptom's domain. CVRs above 1 indicate higher variance in the intranasal OT than in the placebo group. Note that here we show forest plots and summary effect sizes obtained after excluding the influential studies identified in the total psychopathology, positive and general symptoms sets of studies (Cook's distance >0.5). CI, confidence interval; ES, effect size find any significant moderator (Table S12). We could not find evidence of publication bias (Kendall's τ = À0.33, p = 0.47) ( Figure S7D).
F I G U R E 5 Heterogeneity in the response of the core symptoms of patients with autism spectrum disorders (ASD) to repeated intranasal oxytocin (OT). Forest plots of coefficient of variation ratio (CVR) effect sizes separated by symptom's domain. CVRs above 1 indicate higher variance in the intranasal OT than in the placebo group. Note that here we show forest plots and summary effect sizes obtained after excluding the influential studies identified for the global symptoms and repetitive/stereotypical behaviours sets of studies (Cook's distance >0.5). CI, confidence interval; ES, effect size

Risk of bias
The overall risk of bias of our meta-analyses was low for both schizophrenia and ASD ( Figures S3 and S4).

Individual ASD patient data
Using data from Munesue et al. (2016), we found evidence of nonunimodality only for global symptoms (as measured using CARS) in the repeated intranasal OT, but not, placebo group. However, this was driven by one single patient. For repetitive/stereotypical behaviours and social functioning we did not find evidence of deviation from unimodality either in the repeated intranasal OT or placebo groups ( Figure S9). Using data from Bernaerts et al. (2020), we did not find evidence of deviation from unimodality, whether in the repeated intranasal OT or placebo groups, for either repetitive/stereotypical behaviours or social functioning ( Figure S10).

Statistical power simulations
Our power simulations demonstrated that at best our meta-analyses would have been sufficiently powered to detect medium to large, but not small, pooled effect sizes, given the amount of data we included and the overall heterogeneity panorama of our samples (Tables S16 and S17).

| DISCUSSION AND CONCLUSIONS
We conducted two sets of meta-analyses of all RCTs investigating the effects of a daily-repeated intranasal OT treatment regime on the core symptoms of patients with schizophrenia and ASD, using gold-standard methods to investigate efficacy, and heterogeneity in the response to treatment. Using the largest pool of data to date, we show that repeated treatment with intranasal OT may have a small effect in improving general symptoms of patients with schizophrenia, but has no effects on any other symptom category in schizophrenia or ASD. Additionally, we report evidence of heterogeneity in the response to repeated treatment with intranasal OT for negative symptoms in schizophrenia, suggesting that, despite lack of evidence of treatment effects at the group level, beneficial effects might still occur in a subset of responsive patients.
However, we did not find evidence of heterogeneity in the response to repeated treatment with intranasal OT for other core symptoms of schizophrenia, or any of the core symptom dimensions in ASD (beyond a tentative decrease in variance after repeated intranasal OT, as compared with placebo, for repetitive/stereotypical behaviours). We discuss each of our main findings below.
Apart from general symptoms in schizophrenia, daily-repeated treatment with intranasal OT did not ameliorate or exacerbate any of the other symptoms dimensions of schizophrenia and ASD. For general symptoms in schizophrenia, we found a small, but significant, positive effect of intranasal OT in reducing symptoms, as compared with placebo, after excluding one influential study. We note that general symptoms, as assessed by the PANSS, include many aspects of human behaviour where effects of intranasal OT have generally gathered supportive evidence in a number of studies. These are, for instance, anxiety, preoccupation, tension or active social avoidance (Kay et al., 1987). Indeed, the anxiolytic effects of OT are among the most well replicated effects in both humans (Neumann & Slattery, 2016) and rodents (Yoshida et al., 2009). Although our meta-analysis did not investigate effects on specific symptoms (this data is not typically reported), our findings of a beneficial effect of intranasal OT in reducing general symptoms in patients with schizophrenia is globally consistent with the well-established role of OT on regulating neurocognitive processes potentially involved in the genesis of these symptoms, i.e. regulation of anxiety and stress response (Neumann & Slattery, 2016). Based on our findings, we suggest that future larger intranasal OT trials in schizophrenia should pay particular attention to general symptoms.
Most previous meta-analyses have concluded that repeated treat- studies. However, we urge for caution in interpreting these findings because these significant moderating effects were driven by two studies using large 80 IU daily doses and one of these studies was deemed influential in both meta-analyses. These findings suggest that further work is required to elucidate dose-response curves in improving psychopathology in patients with schizophrenia before drawing conclusions with any confidence.
Interestingly, we also found that the percentage of male patients included and total duration of treatment were significant moderators of the effects of repeated treatment with intranasal OT on general symptoms of schizophrenia. Here, studies including higher percentages of male patients tended to report larger beneficial effects of intranasal OT; studies with longer periods of treatment administration tended to report smaller improvements after intranasal OT. These findings are interesting for two reasons. First, gender is a well-known moderator of the effects of intranasal OT (Lynn et al., 2014;Xu et al., 2017), including on anxiety and stress response (Steinman et al., 2016), which contribute to the genesis of general symptoms in schizophrenia. For instance, although studies in men have consistently reported anxiolytic effects of intranasal OT (Heinrichs et al., 2003;Spengler et al., 2017), studies in women have reported opposite effects (Domes et al., 2007(Domes et al., , 2010Lischke et al., 2012). Hence, it is plausible that intranasal OT might attenuate more strongly general symptoms in male patients, but have no effects or even exacerbate these symptoms in female patients. This would fit well with our observation that studies including higher percentages of male patients report larger improvements in general symptoms after schizophrenia. Further studies should investigate this hypothesis.
Second, there has been intense debate about the optimal scheme of administration when treatment with intranasal OT involves repeated applications (Horta et al., 2020). Given the high propensity of the OT receptor to desensitization (Robinson et al., 2003), longterm exposure to high concentrations of OT might down-regulate the OT signalling machinery and thus attenuate the effect of OT on brain function. Down-regulation of OTR expression after chronic exposure to OT has been clearly demonstrated in rodents (Huang et al., 2014;Pisansky et al., 2017). Supporting this idea, a recent study investigating the impact of dose frequency on the effects of intranasal OT in healthy volunteers found that the repeated administration of intranasal OT every other day over five days was more effective in dampening the response of the amygdala to negative affect, when compared with daily administrations (Kou et al., 2020). This hypothesis fits well with our observation that studies with longer periods of administration reported smaller improvements in general symptoms after intranasal OT. It is imperative that future studies systematically investigate the efficacy of various dosing and frequency of administration regimes for RCTs using repeated applications of intranasal OT. Suboptimal regimes can only increase inconsistencies in reported findings, reducing our confidence in the role of intranasal OT as a potential treatment for psychopathology.
For the first time, using meta-analyses of variance to investigate whether a repeated intranasal OT treatment regime might have beneficial effects for subsets of responsive patients, we report evidence of increased variance in the intranasal OT group, compared with the placebo group, for negative symptoms in schizophrenia. This suggests the presence of heterogeneity in treatment response, which in turn might reflect the existence of a subgroup of patients that respond positively to intranasal OT with respect to improvement in negative symptoms, even though the overall mean group effect was not significant. Our meta-analysis cannot identify which factors might determine which patients will respond to intranasal OT in terms of negative symptoms. This needs to be elucidated in future RCTs, for example by using a repeated N-of-1 design, where the effects of treatment are studied by following an individual patient over time with the treatments given being randomised from period to period (Araujo et al., 2016). Parsing this heterogeneity will be important to identify treatment response biomarkers which could guide the selection of patients with higher likelihood of responding to intranasal OT. This is particularly important for negative symptoms, which currently still lack highly effective treatments (Remington et al., 2016). In the absence of this knowledge, future RCTs based on heterogeneous cohorts of patients will, most likely, result in inconclusive findings and missed opportunities to advance the treatment of negative symptoms in schizophrenia by targeting the OT system.
Although the exact factors underlying the heterogeneity in the response to intranasal OT treatment for negative symptoms in schizophrenia remain unclear, we can postulate some hypotheses here. First, negative symptoms encompass varied behavioural problems that include blunted affect, emotional withdrawal, poor rapport, apathetic social withdrawal, difficulties in abstract thinking, lack of spontaneity and stereotyped thinking (Kay et al., 1987). It is conceivable that given the preferential effects of OT on social affiliation (Feldman, 2012) Instead, it is more likely that variance in these cases simply reflects inter-individual differences in the course of symptom progression, which was similar for the two groups. In ASD, this conclusion was further supported by analyses on individual patient data from two RCTs, where we did not find systematic of evidence of non-unimodality in the response to intranasal OT. We note though that for repetitive/ stereotypical behaviours in ASD, we found tentative evidence for decreased variance after repeated intranasal OT, as compared with placebo, suggesting that treatment with repeated intranasal OT might make repetitive/stereotypical behaviours in these patients more uniform. One possible explanation is that repeated intranasal OT might have had a stabilizing effect (Cortes et al., 2018). One example for such variance stabilization might be via ceiling or floor effects, where the assessment instrument is too coarse to capture patients' improvements over a certain level (Cortes et al., 2018). Interestingly, duration of treatment was a significant moderator of variance in the response of repetitive/stereotypical behaviours to repeated intranasal OT, where we found that studies with longer treatment durations tended to report less variance than those with shorter durations. This observation is in keeping with the idea that treatment duration might play an important role in moderating the response to repeated intranasal OT.
Certain limitations of our study and of existing RCTs provide caveats that need to be considered, and where possible investigated in future studies, before any firm conclusions can be drawn. First, the number of intranasal OT RCTs is still relatively small for both disorders. We also note that these studies are mostly small-scale studies, including rather small cohorts of patients. These two aspects pose important limitations to statistical power. According to our statistical power simulations, our meta-analyses were sufficiently powered to detect significant medium-to-large, but not small, effect sizes if they existed. Second, although we found no robust evidence for differences in variance between intranasal OT and placebo for most of the schizophrenia and ASD symptom domains that we examined, intervention-by-individual interactions are still possible even if the variance ratio is equal to one. This would correspond to a scenario where subsets of patients might show high variance in their response, whereas in others, treatment may lead to a more uniform response (i.e. less variance). However, at least for ASD, this is unlikely given that in our analyses of modality of individual patient data from two RCTs, we could not find systematically robust evidence of deviations from unimodality in symptoms. Third, we were not able to investigate the effects of other concomitant treatments (i.e. antipsychotics) might have on the response to OT because descriptions of medication regimens were too sparse and too varied in the primary studies. Fourth, although not specific to this study, we should highlight that, as a general limitation in the field, we remain unclear about the exact pharmacological actions driving potential changes in symptoms in patients, if they exist. There has been intense debate on whether the effects of intranasal OT modulate behaviour through direct effects on the brain, effects on peripheral organs or both (Leng & Ludwig, 2016). Supporting the central effects hypothesis, recent evidence has shown that (i) exogenous OT, when administered intranasally, can reach the brain in primates (Lee et al., 2020); (ii) intranasal OT modulates brain physiology beyond those effects produced by intravenous OT (Martins et al., 2020;Mar-tins, Dipasquale, et al., 2021); (iii) and that the effects of intranasal OT on the physiology of the living human brain can be predicted by the distribution of the OXTR mRNA expression in the post-mortem human brain (Martins, Broadmann, et al., 2021), which provides indirect support to the idea that intranasal OT can target the brain OT system in humans. However, we have also recently shown that exogenous OT when administered intravenously can modulate brain physiology (Martins et al., 2020) and connectivity  with patterns that partly overlap with those changes produced by intranasal OT. The exact mechanisms driving these changes in the brain after intravenous OT remain unknown, but recent studies in rodents have shown that they might involve signalling through the vagus nerve (Everett et al., 2021) or RAGE-mediated transport across the blood-brain barrier (Yamamoto et al., 2019). The exact source of the effects of intranasal OT on the brain and behaviour will need to be dissected in future studies combining intranasal OT with selective brain-penetrant and non-brain penetrant antagonists to isolate the potential contribution of peripheral actions, if they exist.
In summary, our meta-analysis presents evidence for a limited therapeutic effect of a repeated intranasal OT treatment regime, possibly restricted to general symptoms in schizophrenia. We identify a number of important factors, such as gender, dose and frequency, pattern and duration of repeated intranasal OT administration, and the effects of these factors must be elucidated in future studies. Our study further reports that for negative symptoms in schizophrenia, despite the lack of effects at the group-level, there is evidence for heterogeneous response to treatment in patients treated with intranasal OT. This is compatible with the idea that at least some patients might respond positively to intranasal OT. Further studies dissecting this heterogeneity will be paramount in providing a better appraisal of the potential of intranasal OT in treating negative symptoms in schizophrenia.