Executive Function Skills Are Linked to Restricted and Repetitive Behaviors: Three Correlational Meta Analyses

There is a consensus on the centrality of restricted and repetitive behaviors (RRBs) in the diagnosis of Autism Spectrum Disorder (ASD), yet the origins of these behaviors are still debated. We reconsider whether executive function (EF) accounts of RRBs should be revisited. EF deficits and high levels of RRBs are often pronounced in individuals with ASD and are also prevalent in young typically developing children. Despite this, the evidence is mixed, and there has been no systematic attempt to evaluate the relationship across studies and between task batteries. We examine recent evidence, and in three highly powered random‐effects analyses (N = 2964), examine the strength of the association between RRB levels and performance on set shifting, inhibitory control, and parental‐report based EF batteries. The analyses confirm significant associations between high levels of the behaviors and poor EF skills. Moreover, the associations remained stable across typical development and in individuals with ASD and across different types of EF measures. These meta‐analyses consolidate recent evidence identifying that cognitive mechanisms correlate with high RRBs that are seen in individuals with ASD, as well as in typical development. We propose that the EF account may be critical for guiding future interventions in ASD research.


Introduction
Many major puzzles in our understanding of Autism Spectrum Disorder (ASD) surround the nature of one of the two central diagnostic features, restricted and repetitive behaviors (RRBs) [American Psychiatric Association, 2013]. In particular, the literature is unclear about how these behaviors relate to other aspects of the diagnosis, including comorbid social-communicational problems and underlying cognitive skills [e.g., Lopez, Lincoln, Ozonoff, & Lai, 2005;Van Eylen, Boets, Steyaert, Wagemans, & Noens, 2015]. In the 1990s great expectations were placed on cognitive models to explain the joint components of the diagnosis, notably theory of mind [Baron- Cohen, 1995], central coherence [Frith, 1989[Frith, , 2008, and executive function (EF: Russell, 1997). The first two of these cognitive accounts are either agnostic about the link between cognitive factors and RRBs [e.g., Frith, 2008] or simply suggest that the explanation of the social/communication problem, like undeveloped "theory of mind", is "related" to these RRBs [Baron- Cohen, 2000, pp. 78-79] without explanation. The third set of theories has been more explicit about hypothesized links between EF skills and these behaviors [Russell, 1997]. However, more recent analyses raise doubts about whether this third candidate theory can explain RRBs, or even the social and communicative difficulties in ASD. We reopen this debate, first by outlining the centrality of RRBs to the current diagnosis of ASD, then by summarizing a shift in the theoretical focus of accounts of the nature of these behaviors, playing down these cognitive accounts. We re-evaluate the possible role of EF and present three meta-analyses to re-examine the evidence for a possible role of EF skills in this neglected diagnostic feature of ASD.
Restrictive and Repetitive Behavior in Autism Spectrum Disorder of this neurodevelopmental disorder. Kanner [1943, p 245, his italics], for example, writes of an "anxiously obsessive desire for the maintenance of sameness," and it is not surprising that contemporary analyses of ASD show a high comorbidity between ASD and anxiety problems [e.g., Rodgers, Glod, Connolly, & McConachie, 2012;Lidstone et al., 2014]. Since the 1990s (DSM-IV: APA, 1994, ICD-10: WHO, 1995, RRBs have been divided into four sub-groups: stereotypies, preoccupation with objects, restricted interests and nonfunctional routines. They are manifest in actions that range from rocking and hand flapping to very specific food and routine preferences, such as eating only pizza. Parents often identify these behaviors as the most challenging ASD characteristics to manage [e.g., South, Ozonoff, & McMahon, 2007], as they can create barriers to learning opportunities and social interactions [Harrop, McBee, & Boyd, 2016]. Analysis of the origins and nature of RRBs can guide research on the outcomes of the disorder, and help design interventions to remediate these behaviors. We conducted a search of RRBs in the ISI (Clarivate Analytics) Search Engine. Following the definitional changes in the DSM-5 (APA, 2013), output more than doubled: 767 articles in the 5-7 years since the change between 2014 and 2020 (to April) compared to 258 articles in the 6 years between 2008 and 2013. The rapid increase in research has led to more thorough analyses, often concerning how the broad range of RRBs cluster together. This has highlighted a lack of a universal definition of the term. Factor analytic studies, for example, increasingly divide RRBs into dichotomous groups: "low-" and "high-level" behaviors [Cuccaro et al., 2003;Honey, McConachie, Turner, & Rodgers, 2012]. The "low-level" RRBs consist of motor actions like rocking or hand flapping and a preoccupation with objects (including collecting unusual items, like fluff from carpets), whereas the "high-level" behaviors consist of restricted interests and nonfunctional routines, like obsessively repeating facts about a special interest, such as Star Wars or Harry Potter [Turner, 1999].

Measurement Issues
There is no "gold standard" RRB measure. Widely used questionnaires implement a variety of response methods ranging from calculating frequency or intensity to identifying which behaviors are repeated [Honey et al., 2012]. Some also include several of these metrics, making it difficult to compare the same behaviors, yet alone different ones, across different measures [South et al., 2007]. In addition to the various metrics, types of assessments also differ. Although some measures comprise observations and interviews used to diagnose ASD (e.g., the Autism Diagnostic Observational Scale, ADOS: Kim & Lord, 2010, and the Autism Diagnostic Interview-Revised, ADI-R: Rutter, Couteur, & Lord, 2003), others involve parental questionnaires that were created for the sole purpose of assessing RRBs (e.g., the Repetitive Behavior Questionnaire, RBQ: Turner, 1995). To complicate matters further, some scales such as the ADI-R consist of 12 items, while others, like the Repetitive Behavior Scale-Revised (RBS-R: Lam & Aman, 2007], include up to 44 items divided into as many as six sub-scales. This diversity between measures has made it difficult to draw any conclusions concerning which of the existing tools are sensitive enough to capture the wide variety of RRBs. This is especially the case as most measures have not been used frequently enough to test and analyze their concurrent and construct validity [Honey et al., 2012]. This metaanalysis explores the diversity between measures in more depth to examine whether such differences should be considered further. Evidence for measurement difficulties can be seen in South et al.'s [2007] study, assessing one group of individuals with ASD on three measures (ADOS, ADI-R, and the Repetitive Behavior Interview, RBI: Turner, 1997). They found concurrent validity in terms of associations between their cognitive flexibility measures and RRBs using the ADOS and ADI-R, but not with the more specific RBI. These differences could be caused by the different levels of detail in each measure. Specifically, the ADI and ADOS are commonly used to diagnose ASD and hence rely on fewer questions, whereas the RBI is more comprehensive and created for the sole purpose of assessing the nature and extent of RRBs. Not only do findings like these imply that measurement issues may have negative implications for our understanding of the construct itself, they also stress the need for a systematic review of the RRB tools. Given the complexity of RRBs, it is perhaps naive to assume that we can develop one "gold standard" measure. Nonetheless, researchers should consider the different features of the available measures against a range of criteria, and in the light of the specific question they are asking [Honey et al., 2012]. This would then also make it easier to evaluate the various theoretical accounts in the field.

The Move Toward a More Complete Account of Restrictive and Repetitive Behaviors
The past two decades have witnessed a shift of emphasis within theoretical analyses of the nature and origins of RRBs. For the decades before and after Russell's [1997] influential analysis, many researchers focused on the link between a delay in the control of action and the persistence of these behaviors. The typical increase in both lower-and higher-level behaviors toward the end of the preschool period and decline thereafter coincides with the manifestation of EF skills. This close correlation suggested a causal relationship [Turner, 1997]. Over the past decade this view has received much critical scrutiny: "Taking a developmental perspective, it seems unlikely that EF could have a direct causal role since RRBs emerge so early in typical development, hence it may be more appropriate to consider the effect of repetitive behaviors on neurocognitive functioning, than any causal role" [Leekam, Prior, & Uljarevi c, 2011, p 578].
In this section, we review an alternative to the EF theory which is hinted at in this quotationthe developmental trajectory account. We will argue that neither is incompatible with an EF approach: indeed this review was motivated by the possibility that EF account could benefit from a more holistic combination with this area of theorization. Leekam et al.'s [2011] developmental account suggests that RRBs are immature responses that are maintained more strongly within the behavioral repertoire of individuals with ASD. To explain this process, they suggest that neurobiological changes must be traced alongside behavioral ones. They highlight the importance of the development of the corticostriatal circuits in early childhood. This account is largely based upon Thelen's [1979] perspective on skilled motor action in which a high prevalence of stereotypies in the first year of life is caused by slow cortical maturation, as motor actions are not yet under voluntary control [Tinbergen, 1951]. At the end of the first year, motor behaviors become more goal directed, and RRBs more varied. This suggests that RRBs are more likely to be released within specific events which provoke extreme arousal states (high or low), and that triggers for RRBs need to be understood within a context that balances developmental and environmental factors. Leekam et al. propose that Thelen's account can be applied to the broader category of RRBs in ASD. Accordingly, these behaviors are immature responses that are a normal part of early development, which come increasingly under control as infants begin to develop goal-directed actions. The developmental approach would benefit from being linked to a cognitive model that explains how repetitive behavior changes with age. Leekam et al.'s [2011] observation that stereotypies reduce over time in typical development is widely supported in the literature [e.g., Çevikaslan, Evans, Dedeo glu, Kalaça, & Yazgan, 2013;.
It is plausible that typically developing infants acquisition of goal-directed actions explains the reduction in lower level behaviors but this theory is less likely to account for the higher-level RRBs, as these follow a different trajectory, in which they first increase, then decline around the age of 5-6 [e.g., Çevikaslan et al., 2013;Evans et al., 1997;. Without an additional dimension, this account would struggle to explain what purpose the higher level RRBs have, and what it is that drives such changes.
In addition to focusing on their developmental trajectory, Leekam et al. [2011] suggest that RRBs become more likely to be triggered by specific events, particularly extreme physiological arousal states (high or low). This echoes an early RRB account that they are caused by hyper-or hypo-arousal [Hutt & Hutt, 1965]. The hyperarousal prediction suggests that these behaviors are coping mechanisms that develop to reduce high-arousal or anxiety. Goodall and Corbett [1982] expanded this theory by proposing that RRBs may develop to regulate under-arousal caused by a lack of stimulation from the environment. The suggestion that anxiety plays a central role in RRBs, is perhaps not surprising, as it was highlighted in Kanner's [1943] original article. A recent meta-analysis by Steensel, Bögels, and Perrin [2011], however, refocused interest in this account by identifying that as many as 40% of individuals with ASD also met the criteria for an anxiety disorder. Indeed, higher-level RRBs commonly associate with anxiety levels in typically developing children [Evans, Gray, & Leckman, 1999;Laing, Fernyhough, Turner, & Freeston, 2009;Zohar & Felz, 2001] and children with ASD Uljarevi c & Evans, 2017]. Findings like these have suggested that higher-level RRBs serve the purpose of controlling the environment and thus reducing anxiety. Some support for this can be seen in a study by Lydon, Healy, and Dwyer [2013] that investigated heart rate before, during, and after challenging behaviors in three children diagnosed with ASD. Their findings suggest that, for some individuals with ASD, the behaviors serve to increase arousal and to sustain a preferred state of heightened arousal. Despite the interesting focus, these results do not reveal a causal pathway through which arousal and anxiety lead to the manifestation of these behaviors. This highlights the need to reopen the EF account as it is possible that anxiety and RRBs are part of a causal chain between poor cognitive control leading to hyperattentiveness to negative information, creating anxiety and then RRBs [e.g., Spiker, Lin, Van Dyke, & Wood, 2012].
The emphasis on goal directed actions in the developmental account leads easily to the proposal that the different RRB trajectories are driven by an individual's EF skills. A widely cited definition of EF skills is "the ability to maintain an appropriate problem-solving set for attainment of a future goal" [Ozonoff, Pennington, & Rogers, 1991, p. 1083]. Examples of these skills are planning, inhibitory control, and the flexibility of thought and action. Although Leekam et al. suggest that EF deficits are not vital for the development of RRBs, they stress the importance of corticostriatal circuits. As Langen et al. [2011, p2] state "cognitive models have provided valuable hypotheses for how neurobiological circuitry might be disturbed in repetitive behaviour". Given that the main function of the corticostriatal circuit is to control goal-directed behavior, this statement points to EF processes, or even bidirectional influences between behavior and neurophysiology. Evans, Lewis, and Iobst [2004] suggest that variable EF skills and RRB trajectories across disorders may be caused by how different cognitive processes are governed by different regions of the orbitofrontal cortex. There is ample support for variable RRB levels across disorders. For example, individuals with Williams Syndrome have been found to engage in more stereotypies than those with Prader-Willi syndrome [Royston et al., 2018]. Individuals with ASD and OCD are thought to engage in significantly more RRBs than typically developing children [Zandt, Prior, & Kyrios, 2009]. Similar variability has been found in the EF literature as the same study by Zandt, Prior, and Kyrios found that individuals with ASD performed better on inhibitory control tasks than individuals with OCD. The EF hypothesis could then possibly account for the frequency of RRBs in ASD, as individuals with ASD may be more impaired on EF skills. At the same time, it can help explain the wide range of these behaviors, as particular skills may be responsible for individual RRBs. Moreover, it can account for the heterogeneity within disorders, as well as the change from RRBs across typical and atypical development.

Do Individuals with ASD Show EF Impairments?
One reason why the link between EF and RRBs has been contested is that despite extensive research, numerous reviews, and meta-analyses on the definition of EF impairments in ASD over the past 15 years, the role that these skills play in the disorder remains unclear. For example, Geurts, Corbett, and Solomon's [2009], discerned no firm evidence in 29 studies for a cognitive flexibility deficit in adults with ASD. They focused largely on tasks that they considered to have high ecological validity, using mechanistic approaches (e.g., task switching paradigms that warned participants about a rule change, and presented switch trials throughout the task). However, they found clear impairments on tasks that did not meet this criterion, such as the Wisconsin Card Sorting Task (WCST: Berg, 1948). Despite these positive results, Geurts et al.'s [2009] article has been widely cited as evidence against the EF hypothesis (Web of Science = 162) and it steered some researchers away from the EF explanation.
Perhaps paradoxically, Geurts and colleagues' subsequent research has identified an EF profile in ASD. In several meta-analyses, they have found strong prepotent response inhibition and interference control inhibition impairments [n = 41, g = 0.55 and 0.31, respectively; Geurts, Van Den Bergh, & Ruzzano, 2014], as well as planning impairments (n = 50, g = 0.52) [Olde Dubbelink & Geurts, 2017] in individuals with ASD. These positive links with inhibition and planning cast doubt on the suggestion that individuals with ASD are unimpaired on EF skills. In addition, recent meta-analyses by Lai et al. [2017] and Demetriou et al. [2018] find even stronger evidence for the view that overall EF performance, as well as performance on separate EF skills are essential to control thoughts and behaviors in individuals with ASD. Demetriou et al.'s analysis consisted of 235 studies (n = ASD = 6816, Control = 7265). They found a moderate effect size (g = 0.49) for overall measures, suggesting that individuals with ASD are significantly more impaired on EF than control groups. This effect also applied evenly across the 6 individual EF domains (concept formation, mental flexibility, fluency, planning, inhibition, and working memory; g = 0.46-0.55). Lai et al.'s analysis, on the other hand, of 98 studies (n = 5991, ASD = 2985, Control = 3005), concentrated on a narrower age range of younger children and adolescents. Another difference was that Lai et al.'s analysis only examined individual EF domains (verbal and spatial working memory, flexibility, inhibition, generativity, and planning). Like Demetriou, they found moderate to strong effect sizes for all individual skills (g = 0.57-0.67), although a lower inhibition effect (g = 0.41). These recent and more thorough analyses suggest that EF impairments play a crucial role in ASD and that separate skills are important, despite controversy of what overarching EF is. Thus, it is now more relevant to examine the theoretical and clinical implications of the EF account. Initially we wanted to examine the associations between RRB levels and Miyake et al.'s [2000] three "foundational" EF skills; set shifting, inhibitory control and working memory. Unfortunately, not enough studies (<10) have examined the relationship between RRBs and working memory, so our analyses only focus on set shifting and inhibitory control.

Are EF Skills Related to the High Levels of RRB in ASD?
A spurt of new research offers renewed support linking elevated RRB levels to EF impairments such as set shifting [e.g., Jones et al., 2018;Miller, Ragozzino, Cook, Sweeney, & Mosconi, 2015], inhibitory control [e.g., Jones et al., 2018;Thakkar et al., 2008;Mosconi et al., 2009], and planning [e.g., Van Eylen et al., 2015]. Jones et al. [2018], for example, investigated the relationship between RRBs and multiple EF skills in 100 adolescents with ASD and found significant associations with set shifting and inhibitory control, but not planning. Similarly, Miller et al. [2015] found that in individuals with ASD their overall set shifting errors predicted RRB levels. Studies like these have led to the call for set shifting interventions to remediate RRBs in ASD [e.g., Mostert-Kerckhoffs, Staal, Houben, & de Jonge, 2015], and highlight the need to reopen the EF account. Some studies suggest that we need to consider EF skills in combination with genetic components, specifically, the need for gene-brain-behavior models of ASD using set shifting [Yerys et al., 2009] or inhibitory control [Thakkar et al., 2008] as possible links between the components. Thakkar et al., for example, found that elevated RRB levels in ASD participants related to hyperactive response monitoring in the rostral anterior cingulate cortex (rACC) during an antisaccade task. These findings complement a genetic account (see Lewis & Kim, 2009) but also highlight the importance of cognitive factors, strengthening the view that the EF account must be re-examined. Such links, however, are not pervasive, as some recent investigations have also failed to find relationships between RRBs and set shifting [Ozonoff et al., 2004], inhibitory control [Joseph & Tager-Flusberg, 2004] and planning [e.g., Jones et al., 2018]. The inconsistent literature makes it timely to examine the relationships further through a metaanalytic framework to assess the strengths of the proposed relationships and evaluate if EF interventions may have the potential to help manage challenging RRBs.

Task Impurity
Several explanations have been given for the inconsistent findings in the literature. First, like RRBs, EF measures have consistently been scrutinized in terms of their ecological validity [e.g., Kenworthy, Yerys, Anthony, & Wallace, 2008;Rabbitt, 1997]. Given that the executive system incorporates a variety of skills [Miyake et al., 2000] it is not surprising that psychometric measures need to accommodate such diversity. Geurts et al.'s [2014] meta-analysis, for example, confirmed that WCST impairments that relate to RRBs may identify cognitive inflexibility, but might also identify difficulties with staying on task, learning from feedback and/or inhibiting irrelevant information. EF tasks have commonly been criticized for their complex structures [Burgess, Alderman, Evans, Emslie, & Wilson, 1998], and the impure nature of the WCST task has been highlighted as a clear example. It may also tap into cognitive flexibility [Everett, Lavoie, Gagnon, & Gosselin, 2001], working memory [Medalia, Revheim, & Casey, 2001], and inhibitory control [Geurts et al., 2009] skills. Nevertheless, Miyake et al.'s [2000] confirmatory factor analysis identified that the WCST task loaded onto the factor 'shifting' and not the two other factors. Thus, the overall conclusion that there are no clear shifting impairments in individuals with ASD may be mistaken.
Findings like these led to the development of EF rating scales completed by parents or teachers, such as the Behavior Rating Inventory of Executive Function (BRIEF: Gioia, Isquith, Guy, & Kenworthy, 1996). Whereas psychometric tasks require a response to a single event and are conducted in carefully controlled environments, EF performance in the real world involves a stream of tasks [Dawson & Marcotte, 2017]. The BRIEF consists of two smaller scales, the Behavioral Regulation Index (BRI) and the Metacognition Index (MI). The BRI consists of four skills: Shift, Inhibit, Self-Monitoring and Emotional Control. The MI comprises five skills: Plan/Organize, Initiate, Task Monitoring, Working Memory, and Organization of Materials. The outcomes on both scales of the BRIEF have been found to be consistent with clinical expectations; they correlate with biological markers, and even show predictive relationships with academic skills [Isquith, Roth, & Gioia, 2013]. This leads nicely into a second possible reason why the inconsistent findings as EF rating measures may have moderated the results. It has been suggested that rating scales have a higher ecological validity [Rabbitt, 1997], and consequently may be the only measures that can reliably predict EF impairments. To date, many studies find that parent-based EF ratings are not correlated with scores on performance-based EF tasks in children with ASD [Teunisse et al., 2012] or other neurodevelopmental disorders such as ADHD [Mahone & Hoffman, 2007;Toplak, Bucciarelli, Jain, & Tannock, 2008]. Mahone and Hoffman, for example found that 2-5-year-old children with ADHD improved significantly with age on performance-based EF measures. In contrast, parent ratings remained relatively stable during this age range. These patterns again suggest that parent ratings are measuring a different aspect of the EF construct than most performance-based measures. Despite widely reported concerns like these, researchers often interpret the findings in rating scales and performance-based tasks in the same way. This may be problematic, as Toplak, West, and Stanovich [2013] found not only low reliability between scales and psychometric measures (r = 0.19), they also found that each assessed different levels of cognition, namely cognitive abilities and goal pursuit achievement. As well as highlighting the need to examine potential moderating factors further, these findings emphasize the rationale for the third meta-analysis, reported below. Parent-based ratings may be useful when behavioral problems interfere with performance-based testing, and they can provide supplemental data that may relate more to "real-life" situations. We examine if measuring EF skills by behavior vs. parental report makes a difference regarding their relationship with RRBs.

Predictions
Numerous possible explanations for RRBs have been proposed but their cause is unknown, since no hypothesis has yet stood up to rigorous evaluation. The nature of the database has shifted slightly since Leekam et al.'s [2011] review, making it appropriate to re-examine the link between EF skills and RRBs. Not only have recent studies found strong links with EF skills, there is also not enough evidence to support an alternative framework to explain the developmental trajectory of these behaviors.
Nonetheless, there is still ample evidence to suggest that some task or sample characteristics may play a key role in the relationship, albeit if the EF impairment may not be able to explain the full picture.
In order to assess the relationships between RRBs and set shifting, inhibition and parental control scores, a correlational meta-analytic approach was applied. This type of methodology is useful as it assesses the overall strength of relationships by combining data from all the available findings in the literature. One criticism of the approach is that analyses may combine results that are not comparable, since they have implemented different measures or statistical methods [Rosenthal & DiMatteo, 2001]. Other authors, however, argue that a certain degree of dissimilarity needs to be accepted in order to allow for generalization [Smith, Glass, & Miller, 1980]. Meta-analysis can give us an indication of the strength of a relationship, but it has been further suggested that it does not necessarily give us a clearer understanding of its nature, particularly any directions of causality. Nonetheless, correlational relationships specify important research needs and identify children who may benefit from specific interventions.
The inconsistent literature reported above makes it difficult to make strong predictions. Previous meta-analyses indicate strong general EF impairments in ASD, apart from inhibition, in which the role is less clear [Lai et al., 2017]. It is therefore possible that we find stronger effects in the first meta-analysis to be conducted, on the relationship between set shifting and RRBs, than in our second on the links between repetitive behavior and inhibitory control. For the parental report analysis, we may find a strong overall association of EF with RRBs, as not only are we looking at the Meta-cognition Index (MI; Gioia et al., 1996) scale in which shifting and inhibition are combined, but parental report measures have also been argued to be more ecologically valid than psychometric measures [Kenworthy et al., 2008]. If we take the inconsistent evidence into account, it is also possible that we will not find significant relationships between any of the EF skills and repetitive behaviors, calling the EF hypothesis into question. If we do find an overall relationship, we need to highlight moderators that should to be explored further.

Systematic Literature Search and Inclusion Criteria
This systematic review and meta-analysis were performed in accordance with the PRISMA guidelines [Moher, Liberati, Tetzlaff, & Altman, 2009] and those specifically for correlational meta-analysis [Quintana, 2015]. To collect the relevant data on the relationship between EF abilities and levels of RRBs, we searched Scopus and the ISI Search Engines [initially on 10.10.2017]. The following combinations of keywords were used: restricted, repetitive behaviors OR stereotypies OR insistence of sameness OR circumscribed interests AND executive function OR set shifting OR planning OR working memory OR inhibition OR inhibitory control OR BRIEF). These produced 177 results in Scopus and 138 in ISI. We closely examined previous reviews and asked leading researchers in the field (n = 10) to provide unpublished data on the topic to avoid the risk of possible publication bias, or inaccessible data that we needed to calculate an effect size. Two provided additional data on set-shifting. The results made it possible to run set shifting, inhibitory control and parental report based questionnaire analyses, but not planning and working memory as too few studies (<10) measured the relationships between these skills and RRBs. See Figure 1 for PRISMA flow diagram of the studies that were included in the meta-analysis, and Tables 1-3 for summary data of the studies included in the set shifting, inhibitory control and parental report analyses, respectively.

Statistical Dependence of the Samples
If an article reported multiple effect sizes, these were included and treated as separate studies if they fulfilled one of three criteria: 1. The effect sizes were independent and representative of different diagnostic groups [Borenstein, Hedges, Higgins, & Rothstein, 2010].
2. Individual differences were examined within a specific participant group (e.g., ASD individuals divided into two groups, low-and high-functioning individuals, based on their IQ scores).
3. A study assessed participants on multiple tasks that measured different EF skills (e.g., one set shifting and one inhibition task).
This rule did not apply, however, if the same participant group was tested on several set shifting or inhibitory control tasks, if a study included correlations for several task outcomes (e.g., perseverative errors and reaction time), or if participants were assessed on multiple RRB measures. To include the same comparison group in the same analysis several times would have violated the assumption of statistical independence, rendered the standard errors and thus made the confidence intervals inaccurate. We created further inclusion criteria for our analyses when this occurred: 1. If a study reported several outcome measures, we always chose the most widely reported outcomes for our analysis, as these were better comparisons. As a result, if a study: reported perseverative errors and reaction times [e.g., Dichter et al., 2010], we always chose perseverative errors; if frequency and duration were presented [e.g., LeMonda et al., 2012], we included the effect size for frequency; if it reported commission (incorrect button press) and omission (no button press) rates for set shifting scores [de Vries & Geurts, 2012], we reported the effect size for commission rates, as these are more comparable to perseverative errors and frequency scores.
2. If a study reported several correlations for different EF tasks with the same measure outcome, we included the correlation from the most widely used task. For example, in Van Eylen et al. 's [2015] study, the correlation for the WCST task was chosen over the Switch task (Rubia, Smith, & Taylor, 2007, and the Go/No-Go task was chosen [e.g., Fillmore, Rush, & Hays, 2006] over the Flanker task [Christ, Kester, Bodner, & Miles, 2011]. In Mostert-Kerckhoffs et al.'s [2015]] study, effect sizes were listed for the auditory stimulus condition (SSA) and the visual stimulus condition (SSV) [tasks from the Amsterdam Neuropsychological Tasks, De Sonneville, 1999]. We reported the correlation for the visual task, since other widely used shifting tasks (e.g., the WCST) rely heavily on visual skills. Finally, in Joseph and Tager-Flusberg's [2004] study, two types of inhibition tasks were reported, the Day and Night [Gerstadt, Hong, & Diamond, 1994], and the Knock and Tap [Korkman, Kirk, & Kemp, 1998] task. We decided to report the correlation for the Knock and Tap task, since it relies heavily on motor skills, making it similar to the frequently reported Walk/Do not Walk task, while the Day and Night task requires good verbal skills which are known to be compromised in ASD.
3. If participants in a study were assessed on multiple RRB measures we included the most widely used one. For instance, in Van Eylen et al.'s [2015] study, effect sizes for the social responsiveness scale (SRS, Roeyers, Thys, Druart, Schryver, & Schittekatte, 2011) and the repetitive behavior scale-revised (RBS-R, Bodfish, Symons, & Lewis, 1999) were reported. We used the RBS-R correlation, as this is more widely used [Honey et al., 2012]. In other studies, behaviors were measured through two widely used diagnostic measures, the Autism Diagnostic

INSAR
Interview (ADI, Le Couteur et al., 1989) and the Autism Diagnostic Observation Schedule (ADOS, Lord et al., 1989). When a study provided correlations for both measures, we report the observational ADOS, as it includes a wide range of behaviors and is based on observation (following Turner, 1999). This exclusion criteria also meant that we included some nontraditional RRB measures such as the CY-BOCS [Scahill et al., 1997]. We decided to include this measure as no traditional measures were used and we wanted to include as many studies as we possibly could.

Statistical Analyses
We ran random-effects models to estimate the overall means, to account for heterogeneity within studies, since a wide variety of tasks had been used to assess both RRBs and EF skills. Pearson r-values were converted to z scores to ensure that measures were normally distributed. For this analysis, the packages "metafor" [Viechtbauer, 2010] and "robumeta" [Fisher & Tipton, 2015] for R (R Development Core Team, 2015) were used. Following Cohen [1988], we interpreted a correlation coefficient of 0.10 as weak, of 0.30 as moderate and 0.50 or larger as strong. Between studies heterogeneity for each measure was assessed using the index of inconsistency (I2). This calculates a percentage of heterogeneity resulting from study differences that is not due to chance; therefore, larger values indicate greater heterogeneity. Forest plots were created for all analyses.

Measures of Data Quality
We assessed whether nonsignificant results may have been suppressed from the literature. As the response rate to our e-mails asking for unpublished data was poor (2 out of 10 requests), this was particularly important. We examined publication bias through funnel plots, as studies with stronger effects may be more likely to get published and thereby be included in a meta-analysis. However, this type of analysis only offers a subjective measure of potential publication bias. Egger's regression test [Egger, Davey Smith, Schneider, & Minder, 1997] was employed to offer an objective view. This is best suited to small meta-analyses (<25 studies) and evaluates if effect estimates and sampling variances for each study are related.

Moderator Analyses
In all comparisons, we ran meta-regression analyses to identify potential moderators for the relationships. These were: age, diagnosis (ASD vs. TD), type of RRB scale (diagnostic vs. RRB specific) and testing modality (experimenter-administered vs. computer-administered). Age and diagnosis were examined further to explore the developmental trajectory for the relationship between EF skills and RRBs. If the continuous age effect was significant, we ran an additional analysis in which we split the factor into three age categories: child (0-11 years old), adolescent (12-18 years old), and adult (19 and above), following Van Eylen et al. (2015). This was to pinpoint whether the relationship is at its strongest during a stage of development. We explored the moderating effect of testing modality (computerized vs. experimenter administered) because it has been suggested that the difficulties of individuals with ASD in experiment-administered EF tasks may be due to their social interaction problems [e.g., Perner & Lang, 1999]. For the RRB scale moderator analysis, we divided the scales into two types of assessment: diagnostic and specific. The diagnostic measures comprised observations and interviews used to diagnose ASD (e.g., the ADOS and ADI-R), whereas the specific measures were created for the sole purpose of measuring RRBs (e.g., the RBQ). We explored the differences between these two types as, although the they have a similar structure, large differences are found between them. This is likely to reflect the depth of analysis. Whereas the ADI-R uses 12 items to assess RRBs, the RBS-R includes 44-item questions divided into six sub-scales [Lam & Aman, 2007]. These differences might produce variations in the results. For our set shifting analysis, we ran a moderator analysis that examined types of EF task (WCST vs. others), since research has found particularly strong relationships between performance on the WCST task and RRB levels [e.g., South et al., 2007]. We wanted, additionally, to examine the effect of IQ on the relationship between EF and RRB levels but were unable to do so as insufficient information was available.

Results
The first analysis tested the hypothesis that there is an association between high levels of RRBs and poor performance on set-shifting tasks. The second examined the strength of the relationship between RRBs and inhibitory control measures. The final analysis investigated if a similar relationship can be found between high RRB levels and performance on parental-rated EF measures. Note that in all analyses the EF measure is of errors, so both scores (the EF measure and RRBs) are scored in the same direction with higher values indicative of poor psychological functioning.

Meta Analysis 1: Set Shifting Scores and RRB Levels
The performance based set shifting analysis revealed a summary correlation and 95% CI indicative of a significant, but modest relationship with RRB levels [r = 0.31; 95% CI (0.19, 0.41), P < 0.0001]. Figure 2 presents a forest plot of effect sizes. The contourenhanced funnel plot (Fig. 3) indicates a low risk of publication bias, as it does not show an overrepresentation of effect sizes in the significance contour and points fell on both sides of the summary effect size. Egger's regression confirmed this by revealing no overall evidence of small study bias (P = 0.44). Since there was no sign of publication bias, we did not run a trim-and-fill analysis [Vevea & Woods, 2005]. A set of influence diagnostics, derived from standard linear regression, identified none of the studies as potential outliers [Viechtbauer & Cheung, 2010]. The degree of heterogeneity between effect sizes, I 2 = 65.80% (95% CI; 42.2, 83.2), represents moderate variance. Given that a heterogeneity score around 25.00% is considered low, 50.00% moderate, and 75.00% high [Higgins, Thompson, Deeks, & Altman, 2003], we can infer that 65.80% of the proportion of observed variation can be attributed to the actual difference between the studies, suggesting that a few moderators may have had an influence on the results. Accordingly, moderator analyses were performed to identify sources of heterogeneity.
Moderator analyses: We found no moderating effects for age, diagnosis, type of RRB scale, testing mode or type of EF scale. Table 4 summarizes the effects of each moderator.

Meta Analysis 2: Inhibitory Control SScores and RRB Levels
A significant, weak to modest, relationship was found between the inhibitory control measures and repetitive behavior levels [r = 0.21; 95% CI (0.04, 0.37), P = 0.02]. See Figure 4 for forest plot. Egger's regression found no evidence for study bias (P = 0.38). A contour-enhanced funnel plot showed a low risk of publication bias (see Fig. 5). A set of diagnostics derived from standard linear regression identified none of the studies as potential outliers [Viechtbauer & Cheung, 2010]. The I 2 for the inhibitory control analysis was 77.75% (95% CI; 59.00, 91.18), so moderator analyses were performed to identify sources of heterogeneity.
Moderator analyses: The analyses revealed that part of the heterogeneity in the model between inhibitory control performance and RRB levels was caused by an age effect [Q(1) = 4.32, P = 0.04]. We examined this effect further and found a positive relationship between inhibitory control and RRBs in adolescents (r = 0.29, P = <0.0001, CI (0.16-0.42), k = 5) and adults (r = 0.50, P = 0.0002, CI (0.24-0.78), k = 4), but not in children (r = 0.02, P = 0.88, CI (−0.25-0.29), k = 9). The strength of the relationship between RRB levels and inhibitory control becomes stronger with age.
We found no effects for diagnosis, testing mode or type of RRB scale. Table 5 summarizes of the effects of each moderator.

Meta Analysis 3: Parent-rated EF Scores and RRB Levels
The parent-rated EF analysis showed a summary correlation and 95% CI indicative of a significant, modest, relationship with repetitive behavior levels (r = 0.33; 95% CI (0.04, 0.62), P <0.03). See Figure 6 for a forest plot. Egger's regression test showed no evidence for small study bias (P = 0.44). Our contour-enhanced funnel plot presented in Figure 7 indicated a low risk of publication bias. A set of diagnostics derived from standard linear regression identified none of the studies as potential outliers [Viechtbauer & Cheung, 2010]. The degree of heterogeneity between effect sizes was 90.98% (95% CI; 78.3, 97.7). This suggests that a high proportion of observed variation can be attributed to the actual difference between the studies. We carried out moderator analyses to identify the sources of this heterogeneity.
Moderator analyses: These revealed that part of the heterogeneity in the model correlating parent-rated EF measures and RRB levels was caused by the type of RRB measure used, [Q(1) = 8.50, P = 0.004. We split the measures into two factors: diagnostic and RRB specific. We found a positive relationship between parent rated Figure 2. A forest plot containing effect sizes and 95% confidence intervals for the relationship between RRBs and set shifting performance with the impact of RRB scale and if the task was administered by a computer vs. an experimenter. Dyad refers to a measure that is used to diagnose autism (e.g., ADOS) whereas a specific measure is a questionnaire that only examines repetitive behavior (e.g., RBQ). measure and RRBs when assessed through a RRB specific measure (r = 0.50, P <0.0001, CI(0.28-0.73), k = 7), but not when conducted using a diagnostic measure (r = −0.23, CI(−0.86-0.39), P = 0.46, k = 3). This analysis thereby suggests that the relationship between parent ratings and RRBs are stronger in studies that examine RRBs through measures that were created for the sole purpose of measuring RRBs.
Finally, we found no evidence to suggest that the relationship between parent rated EF measures and RRB levels was caused by an age effect or diagnosis. Table 6 summarizes the effects for all the moderator analyses.

Discussion
These meta-analyses are the first of their kind to gather all the available evidence concerning the relationship between RRB levels and performance on set-shifting, inhibitory control, and EF parental-report ratings. They revealed moderate but significant associations between high levels of RRBs and errors in two behavioral measures, set shifting and inhibitory control, as well as parental report. Age and the type of RRB scale moderated the inhibitory control and parental report results, respectively. Yet diagnosis, testing modality, and type of EF measure did not have an impact on the results. We discuss three implications of these findings, which we examine in relation to each other. First, the significant relationships in each meta-analysis suggest that recent analyses of RRBs have been hasty to reject the EF hypothesis. These skills may play a role in the development of the behaviors or vice versa. Second, the extent to which age moderate inhibition should be researched further, as this finding may offer support to a framework in which EF skills must be considered in combination with developmental factors. Finally, future research should examine whether individual factors involved in each type of EF measure may pinpoint what relates them to repetitive behaviors.
The significant associations between RRB levels and poor EF skills suggest that attention needs to be refocused on the EF account, as the relationship between EF impairments and RRB may be more central in ASD than what has been suggested in key analyses of the origins of repetitive behavior [Leekam et al., 2011;Lewis & Kim, 2009]. Two findings are of interest. First, set shifting effects were stronger than those for inhibitory control. For ASD individuals, this finding is perhaps not surprising, considering that impairments have often been identified for set shifting skills, but the role of inhibition has been less consistent [Lai et al., 2017]. Second, the strongest effects were uncovered in the parental report measures. These need to be considered further as they may support the view that these measures are more ecologically valid than psychometric assessments [Rabbitt, 1997]. We return to these two effects in more detail. Our findings nevertheless offer support to the recent meta-analyses that show strong evidence for the view that overall EF performance, as well as performance on separate EF skills are identified in individuals with ASD and appear to be essential to their control of thoughts and behaviors [Demetriou et al., 2018;Lai et al., 2017]. They are also consistent with a spurt of recent research linking elevated RRB levels to set shifting [e.g., Miller et al., 2015] and inhibitory control impairments [e.g., Mosconi et al., 2009].
Despite the significant correlations in the meta-analyses, it is unlikely that the EF account explains the full range and intensity of behaviors which are so prevalent in typical preschooler and persist in ASD. Autism has a strong genetic component [Tick, Bolton, Happé, Rutter, & Rijsdijk, 2016], but this needs to be partly channeled through other nonshared environmental factors  identified by Sandin et al. [2017]. The associations we identified highlight that issues to do with self-control are involved in the manifestation of repetitive behaviors, and the need not only to reopen the EF account, but also to explore the relationship within longitudinal research designs, training studies and the possible mutual influences of genetic factors and nonshared environmental influences on the development of EF. In addition to the research that has found strong evidence for a hereditary component in ASD, studies on the topic [e.g., Van Eylen et al., 2015] and indeed our analyses, have suggested that sample characteristics may moderate the relationships between EF skills and RRBs. Age moderated the relationship with inhibitory control in adolescents and adults, but not children. As previous research has shown that young children with or without ASD engage in high levels of RRBs, and that children with ASD show strong evidence for set shifting but not inhibitory control impairments [Lai et al., 2017], these findings suggest that inhibitory control skills may not initially play a role in the development of RRBs. Indeed, it is possible that the inhibitory control skills only play a role in the development of higher-level RRBs which necessarily develop later, consistent with Mosconi et al.'s [2009] finding of relationships between these skills and higherlevel RRBs in adolescents, but not children.
Alternatively, it is possible that such age-related findings are caused by measurement issues. The inhibitory control tasks for adolescents and adults used in the current meta-analysis tend to include a wider range of skills than those for children. In the widely used Stroop task, participants identify a color name printed in different colored inks, which then interferes with naming the color of the ink. This involves inhibition of an overlearned response, but it also requires set shifting skills in order to successfully switch between a wide variety of stimuli. This differs in complexity from the child-friendly "knock-do not-knock" task, where participants first match Figure 4. A forest plot containing effect sizes and 95% confidence intervals for the association between inhibitory control tasks and RRB levels and the impact of age and if the task was administered by a computer vs. by an experimenter.
the actions of the examiner (knocking the table top with their knuckles or flat of their palm) and then have to respond with the opposite action to the action of the examiner. Although this task is difficult for children, set shifting demands are not high. The same measurement issues are not present in all the studies included in the set shifting analysis, as two of the widely used tasks in this meta-analysis were the ID/ED and the WCST. These are very similar as both require the ability to identify a relevant rule, maintain it and shift between different rules, making it possible that set shifting or working memory skills moderate the analysis. There is evidence to suggest that simpler forms of set shifting in children do not relate as closely to RRBs [Dichter et al., 2010]. Thus, task demands in tests for adults and children might explain variations between studies between these groups.
In addition to the age-related findings in our inhibitory control analysis, we also identified stronger correlations between RRBs and parental report measures when the skills were measured through RRB specific assessments, not clinical measures. Parental questionnaires are rated by the individuals who know the children best, allowing them to consider behaviors across a wider range of situations and settings, and potentially providing a better perspective on a child's behaviors than a brief test in a clinical setting. The RRB specific measures also cover a wider range of behaviors in a single scale so they consequently identify more, making it possible that the questionnaires tap onto some RRBs that the clinical tools do not. Nevertheless, this association may be explained through the fact that both EF parental reports and RRB specific measures are scored by parents. These factors highlight another potential measurement issue in the literature and emphasizes the need to create more robust and convergent measures to tackle this inconsistency.
Instead of moving away from the EF account, recent research and the results of these meta-analyses lead to a need to consider issues to do with control in explaining RRBs. We are not arguing that these higher functions can explain why such behaviors continue, as the amount of variance still to be accounted for in each analysis was large. We suggest that EFs should be explored in combination with other models. For example, a cognitive framework would offer support for Leekam et al.'s developmental account to explain why stereotypies reduce over time in typical development. Specifically, we suggest that these so-called "immature responses" must be driven by cognitive processes, which consider that the behaviors reduce as infants develop goal-directed actions. Moreover, a cognitive framework is not incompatible with claims that genetic factors [Möhrle et al., 2020;Lewis & Kim, 2009] and an imbalance of corticostriatal connectivity play important roles in the development of RRBs [Abbott et al., 2018]. Abbot et al. found decreasing connectivity in limbic corticostriatal circuits with age and an underconnectivity in frontoparietal and motor circuits that did not increase with age. A joint model could account for why these behaviors change with age in typical development, and why they persist in individuals with ASD.
A correlational meta-analysis has high informative value in the current context, as it identifies shared variance in large chunks of data between factors notwithstanding this analytic strength, it is important to emphasize also that it does not determine cause or effect. The literature is not yet positioned to conclude on the direction of causality, and this remains an important objective for future research. In addition, the current  state of the literature has also identified some limitations, such as the difficulty with determining if cognitive abilities confound the picture. Some studies controlled for Full-Scale IQ (FSIQ) and found that the correlation remained [e.g., Jones et al., 2018;Van Eylen et al., 2015], others controlled for VIQ and found that the relationship disappeared [Yerys et al., 2009]. Moreover, whereas some studies in our analyses included representative samples of individuals below the average IQ range [Reed et al., 2013;Jones et al., 2018], most of them were confined to individuals in the average IQ range, making it more difficult to explore the role of IQ in the relationship. Due to the limited amount of data on the topic, we were also unable to assess the relationship between RRBs and EF skills in other developmental disorders in which the behaviors and the EF difficulties are prevalent, such as OCD and Williams Syndrome. Surprisingly, we found a small amount of variance between the EF skills and RRBs. This can suggest that moderators influence the results. For example, a specific RRB sub-type, such as higher-level behavior may have had an impact on the correlation. For example, it is possible that insistence of sameness and intense interests rely on set shifting and inhibitory control skills as the behaviors require the ability to modify a cognitive rule [Turner, 1997] and inhibit incorrect responses [Diamond, 2013]. Alternatively, gender could have played a role. Recent research by Antezana et al. [2019] found that boys with ASD engaged in high levels of stereotyped behaviors and restricted interests, and girls in high levels of compulsive, insistence of sameness and self-injurious behaviors. Despite findings like these, it was not possible to include sub-types or gender differences as moderators in our analyses as only one of the studies provided gender data [Bölte et al., 2011] and only a few studies examined behaviors in sub-groups [Boyd et al., 2009;D'Cruz et al., 2013]. Future studies should examine this further as Bölte et al. found stronger correlations in males and Boyd et al. Figure 6. A forest plot containing effect sizes and 95% confidence intervals for the association between parent-rated EF tasks and RRB levels and the impact of diagnosis and diagnostic vs. specific RRB measures.
found that shifting and inhibition predicted rituals and sameness, but not restricted interests.
The task impurity issues that were emphasized in this analysis highlight that we should assess if similar associations can be identified between RRB levels and other EF skills, such as planning and working memory. We could not consider these in our analyses due to the lack of studies on the topic. Future studies should examine the relationship between the skills and behavior as the limited research in the area on planning [e.g., Jones et al., 2018;Van Eylen et al., 2015;Sachse et al., 2013] and working memory [Joseph & Tager-Flusberg, 2004;Van Eylen et al., 2015] skills have not predicted RRB scores. Future research should also focus on disentangling different EF measures to pinpoint what it is about the tasks that make them correlate with these behaviors. We suggest that set shifting measures are of interest. Not only did they produce stronger associations than the inhibitory control measures, they were also the only skills that predicted RRBs in children. This is in line with a previous review by Geurts et al. [2009] which concluded that isolating crucial cognitive processes will aid in ultimately resolving the gap between inflexibility in daily life, and the ability measured in the set shifting tasks. Recent developments highlight that set shifting processes such as the ability to activate previously irrelevant stimuli may be of further interest, as these errors have been found to play an important role in set shifting development in both children and adults [e.g., Müller, Dick, Gela, Overton, & Zelazo, 2006;Maes, Damen, & Eling, 2004;Maes, Vich, & Eling, 2006]. Moreover, a wide variety of RRB measures are used, where the response options range from identifying whether a behavior is present/not present, its frequency, intensity or its impact upon others, making it hard to compare the responses. Some studies also implement nontraditional measures that are not RRB specific (e.g., CY-BOCS) and some state that they measure RRBs with scales that do not focus on the four types of RRBs identified in the ICD-10 (e.g., the SRS mannerism scale), making it difficult to assess about the full picture. Future research should separate sub-groups of lower-level and higher-level behaviors. These two subtypes may have different causes and could therefore help to explain the inconsistent results, as well as why we found no associations between high RRBs and poor inhibitory control skills in children. To examine these factors in more depth may help identify more causal links in the relationship between specific set shifting errors and RRBs. It may also result in clinical or educational interventions that have the potential to help manage difficult and persistent RRBs. for the association between parent-rated EF task performance and RRBs levels. Contour lines are at 1%, 5%, and 10% levels of statistical significance.