Reward Devaluation in Autistic Children and Adolescents with Complex Needs: A Feasibility Study

Rewards act as a motivator for positive behavior and learning. Although compounding evidence indicates that reward processing operates differently in autistic individuals who do not have co‐occurring learning disabilities, little is known about individuals who have such difficulties or other complex needs. This study aimed first to assess the feasibility of using an adapted reward devaluation paradigm to examine basic reward processes in this underrepresented population, and second to investigate whether autistic children and adolescents with complex needs would show dynamic behavioral changes in response to changes in the motivational value of a reward. Twenty‐seven autistic children and adolescents with complex needs and 20 typically developing 5‐year‐old children took part in the study. Participants were presented with two visual cues on a touchscreen laptop, which triggered the delivery of a video, music, or physical reward. One of the rewards was then presented in abundance to decrease its motivational value. Participants showed decreased interest in the video and music rewards after devaluation. The experimental setup was found to be suitable to test individuals with complex needs, although recommendations are made for the use of physical rewards. The results suggest that autistic participants with complex needs demonstrate goal‐directed behavior and that it is feasible to develop experimental paradigms that can shed important light on learning processes that are fundamental to many education and intervention strategies for this population. Autism Res 2020, 13: 1915‐1928. © 2020 The Authors. Autism Research published by International Society for Autism Research and Wiley Periodicals LLC.


Introduction
Rewards act as a motivator for positive behavior and learning. Although rewards are widely used in various educational and support settings for children with an autism spectrum disorder (ASD), it is unclear whether the status and dynamics of rewards (e.g., how long a reward remains desirable) are similar in ASD children compared to typically developing (TD) children. In addition, much of the research on reward processing in ASD (and indeed on autism) focuses on individuals without significant learning disabilities, whilst we know very little about the approximately 30% of individuals who present with very significant learning disabilities, language impairments, and other significant barriers to participation in everyday activities. The main aim of the present study, therefore, was to investigate a basic reward learning processreward devaluation-in a group of autistic children and adolescents with such complex needs.
The role of rewards in the learning process relates intrinsically to an individual's understanding of the consequence of their actions. The dual-system theory describes effective behavior as predominantly habitual or goaldirected [Dickinson, 1985]. Habitual actions are mechanistic and consist of actions triggered by the presence of an associated cue. In contrast, goal-directed actions are characterized by an awareness of the action's outcome and a direct response to its motivational value (i.e., the individual seeks to either obtain a reward or avoid a punishment). The difference between these two types of action manifests in their sensitivity to dynamic changes in the motivational value of the reward [Horstmann et al., 2015]. While habitual actions do not depend on the reward value of their outcome, the frequency of goal-directed actions is expected to follow the value of their associated reward. Evidence shows that TD young infants are able to detect and adapt their behavior to action-outcome contingencies. For instance, young infants increase the frequency of their behavior (e.g., sucking, limb movement) when that behavior is associated with an immediate positive reward [Kalnins & Bruner, 1973;Siqueland & DeLucia, 1969], but not when a positive outcome is presented independently from their action [Rovee & Rovee, 1969;Rovee-Collier, Morrongiello, Aron, & Kupersmidt, 1978]. In early development, these learning patterns are not, however, goal-directed.
A classic experimental paradigm in animal research to distinguish habitual from goal-directed behavioral control is the selective satiation or devaluation procedure [Adams & Dickinson, 1981;Colwill & Rescorla, 1985]. This procedure consists of providing a reward in abundance (until satiety) before testing to decrease its motivational value. Typically, the animals subsequently show diminished motivation to perform the action associated with this reward, known as the "devaluation effect." Adaptations of the devaluation paradigm for infants and children indicate that the goal-directedness of actions is acquired over the course of development: children under 24 months show no or only a transient devaluation effect, whereas older children show a persistent devaluation effect [Kenward, Folke, Holmberg, Johansson, & Gredeba, 2009;Klossek, Russell, & Dickinson, 2008;Klossek, Yu, & Dickinson, 2011].
Many interventions for individuals on the autism spectrum (e.g., approaches based on the applied behavior analysis) [Cooper, Heron, & Heward, 2007] assume that autistic individuals follow a goal-directed model of behavior. Due to the developmental nature of the disorder, however, it is possible that some autistic individuals, in particular those who present developmental delays, may not have acquired a goal-directed model for action selection. As a result, rather than adapting to changes in the motivational value of rewards, they might instead base their behaviors solely on the cue-reward contingency and respond habitually. In the context of autism, the dichotomy between habitual and goal-directed actions also resonates with the tendency for autistic individuals to engage in restricted and repetitive behaviors (RRBs), one of the core clinical criteria of ASD as defined by the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) [American Psychiatric Association, 2013]. RRBs include stereotypical, repetitive sensory, and motor behaviors on the one hand (such as rocking and finger flicking), and insistence on sameness on the other hand (which manifests in rigid routines, rituals and a tendency to persevere in a given behavior). On the one hand, we could speculate that RRBs represent an imbalance toward habitual actions. Alternatively, RRBs might have a disproportionate intrinsic motivational value [Cascio et al., 2012], in which case the tendency to engage in RRBs could be seen as goaldirected. Some evidence from clinical data and animal models indicates an association between RRBs and both functional and structural alterations in the cortical-basal ganglia circuitry which is involved in reward processing [Langen, Durston, Kas, van Engeland, & Staal, 2011;Lewis & Kim, 2009], although the nature of RRBs is still largely unclear. The devaluation effect might therefore not manifest in autistic people either because their reward system is developmentally less mature, or because a reward provided in abundance might not actually devalue but instead increase in value by virtue of its very repetition. Our first goal in this study is therefore to assess whether autistic individuals show a devaluation effect.
An important question for reward processing in ASD is whether rewards of a social nature are processed differently compared to nonsocial rewards. This question is anchored in the idea that autistic individuals are intrinsically less motivated by social stimuli such as faces and scenes involving social interactions (social motivation hypothesis) [Chevallier, Kohls, Troiani, Brodkin, & Schultz, 2012;Dawson, Webb, & McPartland, 2005], or even that social stimuli are in fact aversive for ASD individuals (social punishment hypothesis) [Tanaka & Sung, 2016]. A recent review [Bottini, 2018] concludes that results are very mixed in terms of empirical support for the social motivation hypothesis. Instead, the review underlines that a substantial amount of evidence supports more domain-general differences in reward processing in ASD [Dichter & Adolphs, 2012;Kohls, Antezana, Mosner, Schultz, & Yerys, 2018;Kohls, Yerys, & Schultz, 2014, but see Demurie, Roeyers, Baeyens, & Sonuga-Barke, 2011Ewing, Pellicano, & Rhodes, 2013;Pankert, Pankert, Herpertz-Dahlmann, Konrad, & Kohls, 2014 for null findings]. Autistic adults, for example, have reported decreased responsivity to rewards [Soderstrom, Rastam, & Gillberg, 2002, and adolescents and young adults with an ASD were found to make decisions that were less consistently dependent on the reward value of their choice [Johnson, Yechiam, Murphy, Queller, & Stout, 2006].
Importantly, all the studies included in the recent review by Bottini [2018] involved ASD individuals with IQs of 80 or above, and virtually no evidence exists about reward processes in the approximately 30% of autistic individuals who have IQs in the range of profound learning disabilities (IQ < 50). Various barriers exist that make research with this population challenging, both for ethical concerns (e.g., consent and assent) and practical reasons (e.g., ability to understand the task). Recent evidence suggests that nearly 90% of autistic individuals with an IQ < 70 also present at least one comorbid DSM disorder [Salazar et al., 2015], most commonly depression and anxiety disorder (69%), ADHD (61%), and general anxiety disorder (54%), which often results in the exclusion of these participants from research protocols. Finally, the at least 30% of autistic individuals who remain minimally verbal [Howlin, Savage, Moss, Tempier, & Rutter, 2014;Pickles, Anderson, & Lord, 2014;Tager-Flusberg et al., 2017] and individuals who present disruptive or challenging behaviors such as pervasive sensory and repetitive behaviors regularly fail to meet inclusion criteria for research. There is likely a significant overlap between these populations; for the purpose of this research, we will refer to autistic individuals who also present significant difficulties that prevent them to engage independently in daily living tasks as well as research protocols as autistic individuals with complex needs. The selection bias against autistic individuals with learning disabilities is widespread in the literature [Russell et al., 2019] and it is critical to rectify it because such individuals with complex needs have the greatest need for support that should come in the form of evidence-based practices. Models of autism are currently rooted in research that predominantly does not include individuals with complex needs. As a result, some of the interventions implemented might be inefficient or even detrimental. As argued above, there are a number of reasons why autistic individuals with complex needs might adapt their behaviors differently to the reward contingencies that are part of many education and intervention strategies. In the present study, we therefore focus on this underrepresented population.
Our aim in this study is twofold: (a) test whether autistic children and adolescents with complex needs show goal-directed behavior based on the motivational value of a reward; (b) in the process, evaluate a procedure that could, in principle, be further developed to probe other basic learning processes in this group. Based on the findings that younger children show no or only a transient devaluation effect, we predicted that autistic children with complex needs would not show a sustained devaluation effect.
The principal objective of the current study was to examine reward devaluation processes in a group of autistic individuals who are typically excluded from research due to their complex needs. Because of the highly heterogenous and individual nature of complex needs, and the challenges of evaluating accurate levels of functioning in this population, seeking to formally match our experimental group to control participants on some criterion would be near-impossible. In addition, our aim was not to compare our group's performance to nonautistic peers, but to establish whether or not they showed goaldirected behavior, which can be addressed without a comparison group. Because of the novelty of the experimental design, however, we tested a group of young TD children separately to ensure that the paradigm elicited the expected reward devaluation phenomenon (Study 1). The experimental study is reported subsequently as Study 2.

Stimuli and Apparatus
The experimental design was adapted from Klossek et al. [2011]. The procedure was presented as a touchscreen game on a Dell Inspiron laptop using Eprime 2.0 (Psychology Software Tools, Inc.). A pilot study showed that the presence of a keyboard was distracting for participants; therefore, it was concealed and the setup was kept as plain as possible. In Study 1, a two-in-one laptop was available and its keyboard was concealed by rotating it behind the chassis and rewards consisted of short video clips. In Study 2, the laptop had rugged protection and was encased in a cardboard box so the keyboard was not accessible. Because we had less information about what types of stimuli would be rewarding for autistic individuals with complex needs, participants were given the opportunity to perform the task in three reward conditions: video reward, music reward, and physical reward;. A custom-built reward dispenser was placed at the back of the laptop and connected via USB port. The dispenser consisted of two tubes that dispensed two different types of rewards and was controlled electronically so that the reward delivery was entirely automated and not mediated by the experimenter. Rewards were delivered in seethrough capsules and rolled down an inclined tube concealed in the cardboard box, coming out into a small tray in front of the participant (see Fig. 1).
Cues. During the task, participants were required to touch cues to trigger the presentation of certain rewards. The cues consisted of two identical images that only differed in color (red or yellow; blue or green). Cues were flowers, fishes, and stars in the video, music, and physical reward conditions, respectively. All cues appeared in a white 1280 × 720 px rectangle and were displayed on a black background. Color-reward associations were counterbalanced across participants.
Video rewards. Four color, nonverbal children-friendly cartoon sources were selected: Tom & Jerry © , Minions © , Twirlywoos © , and Pixar © For The Birds short film. Four different 4-sec 1280 × 720 px clips were created from each cartoon, and each clip represented a meaningful action unit (e.g., Jerry hitting Tom with a pan).
Music rewards. Four different tunes were selected for each of two types of children-friendly music (eight clips altogether): nursery rhymes and instrumental tunes. Each tune was trimmed to a 4-sec clip, which represented a meaningful musical unit. For instance, one of the music clips consisted of the phrase "Humpty-Dumpty sat on a wall".
Physical rewards. For all but one participant, rewards consisted of stickers. Stickers varied for each participant but were always drawn from two different "series" such as smiley faces, animals, lorries, and so on. For one participant, based on parents' recommendation, rewards consisted of slices of carrots and raisins.
Each participant saw one example of each type of reward (other examples within the category were used if one session was interrupted and the task was run a second time), which were counterbalanced across participants.
The devaluation paradigm comprised the following five phases (see Fig. 2

):
Demonstration phase (six trials). A single cue was presented on the screen (either left, center, or right). The participant was encouraged to touch the cue using simple verbal instructions and gestures, upon which the cue disappeared to reveal a short clip in the same position (in the video and music reward condition), or a 500-msec blank screen as the reward rolled down (in the physical reward condition). Both cue-reward pairs were presented three times (once in each possible position) in pseudorandom order.
Acquisition phase (16 trials). For each trial, both cues where presented simultaneously on either side of the screen.
The participant was encouraged to touch one of the two cues, which triggered the delivery of the corresponding reward as in the demonstration phase. Each cue-reward pair was presented eight times, presentation order was pseudorandomized and position of each cue-reward pair on the screen was counterbalanced across trials.
Devaluation phase (single trial). One of the two cuereward pairs was selected at random as the to-be-devalued stimulus. The cue appeared centrally on the screen and the participant was encouraged to touch it, which triggered the delivery of the corresponding reward ten times in a row.
Extinction phase (16 trials). Similar to the acquisition phase, with the exception that touching a cue did not trigger the delivery of any reward (simply moving on to the next trial). This was the critical phase in which we assessed whether children chose either the devalued or nondevalued cue more often without any reinforcement.
Reacquisition phase (16 trials). Identical to the acquisition phase. The primary aim of this phase was to allow the participant to obtain the reward once again, although it also provided insight into the persistence of the devaluation effect.

Repetitive Patterns of Responses
We aimed to evaluate whether participants adapted their behavior to the dynamic motivational value of a reward. Because autistic individuals tend to engage in RRBs, however, their responses in the task might have been driven by repetitive patterns and insistence on sameness. We hypothesized that rigid patterns of responses could include: (a) persistent response ([11…11]  ). For each participant, we aligned the sequence of 16 responses in the acquisition phase (before the reward value manipulation) with the 8 possible sequences representing rigid patterns of responses, and calculated a similarity score by summing the number of "matches". For instance, the response sequence [1221211122121121] and strictly alternating response sequence [1212121212121212] get a similarity score of 8. Participants with a perfect or near-perfect similarity score (15 or 16) were excluded from the devaluation analysis.

Data Analysis
We compared the frequency of choosing the devalued reward between acquisition and extinction (devaluation Figure 1. Experimental setup: a touchscreen laptop was encased in a cardboard box. The reward dispenser was placed at the back, and a tube was concealed in the box that delivered encapsulated physical rewards into a small tray at the front of the display. effect), between extinction and reacquisition (persistence of devaluation effect) and between acquisition and reacquisition (stability) in each condition. To facilitate comparison between conditions, the devaluation effect is also reported as the percentage difference in frequency between extinction and acquisition phases. A more negative percentage indicates a greater devaluation effect. Finally, as a crude measure of prevalence, the proportion of participants who showed an individual devaluation effect more negative than expected by chance (i.e., lower than 6.7%) is reported.

Study 1: Validation Study
The study was approved the University of Sussex Crossschool Research Ethics Committee (ER/MT346/1). Informed consent was obtained from all participants' caregivers.

Participants
Twenty TD children were recruited from reception class at a mainstream primary school in Sussex (4 females, mean age 64 ± 3 months, age range 60-68 months) on the basis of having no diagnosis of a learning disability nor any developmental or neuropsychological disorder.

Background measures
Participants completed the British Picture Vocabulary Scale third edition (BPVS3) [Dunn & Dunn, 2009]. Two children scored too low to obtain a standardized score (raw score = 33) or an age equivalent (raw score 49, standardized score 83). The latter failed to complete the experimental task and was excluded from the subsequent analysis. BPVS3 scores, equivalent age based on the BPVS3, and chronological age are reported in Table 1.

Experimental Procedure
Participants were tested on a one-to-one basis in a quiet room in their school over the course of a single session. One participant was unable to complete the task. In addition, similarity scores resulted in the exclusion of one participant (strictly alternating response side). Results below are therefore reported for 18 participants. (1) Demonstration phase: participants learn to associate two cues to distinct rewards (6 trials); (2) Acquisition phase: in each trial, participants select one cue and obtain the corresponding reward (16 trials); (3) Devaluation phase: one cue-reward is selected at random and the reward is dispensed ten times in a row; (4) Extinction phase (test phase): in each trial, participants select one cue, no reward is dispensed (16 trials); (5) Reacquisition phase: identical to acquisition (not represented here; 16 trials). Figure 3 shows the frequency of selecting the devalued video across phases, which was lowest during extinction. Planned paired-sample t tests confirmed a significantly higher frequency of choosing the devalued video reward in acquisition compared to the extinction phase (t(17) = 2.74, p = 0.014, d = 0.65). There was, however, no significant difference in frequency between extinction and reacquisition phases (t(16) = −1.71, P = 0.107, d = 0.41) nor between acquisition and reacquisition phases (t(16) = 0.66, p = 0.521, d = 0.16). The mean devaluation effect was −11.1%. Devaluation was observed in 12/18 individual participants (67% prevalence).

Results
The results show that TD children aged five changed their behavior in response to the changing value of a reward, as demonstrated by a devaluation effect. As a result, Study 1 confirmed that the experimental task was successful as a devaluation paradigm suitable for young children.

Study 2: Experimental Study
The study was approved by the Department of Psychology Research Ethics Committee at City, University of London (PSYETH (S/F) 15/16 179). Informed consent was obtained from all participants' caregivers.

Participants
Twenty-seven children and adolescents (4 females, mean age 105 ± 39 months, age range 50-184 months) were recruited from Special Educational Needs (SEN) schools in London and Sussex, UK, a majority of which were autism specialist schools (20/27 participants) on the basis of a diagnosis of ASD known to the school. Admission to an SEN school requires a clinical diagnosis through relevant assessments and following the National Institute for Health and Care Excellence (NICE) guidelines. Clinical diagnoses and other co-occurring conditions were confirmed by parent report and where possible with the information available in participants' Education, Health and Care Plan (EHCP; made available for 23/27 participants), which is a legal document issued by local authorities in the UK to describe a child's special educational needs (following appropriate assessments). All participants held a clinical diagnosis of Autism or ASD. In addition, 15 presented with language impairment or language delay (including 10 who were minimally verbal), 9 presented with a learning disability, 6 with Global Developmental Delay, 1 with ADHD, and 1 with Down syndrome. One participant also presented with epilepsy and microcephaly. Finally, ECHPs mentioned sensory difficulties for seven participants.

Background Measures
To further characterize the clinical profile of autistic children, parents were asked to complete the Social Communication Questionnaire (SCQ) [Rutter, Bailey, & Lord, 2003] and the Repetitive Behavior Scale-Revised (RBS-R) [Bodfish, Symons, Parker, & Lewis, 2000]. The RBS-R rates the frequency and severity of RRBs from Note. One Participant scored too low to obtain a standardized score and is reported separately ("low raw score"). Scores are reported as mean (SD; range). 0 (behavior does not occur) to 3 (behavior occurs and is a severe problem) using five empirically derived subscales: stereotypy, self-injurious, compulsive, ritualistic/sameness, and restricted [Lam & Aman, 2007]. Unfortunately, despite being offered individual help, some parents had difficulties filling in the questionnaires because English was not their first language. While we ensured that parents had sufficient understanding of the project to consent for their child to take part, this led to a fairly high proportion of missing data. Overall questionnaires response rates were 59% for the SCQ and 63% for the RBS-R. Two out of 16 (12.5%) respondents provided incomplete responses for the SCQ and 6 out of 17 (35.3%) respondents provided incomplete responses for the RBS-R. The proportion of missing data within respondents was 0.4% for SCQ and 2.7% for RBS-R. Missing values within the incomplete datasets were replaced using multiple imputation [Schafer, 1999;Sterne et al., 2009], which consists of generating replicates data sets and replacing the missing values in each of them by imputed values. Descriptive statistics of questionnaire scores are reported in Table 2. Twelve out of 16 participants scored above the ASD cutoff of 15 on the SCQ. The other four participants scored 10, 12, 13, and 14, respectively, and all had a confirmed clinical diagnosis of ASD.
Despite introducing some flexibility in the administration of the task (see procedure), nine participants were not able to complete the BPVS3, mostly because they did not engage with the task or did not show sufficient understanding of the task (e.g., touching all the items in turn). Of the 18 other children who completed the BPVS3, only 4 scored high enough to obtain a standardized score. As expected for a group of participants with complex needs including a large number of individuals with language difficulties and developmental delay, participants showed a wide range of scores, with a mean raw score of 45 ± 35, ranging from 7 to 115. BPVS3 scores, chronological and equivalent age based on the BPVS3, are reported in Table 3.

Experimental Procedure
Participants were tested in a quiet room in their school (except for one participant who was tested at City, University of London) either on their own or in the presence of a familiar adult over one to four sessions (four sessions on average), some of which included related experimental tasks which are not included in this paper. Each session lasted about 30-40 min. Particular effort was put into making the sessions comfortable for participants with complex needs, including extra familiarization time with the experimenter, breaks as required, minimal verbal instructions and multiple opportunities to complete the task over the sessions when one attempt was unsuccessful. Additional adaptations were provided to individual participants based on behavioral observation and, where available, the advice of the familiar adult accompanying the child. General and individual adaptations are described and reported in Table 4. Sessions were interrupted whenever participants showed signs of discomfort or distress.
Despite these adaptations, some participants engaged in disruptive or challenging behaviors which are summarized in Table 5. Due to time constraints, 1, 7, and 3 participants did not complete the task in the video, music, and physical reward condition respectively. Technical issues resulted in invalid data for 3 participants in the video condition and 1 participant in the physical reward condition. In addition, 4, 2 and 5 participants were not able to complete the task in the video, music, and physical reward condition respectively because they either showed no interest for the task or were not able to sustain their attention for the duration of the task. Finally, similarity scores resulted in the exclusion of 2 participants in the video condition (persistent response and strictly alternating response patterns), 3 participants in the music  Note. Nine participants who were not able to complete the test ("Not completed"), and 14 participants who scored too low to obtain a standardized score ("Low raw score") are reported separately. Scores are reported as mean (SD; range). condition (persistent response and persistent response side patterns) and 5 participants in the physical reward condition (persistent response and persistent response side patterns). Results below are therefore reported for 17, 15 and 13 participants in the video, music, and physical reward conditions, respectively.

Results
Video reward condition. Figure 4 shows the frequency of selecting the devalued video across phases, which was lowest during extinction. Planned pairedsample t tests confirmed a significantly higher frequency of choosing the devalued video reward in acquisition compared to the extinction phase (t(16) = 2.30, p = 0.035, d = 0.56). There was a mild, but not statistically significant, difference between extinction and reacquisition phases (t(15) = −1.96, p = 0.068, d = 0.49) and no difference between acquisition and reacquisition phases (t(15) = 0.05, p = 0.961, d = 0.01). The mean devaluation effect was −9.6%. Devaluation was observed in 9/17 individual participants (53% prevalence).
Music reward condition. One participant with a devaluation score greater than 2 standard deviations above the mean was excluded from the analysis. Figure 5 shows the frequency of selecting the devalued video across phases, which is decreasing throughout the phases. Planned paired-sample t tests confirmed a significantly higher frequency of choosing the devalued video reward in acquisition compared to extinction phase (t(13) = 2.45, p = 0.029, d = 0.66) and reacquisition phase (t(12) = 4.45, p = 0.001, d = 1.24). There was no significant difference between extinction and reacquisition phases (t(12) = 1.45, p = 0.172, d = 0.40). The mean devaluation effect was −11.2%. Devaluation Note. Adaptations were selected by the experimenter based on behavioral observation and/or the advice of a familiar adult (parent, teacher, teaching assistant, or school psychologist).    was observed in 7/14 individual participants (50% prevalence).
Although at the group level, autistic participants with complex needs showed a significant devaluation effect, indicating their ability to perform goal-directed behavior, the prevalence of the devaluation effect in each condition suggests that almost half of the participants did not. Sample size does not allow here for the meaningful analysis of subgroups. However, because this study aims to pave the way for further research into autism with complex needs, two additional tables are provided to further qualify the results.
To provide some insight into how goal-directed behavior depends on functioning level, Table 6 reports the proportion of task completion and the prevalence of devaluation for each task as a function of participants success completing the BPVS3. Visual inspection suggests that both task completion and likelihood to yield a devaluation effect were greater for participants who performed better on the BPVS3. Table 7 further summarizes participants characteristics as a function of their prevalence status (devaluation effect, no effect, reverse effect). Visual inspection indicates a possible relationship between participants' likelihood to show a devaluation effect and their score on the SCQ, although in view of the small numbers in each cell and the amount of missing data in the questionnaires this observation should be taken with great caution.

Discussion
In this study, we examined reward devaluation in a group of autistic children and adolescents with complex needs, using an experimental procedure that should, in principle, be suitable for investigating basic learning processes in this grossly underrepresented group of autistic individuals. The objective of this study was two-fold. First, to test the feasibility of our experimental procedure as a way of involving autistic individuals with complex needs in research. Second, to begin to shed light on a basic learning phenomenon-reward devaluation-that is fundamental to many educational and intervention practices. In the following discussion we first consider how successful the experimental procedures were in eliciting reliable data from participants with complex needs, before discussing the implications of our findings.

Evaluation of the Experimental Setup
Because, to our knowledge, this is the first instance of using a devaluation paradigm with individuals with complex needs, it is important to assess the success of the present setup in engaging participants in the task and measuring the target behavior. In Study 1, all but one child completed the task, and there was a robust devaluation effect, showing that the experimental design was appropriate to engage young children and elicit goal-directed behavior. In Study 2, 22/26 and 18/20 children successfully completed the task in the video and music conditions respectively. For a majority of children, both the video and music reward conditions Note. Participants who did not have the opportunity to complete a certain task, of for whom data were excluded on the basis of either technical issues or repetitive response patterns are not included in the table. A participant is counted as showing a prevalence effect if their devaluation score is greater than expected by chance (i.e., more negative than −6.7%).
were therefore successful in terms of engaging participants. Signs of enjoyment varied between children and conditions but included getting close to the screen, smiling and laughing, verbalizing, singing and dancing along the clips and getting upset when the task ended. Video clips were particularly effective in eliciting interest and sustained attention. In addition, due to their audio-visual nature, they allowed participants with auditory or visual sensory sensitivities to engage with at least some aspects of the stimuli (e.g., participants who looked away or watched through their fingers still received an auditory reward, and participants wearing ear defenders still received a visual reward). We observed fairly robust devaluation effects in both conditions, suggesting that the setup generated meaningful behaviors.
Successful adaptations were made that enabled participants with complex needs to take part in this study including additional opportunities to do the task when completion was not achieved the first time and the use of a touchscreen interface.
In contrast, although 19/24 children completed the task in the physical rewards condition, and despite eliciting a fair degree of curiosity and excitement in participants, the physical reward condition presented with limitations including diminished cue-reward contingency due to the delay between touching the cue and obtaining the reward, the intrinsic motivational value of triggering the dispenser, and the poor discriminability of the chosen types of rewards (stickers of different categories). Future adaptations using a reward dispenser should try and address the immediacy of the cue-reward contingency and increase the discriminability between different rewards. For example, using the same reward dispenser setup as in the present study, small balls of two distinct colors (which can be dispensed without the intermediary of capsules) could be used as rewards.

Goal-directed Actions in ASD and Complex Needs
Our findings showed that contrary to our prediction, autistic children and adolescents with complex needs did show a devaluation effect in the video and music rewards condition. This result goes against our hypothesis that autistic individuals with complex needs might present with a developmentally immature reward processing system. The result also contradicts the hypothesis that Note. Participants who did not have the opportunity to complete a certain task, of for whom data was excluded on the basis of technical issues are not included in the Table. A participant is counted as showing a prevalence effect in one of the tasks if their devaluation score is greater than expected by chance (i.e., more negative than −6.7%) and a reverse devaluation effect if their devaluation score is lower than expected by chance (i.e., more positive than 6.7%). Anything between −6.7 and +6.7 counted as an absence of effect. BPVS3, British Picture Vocabulary Scale 3rd edition in Study 1; RBS-R: Repetitive Behavior Scale-Revised (RBS-R); SCQ, Social Communication Questionnaire (SCQ). a Because of constraints independent from the study, one participant did not get the opportunity to complete the BPVS3.
behavior would be better explained by the tendency to engage in RRBs (although a small number of participants were excluded on the basis of repetitive patterns of responses) and the notion that repeating a stimulus might increase its motivational value on the basis of sameness. Our results suggest instead that participants adapted their behavior to dynamic changes in the motivational value of a reward, demonstrating goal-directed behavior. This is an important stepping stone to pursue further research in this population as rewards are a crucial factor in encouraging participants to engage with research tasks, and to develop adapted support strategies in educational and clinical contexts where rewards are used to elicit desired behaviors such as learning.
Although not the focal point in this study, data from the reacquisition phase gave us some insight into aspects of the devaluation effect: the frequency of choosing the devalued reward increased again in the reacquisition phase in the video condition, but decreased further in the music condition. Future research should determine whether the stability of the devaluation effect depends on the absolute motivational value held by a reward (a highly desired reward such as a video clip might show a more transient devaluation effect compared to a less desired reward such as a music clip).

Limitations and Why They Should Not Stop Us
This study presented some limitations. Sample sizes were fairly small, though the measured effect sizes for the devaluation effect were medium to high. The experimental group was highly heterogenous in terms of age, clinical profile, and general abilities. Furthermore, a small proportion of participants did not succeed in completing the experimental task. The conclusions are therefore likely not representative of the entire autistic population with complex needs. Reporting the prevalence of the devaluation effect also allowed us to highlight some of the heterogeneity within the population and suggested that while a majority of autistic children with complex needs did show goal-directed behavior, a significant number of them did not. Future research could explore whether developmental level or symptoms severity can account for these differences. Questionnaires presented a high proportion of missing data, which was mainly due to the fact that many parents were immigrants to the United Kingdom and not native speakers of English. Two recent studies [Delobel-Ayoub, Ehlinger, & Klapouszczak, 2015;Kelly et al., 2019] report that ASD is both underdiagnosed and shows greater prevalence in populations of lower socioeconomic status (SES) including immigrants, and flag the need to improve autism awareness in these groups. In order to obtain data representative of the autistic population, it is therefore important to include participants from lower SES areas and immigrant backgrounds in research, despite potential shortcomings such as a higher rate of missing data.
The BPVS3 was also not effective in capturing functioning levels in our experimental group (9 participants could not complete it, and of those who did only 4/18 scored high enough to obtain a standardized score). Failure to capture participants' characteristics accurately is one of the main barriers for conducting research in populations with complex needs. In this study, the BPVS results (and failures) bring to the fore the heterogeneity of these participants and the difficulties associated with including them in standard research protocols. Developing better tools to evaluate levels of functioning is, however, a challenge that future research should address in order to describe patterns of strengths and difficulties in participants with complex needs. The design used in this study might pave the way for shorter, nonverbal, reward-driven procedures to that effect.
Overall, this study laid some important foundations for research in autism with complex needs, showing that this underrepresented population could participate in research when provided with adapted experimental designs. Contrary to our hypothesis, a majority of individuals with complex needs demonstrated goal-directed behavior.