Evidence‐based use of scalable biomarkers to increase diagnostic efficiency and decrease the lifetime costs of autism

Abstract Challenges associated with the current screening and diagnostic process for autism spectrum disorder (ASD) in the US cause a significant delay in the initiation of evidence‐based interventions at an early age when treatments are most effective. The present study shows how implementing a second‐order diagnostic measure to high risk cases initially flagged positive from screening tools can further inform clinical judgment and substantially improve early identification. We use two example measures for the purposes of this demonstration; a saliva test and eye‐tracking technology, both scalable and easy‐to‐implement biomarkers recently introduced in ASD research. Results of the current cost‐savings analysis indicate that lifetime societal cost savings in special education, medical and residential care are estimated to be nearly $580,000 per ASD child, with annual cost savings in education exceeding $13.3 billion, and annual cost savings in medical and residential care exceeding $23.8 billion (of these, nearly $11.2 billion are attributable to Medicaid). These savings total more than $37 billion/year in societal savings in the US. Initiating appropriate interventions faster and reducing the number of unnecessary diagnostic evaluations can decrease the lifetime costs of ASD to society. We demonstrate the value of implementing a scalable highly accurate diagnostic in terms of cost savings to the US. Lay Summary This paper demonstrates how biomarkers with high accuracy for detecting autism spectrum disorder (ASD) could be used to increase the efficiency of early diagnosis. Results also show that, if more children with ASD are identified early and referred for early intervention services, the system would realize substantial costs savings across the lifespan.


INTRODUCTION
Autism spectrum disorder (ASD) is an etiologically and phenotypically heterogeneous disorder with two symptom domains, social communication/interaction and restricted/repetitive behavior. It is frequently accompanied by co-occurring medical and mental health conditions and results in variable but lifelong functional impairments and challenges, particularly in social behavior. Over the last two decades, the prevalence of ASD has increased dramatically: in 2000, the prevalence in the US was estimated to be 1 in 150 children, and as recently as 2019, the prevalence was estimated to be 1 in 54 children (Centers for Disease Control and Prevention, 2020).
Caring for an individual with ASD affects all aspects of a family unit, requires considerable use of community resources and results in substantial costs to the family and society (Hyman et al., 2020). The cost of caring for Americans with ASD was estimated to be $268 billion in 2015 (Leigh & Du, 2015), of which $191 billion was attributable to adults with ASD. By 2025, total annual costs are expected to be $461 billion (Leigh & Du, 2015). Medical care represents a significant component of these costs and is 4-6 times higher for individuals with ASD than those without ASD (Shimabukuro et al., 2008).
Early accurate diagnosis followed by appropriate early and intensive intervention has the potential to significantly decrease lifetime cost while improving functioning and well-being. Randomized controlled trials indicate that a significant proportion of children with ASD who have access to early intensive behaviorally-based interventions, including naturalistic developmental and behavioral interventions, starting in the 2nd or 3rd year of life (hereafter EIBI for simplicity) show substantial cognitive and functional gains and symptom improvement relative to eclectic or treatment as usual conditions (Dawson et al., 2010;Eldevik et al., 2009;Granpeesheh et al., 2009;Green et al., 2017;Hardan et al., 2015;Howard et al., 2005;Howlin et al., 2009;Kasari et al., 2008;Lovaas, 1987;Mohammadzaheri et al., 2014;National Research Council Division of Behavioral and Social Sciences Education, 2001;Peters-Scheffer et al., 2011). Associated with these cognitive and functional gains are significant reductions in ongoing costs, including special education and medical care (Dawson & Bernier, 2013).
The current average age of ASD diagnosis in the US is 4 years old (Centers for Disease Control and Prevention, 2019), significantly reducing the ability to positively influence early developmental trajectories. To address the growing need to identify children with ASD early, the American Academy of Pediatrics (AAP) developed and published a Surveillance and Screening Algorithm for ASD (Johnson & Myers, 2007) and in 2020 released an updated clinical report with consistent recommendations for physicians in primary care to screen all children for ASD (Hyman et al., 2020). Despite the clear guidance to universally screen for ASD in primary care settings, fewer than 60% of pediatricians administer an ASD-specific screening tool at the 18-and 24-month preventative care visit (Siu et al., 2016). Limited screening adherence causes physicians to refer for diagnostic evaluations when symptoms fully manifest at older ages, contributing to a delay in final diagnosis and access to services (Siu et al., 2016). Yet, fully implementing recommendations would only further inundate specialty care clinics and limit access to comprehensive diagnostic evaluation.
After a child is identified as "at risk" using a screening tool, there are significant wait times in clinics due to the abundance of time needed to properly conduct the evaluation per patient (e.g., sometimes multiple visits 2-3 h each). CDC Pathways Survey data from 1420 families of children with ASD reported an average wait time of 3 years between parents' first concerns and receiving a diagnosis of ASD (Oswald et al., 2017).
Given these limitations, scalable, easy-to-implement diagnostic measures, used in an evidence-based medicine framework (Guyatt et al., 2002;Youngstrom et al., 2017) and applied to individuals identified at "elevated risk" by current screening measures, could substantially improve the efficiency of ASD identification. A scalable test would be implemented widely into the healthcare system and would need to be highly feasible, efficiently implemented, and cost-effective. Differentiation of ASD from non-ASD cases is also a key consideration for any tool implemented after screening because the majority of false positive screens will be children with other developmental or neuropsychiatric conditions (e.g., anxiety, language/communication disorder, ADHD, oppositional defiant disorder, etc.).
Recent progress investigating a saliva based test (Hicks, Rajan, et al., 2018) and remote eye gaze tracking (Frazier et al., 2018) holds promise in ASD diagnostics. Researchers demonstrated that molecules in saliva are highly accurate in differentiating children with ASD from at-risk children aged 18 months to 83 months (area under the curve or AUC = 0.88) in a large clinical study (n = 451;Hicks, Rajan, et al., 2018). In 2019, this technology was released commercially to use in clinical practice as a diagnostic aid (Geddes, 2020). Similarly, eye tracking measures have shown consistent validity in differentiating ASD and non-ASD individuals responding to social and nonsocial stimuli . Recent studies have supported high levels of validity when measurements are aggregated across stimuli and paradigms (Frazier et al., 2016;Pierce et al., 2011), and a recent remote eye gaze tracking assessment indicated high diagnostic accuracy (area under the curve or AUC = 0.86) when an aggregate index was trained and tested in study of 201 children (91 with ASD and 110 non-ASD) (Frazier et al., 2018).
The addition of a second order diagnostic aid to the screening process but prior to comprehensive evaluation has the potential to substantially increase the efficiency of ASD identification and lead to more rapid referral for early intervention services. Evidence-based assessment, including the use of multi-level diagnostic likelihood ratios, provides a method by which scalable, diagnostic aids with good accuracy could be applied to screen positive cases. Specifically, test scores at levels that decrease the likelihood of ASD could be combined with the postscreening probability to rule out false positives flagged from screening tools. Similarly, test scores at levels that increase the likelihood of diagnosis could be combined with the postscreening probability to identify cases at highest priority and need for evaluation, and, with very high score levels, rule-in patients with a very high probability of ASD that do not need an expensive and time-consuming ASD evaluation prior to treatment initiation (Figure 1). Furthermore, parents of children with and without ASD have demonstrated strong interest in an objective diagnostic aid for ASD (Wagner et al., 2019), suggesting that adding an objective diagnostic measure into the current diagnostic process can increase parental acceptance of the diagnosis and decrease the perceived necessity of seeking additional second opinions.
The first aim of the present study is to demonstrate how a scalable diagnostic evaluation tool used in an evidence-based assessment framework (Guyatt et al., 2002;Sackett et al., 2000) to further triage and inform clinical judgment for high risk cases can substantially improve early ASD identification. Specifically, the paper demonstrates how test scores on scalable diagnostics could be used individually or jointly across a range of postscreening settings to determine whether additional evaluation is needed (test/no-test threshold) or whether treatment might be initiated (treatment threshold) (Frazier & Youngstrom, 2006;Guyatt et al., 2002;Jenkins et al., 2012;Sackett et al., 2000;Youngstrom et al., 2017). Improvements to the diagnostic process may ultimately lead to a decrease in lifetime costs of ASD by allowing appropriate interventions to be initiated faster and reducing the number of unnecessary diagnostic evaluations. Consequently, the second aim of the present study is to show the cost savings to the US by facilitating an early accurate diagnosis of ASD leading to appropriate early intervention. F I G U R E 1 Clinical flow diagram. Results of a biomarker diagnostic determine whether additional evaluation is needed (test/no-test threshold) and whether treatment might be initiated (treatment threshold). These thresholds are not fixed and are often dependent on additional considerations such as how important it is to identify the condition. Ultimately, exact thresholds are based on the clinical setting and determined by the clinician in consultation with the patient. If the first biomarker results is between thresholds (above the test/no-test threshold but below the treatment threshold), the second biomarker (whichever was not administered first) would be administered. In this case, Table 5 (see above) would be used to generate the final post-test probability for use of two biomarkers if results correspond to the presented outcomes (low-low, high-high, etc.). However, if some other combination of results were observed, the likelihood ratio values from Tables 1 and 2 could be applied in iterative fashion using Bayes theorem to generate a final post-test probability based on both biomarkers

Diagnostic measure validation samples
To illustrate the potential utility of ASD diagnostic biomarkers implemented after primary care screening, two distinct measures were utilized-one based on molecular data (Hicks, Rajan, et al., 2018; Table 1) and a second based on data from remote eye gaze tracking (Frazier et al., 2018; Table 2) to demonstrate to readers that the approach is measure-agnostic as long as the diagnostic tool being considered is highly scalable and has good validity for ASD and non-ASD case differentiation. These measures were chosen because they have shown promise in initial development and replication studies (Frazier et al., 2016(Frazier et al., , 2018Hicks et al., 2016;Hicks, Rajan, et al., 2018;Hicks, Uhlig, et al., 2018) and are likely to provide incremental validity by evaluating different neurophysiological processes.
In the first sample, a molecular diagnostic panel was derived and replicated using data from 451 patients (238 children with ASD, 84 children with non-ASD developmental delay, and 134 neurotypical children). The panel had high overall accuracy for detecting ASD (AUC = 0.88; Hicks, Rajan, et al., 2018). For the purposes of this study, molecular diagnostic data for ASD and all non-ASD cases are used because presumably a mixture of healthy or neurodevelopmental non-ASD cases will screen positive on the M-CHAT-RF given recent positive predictive value estimates. In the second T A B L E 1 Multi-level likelihood ratios, post-test probabilities for base rates .10-.50, and sensitivity and specificity values for the molecular diagnostic measure Note: Sensitivity and specificity values were based on the mid-point of the range for low, indeterminant, and high score ranges. LR− reported for low and very low score ranges. LR+ reported for indeterminant, high, and very high score ranges. Abbreviations: BR, base rate; PP, posterior probability.
sample, remote eye gaze tracking to an audio-visual stimulus was administered using seven distinct social stimulus paradigms (7-min total administration). An autism risk index was empirically-developed and validated for differentiating ASD (n = 90) and non-ASD neurodevelopmental disorder cases (n = 110), with high overall accuracy (AUC = 0.86; Frazier et al., 2018).

Applying evidence-based assessment
Evidence-based assessment methods focus on utilizing available research to select the choice of measures and to guide the assessment process. Importantly, evidencebased assessment allows the clinician to more accurately evaluate test results in relation to the base rate of the condition and the predictive validity associated with a test score. In the present context, the base rate provides the pretest probability of an ASD diagnosis. Given that universal screening is recommended for ASD, that an increasing proportion of children are being screened, and that studies of the M-CHAT-RF have suggested positive predictive values as low as 14.6%, the present demonstration assumes a range of initial base rates from .10 (implying 1 in 10 M-CHAT-RF screen positives have ASD) to .50 (assuming 1 in 2 M-CHAT-RF screen positives have ASD). The upper end of this range is provided to simulate specialty care clinics where base rates of ASD diagnosis often hover around 50% because many patients receive both early screening and additional triage as a result of parental or care provider concern.
In an evidence-based assessment framework, the predictive validity of a test score is often represented using a likelihood ratio. For tests with continuous or quasicontinuous scaling, multi-level likelihood ratios are recommended (Guyatt et al., 2002). Multi-level likelihood ratios quantify the predictive value of test scores across defined score ranges, in contrast to AUC values, which estimate accuracy across the full range of scores. In the case of ASD, multi-level likelihood ratios permit the articulation of score ranges that substantively decrease the probability of diagnosis (and in the extreme case may rule it out entirely) as well as score ranges that substantively increase the probability of diagnosis (or in the extreme case rule-in ASD). Multi-level likelihood ratios are superior to cut scores because they do not assume that all scores below or above a particular cut score have the same predictive validity. Furthermore, multi-level likelihood ratios can be easily combined with base rates to understand the probability of diagnosis. Generally, likelihood ratios <.50 are considered useful for decreasing the probability of a condition and those <0.10 are typically strong enough to rule out a diagnosis. Conversely, likelihood ratios >2.0 are considered useful for increasing the probability of a diagnosis and those >10.0 may be strong enough to rule-in the diagnosis. Likelihood ratios of 1.0 do not alter the probability of diagnosis.
By combining the most likely pretest probability with the likelihood ratio of the observed score, it is possible to generate a post-test or posterior probability. In an evidence-based framework, post-test probabilities can be used to grade clinical decision-making into more nuanced options and are typically interpreted with reference to two basic clinical decisions-the test/no test threshold and the treatment threshold. The test/no test threshold defines the probability of the disorder at which further evaluation is recommended and provides a rational approach for determining when to collect additional assessment information. The treatment threshold defines the probability of the disorder at which treatment is recommended. These thresholds are not fixed and are often dependent on a multitude of considerations such as how important it is to identify the condition. For example, the test/no test threshold could be quite low for a condition with a highly accurate, inexpensive, easily scaled, and readily administered test, while the treatment threshold could be very high for expensive, risky treatments. Ultimately, the exact threshold is dependent on the clinician and the clinical setting, and is often set in consultation with the patient. For the present demonstration, we suggest that, in most settings, screening generates at least a 10% postscreening probability of ASD (in children who screen positive) This probability is assumed to be sufficiently high to pass the test/no test threshold and merit application of scalable, easily-administered diagnostic measures, particularly given that ASD is often associated with significant lifelong disability and many cases show good response to early intensive intervention. Furthermore, we assume that probabilities below 10% (implying that additional testing has not supported the presence of ASD) are sufficient to rule out additional evaluation.
The treatment threshold is more nuanced and will be highly dependent on the family and patient situation. However, for the purposes of this analysis, we assume that probabilities >50% are sufficient to initiate less intensive, scalable, insurance-billed treatments such as parent-mediated intervention (Green et al., 2017) while probabilities >80% are sufficient to rule-in ASD and initiate therapeutic interventions and de-prioritize additional expensive diagnostic evaluation (although evaluation for treatment tailoring and other recommendations may ultimately be needed). Finally, we assume that probabilities between 10 and 50% suggest a very high need for additional specialty evaluation, as these individuals are not yet recommended for any intervention, while probabilities between 50 and 80% suggest the next level priority for evaluation because they are recommended to receive less intensive intervention, but may actually require more intensive intervention.
Clinicians could adopt different probability levels or even more nuanced actions. For example, a clinician (in conjunction with the caregiver and patient) may decide that probabilities between .05 and .10 are sufficiently high that additional testing should still be considered, particularly if additional low-cost, easilyacquired measures were available. Similarly, if EIBI were to become more widely available and supported through existing funding streams, the post-test probability level at which intensive treatment could be considered might be substantially lower.

Diagnostic efficiency analyses
Tables 1 and 2 present diagnostic efficiency statistics, including multi-level likelihood ratios and posterior probabilities across a range of potential base rates of ASD observed postscreening for the molecular and eyetracking diagnostics, respectively. An upper base rate of .50 is provided because this pretest probability is common in specialty care clinics where formal screening has occurred and caregivers have decided to follow through with making and keeping appointments. In this scenario, the molecular and eye-tracking diagnostic are functioning in the role of specialty care triage, which is analogous to second-level screening. Multi-level likelihood ratios were derived for each diagnostic measure using the Evidence-Based Medicine Toolbox diagnostic test calculator (https://ebm-tools.knowledgetranslation.net/calculator/ diagnostic/). For these calculations, score ranges for the molecular and eye tracking diagnostic measures were separately identified that corresponded to "very low," "low," "high," and "very high" proportions of ASD relative to non-ASD cases based on inspection of the score distributions. For the eye tracking diagnostic, an "indeterminant" category was also used to demonstrate how multi-level likelihood ratios could account for portions of the score distribution with strong overlap between ASD and non-ASD cases. This approach has been previously used (Frazier et al., 2007), follows evidence-based medicine recommendations (Frazier & Youngstrom, 2006;Sackett et al., 2000), and typically generates score ranges with likelihood ratios that are useful for substantially altering the probability of a clinical diagnosis and informing the test/no test and treatment thresholds.
Case examples with low and high scores are presented to demonstrate how biomarkers can be used to facilitate ASD identification, both individually or jointly (through iterative application). AUC values and sensitivity and specificity values at corresponding cut scores are also presented to connect familiar cut score approaches to interpretation.

Cost savings from accurate early diagnosis and intervention
Using an evidence-based assessment approach with scalable biomarkers should result in a substantial proportion of children having sufficiently high post-test probabilities to initiate treatment and a large proportion of children having sufficiently low post-test probabilities to rule-out additional evaluation. Thus, cost savings analyses conservatively assume, based on the biomarker accuracy estimates AUC = 0.86-0.88 (Frazier et al., 2018;Hicks, Rajan, et al., 2018), that at least 86% of screened ASD cases will be appropriately triaged. We use 86% in our cost-savings analysis as indication of how often ASD-affected children who receive a scalable postscreening diagnostic test will receive appropriate intervention services.
To estimate US cost savings associated with (i) facilitating an early accurate diagnosis of ASD and (ii) early identification leading to appropriate early intensive behavioral interventions, an analysis was conducted to forecast estimates of costs associated with (a) special education (including federal, state and local district expenses for special education) and (b) medical and residential care expenses (including federal and state Medicaid expenses).
This cost-benefit analysis acknowledges the following: 1. Predictors of reduced severity of ASD symptoms as a result of EIBI include age at intervention enrollment, cognitive functioning, and initial ASD symptom severity (Landa, 2018). 2. The proportion of children who become functionally indistinguishable from their peers is probably lower than the proportion often reported in the literature (just under 50% [Lovaas, 1987]). Among children with ASD who receive competently delivered EIBI, between 20 and 50% will be functionally indistinguishable from age-matched peers; between 20 and 40% will achieve meaningful but moderate gains; and 10-40% will continue to require intensive special education and adult services. For this financial model, we use the results of a meta-analysis and assume that 29% achieve age-appropriate functional behavior, 34% achieve meaningful but moderate gains, and 37% require intensive special education and adult services (Peters-Scheffer et al., 2012). 3. Without EIBI the majority of children with ASD will manifest enduring dependency on special education and adult developmental disability services: among children with ASD who have not received EIBI, a meta-analysis suggests that only 11% will achieve age appropriate functional behavior, 8% will achieve meaningful but moderate gains, and 81% will require intensive special education and adult services (Peters-Scheffer et al., 2012).
For these reasons, this cost-benefit analysis is framed in terms of marginal gains as well as the attainment of age appropriate functional behavior.
Additional assumptions in this analysis include the following: 1. Children who are diagnosed with ASD have access to EIBI services. Children who are not identified early as having ASD could still receive interventions; however, these interventions would either be nonspecific to the treatment of core symptoms of autism (e.g., occupational therapy, physical therapy), and/or would be related but balanced between the EIBI and no EIBI scenarios (e.g., speech therapy), and/or would likely be much lower intensity. 2. The costs for EIBI services is assumed to be a representative average for both center-based and homebased services (average of $45,000/year, given that children with ASD receive 20-40 h/week of EIBI (Reichow et al., 2018), we used 30 h/week and 50 weeks/year as a conservative estimate and assumed $30/hour (the average hourly rate of a board certified behavior analyst)). 3. The average duration of EIBI is assumed to be 3 years (Jacobson et al., 1998). 4. Consistent with prior literature estimates, 31% of children with ASD are assumed to also have an intellectual disability (ID) (Centers for Disease Control and Prevention, 2014). 5. Children with ASD who achieve age-appropriate functional behavior are assumed to use family support services only during participation in EIBI; those who make moderate gains or realize minimal effects are assumed to use 18 years of services. 6. All savings shown are net of the expense of providing EIBI (which is assumed to be a medical expense). 7. Children with ASD who ultimately become functionally indistinguishable from their peers are assumed to participate in regular education and have normal medical expenses thereafter; those who make moderate gains are assumed to participate in special education (Peters-Scheffer et al., 2012) and have medical expenses associated with ASD children who have other comorbid conditions (Peacock et al., 2012); and children who make minimal gains are assumed to participate in intensive special education (Peters-Scheffer et al., 2012) and have medical expenses associated with ASD children who have other comorbid conditions including intellectual disabilities (Peacock et al., 2012). 8. Cost estimates which include the adult years are made only to age 54, consistent with the average age of mortality in ASD (Hirvikoski et al., 2016). This assumption is conservative since there is a high likelihood that future generations will live beyond 54 with improved medical care and awareness of ASD. 9. Cost estimates are based on the article by Buescher et al., 2014(Buescher et al., 2014: for Medical and Residential Care, see Table 3; for special education and intensive special education for preschool children (ages 2-5), 2012 cost estimates were $31,460, and $62,920, respectively; for special education and intensive special education for school age children (ages 6-21), 2012 cost estimates were $13,980 and $27,961, respectively. Resent costs (year 2020) were derived from historic cost estimates (year 2012), using annual rates of inflation for Medical Services or Elementary and High School Tuition and Fees, as appropriate (see Table 4). 10. Calculated present-day costs are assumed to increase annually at the prior 10-year average annual rate of inflation for Medical Services or Elementary and High School Tuition and Fees, as appropriate (see Table 4). 11. Future costs are discounted to present value at a rate equivalent to the 30-year US Treasury yield (1.56% on March 5, 2020; U.S. Department of the Treasury, 2020).
Using the assumptions outlined above, cost savings estimates were derived using the specific methodology detailed in Supporting Information The present value of cost savings derived from this analysis are apportioned between federal, state and local taxpayers using the following methodology: 1. According to a CSEF Report on State Special Education Finance Systems, support for special education programs is provided by approximately 45% from states, 46% from local districts, and 9% through federal IDEA funding (Dragoo, 2018;Parrish et al., 2003)  Medicaid. The fixed percentage the federal government pays, known as the "FMAP," varies by state, with poorer states receiving larger amounts for each dollar they spend than wealthier states. The national average of 76.5% (KFF, 2020) was used in this analysis.
Avalere Health conducted an independent examination of the underlying assumptions associated with this cost savings analysis (included in Supporting Information).

Improving ASD identification
For both diagnostic measures, low and very low scores show good sensitivity, while high and very high scores show good specificity. However, sensitivity and specificity values do little to guide clinical judgment as the clinician needs to know what the probability of ASD diagnosis is after utilizing one or both of these measures. This requires the application of likelihood ratios under realistic base rate conditions. As shown in Tables 1 and 2, both the molecular and eye tracking diagnostics have multi-level LRs that fall in the clinical useful ranges for decreasing (LR <.50) and increasing (LR >2.0) the probability of diagnosis. For example, both measures individually produce very low posterior probabilities when very low scores are observed, even under the postscreening high base rate scenario. Thus, very low scores on either measure are likely sufficient to rule out ASD and either avoid additional testing or re-focus the priority for evaluation on other issues (e.g., speech language evaluation; see Figure 1 scenario A). While low scores on either measure alone are insufficient to rule out ASD (except in the lowest base rate conditions BR <.20), jointly observing low scores on these measures is sufficient, even under the highest base rate condition, to rule out ASD and avoid additional testing (Table 5). In Figure 1, this scenario is represented by the path where the first biomarker produces a between threshold result but administration of the second biomarker results in a rule out (middle branch followed by left-sided branch-scenario B).
Very high scores on either measure generate post-test probabilities indicating it is more likely than not (PPs >.60 in all base rate conditions) that the patient being evaluated has ASD. Given the potential benefit and low risks of early behavioral intervention approaches, this Note: Score combinations were chosen to represent extreme and middle combinations (absent indeterminant values for the eye tracking diagnostic measure). For score combinations, likelihood ratios are used in an iterative fashion and assuming no substantive correlation (r <.40) between the molecular diagnostic and the eye tracking diagnostic. If the actual correlation between two measures used iteratively is higher, posterior probabilities will be inflated.
information is likely sufficient to recommend initiating treatment in many circumstances (scenario C). At minimum, this information could be used to prioritize individuals for specialty care evaluation (scenario D). High scores on one measure are probably not sufficient to assume ASD is present and recommend expensive interventions, but may be sufficient to recommend less expensive approaches such as parent-mediated intervention or outpatient social skills training. However, high scores on both measures yield posterior probabilities that suggest EIBI should be initiated (PPs ≥.77 in all base rate conditions), if deemed clinically appropriate for the child (Table 5).
Overall, under most realistic base rate conditions, even when assuming only one postscreening diagnostic measure can be used, many evaluated cases are likely to have posterior probabilities that either rule out ASD or that increase the probability of ASD sufficiently that recommending intervention is warranted, substantially reducing the number of cases that require a specialty care evaluation and improving prioritization of the remaining cases.

Cost savings from second order diagnostic aid
In prior research, the cost differential estimated over the lifetime for an ASD child relative to a neurotypical child ranges from $1.4 to $2.4 million per child (Buescher et al., 2014). Due to a range of improvements resulting from EIBI, the societal cost savings in (i) special education and (ii) medical and residential care associated with the recommended changes are estimated to average nearly $580,000 per ASD child (after accounting for a projected accuracy of at least 86% for each of the newly developed eye-tracking and molecular diagnostic tools); as seen in Figure 2, with approximately 65,000 new children diagnosed as ASD each year in the United States, this cost savings totals over $37 billion/year. Annual cost savings in education exceeds $13.3 billion, with savings of approximately $1.2 billion, $6.0 billion, and $6.1 billion achieved by federal, state and local school districts, respectively. Annual cost savings in medical and residential care exceeds $23.8 billion, with savings of approximately, $8.5 billion and $2.6 billion in Federal Medicaid and State Medicaid spending, respectively. Variations of key model parameters (high and low estimates) were used to estimate the sensitivity of cost savings to different inputs. Results of this sensitivity analysis indicate that even under the most conservative conditions, substantial costs savings are achieved in education (the most conservative model estimates over $5.6 billion saved). Not surprisingly, less conservative estimates yield even greater savings (the least conservative model estimates over $24.6 billion saved). This holds true for costs associated with Medicaid as well (the most conservative model estimates $3.0 billion saved; the least conservative model estimates $19.3 billion saved; see Table S1).

DISCUSSION
The current ASD screening and diagnostic process is very inefficient, causing large proportions of children with ASD to experience a delay in diagnosis and miss an opportunity to initiate treatment at an early age when it is most effective. There are significant wait times for diagnostic evaluations, contributing to a large gap of time between when children are first identified as "at risk" using screening methods to when interventions are initiated (average age of diagnosis in the US is 4 years old).
The present study demonstrates that using scalable diagnostic measures coupled with an evidence-based assessment framework could substantially improve ASD identification by: (i) ruling out, de-prioritizing, or refocusing the type of evaluation needed for likely non-F I G U R E 2 Average cumulative cost savings per ASD child. Average cumulative cost savings per ASD child. The blue shaded area provides an estimate of the cumulative cost savings (or outlays) in real dollars (RD) for each ASD child who experiences early intervention as a consequence of early detection. The dotted lines indicate the same amount adjusted for different inflation estimates across time (±1 standard deviation). The solid line indicates the present value (PV) of real dollar savings (or outlays) experienced through the age indicated. Note that there is a net positive savings in both real dollar and present value terms beginning at age 11 years ASD cases with low post-test probabilities, (ii) identifying cases at the highest need for specialty care evaluation (reducing wait lists for these clinics), (iii) immediately initiating low cost interventions for ASD cases with moderate to high probability, and (iv) immediately initiating EIBI for very high probability cases (Figure 1). While accuracy remains the most important feature to consider when assessing new diagnostic tests, several other strengths suggest that these measures will translate well into a clinical setting, including: rapid administration (<10 min), noninvasive testing methods, results containing quantitative information that is objective and not influenced by rater perceptions, inexpensive equipment, and applicability to a wide range of ages and symptom severities. Parents are often unsatisfied with the measures currently used to evaluate ASD and in many cases, parents do not accept the diagnosis, creating a barrier to access treatment (Crane et al., 2015;Makino et al., 2017).
Secondarily, this study conducted a cost analysis to estimate savings associated with the use of evidencebased, postscreening diagnostic tools. Results indicated that substantial cost savings are achievable. Specifically, cost/benefit analysis yielded three major findings: (1) the present value of lifetime societal cost savings in special education, medical and residential care associated with implementing one or more scalable postscreening diagnostic measures is estimated to average nearly $580,000 per ASD child; (2) with approximately 65,000 new children diagnosed with ASD each year in the US, annual cost savings in education would exceed $13.3 billion and annual cost savings in medical and residential care would exceed $23.8 billion; and (3) a total of more than $37 billion/year would be saved when combining cost-savings for education, medical and residential care. In the first several years following implementation, costs savings may significantly exceed these estimates due to existing pent-up demand for ASD diagnostic services. More importantly, these changes will positively impact the quality of life for ASD children and their families. To date, approximately one-fourth of children under age 8 with ASD go undiagnosed, most of which belong to a minority population or are children living in census tracts with lower socioeconomic development, having reduced access to appropriate diagnostic services and therefore not receiving the support they need (Durkin et al., 2017;Wiggins et al., 2020); these children and families will likely benefit most from the recommended changes.
Here, we demonstrate the positive clinical and financial influences that implementing an ASD diagnostic measure can have, reinforcing the need to continue development of highly accurate, scalable biomarkers for ASD. Once promising measures are identified, funding mechanisms are needed to ensure that these tools can be clinically-implemented and that widespread adoption can be achieved even in low resource settings. As these tools are validated and widely implemented, it will be key to incorporate them into the existing ASD identification processes and practice guidelines. The present results also emphasize the need to make early intervention programs, including less intensive and cost-effective parentmediated interventions, as well as more-intensive and costly EIBI and developmental and behavioral intervention packages, more widely available. Rapid progress in scalable biomarker identification and validation, including one measure that is already commercially available and several that are likely to become available in the coming years, means that many more children with ASD will receive an early accurate diagnosis. This deluge of early diagnosis will only be useful if individuals can receive appropriately calibrated early interventions.
While Medicaid and private medical insurance may pay for some portion of EIBI (this varies by insurer and by state), access to EIBI remains dependent on ASD symptoms being identified early, and financial access to these services is often delayed pending a clinical ASD diagnosis. Incorporating a scalable diagnostic measure into the ASD diagnostic process to allow more children with ASD to be identified in early childhood when treatment is most efficacious is a goal consistent with that of Early and Periodic Screening, Diagnostic, and Treatment (EPSDT), a child health service of Medicaid for beneficiaries under age 21, that requires coverage for all health care services (e.g., preventative and treatment services) that are found to be medically necessary to "discover and treat childhood health conditions before they become serious or disabling" (California's Healthcare Foundation, 2015; Center for Medicaid & Medicare Services, 2014;Medicaid.gov., 2019). In the fiscal year 2014, over 40 million children were eligible for EPSDT, suggesting a strong need for validated scalable diagnostic measures to aid identification of a significant number of children in the US with ASD.

Limitations and future directions
Demonstration of an evidence-based assessment approach was intentionally presented in a measureagnostic fashion, so that any well-validated, scalable diagnostic measures for ASD could be applied using this framework. We chose two measures that were developed and validated in large samples, but also encourage all emerging ASD diagnostics, including the two used as examples, to continue to build predictive validity evidence, particularly across settings and sub-populations. It is crucial that ASD diagnostic biomarkers be validated in the most stringent fashion by comparison to both healthy controls and to non-ASD developmental disability or developmental neuropsychiatric controls who often mimic ASD presentations and frequently screen positive during the initial screening process. Furthermore, validation should examine low resource and race/ethnic subpopulations to ensure that existing validity evidence appropriately generalizes. And, finally, we would not suggest using even well-validated ASD biomarkers without first implementing the recommended first-level questionnaire screening tools as these tools provide an inexpensive and rapid method for altering the post-test probability prior to a more expensive or time-consuming biomarker data collection. Instead, it is optimal if biomarkers are used only for screen-positives and within the context of evidence-based evaluation to leverage existing screeners.
The cost-benefit analysis also makes several assumptions-many consistent with prior cost savings analyses (Ganz, 2007;Leigh & Du, 2015). It is important to note that even if fewer ASD cases are accurately identified than assumed, cost savings remains substantial. Furthermore, modest variations in assumptions were found not to impact the overall message that early accurate identification followed by effective early intervention will generate massive savings.
In conclusion, application of scalable ASD diagnostic biomarkers using an evidence-based assessment framework is likely to substantially enhance early ASD identification and provide support for more nuanced clinical recommendations. As a result of substantially improved early ASD identification, substantial lifetime costs savings can be realized as a result of a greater proportion of ASD-affected children receiving appropriate early intervention.

ACKNOWLEDGMENTS
The authors thank Avalere Health for their comprehensive and independent examination of the underlying assumptions associated with the EBA analysis and estimated US cost-savings analysis. The authors also thank Autism Speaks for their continuing support and leadership in autism research, and the American Academy of Pediatrics for their support of the initial research related to the molecular diagnostic used in this demonstration. The molecular work is currently supported by an STTR grant from the National Institute of Mental Health (awarded to Frank A. Middleton, Steven D. Hicks, and Quadrant Biosciences).