Using machine learning to advance disparities research: Subgroup analyses of access to opioid treatment

Abstract Objective To operationalize an intersectionality framework using a novel statistical approach and with these efforts, improve the estimation of disparities in access (i.e., wait time to treatment entry) to opioid use disorder (OUD) treatment beyond race. Data source Sample of 941,286 treatment episodes collected in 2015–2017 in the United States from the Treatment Episodes Data Survey (TEDS‐A) and a subset from California (n = 188,637) and Maryland (n = 184,276), states with the largest sample of episodes. Study design This retrospective subgroup analysis used a two‐step approach called virtual twins. In Step 1, we trained a classification model that gives the probability of waiting (1 day or more). In Step 2, we identified subgroups with a higher probability of differences due to race. We tested three classification models for Step 1 and identified the model with the best estimation. Data collection Client data were collected by states during personal interviews at admission and discharge. Principal findings Random forest was the most accurate model for the first step of subgroup analysis. We found large variation across states in racial disparities. Stratified analysis of two states with the largest samples showed critical factors that augmented disparities beyond race. In California, factors such as service setting, referral source, and homelessness defined the subgroup most vulnerable to racial disparities. In Maryland, service setting, prior episodes, receipt of medication‐assisted opioid treatment, and primary drug use frequency augmented disparities beyond race. The identified subgroups had significantly larger racial disparities. Conclusions The methodology used in this study enabled a nuanced understanding of the complexities in disparities research. We found state and service factors that intersected with race and augmented disparities in wait time. Findings can help decision makers target modifiable factors that make subgroups vulnerable to waiting longer to enter treatment.

K E Y W O R D S racial disparities, regression tree, subgroup analysis, virtual twins, wait time for opioid treatment What is known on this topic • Existing research suggests racial disparities in access to opioid treatment.
• Knowledge on factors that contribute to disparities beyond race is limited.
• Machine learning approaches may improve estimation of disparities.

What this study adds
• Specific machine learning approaches can improve estimation of disparities and identify intersecting conditions that increase the probability of waiting longer for treatment.
• Largest variability in disparities exists among states, highlighting the importance of tailoring interventions based on each state's context.
• Service setting, referral source, medication-assisted treatment, and living arrangements can define the subgroup most vulnerable to racial disparities in access to treatment.

| INTRODUCTION
Opioid use disorder (OUD) and overdose have become a national crisis. 1,2 More than 115 people in the United States die every day due to opioid overdose. 1 The total cost of the opioid epidemic from 2015 to 2018 is more than $2.5 trillion, according to the White House Council of Economic Advisers. Improving access to treatment and recovery services is considered one of the major priorities by the US Department of Health and Human Services. Improving access to OUD treatment, particularly for underserved groups, can reduce overdoses and opioid-related deaths. 3 African Americans face more barriers to accessing OUD treatment than non-Hispanic Whites, hereafter referred to as Whites, [4][5][6][7][8][9] and wait longer to enter substance use disorder treatment. 10 Previous research has established racial and ethnic disparities in access to treatment. However, most of these studies focused on differences based on race and ethnicity, mainly between African Americans and Whites. 11 Limited disparities research has used large databases and advanced analytic models to identify subgroup characteristics beyond assigned categories based on self-reported race and ethnicity. It is likely that these other attributes may explain the variation in clients' access to care in a general racial and ethnic category.
Disentangling the heterogeneity in racial and ethnic minority groups can help health services researchers learn about additional sources of inequality and how to measure these sources or drivers, as noted in the extant literature, to inform and target effective interventions for improving treatment access. 12,13 We excluded Hispanics and Latinos, another minority group affected by access disparities, because our goal in this study was to explain the heterogeneity of two distinct groups, that is, African Americans and Whites. Hence, the scientific premise of this study is based on the need for analytic methods to identify the characteristics that make individuals more vulnerable to disparities in access to OUD treatment beyond race.
Subgroup analysis is a class of approaches in machine learning widely used in personalized medicine. [14][15][16] Machine learning can be considered a pure data-driven process of detecting patterns or making predictions without explicit instructions. It can be widely used in many fields as long as a supervised model, prediction and classification, or an unsupervised model, such as cluster analysis and a recommendation system, is needed. Subgroup analysis is used to evaluate heterogeneous treatment effects across subgroups in response to an intervention, such as the most effective pharmacological agent or behavioral treatment for an individual. For instance, using common comparative methods, when the heterogeneity among individuals is large enough, an effective pharmacological agent or behavioral intervention may have adverse effects for some individuals. From this point of view, subgroup analysis can personalize interventions to help eliminate the risk of using an inappropriate treatment. Applied to health care disparities, subgroup analysis can identify what characteristics beyond race make individuals vulnerable to waiting longer to receive care. Traditional methods such as incorporating effect modifiers can also help identify altered treatment effects, but regression models are restricted to a limited number of interactions. In addition, common effect modifications usually define subgroups in a linear way, which is not as flexible as the machine learning approach used in this study.

| Conceptual framework
We applied an intersectionality conceptual framework to understand the different sources of vulnerability that may lead to disparities in access to OUD treatment. Intersectionality theory argues for the nonadditivity of effects of categories, such as sex or gender and race and ethnicity. It posits that other domains, called categories here, play a role in access to care. Different intersections of identity, social position, processes of oppression or privilege, and policies or institutional practices may explain the heterogeneity of effects and causal processes contributing to health disparities. Using this framework, we examined characteristics beyond race (e.g., gender, socioeconomic status, drug of choice, treatment setting) that may increase the risk of a group to face disparities in access to OUD treatment.

| Analytic framework
We drew from a subgroup analysis approach called virtual twins 12 used in personalized medicine to search for specific groups most vulnerable to disparities. Several types of subgroup analyses exist.
For instance, Kehl and Ulm 17 introduced a bump-hunting-based method, with several versions, that identifies positive and negative responders to a new treatment. This method incorporates interactions between treatment and covariates, but it can be difficult to assess high-order interactions. Tree-based search methods are a popular option for subgroup analysis due to their flexibility in accounting for complex interactions. [18][19][20] Among those, the virtual twins method, a two-step algorithm, appears to be one of the most intuitive and efficient approaches in identifying a subgroup. 14 In the first step, it predicts the response difference for each observation, assuming in the treatment and control group, respectively. In this article, treatment and control groups refer to episodes from individuals identified as African Americans and Whites, respectively. In the second step, a regression tree is developed to define subgroups associated with the higher response difference. We adopted this method for the subgroup analysis to operationalize the intersectional framework and propose an approach that is parsimonious and easily interpretable.

| Analytic sample
The final analytic sample only included episodes that did not have missing values (104,741 episodes, or 10.0% of data, were removed due to missing values). We did not consider variables that were not analyzable (e.g., case ID, core-based statistical area) or had overlapping information with other variables (e.g., "heroin reported at admission" was removed because "heroin as primary substance abuse problem" had already been included). We also combined categories for categorical variables with too many levels. The final analytic dataset featured 941,286 episodes and 30 variables for the national data; subsamples featured 188,637 episodes for California and 184,276 episodes for Maryland. Deletion of 10% of the data did not impact the robustness of our estimates based on a comparative analysis of key variables between deleted and retained records. Implications are described in the Section 4.1.

| Dependent variable
In the original uncleaned TEDS-A dataset, the variable for number of days waiting to enter the treatment had five categories: 0, 1-7, 8-14, 15-30, 31 or more, and unknown. The episodes in the unknown category were deleted in the analytic sample. We only considered episodes that listed the number of wait days. We dichotomized the outcome variable and treated wait days equal to 0 as class "0" and all others as class "1." 21

| Explanatory variables
Because more than 95% of the data involved episodes for individuals who identified as African American or White, we focused on analyses of racial disparities between those two groups. Other independent variables can be summarized in the following categories. Sociodemographic variables included age, gender, marital status, education level, employment status, primary source of income, pregnancy, and homeless status. Drug use variables were the primary drug used (heroin or other opioids), route of using the primary drug, frequency of using the primary drug at admission, and age at first use of the primary drug. Variables related to treatment information at admission included principal source of referral, number of prior treatment episodes, receipt of medication-assisted opioid therapy, the type of service and treatment setting in which the current episode occurred at the time of admission, or transfer.
Veteran status, number of arrests in the 30 days before admission, the presence of a psychiatric problem at admission, health insurance, and the primary source of payment for this treatment episode were also included, which related to access to care and have been tested in other studies. 22

| Data analysis
We first examined descriptive data of independent variables across the group of episodes that involved waiting and the group of episodes with no wait to enter care (see Table 1). This analysis can be considered an investigation of marginal associations of each independent variable with the outcome variable: any wait to enter treatment.
Results are presented in Table 1

| Analytic approach
We relied on a common statistical method for subgroup analysis called virtual twins. 12 The formulation of the method is as follows. We applied the two-step virtual twins approach in Foster et al. 14 to wait time to enter OUD treatment in this study. In Step 1, a classification model was built to predict the probability of waiting 1 day or more instead of no wait, given the race and other independent variables, denoted as outcome variable. To ease interpretation, the regression tree method was used for Step 2.
As previously described, it is critical to ensure that the model in Step 1 is equipped with good prediction performance because the outcome variable in Step 2 is constructed based on the predicted probability from the model in Step 1. Therefore, we examined and compared multiple variable selection methods.

| Selecting the best prediction model
First, we evaluated a few commonly used prediction methods and picked the best one for Step 1: random forest, 25 elastic net, 26 and boosting tree. 27 In addition to the three classic approaches, some new approaches can also be used in Step 1; for example, two-scale distributional nearest neighbors, which can be the subject of future work. 28 The three classic methods were implemented using R packages random forest, glmnet, and xgboost, respectively. Data were randomly split into training (188,257 episodes, or 20% of the data) and test data using 100 iterations to eliminate the randomness from data splitting. A comparison of prediction performance on the test data in these 100 replications is presented in Table S1. We evaluated the misclassification percentage overall and for treatment group and control group, sepa-

| Implementing the two-step approach
Observing such results in the previous section, random forest was used as the prediction method for Step 1. Data were randomly split into two parts: 20% in the first part and the remaining 80% in the second part. The first part was used to train the classification model; then we applied this model to the second part to calculate Z i , the outcome variable to be used in Step 2 of the virtual twins method. The dataset was split to avoid artificially creating the outcome variable and fitting the regression tree with this outcome variable in the same dataset. Also, the dataset was split once instead of 100 times because it was not feasible to interpret 100 trees obtained from 100 datasets in the second part. Overall, this process allowed us to no longer need to evaluate and select the best classification model for Step 1, in which randomness can affect the selection process. For the next step, please refer to the following explanation of the probability difference.
Denote by P rf Á ð Þ the function to obtain the probability of waiting 1 day or more based on this model. For each observation in the full dataset, irrespective of race, we calculated the following virtual probability difference: wherein D i ¼ 1 denotes African American and D i ¼ 0 denotes White.
This quantity measures the enhanced probability of waiting only because a client is African American instead of White. In Step 2, with Z i as the dependent variable and all other variables except for race used as independent variables, we built a regression tree that defined subgroups associated with the high enhanced probability of, or more specifically, greater vulnerability to racial disparities in access treatment. Note that although all independent variables other than race were considered in the analysis, the regression tree may not necessarily use all of them to construct the tree model. The flowchart in Figure 1 shows the main steps of the approach. We applied the twostep approach on the national data first and then on the two states with the largest number of episodes.

| RESULTS
The regression tree based on national data is presented in Figure 2, highlighting the main findings in the footnotes. The branch to the left of a splitting node indicates that the condition in the node is satisfied or met, whereas the condition in the right branch is not satisfied or met. The same rule applies to the other trees in this study. The first splitting node in Figure 2  The regression tree based on episodes of Maryland is provided in Figure 4. The subgroup of episodes most vulnerable to racial disparities favoring Whites had an increased probability of 3.4% in waiting 1 day or more (the right-most branch in a green oval at the bottom in Figure 4). This subgroup includes episodes satisfying the following four conditions: (a) services setting of detox 24-h free-standing residential, short-term rehab or residential, or ambulatory (nonintensive outpatient); (b) more than one prior treatment episode, (c) receipt of medication-assisted opioid treatment, and (d) some and daily use of the primary drug. We emphasize that there were no episodes in Maryland in services setting of rehabilitation or residential or hospital.
As shown in Figures 3 and 4, in some subgroups, African Americans had decreased probabilities of waiting 1 day or more (those lightcolored leaves on the left of the two trees). For example, the leaf at the bottom left of Figure 3 shows a decreased probability of 0.067, with 5% of episodes falling into this category. This shows that African American episodes in this category entered OUD treatment more rapidly.
F I G U R E 2 Regression tree for the national data. Left branch indicates that the condition in the splitting node is met or satisfied. The decimal number in the node shows the increased or decreased probability of waiting 1 day or more due to race. The percent value shows the percentage of episodes falling into that node. Nodes with high positive decimal numbers include episodes more subject to racial disparities [Color figure can be viewed at wileyonlinelibrary.com] F I G U R E 3 Regression tree for the subgroup analysis of California episodes. Left branch indicates that the condition in the splitting node is met or satisfied. The decimal number shows the increased or decreased probability of waiting 1 day or more due to race for that subgroup. The percent value shows the percentage of episodes falling into that node. The most vulnerable subgroup is enclosed in the red circle [Color figure can be viewed at wileyonlinelibrary.com] F I G U R E 4 Regression tree for the subgroup analysis of Maryland episodes. Left branch indicates that the condition in the splitting node is met or satisfied. The decimal number shows the increased/decreased probability of waiting 1 day or more due to race for that subgroup. The percent shows the percentage of episodes falling into that node. The most vulnerable subgroup is enclosed in the green circle [Color figure can be viewed at wileyonlinelibrary.com]

| Characteristics of subgroups
After obtaining and interpreting the regression trees, we took a closer look at the identified subgroups most vulnerable to waiting longer to enter treatment. For the state of California, 3% of the episodes used in the second step of the virtual twins method fell into the most vulnerable subgroup, as indicated by the red oval in Figure 3.   29 Although this definition is fairly comprehensive, attributing responsibility for disparities to systems and client-level factors, it does not address the intersection of these factors. The regression trees presented here allowed us to examine the interaction among some variables representing system and client factors, including health care system (e.g., referral sources and service settings) and client characteristics (e.g., homelessness, employment, method of using primary drug).

| Limitations
The subgroup analysis was one of the first attempts to identify key characteristics associated with disparities in OUD treatment. Although racial disparities have been detected by other studies, they were limited to a racial or ethnic category-hence, ignoring other intersecting characteristics that may predispose individuals to wait longer to initiate OUD treatment. The strength of this study is our rigorous machine learning approach and intersectionality framework to identify conditions that intersect with race beyond the use of traditional regression analysis. However, the proposed approach has several limitations. The predictive performance of our model can reach an accuracy of more than 80%. This is acceptable as a pure predictive model, but we recognize that the dataset did not include several variables relevant to the outcome. Such an accuracy level may not be optimal to produce the outcome variable in the second step of subgroup analysis, but it is still adequate. 30 Another limitation is that our analysis focused on treatment episodes and not individuals with multiple episodes. The average number of episodes per client is unknown from this TEDS-A dataset, but we do not expect this number to be large based on other studies using TEDS. 31,32 Another limitation is the deletion of 10% of data with missing information. Although our comparative analysis of retained and deleted records due to missing data showed that the deleted sample had almost identical mean age levels, proportion of episodes involving women, racial distribution, wait days, employment status, and service settings, it is important to mention here. Finally, due to the nature of data collected by TEDS, the analysis is possibly subject to the issue of selection bias in that individuals with longer waiting periods may never present for care.
However, we do not expect the number of such individuals to be high; as indicated in the codebook of TEDS-A 2015-2017, the percentage of episodes involving waits of 31 or more days was only 1.5%. Despite these limitations, we believe that the application of the machine learning method used in this study may advance the conceptualization and operationalization of disparities in access to care. As such, findings can clearly inform health policy interventions.

| Future research
The growing literature suggests racial disparities in access to OUD treatment, but only general differences in wait time to enter treatment have been identified. It is unclear whether the racial difference is heterogeneous across various intersecting conditions. The general difference may be explained by characteristics of subgroups. Traditionally, it is common to use regular regression to detect the average racial difference by considering race as a predictor, and some control variables may be included to eliminate confounding. Thus, insufficient characterization of heterogeneous racial differences can be one big issue. Moreover, if only a small subgroup of episodes is subject to racial differences while most others are not, such regression methods can possibly have insufficient power to detect the difference because the average effects can be small. The subgroup analysis conducted in this study provides a more precise way to differentiate heterogeneous racial differences and does not rely on the significance of the average racial difference. This approach is more sensitive and accurate than common regression methods and responds to calls for the application of artificial intelligence (e.g., machine learning) to intersectionality problems. 33 Future research should build from our approach to identify conditions and factors that contribute to access to care beyond African American race. Exploring further these factors in this and other underserved groups like Hispanics and Latinos(x) is the next question in disparities research. As such, policy makers, health care administrators, and providers can make better decisions to allocate resources to reduce or even eliminate such racial disparities.