Transparent machine learning suggests a key driver in the decision to start insulin therapy in individuals with type 2 diabetes

Abstract Aims The objective of this study is to establish a predictive model using transparent machine learning (ML) to identify any drivers that characterize therapeutic inertia. Methods Data in the form of both descriptive and dynamic variables collected from electronic records of 1.5 million patients seen at clinics within the Italian Association of Medical Diabetologists between 2005–2019 were analyzed using logic learning machine (LLM), a “clear box” ML technique. Data were subjected to a first stage of modeling to allow ML to automatically select the most relevant factors related to inertia, and then four further modeling steps individuated key variables that discriminated the presence or absence of inertia. Results The LLM model revealed a key role for average glycated hemoglobin (HbA1c) threshold values correlated with the presence or absence of insulin therapeutic inertia with an accuracy of 0.79. The model indicated that a patient's dynamic rather than static glycemic profile has a greater effect on therapeutic inertia. Specifically, the difference in HbA1c between two consecutive visits, what we call the HbA1c gap, plays a crucial role. Namely, insulin therapeutic inertia is correlated with an HbA1c gap of <6.6 mmol/mol (0.6%), but not with an HbA1c gap of >11 mmol/mol (1.0%). Conclusions The results reveal, for the first time, the interrelationship between a patient's glycemic trend defined by sequential HbA1c measurements and timely or delayed initiation of insulin therapy. The results further demonstrate that LLM can provide insight in support of evidence‐based medicine using real world data.

when HbA1c gap is <6.6 mmol/mol (0.6%), a timely initiation of insulin therapy is less probable. Furthermore, for individuals initiated on insulin in a timely manner, the HbA1c gap is systematically higher than for those patients who have experienced clinical inertia. This key driver correlated with insulin therapy initiation could help combat clinical inertia.

| INTRODUCTION
In spite of abundant evidence demonstrating that in type 2 diabetes early glycemic control is correlated with a reduction in long-term complications, 1 data from many health systems indicate that delay in initiation and/or intensification of insulin therapy remains systemic. 2,3 The end result is that glycemic control in type 2 diabetes is globally inadequate, and individuals live years with poor glycemic control. 4,5 Several studies have provided explanations for the motivation behind the delay in glycemic control, [6][7][8] but none, to our knowledge, have examined real world data with artificial intelligence (AI) techniques to identify which factors are more likely to be associated with a provider's behavior (presence or absence of inertia) with regard to initiation of insulin therapy.
In 2005, the Italian Association of Medical Diabetologists (AMD) initiated the Annals project that led to the creation of a network of diabetes clinics, representing half of the total number of diabetes clinics in Italy, with the aim of monitoring, standardizing, and sharing the main parameters used for the evaluation of the quality of care given to patients. In this way, the AMD Annals were able to collect from patient electronic records up to 180 parameters, including clinical, pharmacological, organization related, and provider related, from each patient [9][10][11] to create "big data" sets (approximately 1.5 million patients and about 9 million visits), which permitted a thorough analysis using AI techniques.
Previously published studies have utilized AI methods to investigate other aspects of diabetes and have yielded promising results. 12 The present study applied AI techniques, specifically a transparent machine learning (ML) methodology, to overcome the problems associated with "black box" AI algorithms, which can deliver performant models but do not furnish explanations as to how the results were obtained. 13,14 The transparent ML technique used in this study is based on a proprietary algorithm of "explainable artificial intelligence" known as logic learning machine (LLM), which yields performance that is on par to the best ML algorithms while at the same time allowing full control over the algorithmic logic and permitting the correlation between predictive factors and outcome. 14 LLM has already been used to great effect in the analysis of biomedical datasets included in the Statlog benchmark. 15,16 For the purposes of this study, LLM was used to generate predictive and explanatory models that identify the combination of factors (clinical, personal, organizational) correlated with provider inertia in situations which would otherwise require the initiation of insulin therapy. This approach has already been previously used to good effect to analyze other types of clinical data. 17,18 2 | MATERIAL AND METHODS

| Participants
Data from people with type 2 diabetes were obtained from electronic medical records located on the AMD Annals database. The medical records are from patients who, prior to being referred to one of the clinics within the AMD, showed a high risk for the development of type 2 diabetes or had lab values that were indisputably consistent with type 2 diabetes. No patient referred to one of the diabetes clinics was initiated on insulin prior to their first visit at the clinic. Every patient had visited at least one of the Italian diabetes clinics between 2005 and the first half of 2019. 5 A total of 2.3 billion data points corresponding to information on 1 186 247 people with a confirmed diagnosis of type 2 diabetes (as indicated in the diagnosis field of the electronic medical record found on the database) and 9 954 976 visits were selected. These individuals were followed over time, and a total of 91 variables, including glycated hemoglobin (HbA1c), were checked periodically (on average every 0.6 years). The data preparation can be summarized as follows (for more in-depth information please see 17 ): • Time interval between two HbA1c measurements ≥2 months. • For each HbA1c measurement, "clinical factors" (eg, blood pressure, lipid panel, albuminuria, etc.) were tracked over time with an interval of maximum 4 months before and after the date of each measurement.
• For each HbA1c measurement, irreversible comorbidities (eg, acute myocardial infarction, stroke, etc.) were tracked starting from the date of first detection. Table 1 provides the inclusion criteria for this study. Following the application of the inclusion criteria, measurements from 129 373 individuals were included.
Data related to drug therapies and comorbidities were grouped as described in our previous study. 17 Prescribed medications were grouped into eight main diabetes therapies to simplify the number of drug combinations, yielding 18 combinations. To ensure a robust estimate of comorbidities, we grouped information from across different fields in the electronic medical record. Figure 1 provides a flow chart with participant characteristics. Table 2 provides the means, median, SDs, and interquartile for the characteristics of the cohorts.

| LLM characteristics and ML modeling
ML has the ability to both analyze data without making any a priori assumptions and predict new output values from the data. The ML technique, "rule generation methods," builds models described by a set of intelligible rules that permit the extraction of knowledge about variables in the analysis as well as their relationships with a target attribute. Two different paradigms for rule generation have been proposed in the literature. Decision trees 19 adopt a divide-and-conquer approach for generating the final model. Methods based on Boolean function reconstruction follow an aggregative procedure for building the set of rules. 20,21 LLM is a proprietary algorithm that implements the switching neural network model, 22 which allows for solving classification problems and produces sets of intelligible rules expressed in the form: "if premise …, then consequence …," where "premise" includes one or more conditions on the input variables, and "consequence" contains the output value or information about the target function in terms of "yes or no." Thus, the LLM rule generation technique produces a subset of relevant variables associated with a specific outcome and informs on explicit intelligible conditions related to a particular outcome as well as relevant thresholds for each input variable. Furthermore, the "clear box" approach used by LLM yields "explainable AI," which provides comprehensible and trustworthy results and output created by the ML algorithms. 23 In the present study, data were subjected to a first stage of modeling to allow ML to select automatically the most relevant factors related to inertia. The model incorporated both descriptive variables (clinical and demographic) and dynamic variables (HbA1c gap and drop speed, mean, SD, and trend for several clinical measurements) collected from each individual (Table S1).
After the preliminary phase, four further modeling steps were completed to individuate the key variables that discriminate the presence or absence of inertia. The role and relevance of the different variables that influenced YES/NO inertia were taken through several modeling steps (learning set = 70% and test set = 30%) as outlined in Table 3.
LLM affords the advantages of ML, which permits the analysis of very large number of variables, along with ability to have access to (transparency) of the ranking of the most relevant variables that can help guide the analysis.
As such, Step 1 began by incorporating all of the descriptive variables (Table S1) into the model, which resulted in accuracy = 0.70 and area under the curve (AUC) = 0.76 (also reported in Table S1). This was followed by Step 2 to verify if the addition of dynamically derived-variables could improve the performance of the model. The hypothesis was that a medical practitioner's inertia could be influenced by factors related to the patient's progress across time rather than only by static parameters related to a single visit. This hypothesis was confirmed by the results from Step 2 in that the performance of the model significantly improved accuracy = 0.79 and AUC = 0.87. Furthermore, the relevant variables revealed by the transparent ML highlighted the important role of those variables that are related to the patient's progress. For example, the HbA1c gap achieved the second position in the ranking and in third position one finds the average HbA1c across 4 years ( Table 3). Given that the first three positions in the ranking of the variables in Step 2 were all related to HbA1c, we were driven to carry out a third step to verify if the dominant role of glycemia and HbA1c was real. Therefore, for Step 3, input for ML consisted only of variables that were related to glycemic factors (static and dynamic values for fasting HbA1c and glycemia).
Step 3 results had accuracy = 0.78 and AUC = 0.84, which confirm the dominant role of the glycemic factors as determinants of the medical practitioner's decision to initiate insulin therapy in a patient.
As a counterproof to Step 3, we carried out a final step in modeling that included the use of all variables, both dynamic and descriptive, EXCEPT for those related to fasting HbA1c and glycemia, as input for ML. The results from Step 4 of modeling resulted in accuracy = 0.64 and AUC = 0.67. These results were modest compared to the previous three modeling steps and serve to demonstrate the small, but not absent, influence that other variables have on the medical practitioner's decision to initiate insulin therapy.

| RESULTS
The total data pool was comprised of 129 373 individuals, 32 752 of whom were started on insulin therapy in accordance with the 2020 guidelines of the American Diabetes Association, 43 375 whose insulin therapy initiation was delayed, and 53 246 who never received insulin as reported in Table 1.
The results indicate that the best model was derived from the second modeling iteration. This best performing model, which includes all the dynamic and descriptive variable outlined in Table S1, underscores the relevance of the variables selected for the analysis and prediction of the phenomenon studied. The area under the receiver operating characteristic curve (ROC) of the model is  glucose, as well as yearly HbA1c reduction speed are also relevant, but to a lesser degree than the HbA1c gap. The model indicates that the glycemic profile, which for the purposes of this study refers specifically to the change in either HbA1c or glycemia seen in a patient from one visit to the next, and a patient's glycemic trend, that is, the direction that change takes, has a greater effect on a provider's therapeutic inertia than any one individual datapoint in the patient's static profile. ML indicates that for threshold HbA1c values, on average, an HbA1c gap of <6.6 mmol/mol (<0.6%) is correlated with inertia. On the other hand, an HbA1c gap of >11 mmol/ mol (>1.0%) is correlated with non-inertia. Thus, the data suggest that the HbA1c gap between two consecutive visits appears to play a crucial role in the decision to start insulin therapy in a person with type 2 diabetes. Moreover, not only are the current HbA1c and the change in HbA1c between two consecutive measurements the key drivers with the strongest influence on the presence or absence of inertia, but the average HbA1c across 4 years is also relevant in the decision-making process. Furthermore, in terms of relevance, following the HbA1c level at the time of the most recent visit, ML suggests that below a value of 72 mmol/mol (8.7%), clinical inertia is most probable, whereas values above 73 mmol/mol (8.8%) lead to a greater probability that insulin will be initiated.
There are other parameters not directly linked to HbA1c but which in all iterations of modeling also suggest some relevance. For example, estimated glomerular filtration rate (eGFR) (mean and trend) stood out in all three iterations of modeling as the most important comorbidity and, when the eGFR mean is <59.99, it is correlated with non-inertia. A body mass index >24, either as a static variable or as the mean over 4 years, is correlated with inertia. Both stable triglycerides and the absence of complications in particular cardiac, hepatopathy, and hyperuricemia are all correlated with a greater probability of inertia.
The model was also able to confirm data previously reported in the literature. Namely, HbA1c values of less than 68 mmol/mol (8.4%) are associated with insulin therapy inertia whereas values above 75 mmol/mol (9.0%) are associated with an increased probability of insulin therapy initiation. The presence of a sudden increase in HbA1c as a driver of non-inertia led to a more in-depth analysis to verify the accuracy of the results provided by LLM.
Given that among the transparent ML rankings the HbA1c gap is an innovative finding in that it has not yet been described in the literature as a factor that plays a role in insulin inertia, we wanted to verify with traditional statistics the correlation between the variables revealed by transparent ML and insulin inertia. To confirm the results from the model, a statistical analysis was carried out on those individuals identified by ML who at some point in their clinical history had HbA1c values of >58 mmol/mol (>7.5%) for one or two consecutive visits. HbA1c was compared between inertia-YES and inertia-NO conditions at "T0." In the inertia-YES condition, T0 represents the point at which an Step description individual presents with an HbA1c >58 mmol/mol (>7.5%) for the second consecutive time and is the point at which insulin therapy initiation would have been appropriate. 24 In the inertia-NO condition, T0 represents the point at which a patient who has had an HbA1c >58 mmol/mol (>7.5%) for one or two consecutive visits is prescribed insulin. The total data pool comprised 129 373 individuals, 32 752 of whom were started on insulin therapy in accordance with the 2020 guidelines of the American Diabetes Association, 43 375 whose insulin therapy initiation was delayed, and 53 246 who never received insulin as reported in Table 1. Figure 2 provides an illustration of the stratification of HbA1c levels across patients who experienced a delay in insulin initiation (inertia-yes) and those who did not (inertia-no).
The primary goal of the statistical analyses was to verify if the HbA1c gap was able to discriminate the presence or absence of inertia for any HbA1c level at a particular visit. Figure 4 shows the calculated HbA1c gap across different ranges of HbA1c at T0 for individuals who did and did not experience therapeutic inertia. What can be observed in Figure 4 is that the HbA1c gap is systematically higher across all ranges of HbA1c in those individuals who did not experience therapeutic inertia.
To confirm the statistical validity of the results, we began by determining whether the data have a normal distribution by submitting the entire range of HbA1c values to a Jarque-Bera test, which is typically used for large data sets such as ours. The results of the test indicated a nonnormal distribution (p <.05). The data set were then submitted to a Mann-Whitney test for each of the pairwise comparisons starting with 58-64 mmol/mol (7.5%-8%) and ending with 119 mmol/mol (13%). The analyses confirmed a statistically significant difference in the average HbA1c gap between the two groups (presence of absence of inertia) (p <.01, two tailed test) across all ranges. A further verification was made by calculating the ROC curve for the HbA1c gap related to inertia. Figure 5 shows the AUC (0.776) that confirms the ability of the HbA1c gap to discriminate between situations of therapeutic inertia from those where no therapeutic inertia was present as suggested by ML. The Youden index for the HbA1c gap indicates a threshold value equal to 5.5 mmol/mol (0.505%), which agrees with the threshold value obtained with ML. It should be noted that the threshold values obtained with the two techniques were not identical, given that ML takes into consideration the HbA1c gap along with all other variables input into the model, and the Youden index refers strictly to the HbA1c as a single variable.
Finally, in order to verify if the HbA1c gap plays a role in the discrimination between situations in which therapeutic inertia is present and those situations in which it is not even after T0, all patient visits after T0 (T0 + 1, T0 + 2…T0 + n) were analyzed and the HbA1c gap was calculated. Figure 6 illustrates the results of this analysis for individuals who received insulin and those who continued to experience therapeutic inertia. Figure 6 once again shows that even for patient visits subsequent to T0, the HbA1c gap is systematically elevated in those individuals who receive insulin relative to those who continue to experience therapeutic inertia. The Jarque-Bera test was also applied to this series of data and results indicated a nonnormal distribution (p < .05). A Mann-Whitney test F I G U R E 2 Stratification of HbA1c levels at "T0" across patients who experienced a delay in insulin initiation (Inertia-yes) and those who did not (Inertia-no) revealed that for all periods following T0, average values between the two groups were statistically significant (p < .01, two-tailed test). This finding further confirms the discriminating role of HbA1c between situations in which therapeutic inertia is present and those in which it is not.

| DISCUSSION
The current state of therapeutic inertia and delay in initiation of insulin therapy in individuals with type 2 diabetes is systemic and unsettling. 2,3 This inadequate approach F I G U R E 3 Area under the receiver operating characteristic (AUC ROC) curve of the best performing model F I G U R E 4 Average HbA1c gap across HbA1c ranges at T0 for individuals who did and did not experience therapeutic inertia. Mann-Whitney (U) and number of subjects is given for each range. (*) indicates statistical significance to glycemic control in individuals with type 2 diabetes is a global concern and one that is related to poor patient outcome as well as increased socioeconomic and clinical burden. 4,5,25,26 Though several studies have provided explanations for the motivation behind the delay in insulin therapy, 27,28 clearly, new approaches are needed to get at the root of the factors that are associated with therapeutic inertia, particularly the identification of key drivers that may break a health care provider's tendency to delay initiation of therapy.
Our objective for this study was to uncover as yet unrecognized factors that motivate a provider to move away from behavioral inertia and initiate insulin therapy using ML techniques and evidence-based medicine. LLM, a "clear box" ML with "explainable AI," is able to delineate the characteristics of those individuals who face therapy initiation inertia compared with those who undergo treatment with insulin in accordance with established guidelines. It was thus possible to establish a predictive model capable of identifying key drivers F I G U R E 5 Area under the receiver operating characteristic (AUC ROC) Curve of the HbA1c gap relative to YES/NO inertia F I G U R E 6 HbA1c variation for subgroups of individuals who are initiated on insulin either at T0 + 1 or on subsequent visits (T0 + 2… T0 + n). Mann-Whitney (U) and number of subjects is given for each range. (*) indicates statistical significance associated with the initiation of insulin therapy with high accuracy (0.79).
Specifically, our results revealed that a medical practitioner was more likely to initiate insulin not only as a consequence of excessively elevated HbA1c as one might expect, but also when, across two consecutive visits, a patient showed a difference in HbA1c of 1% or greater. This difference, which we are calling the HbA1c gap, is correlated with a movement toward insulin initiation irrespective of the absolute HbA1c value and could play an important role in the decision to start insulin. That is to say, it was the HbA1c gap that prompted insulin initiation rather than only the absolute HbA1c at a current visit, which in our study varied between 7.5% and 11%.
We believe that our results could provide the medical practitioner with an additional and new measure to monitor in existing or new patients. One could even envision the inclusion of an AI algorithm as part of a patient's electronic health record, which could alert the provider in real time as to risks related not only to inertia but to other health concerns that can be modeled in a manner similar to what has been described in this report. This type of analysis which has been deemed "augmented intelligence" is a way to use AI to improve the quality of decision-making rather than substitute or automate human decision-making. Rather than waiting for a critical point at which the patient begins to experience new and potentially dangerous comorbidities as the moment to begin insulin therapy, the medical practitioner could remain increasingly more aware of variabilities in a patient's condition in real time and intervene more promptly and appropriately. The model's high level of precision confirms the adequacy and utility of the input variables to identify key drivers that influence a health care provider's tendency to exhibit or not exhibit inertia when faced with an individual with type 2 diabetes. The model clearly points to the role of dynamic variables related to glycemia as crucial for the determination of whether or not a health care provider will make a timely decision or remain inert with respect to the initiation of insulin therapy.
A weakness in the study is related to the need to validate findings in a pilot study first. Furthermore, a weakness of any ML model is its reliance on electronic medical data, which to be fully functional for use with ML, must somehow remain in a public domain and not be privately owned data. Otherwise, access to the data can be terminated when an attempt is made at integration with ML modeling from external software.
The data from this study provide new insight for health care providers as they face the challenges of understanding their patients and provide the individualized care that each needs. The information generated by these LLM analyses not only will allow health care providers to gain awareness into the factors that drive their behavior leading to therapeutic inertia but also clearly paves the way for indepth investigations of other unknown factors using ML techniques that could help identify further subgroups at greater risk for therapeutic inertia.
In sum, the presence of inertia in the absence of complications suggests how the importance of timely and decisive therapeutic action is still often underestimated even though current guidelines clearly indicate that timely therapeutic action prevents future complications. The present real-life study has demonstrated how research on inertia can generate new points of view and novel approaches, which will provide healthcare providers the ability to create innovative, effective, and realistic training on this highly debated topic.