Opening the black box: Personalizing type 2 diabetes patients based on their latent phenotype and temporal associated complication rules

It is widely considered that approximately 10% of the population suffers from type 2 diabetes. Unfortunately, the impact of this disease is underestimated. Patient's mortality often occurs due to complications caused by the disease and not the disease itself. Many techniques utilized in modeling diseases are often in the form of a “black box” where the internal workings and complexities are extremely difficult to understand, both from practitioners' and patients' perspective. In this work, we address this issue and present an informative model/pattern, known as a “latent phenotype,” with an aim to capture the complexities of the associated complications' over time. We further extend this idea by using a combination of temporal association rule mining and unsupervised learning in order to find explainable subgroups of patients with more personalized prediction. Our extensive findings show how uncovering the latent phenotype aids in distinguishing the disparities among subgroups of patients based on their complications patterns. We gain insight into how best to enhance the prediction performance and reduce bias in the models applied using uncertainty in the patients' data.


INTRODUCTION
Predicting complications associated with the disease is challenging. They can be numerous and can interact in complex nonlinear ways throughout the disease process. However, if we can better predict the onset of different complications in individual patients, then we can intervene more effectively. In addition, to gain patients trust and satisfaction, it is mandatory to understand/explain influencing factors of disease that guides decisions. Black box AI models in the clinical decision-making process are models that attempt to predict/diagnose/forecast/group patients using complex parameters that are not easily understood. For example, the complexity of countless hidden layers in a deep neural network and their interconnections makes it challenging to determine precisely how predictions are being made. Compare this to decision trees or graphical models where inference is more transparent and therefore explainable. Previously, we have explored the use of probabilistic graphical models to build more transparent methods of modeling disease progression. In particular, we used dynamic Bayesian networks to model clinical data and predict the onset of type 2 diabetes mellitus (T2DM) complications. 1 We developed methods to infer the location of hidden variables within these models in order to improve prediction. 2 The behavior of these hidden variables over the course of the disease process can be thought of as a "temporal phenotype" for an individual patient, 3 which is considered as a "latent phenotype." Preliminary experiments obtained in Reference 4 showed that it is possible to find subgroups of patients only based on their latent phenotype. Nevertheless, the techniques used in these investigations were not validated for interpreting each subgroup to enhance the prediction of the associated complications. Therefore, this study facilitates a hybrid type approach that utilizes a variety of patients subgroups in which the prediction of the associated complications is improved for optimal performance. These techniques can also be combined for a better understanding of the latent variable as well as an underlying pattern of complications for the type of patients. In this article, temporal association rules (TARs) are utilized to identify the frequent co-occurrence of complications over time. An integration of TARs and pattern clustering attempts to build meaningful subgroups. The obtained clusters of the rules are compared with clusters of the latent phenotypes that are extracted from the hidden variable by using dissimilarities and dynamic time warping (DTW) distance among patients. 3 In Section 3, we discuss the data used to explore our approach along with details of the methods introduced. In Section 4, we document the results of these methods. The prediction accuracy of the complications is also validated the contribution of this study in terms of obtaining a higher performance by using the discovered subgroup comparing to the raw dataset which did not consider the presence of the latent phenotype and the proposed hybrid methodology. Section 5 discusses the challenges and our solution in more detail when tested on the diabetes data before concluding in Section 6.

Related work
The World Health Organization (WHO) reported that T2DM accounts for at least 90% of all diabetes types. Another study in WHO revealed that T2DM patients are at increased risk of long-term vascular comorbidities, which is known as "underlying cause of death" and severe phenotype of the disease. 5 It has previously been observed that patients with T2DM are also at an increased risk of microvascular comorbidities, including nephropathy, neuropathy, and retinopathy. 5 Similar to diabetic type 1 patients, although genetic factors impact on developing T2DM, it is believed ignorance of developing complications harms patients' life because it may develop a different profile of complications and features, which changes over time per follow-up visit. However, these life-threatening complications remain undiagnosed for a long time because of the hidden patterns of their associated risk factors. 6 The underlying pattern of the complications is known as the major source of mortality and morbidity in T2DM and how their co-occurrence is followed/caused by other complications associated with the disease. 7 That is because predicting a target complication can be challenging without the consideration of the effects of its associated complications.
Understanding the associated pattern of complications has been used significantly in the clinical domain. 8 It provides an insight into the prediction and relative prevention of the associated complications, which are expected to occur in a patient follow-ups. 9 It generally can lead to less suffering time for patients while saving time and cost to healthcare. However, that is highly dependent on the stage of disease along with the prior occurring complications, which is associated with time series analysis. In time series analysis, every disease risk factor and complication is determined by various features in previous patient visits (time interval).
In this work, we attempt to address this issue and present an informative rules/ordering pattern of patient behavior, with an aim to capture the complexities of the associated complications' over time. The proposed descriptive strategy has been regarded as a useful tool known as association rules (ARs) to detect interesting relationships among T2DM complications. ARs strategy originated from learning patterns from supermarket transaction data and was introduced by Agrawal. 10 Temporal abstraction (TA) has also been employed for the segmentation and aggregation of time series data into a symbolic representation, suitable for decision making and data mining. 11 TARs 12 is an extension to ARs 10 to analyze basket data that include a temporal dimension to order related items. Many algorithms with temporal rules work by dividing the temporal transitions database into different partitions based on the time granularity. For example, different mining algorithms are reformulated and presented to reflect the new general TARs, and these include progressive partition minder (PPM), segmented progressive filter (SPF), and TAR algorithm. 10 Various algorithms are proposed for the incremental mining of TARs, especially for numerical attributes. 13 Allen's rules 14 abstracted time series data into a relation (PRECEDES) to find TARs in Reference 15. Various ways have been proposed to explore the problem of TARs discovery. 16 Nevertheless, previous studies employed ARs strategy on a given subset specified by the time, 17 while not considering the specific exhibition period of the elements.
Association rule mining (ARM) finds frequent patterns by mining ARs with the use of two basic parameters of support and confidence. 18 The majority of the previous ARM algorithms worked by dividing the temporal transitions database into different partitions based on the time granularity obliged. Then mining TARs were employed by locating frequent temporal item subsets within these partitions. However, the incremental mining of TARs for numerical attributes cannot always be easily adapted to a transaction database. Despite all efforts, it appeared that no method exists today that can find meaningful subgroups of patients based on the underlying pattern of complications in the existence of the latent risk factors. With a similar objective as this thesis, Moskovitch and Shahar 11 conducted a study in which time-interval mining methods obtained informative temporal patterns for finding relationships in the transitivity inherent in time series diabetic patients. Also, they exploited TA for the segmentation and aggregation of a time series into a symbolic representation, suitable for decision making and data mining. Although Moskovitch's paper is consistent with this study by using supervised learning in time series diabetes data, it differs from this work in finding meaningful time series patterns only based on gender not complex temporal patterns from a longitudinal clinical dataset with the appearance of latent risk factors.
A considerable amount of literature has been published on TARs to discover interesting rules based on several quality filtering metrics known as constraints. Luna et al 19 conducted an empirical study in the optimization of the most interesting groups of metrics. In addition, recently, they provided a rich review on the commonly used frequent itemsets mining algorithms. 20 Part of this work is motivated by Hashler and Karpienko, 21 which introduced a distance-based clustering of ARs. It then is supported by Li et al, 22 which revealed that applying a postprocessing method to ARs to find the most frequent calendar patterns improves interpretability in the descriptive analysis. Unfortunately, the previous methods were not only limited in time granules but also increased the uncertainty in the relationship among rules, while there was overlap among clusters in k-means clustering. The frequent pattern mining research significantly affects data mining techniques in longitudinal data. A postprocessing approach in Reference 23 attempted to extract interesting subsets of temporal rules within T2DM data. However, it only considered characteristic patterns of administrative data without the appearance of latent variables. Other researchers have undertaken AR mining of clinical data. Lee et al attempted to address the issue in Reference 24, and these have led to the proposal of the concept of general TARs, where the items are allowed to have varying exhibition periods, and their support is made based on that. Another piece of research conducted by Plasse et al 25 looked at finding homogeneous groups of variables. They suggested that a variable clustering method could be applied to the data in order to achieve a better result in pattern discovering methodology. However, their strategy to mine ARs differs from this study in which the number of rules was reduced only based on hierarchical clustering applied to items, not to multiple identical binary attributes. Among these, some methods uncovered temporal patterns and relationships among clinical variables, including causal information 26 and numeric time series analysis. 27 In longitudinal clinical data (eg, T2DM), one of the most important factors in the high number of dependencies among features and complications is the appearance of unmeasured risk factors. Surprisingly, the effect of understanding unmeasured variables, which play an important role in disease prediction, does not seem that closely examined. The reason behind this might be because of the recent focus on the AI models with a black box nature. What is more, there are several issues with TARs when there are some rare rules of particular interest. 28 Given the strong association between the complications, another challenge is the existence of unknown (latent) factors in the data. It is crucial to understand better the latent variables and other associated risk factors to be able to predict their underlying patterns earlier than their actual occurrence time. That can be done by exploring a well-chosen group of potentially all significant patients' patterns while identifying temporal phenotypes based on their unmeasured risk factors with reasonably minimal outliers. Having insight into the causal associations, among disease complications, we attempt to open a black box model to ease interpretation of the hidden patterns of complications in an accurate predictive model. We, therefore, need to take into consideration both descriptive and predictive data mining strategies.
Nevertheless, Lakkaraju et al 29 suggested that there is a trade-off between patient personalization (in a descriptive analysis) and prediction performance (in predictive analysis). In other words, aiming explainability (in an explainable/interpretable model) is often possible at a higher cost of the predictive accuracy (in a Black box model). 8 Therefore, in the black box models, it can be challenging to determine from just temporal clinical data what is coordinating the visible patterns, to separate the underlying causes into meaningful and spurious causes, which help patient stratification with understanding hidden variables. Black box AI models in decision making are mostly based on deep learning techniques with many latent variables. For example, these models map a patient's latent/risk factor into a class only based on the combinations of weights without exposing the reasons why. Black box models are problematic not only for lack of transparency but also for possible biases inherited by the algorithms from clinician's mistakes. 30 This issue is caused based on the human prejudices and underestimation of the impact of the risk factors underlying behavior/pattern as well as the existence of latent variables in the dataset, which may lead to incorrect and unfair decisions.
Nevertheless, considering all of this evidence, none of the above studies have clustered uneven time series clinical data based on a hidden variable for extracting temporal phenotype and behaviors of patients. There are quite few research studies on predicting T2DM complications and T2DM black box models. However, studies on explaining an unknown risk factor/latent phenotype by using a hybrid data mining methodology (including descriptive and predictive) are rare to find in literature.
In this work, we argue that binary complications could be predicted accurately by discovering the latent factors and adding them to the observed data. 1 Another study in Reference 3 have primarily concentrated on the clustering approach based on the latent variable to personalise the patients. That is consistent with the very current work in Reference 4, which also provided a comparison methodology to evaluate the discovered latent variable clusters by using a combination of supervised learning such as clustering and TARs among the binary complications. Hence, Reference 4 found similar clusters to those obtained in Reference 3. This article extends the previous work in Reference 4 in order to take into consideration both descriptive and predictive analysis when it comes to the basic idea of precise prediction through and explainable model.
To sum up, the motivation behind this work is conducting new research in order to suggest that the identification of a "latent phenotype" can be utilized to separate patients into meaningful subgroups with the consideration of the relation among T2DM complications. In general, as observed balancing strategies from the prior studies to deal with imbalanced data for one complication at a time, it is challenging to obtain the prediction performance enhancement for all complications. Therefore, another motivation for this study is to improve the performance for predicting associated complications considering the imbalance issue.

Data
The data for this study consist of prediagnosed T2DM patients aged 25

Preliminaries
From diabetes health status records, the T2DM dataset is accumulated (which is denoted here as DS) from prediagnosed diabetic patients. For each patient in T2DM dataset defined the fol- where demonstrates a distinct patient, i identifies the patient in which i ≤ p, and p denotes the maximum number of patients in DS. V i refers the visits of patient i ( i ), there is a maximum T i of visits V i , where p = 356 represents a maximum number of patients and T i is a maximum of visits (V i ) for ith patient. The number of visits is not necessarily the same for different i and varies (2 ≤ T i ≤ 300). Hence, there is a total of T = 3959 visits/instances/time series in DS, which contains temporal observations of the occurring complications.
be a set of visits for i-th patient with T i time series where V iv represents the vth visit of i (as demonstrated in Equation (1)). For each of the patients in DS, over which linear order is defined, v ≤ z means V vi occurs before or is earlier than V iz in [V iv V iz ]. In order to clarify the dataset, a vector of patients is demonstrated in Equation (1).
Tables 1 and 2 represent the selected T2DM complications (comorbidities), risk factors, and their clinical control values. Data are discretized into qualitative states (binary and nonbinary features) of ordinal clinical risk by using statistical parameters such as mean, median, and SD. The main goal of this thesis is to understand the underlying patterns of associated binary complications.
In this study, the association of nonbinary risk factors/symptoms has not been considered in order to extract rules among T2DM complications. The reason behind this is that by utilizing the discovered latent variable, the overall behavior of T2DM risk factors is captured by using the IC*LS algorithm in a DBN framework (which is called a "latent phenotype"). Therefore, this study only concentrates on five binary complications as predictive target classes in a binary classification problem (two categories of classes: "high" or "low" risk). Furthermore, a complication class value of low risk (zero) represents a patient visit in which the complication is not present; otherwise, it is at high risk (one). For instance, a complication class value of zero represents a patient visit in which the complication is not present; otherwise, it is one. Alternatively, other risk factors associated with a patient (symptoms/clinical tests) are abstracted in the multiclass classification problems with more than two targets including "high," "medium," and "low" risk patient, according to a diabetes expert's definitions. 32,33 Let be a set of binary complications in DS, where = ∑ 5 i=1 i . i must be selected from one of HYP, NEU, NEP, LIV, RET, and i only takes on clinical class values from {low, high}. For example, if ith complication ( i ) of kth patient ( k ) is diagnosed negatively (not having the complication of i ), the class value becomes zero ( k ( i ) = low); otherwise it sets to one ( k ( i ) = high) in which it shows that the patient is diagnosed positively (having the ith complication).
For retrieving the conditional rules (if-then pattern) among the complications, we need to make use of some concepts within the associated complications rules. Here, preliminaries for ARs are defined according to a study conducted by Parvez et al. 34 ARs in this article aim to uncover all such relationships between complications from T2DM dataset. TAR of {antecedent ⇒ consequent} is a representation of finding consequent on the patient visits (which is called basket) followed by the corresponding antecedent on it. (1)

Latent phenotype discovery and time series clustering
The previous work by Yousefi and co-authors in Reference 4 stated that a discovered latent phenotype could be used to capture the temporal risk factors while monitoring the pattern changes in the disease. The latent phenotype for each patient is extracted from the most influential hidden variable identified using the IC* Stepwise algorithm, 4 which uses a DBN framework for inferring model structure and any potential hidden variables simultaneously. A latent variable H is defined to be the expected values for this hidden variable calculated using EM algorithm within the DBN framework. Time series clustering is used on these expected values of the latent variables with DTW to generate clusters of patients as well as identify the "mediod" patient at the center of each cluster. Having discovered the latent phenotype clusters (which is called "H clusters"), it assumes that patients within a cluster share a similar risk factor profile as well as a similar pattern of the occurring complications. In this study, this pattern for each H cluster represents the most frequent ordering pattern of complications, which is associated with the corresponding deep latent phenotype. However, the meaning of the H and its influence on the complications' pattern for each subgroup of patients has remained unclear. In order to understand how the latent phenotype helps to group patients, a combination of the TARs mining and time series clustering is performed in the next section.

TAR and AR mining
In this study, ARM is a method that discovers all combination/sequence/set of items (complications), which is called itemsets with the frequency of transactions (referred to support) greater than a predefined minimum threshold based on large itemsets (in the case greater than 0.001). To generate interesting rules with having a confidence greater than the default threshold, it was important to find large itemsets. However, for the sake of simplicity and having a small-sized dataset with sensitive clinical data, a confidence constraint of 25% is chosen. In T2DM dataset, support is regarded as an explicit constraint to identify the outliers. Thus, the minimum constraints must be assigned at a low level. This is because complication rules with predefined constraints that vary from a patient to another patient. Moreover, in the small-sized dataset with the appearance of bias, it is necessary to ascertain that the frequent items do not affect the associations of other items rather than HYP. T2DM binary complications are representing items of TARs in the shopping basket problem. Itemset of {antecedent, consequent} is a representation of the sequence of complications occurs between two visits of [V iv V iz ]. An itemset I is a transaction that represents a pattern of all associated complications over a patient time series (from the first recorded visit to the last visit). If I is a transaction in database R and a rule is an implication of the form The maximum number of items in I is five (|I| = 5), which is equal to the number of binary complications (items). In terms of explaining temporal notation, every two itemsets with a similar complication co-occurrences are treated equivalent and any redundant complication in their intersection is ignored.
In order to analyze antecedent and consequent itemsets, we declare the following definition: { i , j } is an ordered pair of complications (two-tuple) in which representing the set consisting of both complications i and j with respect to their ordering pattern. { i , j , k } is an ordered triple (three-tuple), while ∅ or {} is the empty tuple (zero-tuple). The consequent itemsets may be consisted of more than one item per rules. In the process of pruning/analyzing the rules to pick the most interesting one, our main priority in predictive model for the decision making is based on consequents. Note that despite the fact that the empty set (no complication is diagnosed) is an empty type and one subtype of each of rules antecedent ({} ⟂ ( )), this is not allowed to be located in consequents. Database R = ∑ m=87 l=1 R l is retrieved based on the relationships among the complications for all patients within DS. An antecedent of In addition, if there is an "OR" (|) operation among complications in a rule, { i | j } means that either of complications ( i or j ) can occur or neither of them ({}), as shown in Equation (2).
In Each of the patients in DS can develop any combination of items included in where C( i ) represents all complications/items that patient i has developed during the visits record. A set k-combination of items is a subset of k distinct complications/items chosen from (k-itemsets, which is called a subrule). For each patient i with a set of visits V i , the number of k-combinations is equal to the binomial coefficient. C( i ) ⋈ C( j ) is the natural join of the relations C( i ) and C( j ) where all combinations of tuples in C( i ) and C( j ) are equal on their common complications.
Given a set of is the set of all complications denoted by Thus, a sequence of complications co-occurrence of is assumed as a partial ordering on ( ), where ( , ⊆) is a poset considering . The inclusion relation ⊆ defined as a partial ordering on the power set of with a reflexive, antisymmetric, and transitive nature. For example, i with a pattern of the . Thus, patients ith and jth are developing a similar rules and belonging to a subgroup if their antecedents follows . This means that the set of items in the consequent of is belonged to intersection of both rules (RHS(

Quality metrics
Support is a fraction of patients containing the itemsets (which is called a transaction or a basket of items). Confidence calculates the probability of occurrence of {consequent} given {antecedent} is present. Lift is the ratio of confidence to baseline probability of occurrence of {consequent}. A frequent itemset is an itemset included in at least a significant number of patients. ARM involves the generation of itemsets and TARs. Maximal frequent itemsets represent an itemsets in which none of the corresponding supersets are frequent. The support measure of itemsets C( i ) * (supp(C( i )) is defined as the proportion of transactions in the dataset containing RHS(C( i )). In particular, an AR of (C( i )) ⇒ (C( j )) has a support of P(C( i )C( j )). The confidence measure of a rule identifies the proportion of transactions with the most interesting/important relationships. In addition, the confidence of a rule is defined as confidence(C( i ) ⇒ C( j ) ≡ support(C( i ) ∪ C( j )) ≡ support(C( j )), which satisfies Equation (8).
) . (3) Parameters such as and are the minimum support and confidence, respectively. Instead of using accuracy, efficiency is an appropriate way to evaluate ARs. 34 To obtain the frequents itemsets, first TARs are filtered by using support and confidence. However, they are not able to filter complication rules based on the different dependencies among the rules. For this purpose, a measurement of independence of C( i ) and C( j ), which is known as lift. Lift is the deviation of the whole rule support from the expected support under independence given both sides of the rule support. Higher lift values indicate strong associations. Lift of 1 represents C( i ) and C( j ) are independent as shown in Equation (9).
For example, the probability of developing both HYP and LIV is associated with the likelihood of developing RET. Confidence of HYP, LIV implying RET is given as the likelihood of developing HYP, LIV, and also RET over the likelihood of developing only HYP and LIV (see Equation (10)).
The confidence measures whether {RET, HYP, NEU, RET} implies LIV. This reveals that how likely a given patient develops {RET, HYP}, NEU, RET, and LIV. In order to find the most interesting itemsets, support ensures that all subrules of the frequent itemsets are also frequent, hence no superset of infrequent itemsets can be frequent. Confidence is very sensitive to the frequency of the consequent. It has been reported that consequents with higher support will produce higher confidence even though there is no association among the antecedent and consequent. Thus, it might not be useful in performing effectively with the existence of bias in dataset DS with a having small number of patients and relatively complications. Confidence measures the strength of the ARs in which the patients that have complication C( i ) also developed C( j ) together. There is a number of choices for selecting the filtering measures 35 such as lift, leverage, and cover- . In T2DM dataset, there is a strong association (indicated by the highest lift) among the complications, which shows the likelihood of the complication being developed relative to its general developing rate, given that the patient developed other complications. For instance, the conditional probability of developing both HYP and LIV in are associated with the likelihood of the patient developing RET. There is a strong association (indicated by the highest lift) among the complications, which shows the likelihood of the complication being developed relative to its general developing rate, given that the patient developed other complications. For example, the conditional probability of a patient developing both HYP and LIV is associated with the likelihood of the patient developing RET. Whereas coverage filters the rules mostly based on their antecedents. This is opposite to this article preferences where the consequents (the complications occur in the future visits) have been considered as the most revealing itemsets in the decision making and prediction process. Similar to lift, conviction metric assesses the likelihood of the appearance of an antecedent in which the corresponding consequent is not likely to occur.
Overall, a question still remains to answer whether it could be possible to trust these metrics by the user-defined thresholds. In particular, there are many challenges to find the most interesting rules 36 only by relying on TARs. Nevertheless, most of the previously mentioned metrics in this study are mainly depended on the support and frequency. In a small-sized dataset like DS, where there is a different imbalance ratio for each item (complication), bias, and latent factors, it may not be beneficial if is only trust on the obtained itemsets resulted by using support, confidence, and lift.
Moreover, there are some itemsets that are called frequent itemsets, while their occurrence exceeds the threshold in the database. In order to generate interesting rules, one could come across many frequent itemsets with minimal confidence. In the other words, by applying a rigid constraint with having bias in data, the final itemsets can be identified as interesting itemsets wrongly. This is because interestingness is only based on the association of HYP with the items, not the relationships among the items themselves. An item like HYP with a high occurrence rate can affect the way how other items are associated with each other. To avoid the above issue in a small-sized dataset, we tend to discover all types of associations regardless of effect of bias (eg, HYP) and focus mostly on the relaxed or flexible filtering metrics.
It does not seem to be possible to only rely on lift as it may not be trustworthy enough and unable to perform effectively with the existence of bias in the incomplete data. Lift suffers from having nonfixed range of variables. It only assesses the dependency and correlation of the items without taking into consideration the importance of the cause and effect relationships among antecedents and consequents. Similar to the issue related to support and confident, lift is susceptible to infrequent items with a relatively low probability complication rules that can be ranked wrongly as the most interesting itemsets. Although having a very low or minimal constraints to be applied on the quality metrics, it does not eliminate the above issue, which is caused by generating all possible permutations of complications for all transactions as an non-optimal option. This is because, Tables 3 and 4 contain many different antecedents and consequents, which increase the database size exponentially based on the number of items. It also leads to generating large number of uninteresting distances among many small rules despite the previously chosen optimal/minimal threshold for support and confidence. In this situation, neither clustering nor ARM methodology perform effectively and can be even worse and problematic in a sparse dataset (such as T2DM). In conclusion, for making a better decision, the uninteresting rules needs to be reduced at another level which is addressed in the next section.

Methods
This section explains the methods to find explainable subgroups of patients. Our recent work in References 3 and 4 has suggested that the identification of a "phenotype" can be used to separate patients into meaningful subgroups with the consideration of the T2DM risk factor and complication relationships. Here we, first, identify an informative pattern based on latent variables, which we call a "latent phenotype." This is then used to group patients and captured the complexities/homogeneity of the risk factors/complications over time. Studies relating to enhancing the interpretability of latent variables along with a significant improvement in the prediction performance have been relatively scant. There is no study focusing on utilizing ARM in the underlying patterns of temporal complications rules (which we note as "complication rules") in order to explain the latent variable behavior. Since the clinical model can have serious consequences, it is imperative to better understand the associated complication rules in trustable/interpretable patients models. These models are relatively complex; however, it can be accurately modeled by using data mining techniques (including both descriptive and predictive strategies). We further extend this idea by using a hybrid methodology of TARs, ARM, time series clustering, statistical, Bayesian structure modeling, and predictive analysis in order to find explainable subgroups of patients with more personalised prediction. To implement the model, the associated complication rules are mined to assess the occurrence likelihood of binary complications in relation to the rest of complications associated with a prediagnosed T2DM patient. For example, to find out whether the increasing prevalence of HYP has been accompanied by an increase in the NEU or patients with NEP are also diagnosed by LIV. Then, TARs are chosen according to the needs of the study to discover underlying relationships among the complications.
Similarly, pattern mining and sequence discovery are performed to explain and highlight the potential usefulness of the complication rules with a deeper understanding of their causal structure within the clinical data. With ARM we are interested in the absolute number of patients that contain a particular set of complications. By utilizing TARs, given many patterns of complication rules (itemsets), we attempt to find which itemsets, that belong to a patient, predict another complication for the patient. Thus, we use a postprocessing approach (which is called minimum coverage itemsets [MCI]) to prune the rules to the most important ones and to find the most useful distances in order to obtain meaningful clustering outcomes. We then attempt to explain and validate these groups through the integration of TARs combined with time series clustering. Figure 1 illustrates the overall process that includes: hidden variable discovery that is used to identify the latent phenotype and, in turn, generates the latent phenotype clusters (H cluster), TARs clusters, and finally comparison and validation strategies (involving Jaccard distance metrics and sensitivity analysis). The proposed hybrid methodology to find explainable subgroups of patient and interpret the latent variable by personalizing diabetic patients in precision medicine is demonstrated as a multiple-stage process in Figure 1, which is labeled and explained as follows: 1. Data discretization and preparation are employed to generate the original T2DM dataset (DS) in the preprocessing approach. 2. For each patient, an informative pattern (latent phenotype) is identified based on the latent variable discovery approach using DBNs and IC* Stepwise algorithm, latent phenotype. 3. DTW finds dissimilarities between the discovered latent phenotypes and captures the complexities/homogeneity of the risk factors/complications over time.

F I G U R E 1
The proposed hybrid methodology to find explainable subgroups of patient to interpret the latent variable by personalizing diabetic patients in precision medicine 4. Time series clustering based on DTW distance is applied to stratify patients into four latent phenotype clusters. 5. The multiple binary complications, as items from the preprocessed dataset DS, are extracted and mined to retrieve the temporal patterns of items for all patients. 6. TARs are applied on the obtained patterns from DS and generate Tables 3 and 4. These rules consists of (87 × 2) subrules (including 87 antecedents and 87 consequents). 7. A postprocessing ARM methodology is applied to the complication rules where metrics such as support and confidence with predefined soft thresholds filtered frequent rules. These constraints are strengthened in which lift of the frequent rules must come through the highest lift boundary. Then another algorithm (which we called MCI) generates least itemsets in D covering the interesting rules from R. MCI locates alternative optimal combinations of the subrules in which the number of repetitive items can be reduced. As a result, dataset D is generated based on the most important rules. 8. All rules in R are mapped to the relevant objects/itemsets in D based on the implications of the antecedents and consequents. Jaccard index measured the objects to clusters the complication rules (TAR clusters). 9. By using agglomerative clustering, objects are grouped in five groups. Patients are assigned to the corresponding cluster based upon their associated unique pattern of complications. 10. Jaccard Similarity and more statistical methods is applied to compare and validate the discovered clusters to find meaningful subgroups of patients from the intersection of H and TAR clusters. 11. Prediction performance of the discovered meaningful subgroup (DS1) as a subset is compared to DS. 12. Sensitivity analysis is utilized to assess DS1 and analyze its prediction performance comparing to DS. 13. The latent variable is explained for patients with a similar pattern of TARs and latent phenotype.

Latent phenotype discovery and time series clustering
Previously, we stated that a discovered latent phenotype could be used to capture the temporal risk factors while monitoring the pattern changes in the disease. The latent phenotype for each patient is extracted from the most influential hidden variable identified using the IC* Stepwise algorithm, 4 which uses a DBN framework for inferring model structure and any potential hidden variables simultaneously. We define H to be the expected values for this hidden variable calculated using expectation-maximization (EM) algorithm 37 within the DBN framework. Time series clustering is used on these expected values of the latent variables with DTW to generate clusters of patients as well as identify the "mediod" patient at the center of each cluster. Having discovered the latent phenotype clusters (which we call "H clusters"), we assume that patients within a cluster share a similar risk factor profile as well as a similar pattern of the occurring complications. In this study, this pattern for each H cluster represents the most frequent ordering pattern of complications, which is associated with the corresponding deep latent phenotype. However, the meaning of the H and its influence on the complications' pattern for each subgroup of patients has remained unclear. In order to understand how the latent phenotype helps to group patients, a combination of the TARs mining and time series clustering is performed in the next section.

TARs and AR mining
In this study, ARM is a method that discovers all combination/sequence/set of items (complications), which is called itemsets with the frequency of transactions (referred to support) greater than a predefined minimum threshold based on large itemsets (in our case greater than 0.001).
To generate interesting rules with having a confidence greater than the default threshold, it was important to find large itemsets. However, for the sake of simplicity and having a small-sized dataset with sensitive clinical data, we choose a confidence constraint of 25%. In T2DM dataset, support is regarded as an explicit constraint to identify the outliers. Thus, the minimum constraints must be assigned at a low level. This is because complication rules with predefined constraints which vary from a patient to another patient. Moreover, in the small-sized dataset with the appearance of bias, we need to ascertain that the frequent items do not affect the associations of other items rather than HYP. In order to find the most interesting itemsets, support ensures that all subrules of the frequent itemsets are also frequent, hence no superset of infrequent itemsets can be frequent. Confidence is very sensitive to the frequency of the consequent. It has been reported that consequents with higher support will produce higher confidence even though there is no association among the antecedent and consequent. Thus, it might not be useful in performing effectively with the existence of bias in dataset DS with a having small number of patients and relatively complications. Confidence measures the strength of the ARs in which the patients that have complication i also developed j together. We have a number of choices for selecting the filtering measures 35 such as lift, leverage, and coverage, where In T2DM dataset, there is a strong association (indicated by the highest lift) among the complications, which shows the likelihood of the complication being developed relative to its general developing rate, given that the patient developed other complications. For instance, the conditional probability of a patient developing both HYP and LIV is associated with the likelihood of the patient developing RET. There is a strong association (indicated by the highest lift) among the complications, which shows the likelihood of the complication being developed relative to its general developing rate, given that the patient developed other complications. For example, the conditional probability of a patient developing both HYP and LIV is associated with the likelihood of the patient developing RET. Whereas coverage filters the rules mostly based on their antecedents. This opposite the present paper preferences where the consequents (the complications occur in the future visits) have been considered as the most revealing itemsets in the decision making and prediction process. Similar to lift, conviction metric assesses the likelihood of the appearance of an antecedent without the corresponding consequent.
Nevertheless, a question still remains to answer as if we can trust these metrics by the user-defined thresholds. In particular, there are many challenges to find the most interesting rules 36 only based on the TARs and its constraints. For example, all of the previously mentioned metrics in this article only depend on the support and frequency. In a small-sized dataset like DS, where there is a different imbalance ratio for each item (complication), bias, and latent factors, it may not be beneficial if we only rely on the obtained itemsets resulted by using support, confidence, and lift. Unfortunately, there are some itemsets that are called frequent itemsets while their occurrence exceeds the threshold in the database.
Moreover, in order to generate interesting rules, we come across many frequent itemsets with minimal confidence. In the other words, by applying a rigid constraint with having bias in data, the final itemsets can be identified as interesting itemsets wrongly. This is because interestingness is only based on the association of HYP with the items, not the relationships among the items themselves. An item like HYP with a high occurrence rate can affect the way how other items are associated with each other. To avoid this issue in a small-sized dataset, we need to find all types of associations regardless of effect of HYP and relaxed or flexible filtering metrics.
Having said that, if we only rely on lift, it might not be trustworthy enough and unable to perform effectively with the existence of bias in the incomplete data. Lift suffers from having nonfixed range of variables. It only assesses the dependency and correlation of the items without taking into consideration the importance of the cause and effect relationships among antecedents and consequents. Similar to the issue related to support and confident, lift is susceptible to infrequent items with a relatively low probability complication rules that can be ranked wrongly as the most interesting itemsets.
Although having a very low minimum could eliminate the above issue, generating all possible permutations of complications for all transactions is not an optimal option. This is because, Tables 3 and 4 contain many different antecedents and consequents, which increase the database size exponentially based on the number of items. It also leads to generating large number of uninteresting distances among many small rules despite the previously chosen optimal minimum threshold for support and confidence. In this situation, neither clustering nor ARM methodology perform effectively and can be even worse and problematic in a sparse dataset (such as T2DM). In conclusion, for making a better decision, we need to reduce uninteresting rules at another level which is addressed in the next section.

Interesting itemsets in complication rules using minimal coverage itemsets algorithm
Thus far, metrics such as support, confidence, and lift were used to identify the most interesting rules. However, we argued that there might still be many uninteresting/uninformative rules remained, which would be challenging to interpret due to the complex nature of the associated complications. To overcome this, we intend to discover the minimum coverage of rules by using MCI, which is motivated by a variation on the proposed methodology conducted by Liu et al to enhance k-means clustering in Reference 38. The identified sequence of complications is mined to extract the useful rules and detect an appropriate ordering of the complications as a minimum coverage of set, which is called itemsets. As can be seen in Figure 2 in the left hand side, temporal patterns of the complications co-occurrences are retrieved from DS. The database is mined to include the temporal relationships among the multiple complications into their associated rules. We used TARs on the temporal co-occurrence pattern of the complications to obtain 87 rules. Then, MCI analyze subrules (antecedents and consequents) as input and produces the minimum coverage itemsets (41 objects found) as output in Table 5. A minimum number of aggregated subrules are produced based on their uniqueness/intersection while covering the most frequent/interesting rules. We then refer to database R to find the related objects of the relevant associated rules once all of the objects are identified and mapped to the rules in Tables 3 and 4. By choosing the objects in the instead of rules, a minimum overlap among the data points is produced, this cannot be achieved using only lift. Thus, distance among the objects represents higher quality data points with less repetition of unimportant rules as the clustering input. In addition to this, MCI helps in achieving the optimal number of meaningful subgroups in the clustering method.

Combined methodology of ARM and clustering
In this article, a hybrid methodology of TARs mining and clustering attempts to validate and give meaning to the H clusters. We also proposed MCI algorithm to find minimum rules set as the most interesting itemsets from the temporal complications within the T2DM. Furthermore, the meaningful rules after applying the MCI based on the aggregation of only the most frequent and  important antecedents and consequents are utilized in Table 5. The issue of discovering the frequent itemsets (ARM) differs from the similarity search in the clustering method. Instead of using all rules as a clustering input, we only use the significant itemsets (objects) in the hierarchical clustering method. The clustering method allocates objects as itemsets in such a way that objects in the same subgroup coincide with each other subgroups, based upon Jaccard index. The Jaccard distance between two itemsets (objects) (I i and I j ) is calculated by the number of similar itemsets between I i and I j over all unique itemsets in both itemsets. For a set of m itemsets, there is overall of m(m − 1)∕2 distances that can be used to cluster the objects and further patient subgroups. Therefore, clustering tries to find objects that have a significant fraction of their associated pattern of complications in common; the absolute number of those objects is not of interest. Thus, patients are assigned to a cluster if their patterns of complications match the most frequent object/itemsets in the corresponding cluster. In other words, patients that have been diagnosed with a similar occurring pattern of complications over time (corresponding frequent itemsets) are gathered in one cluster. In the next part, we attempt to measure the distance among the objects. The proposed MCI procedure to discover the most interesting itemsets (which we call objects/clustering data points) is illustrated below and shown in Figure 2. Tables 3 and 4, considering minimum support and minimum confidence thresholds. 2. Output: Interesting itemsets (objects) in Table 5

Input: R in
If lift(R l ) ≥ max lift(R) 9.

Jaccard index and TAR clusters
To handle a large number of rules, we grouped the rules using agglomerative hierarchical clustering. 39 The combined use of unsupervised learning is motivated by Hahsler et al' research conducted in Reference 39, which introduced a distance-based clustering of ARs. However, we adopted a different method for a more in-depth analysis of the correlation between rules to find dissimilarities (distances). In the clustering literature, the frequent rule sets as a fundamental concept of TARs have enhanced the overall clustering methodology. 38 Agglomerative hierarchical clustering is employed to group the associated rules into more informative rules or the so-called itemsets. Accordingly, the Jaccard index is applied to create distances between itemsets. Comparisons between the two patients from two different clusters are made using unrelated rules on their associated complications. Table 6 represents the elements of clusters, which represented as objects. Finally, patients are allocated to a cluster based on the object meeting the rules belong to the itemsets within the corresponding cluster. We cluster patients based on their TARs clusters (C TAR ), where each cluster shares a similar complications sequence (co-occurrence pattern of complications). For comparing two different sequences of the complications (i and j) in the hierarchical clustering of the itemsets of I i and I j , we use Jaccard index (Jaccard(I i , I j )) and Jaccard distance (d i,j ) in Equation (6).

Clustering comparison and validation strategies
We intend to ascertain the usefulness trustworthiness of the TAR cluster in understanding the underlying disease as well as being a reliable source to validate the latent phenotype. Internal validation is applied to assess the validity of the C TAR through the use of the information contained within the given database of complication rules. In order to remove uninformative and rare rules from the database, the most infrequent itemsets are ignored. Then the dissimilarities (distances) among TAR clusters filter out the discovered meaningful rules. For example, rules with a high lift and confidence score are selected. Thus, the number of TARs is reduced to a manageable number while concentrating the most interesting rules. For external validation, the H clusters are assessed based upon another data source (TAR clusters). Jaccard similarity is applied to calculate the proportion of the overlapped patients for each pair of the latent phenotype and TAR  Tables 3-5 clusters. Although the Jaccard similarity seems useful to measure the overlap between two clusters, the resulting value is not able to indicate the likelihood of the observed overlap. As a result, normal approximation for the binomial approximation of the hypergeometric distribution (NBH) metric 40 is utilized to evaluate the probability of observing an overlap between each pair of clusters from C H and C TAR . A low value (probability) indicated that the chance of observing a given overlap was highly unlikely to occur by random chance. For a given C i TAR of size s i (where (i indicates the cluster number) compared to a C j H of size k j (where (j indicates the cluster number), the probable score of the overlap occurring randomly can be modeled using a binomial distribution, as shown in Equation (7). 40 Pr(observing x from groupj) =

TAR clusters Elements of cluster (interesting itemsets/objects)
where n is the number of patients in the union of all of the C i TAR and all of the C j H . If both n and npq are large, a binomial distribution can be approximated by a normal distribution. For example, obtaining a very low NBH probability represents there is a considerable/significant overlapped rate between two clusters from different data sources. We illustrate the finding and more explanation regarding the NBH probability in the following section.

EXPERIMENTAL RESULTS
In this section, we validate TAR clusters and compare them with the latent phenotype to understand whether the latent phenotype reduces some uncertainty caused by the complex relationships among the temporal complications. In Table 5, the most frequent and interesting itemsets (ordering pattern of complications) are identified by an object. In order to quantify a distance between two heterogeneous rules, one solution could be to use cluster rules based on their features (support, confidence, and lift). However, these measures can only capture the interaction of rules on the data and characterize only a single rule. Thus, more in-depth analysis of the correlation/causation between rules is possible when we find dissimilarities among the itemsets. Agglomerative hierarchical clustering is employed in order to build homogeneous groups of objects.
ARs are grouped according to the descriptors (itemsets or objects), as shown in Table 6. On the other hand, they are not grouped according to their coverage, as explained in MCI algorithm. Each of the patients within DS that have been diagnosed with the a similar occurring pattern of complications (the corresponding frequent itemsets) are gathered in one cluster. The distances among the frequent itemsets are aggregated for two patients within a cluster by using Jaccard distance, which are applied to the group of the object associated with the corresponding pattern.

Discovered clusters
We obtained the initial five clusters of the TARs as C TAR = { C 1 TAR , C 2 TAR , C 3 TAR , C 4 TAR , C 5 TAR }, according to the dissimilarities between associated rules (itemsets) using Jaccard dissimilarity.

F I G U R E 3
The proposed complication pattern mining methodology by using ARM and MCI to obtain the interesting itemsets as clustering objects The optimal number of clusters, in here five, is established and validated by using the elbow method. 41 T2DM patients are grouped based upon C TAR . If two rules do not share patients, we assume that they are not in the same cluster. In Figure 3, there were four T2DM patient clusters as the discovered hidden variable H } in which obtained using dissimilarity (1-correlation). Each one had a unique deep temporal phenotype (latent phenotype) and risk factor profile. In Figure 3, in the right-hand column (the most frequent ordering pattern of the complications), a symbol of > between two complications demonstrates whether a complication in the left-hand of the symbol occurred before the right-hand one with the higher occurrence rate.

Clustering comparison and validation findings
In this section, the latent phenotype clusters are compared with the TAR clusters by applying a number comparison and validation strategies to the identified clusters. These strategies assess the similarities among subgroups of patients, whereas they are clustered based upon different data sources. The comparison also aims to ensure a more appropriate decision for discovering the most meaningful subgroup of patients as well as explaining the behavior of the latent phenotype. For example, the intersection of C 4 H and C 3 TAR (the right-hand column in Table 7) revealed that a significant number of patients (with an overlap of >50%) shared a similar complications co-occurrence pattern. C 4 H with the complications pattern of {HYP, NEU, RET, LIV} and C 3 TAR with the occurrence order of {RET, HYP}, NEU, LIV have also coincided. The intersection of C 3 TAR and C 4 H showed that they greatly resembled each other, and it revealed an important link between the two clustering methods. Overall, we believed that there was a strong link between C 1 H and C 1 TAR where both clusters were sharing a similar complications co-occurrence pattern of {HYP, LIV, NEU}. In order to ascertain precisely that the overlap was not random, we used the NBH metric as illustrated in Table 7 likely to develop {RET, HYP}, NEU, and LIV (see Table 8), revealing a significant as well as a meaningful relationship between those two clusters (C 4 H and C 3 TAR ). Moreover, a C i TAR pattern, for example, {RET, HYP}, {NEU}, {LIV} revealed that {RET, HYP} was more likely to be seen than NEU, and NEU was more likely to be developed compared to LIV and the rest of complications were not likely to be developed in patients within the corresponding cluster C 3 TAR (as shown in Table 8). In particular, our hypothesis was checked whether C 2 H resembled C 4 TAR . As can be seen in Table 8, for the patients within C 4 TAR , the chances of having RET, HYP, and NEP were approximated by percentages of 67, 50, and 33, respectively. Similarly, the chance of having a consequence of RET, HYP, and NEP for patients in C 2 H was high (see evidence in Table 8). Additionally, as shown in Table 7, C 2 H ∩ C 4 TAR with the lowest NBH probability of <7.9E-90 and second highest overlapped number of patients of 25% revealed a significant and meaningful relationship between those two clusters (C 2 H and C 4 TAR ). In this article, the dissimilarities (distances) between clusters are analyzed as the interestingness to filter discovered rules, which was optimized after filtering out uninteresting rules effectively. These results will attract a domain expert to choose interesting patterns from the remaining small set of rules. For instance, the itemsets consisting of similar items are uninteresting, despite the fact that the frequent itemsets with different items are interesting. Figure 4 represents a dendrogram of the TAR clusters based upon the objects. F I G U R E 4 Hierarchical clustering for objects items in association rules, using dissimilarity Jaccard distance. x-axis and y-axis illustrate Jaccard Distance among objects and objects id obtained in Table 5, respectively

The meaningful subgroup of the personalized patients
In this section, we attempted to investigate how the similarities between the C i TAR and C j H could validate and give meaning to the latent phenotype. Figure 3 represented patients in C 1 H , with a decreasing and an increasing pattern in their deep temporal phenotype, shared similar trajectories over the observed risk factor profiles. Almost 90% of patients within C 1 H was found in C 1 TAR . More importantly, it was significantly validated from a statistical point of view as the likelihood of randomly observing this overlap was very low with an NBH probability of <0.001, as shown in Table 7. Thus, there was sufficient evidence to suggest that nearly all patients belonged to a similar TAR cluster (C 1 TAR ). It also appeared that the most frequent ordering pattern of complications of HYP, LIV, and NEU belonged to C 1 H matched {HYP, LIV, NEU} belonged to C 1 TAR . Having known that patients within C 1 H and (C 1 TAR ) were selected from two different data sources, not only statistically validated our clusters but also revealed the meaningfulness of the latent phenotype. Therefore, patients in the intersection of C i TAR and C j H (C i TAR ∩ C j H ) with the highest similarities among other clusters might represent a link between their latent phenotype and the temporal associated complications.
The most significant intersection of the TARs and latent phenotype clusters (C 1 TAR ∩ C 1 H ) was considered as the most informative (meaningful) subgroup and thought as DS1 (see Figure 1. We are interested in prediction the complications, personalizing patients based on their latent phenotype as well as the underlying pattern of complications. The latent variable is discovered based on the whole set of features (using IC* stepwise approach in DBNs framework). We trained the data, including all risk factors and complications for comparing two datasets (the original dataset (DS) to the meaningful subgroup (DS1)). In the next section, the prediction results are TA B L E 9 The prediction accuracy of a target complication (MAP), posterior likelihood level (clinical level), patients' group (dataset), evidence (E), P(MAP|E), P(E), and P(MAP,E) are compared between DS and DS1

EVALUATING THE PREDICTION PERFORMANCE
The evaluation strategy in this section argued that uncertainties in the cause and effects relationship among T2DM data could affect the prediction performance negatively. It also suggested that DS1 (by personalizing patients) could be considered as a dataset with less uncertainty compared to DS. This section has not concentrated only on the descriptive study. Therefore, by utilizing a predictive strategy (as a contribution for this chapter), the underlying patterns of complications were predicted for each of patients within DS1 (which were discovered in the descriptive strategy of the proposed hybrid methodology). These results then were compared to the prediction performance of the whole group of patients (which also includes DS1). This comparison attempted to reveal an explainability of the state-of-the-art method in order to uncover the meaning behind the latent AI model and gain insight into opening the black box. Thus, the prediction results were analyzed to investigate the differentiation of DS1 and DS in terms of how accurate the hybrid complications were predicted in the personalised dataset (DS1) compared to the raw dataset of DS. Table 9 illustrated the prediction accuracy of the hybrid complications, which was compared between DS and DS1, where an optimal posterior likelihood of a high or low clinical level was the question of the interest.

Improvement in the overall prediction accuracy
The prediction accuracy for each target complication was assessed for both DS and DS1 in Table 9. For example, the prediction accuracy of DS1 being diagnosed with HYP is 1, while for LIV and NEU are 0.88 and 0.81, respectively. Additionally, as shown in Table 9, the overall prediction accuracy across all complications for DS1 was 0.88 compared to a lower overall accuracy of 0.81 for DS. Similarly, the prediction accuracy for DS of individual complications was significantly smaller than in DS1, for HYP, LIV, and NEU by 0.90, 0.77, and 0.76, respectively. These results indicated that by applying the proposed methodology and discovering the meaningful subgroup, the prediction accuracy was increased for each complication within the most frequent ordering pattern of complications belonging to DS1. Accordingly, the overall prediction accuracy across all complications with a different pattern has been improved significantly.

Optimal posterior likelihood
Predicting a target complication and deciding whether a diagnostic test result was positive or negative were challenging. One possible solution could be provided by computing the expected utility as a likelihood of each decision alternative. The clinical decision alternative with the highest expected gain must be an optimal option, which was chosen by the clinicians. Thus, an approach was utilised to approximate the posterior likelihood of developing complications when optimizing the Bayesian parameters. An integration of maximum entropy and Bayesian optimization methodology was applied to the parameters. For this purpose, the posterior likelihood of the developing complications was approximated by using "Maximum A posteriori Probability" (MAP) algorithm, 42,43 which converged toward the set of parameters. In the proposed model with latent variables, MAP was utilized as an iterative strategy to discover maximum a posteriori of parameters. Then, an optimization procedure such as simulated annealing algorithm 44 was obtained to produce optimal posterior results along with the evidence. The simulated annealing algorithm was aggregated to a stochastic simulation of the hidden Markov chain which relied on data augmentation in the same way as EM algorithm.
The causal relationships are explained in terms of static and dynamic correlations between T2DM risk factors (attributes) to describe the inference problem. The causal inference has a greater focus on distinguishing causes from other associations than on uncovering detailed temporal relationships. It also facilitates a hybrid type approach that would yield useful information to find the inference used in probabilistic graphical models (Bayesian networks). These aim to distinguish and understand different categories while exploring knowledge in discovering causes. In this chapter, the prediction is obtained based on prior knowledge as well as the current stage of the risk factors and complications.  Table 9, an optimal posterior likelihood of developing RET, NEU, LIV, and SMK is compared with DS1 and DS in terms of a prior/evidence (already developed complications such as HYP and LIV). Having illustrated the extensive findings in Table 9, the cause and effect relationship were investigated in influence diagrams (as illustrated in Figures 5 and 6), which demonstrated Bayesian structures for DS and DS1, respectively. In these figures, class values for HYP and LIV are set to their highest/lowest clinical level, as the evidence, to observe changes in the clinical level of a targeted complication in the Bayesian structure modeling. In order to ascertain the obtained posterior likelihood of being at the high risk of having LIV and NEU, both should be coincided with demonstrating Bayesian structures. For example, once the patients in DS1 has been diagnosed with having HYP and LIV, the likelihood of developing NEU is increased to 0.84. Alternatively, with assuming that patients in DS to be diagnosed with NEU by knowing that the patients have already developed HYP and LIV, where the optimal level was low with the posterior likelihood of 0.57.
As can be seen in Table 9, if HYP and LIV class values were set to their high clinical level, probability of developing NEU (P(NEU|{HYP, LIV}) of 0.83 was higher than likelihood of not developing RET (P(RET|{HYP, LIV}) of 0.96 thus, the evidence showed that DS1 patients with HYP and LIV were at a high risk of being diagnosed with NEU compared to a high F I G U R E 6 An influence diagram to represent Bayesian Structure applied to the subgroup of patients in DS1 probability not being diagnosed with RET. Again, in Table 9, in DS1, when the posterior likelihood of LIV raised above 88%, growth of damaged eye cells in developing RET decreased to 96%. RET was negatively affected by the occurrence of LIV shown with a thick red arrow in DS1, which was revealed in Figure 6. Then this was compared to the no influence arrow in DS, as shown in Figure 5. Similarly, NEP in DS1 seemed less likely to be developed since HYP, LIV, RET and NEU occured with the optimal likelihood posterior of 0.86 at the low clinical level. However, for DS1, the posterior with the same evidence was 0.76 at a high clinical level (see Table 9). The influence of HYP (with a diagnosis likelihood of 1) on the rest of complications is neglected as it is a macrovascular complication, which is often developed in T2DM data with the same likelihood with or without adding it to the evidence of (P(NEU|LIV) = P(NEU|{HYP, LIV})). This observation is ensured with a thicker red-coloured arrow pointing to RET from LIV in DS1 and no arc in DS as demonstrated in Figures 5 and 6.
In Figure 5, a thick purple edge from NEP to SMK illustrated the development of NEP causes SMK. Additionally, in Figure 6, positive causation was represented by a green edge from NEP to SMK. These findings suggested that once a patient has been diagnosed with NEP, the probability of being a smoker was increased significantly from 0.33 to 1.00 by comparing P(SMK|NEP) values between DS and DS1 in Table 9.

DISCUSSION
Lack of prediction of the onset of associated diseases/complications can negatively affect a patient's health in many ways. They can be numerous and interact in complex nonlinear ways throughout the disease process. Patients must switch to another medication if more complications have been developed, for example, when a patient uses a treatment that may not be suitable for another complication. It leads to unsuccessful treatment, where clinicians are pushed to follow an unreliable and suboptimal approach in prescribing treatment options. In this situation, the medicine that is prescribed to help a patient in a particular complication might lead to patient dissatisfaction and more severe health outcomes. On the other hand, T2DM is potentially reversible, treatable, manageable, and, if caught, early enough. Early diagnosis and management of the disease have reduced the risk of complication development. 45 The state-of-the-art modeling techniques for analyzing T2DM progression is either focused on descriptive or predictive strategies. Despite this, the present research in order to personalise patients in a precise prediction is based on both descriptive methodology and predictive analysis. For this purpose, the thesis conducted a new methodology based on a framework that combines notions of causality in medicine with algorithmic approaches built on Bayesian model as well as statistical techniques for analyzing the causal relationship. Additionally, having greater insight into the discovered subgroups and relatively, the prior understanding of the interesting TA B L E 11 A subset of database R (r 2 ) of the associated rules with the complications rules helps interpreting the predictive results correctly. Therefore, the discovered hidden variable/latent phenotype can be combined with the meaningfully associated complication rules for optimal performance of the patient personalization. Despite the importance of prediction of an expected complication at a time, finding a patient model that simultaneously takes into account the chance of occurrence of other associated complications can produce a more precise predictive model. In order to investigate whether a particular patient is at a high risk of developing a target complication, we need to analyze multiple factors. That may depend on the patient's clinical history, stage of the disease, and fluctuations of the related risk factors. More importantly, it can be affected by the associations of the prior complications with the expected complications (likely to be diagnosed and yet to be developed). In T2DM data, the worsening level of the microvascular diseases and HYP is known as a significant cause of death. 46 Even though microvascular complications such as RET, NEP, and NEU are less frequent comparing to HYP, an inadequate estimation of them causes long-term suffering and life-threatening comorbidities. 7 Fowler et al 9 researched type 2 diabetic American patients. This research utilized T2DM key risk factors such as H21Ac, SBP, and DBP to investigate relationships among complications such as HYP, NEP, RET, and NEU. In addition, LIV is a severe phenotype of diabetes and associated with T2DM complications, especially NEU. 47  They also revealed that HDL has a negative effect on HYP, NEP, NEU, and RET, whereas H2A1c negatively associated with HYP. Again, a study conducted by Ramachandran et al 49 referred to the high prevalence of NEU and RET in Type 2 diabetes in India. Similarly, research in Reference 50 suggested that most of the diabetic patients have objective evidence for some variety of NEU, but only a few of them have identified by symptoms. This research also showed that there is a strong association among NEP, NEU, and RET. All together, it seems pertinent to remember that understanding the underlying pattern of the complications is based on the correlation and causation of their co-occurrences (both positively and negatively). Here we give an illustration of what we mean. In the first case, an occurring complication is caused or followed by other complications. Alternatively, in the second case, if any combination of two complications is less likely to be followed or caused by another one. That is to say, the occurrence of some complications may negatively affects the occurrence of another complication. Here, we provide one case study example to clarify the contribution of this article. We have been able to come to the conclusion that if the levels of HYP and LIV of the patient population rises, the risk of developing RET decreases while the chance of developing NEU increases (based upon "causality backwards"). Moreover, since DS has appeared to be more complicated than DS1, there could be some unmeasured/hidden risk factors, which may affect both LIV levels and the likelihood of having RET in DS. In this situation with a considerable amount of uncertainty, one could argue that RET is caused by other underlying risk factors (latent phenotype), such as exercise, genes, and diet. Thus, we attempt to open the clinical black box model by utilizing an appropriate methodology in order to discover correlation and causation among temporal risk factors and complications in the presence of hidden factors. We utilized DBNs, which allow the description of each time between cause and effect and the likelihood of this relationship being discovered. We obtained this causal phenotype with the associated probabilities as we had a tendency to calculate the joint impact a cause made to its influence and then observed statistically significant causes through the ideas of multiple hypothesis testing (treating each causal relationship as a hypothesis) and false discovery control. Having known the mentioned investigation, it seems reasonable to assume that the ordering of the complications co-occurrence and their temporal transactions produces remarkable/informative knowledge in order to interpret the patient model. This is not the only evidence that supports this study's claim, there is evidence to suggest that informative patterns of complications significantly improve the prediction performance for the personalised subgroup comparing to the original dataset.

CONCLUSIONS
Discovering a latent phenotype by identifying the underlying sequence of temporally associated complications to explain AI black box model is notably absent from Scholar. It even becomes more problematic when a significant improvement in the predictive model is vital. Our main contribution in this article is based on the challenge of how to construct meaningful explanations of patients' subgroups in a precise prediction by uncovering the hidden factors. As a matter of fact, due to the difficulties of the explanation of the constraints and latent phenotype, we proposed a combination of data mining techniques while exploring knowledge in discovery cause and effect. For being able to explain the black-box model and hidden variables, we attempted to explore a well-known group of patients. We applied TARs to the data followed by MCI algorithm that filter out the most interesting itemsets only based on the underlying patterns of complications. Then, the discovered interesting patterns were considered as the input of the descriptive methods. Alternatively, the resulted subgroups of patients in the descriptive study became a new dataset for analyzing the predictive model. Then, we used the predictive model to capture the behavior of the latent variable and then in descriptive data mining techniques like unsupervised learning, patients were allocated to four clusters only based on their latent phenotype. In this work, a combined data mining methodology was adopted to help understand and validate the latent phenotype in order to find a meaningful subgroup of patients. It also intended to assist the clinicians in the decision-making process to help with the early and precise diagnosis of complications. Existing approaches in pattern discovery from time series clinical data have not yet exploited the representational power of the integrated data mining techniques such as hidden variable discovery, TAR mining, time series clustering, patient personalization, and enhanced prediction methodology.
To sum up, in this research, we addressed three goals. First, we demonstrated a rich clinical data to provide fine temporal phenotype in associations. Second, we aimed to illustrate cluster analysis of time series data with an underlying causal structure in T2DM phenomenon. Furthermore, considering the hybrid complications as a class, in the classification/prediction problem, we addressed the unbalanced issue. Our promising experimental results showed that the patient personalization by using the proposed integrated data mining techniques could provide better prediction accuracy and interpretability in discovering the temporal associated complication rules and understanding the latent phenotype. More importantly, these findings revealed that the proposed hybrid techniques could handle uncertainty in the clinical decision-making process. It also aided the clinicians to prepare future prognosis of the most likely occurring complications.
Nevertheless, several questions remain to be answered as we have just attempted to open the black box AI models. In future work, we are attempting to provide more interpretation of the results from the clinician's point of view. The generalizability of these results is limited to the T2DM dataset. Thus, we intend to apply the proposed methodology to a new dataset with more risk factors and patient visits with the aim to understand the black box latent DBNs model. This new understanding should help to improve predictions of the impact of the latent phenotype on associated complications. In the future works, a few possible solutions can be of interest to the authors of this paper. For instance, causal confidence and support could be combined with the other metrics in order to uncover these types of uncertainties. We will also consider employing Fisher's p-value that is ranked as the most robust measure in which ensuring the interesting itemsets acts as an informative input in the predictive model.

Implementation tools
We exploit AR mining based on an extension package "arules" in R. 39 The original (imbalanced) dataset is considered to find a pattern of developing different complications throughout patients visits. Additionally, we use the R-extension package "arulesViz" and "Gephi" for visualization techniques to explore ARs clearly. The visualization techniques are utilized to determine a considerable number of rules, allowing interesting information to be discovered from the transaction data. Finally, "Genie" is used to infer the BN as well as illustrate an influence diagram and applied diagnosis test and sensitivity analysis.

TARs
The support measure of itemsets X (supp(X)) is defined as the proportion of transactions in the dataset containing X. In particular, an ARs of X ⇒ Y has a support of P(XY). The confidence measure of a rule (conf(X ⇒ Y)= P(XY ) P(X) ) identifies the proportion of transactions with the most interesting or important relationships. In addition, the confidence of a rule is defined as conf(X ⇒ Y) = supp(X ∪ Y) = supp(X) in which it satisfies Equation (8).
Parameters such as and are the minimum support and confidence, respectively. Instead of using accuracy, efficiency is an appropriate way to evaluate ARs. 34 To obtain the frequents itemsets, first, we filter TARs by using support and confidence. However, they are not able to filter complication rules based on the different dependencies among the rules. For this purpose, we used a measurement of independence of X and Y (known as lift and defined as lift(X ⇒ Y ) = P(XY ) P(X)P(Y ) . Lift of 1 represents two itemsets X and Y are independent as shown in Equation (9).
Lift is the deviation of the whole rule support from the expected support under independence given both sides of the rule support. Higher lift values indicate strong associations. For instance, the conditional probability of a patient developing both HYP and LIV is associated with the likelihood of the patient developing RET. For example, the confidence of HYP, LIV implying RET is given as the likelihood of the patient developing HYP, LIV and also RET over the likelihood of developing only HYP and LIV (see Equation (10) The confidence measure in {RET, HYP, NEU, RET} implying LIV reveals how likely a given patient developed {RET, HYP}, NEU, RET, and also LIV.

MCI
To ascertain whether patients in each cluster are developing a similar pattern of complications (the most frequent itemsets which are also unique for the corresponding cluster) as well as a different pattern from other patients within another cluster. For instance, if HYP happens before LIV and RET and NEU or NEP or no complication occur after them, there is a co-occurrence pattern of {HYP, {LIV, RET}, {NEU|NEP}}. For example, we select two subsets of rules with maximum lift and reasonable support and lift (meeting the constraints) as follows (as shown in Tables 10  and 11): The union of objects in these subsets is meeting the most items in D. We need to find out whether the rules set are covering the optimal/minimal number of the associated objects. There is an ideal itemsets MCI of the intersection of r 1 and r 2 , which is defined as MCI = {R 10 , R 11 , R 42 , R 60 , R 62 } (as illustrated in Table 12). These itemsets are generated based upon the intersection of objects in MCI representing a unique/minimum coverage set of items in D and are illustrated in Figure 2.