Artificial intelligence enabled applications in kidney disease

Abstract Artificial intelligence (AI) is considered as the next natural progression of traditional statistical techniques. Advances in analytical methods and infrastructure enable AI to be applied in health care. While AI applications are relatively common in fields like ophthalmology and cardiology, its use is scarcely reported in nephrology. We present the current status of AI in research toward kidney disease and discuss future pathways for AI. The clinical applications of AI in progression to end‐stage kidney disease and dialysis can be broadly subdivided into three main topics: (a) predicting events in the future such as mortality and hospitalization; (b) providing treatment and decision aids such as automating drug prescription; and (c) identifying patterns such as phenotypical clusters and arteriovenous fistula aneurysm. At present, the use of prediction models in treating patients with kidney disease is still in its infancy and further evidence is needed to identify its relative value. Policies and regulations need to be addressed before implementing AI solutions at the point of care in clinics. AI is not anticipated to replace the nephrologists’ medical decision‐making, but instead assist them in providing optimal personalized care for their patients.


| INTRODUC TI ON
Artificial intelligence (AI) is anticipated to transform health care through advancements in clinical decision support. Rapid advancements in computational power and improvements in statistical techniques ultimately enable AI to be leveraged to identify hidden interactions and patterns within large, complex, multi-level datasets.
AI has been suggested as the next natural progression of traditional statistical techniques (eg, logistic regression, linear regression, etc), and these analytical advancements can be applied to the practice of medicine. 1,2 An AI-based "virtual coach" using a diverse set of inputs and algorithms may have the potential to aid in personalized medical guidance for patients. 3 AI medical decision support tools for clinicians may also improve efficiency by optimizing routine workflows and aid them in the process of providing care. 4 In a recent bibliometric study on the global evolution of AI in health care and medicine, it is shown that clinical applications of AI are relatively common in fields like ophthalmology, oncology, and cardiology. 5 However, the use of AI is scarcely reported in nephrology, despite attributes of large datasets 6 and one of the highest disease burdens. 7 In-center hemodialysis (HD) is typically performed three times per week for 3-5 hours, thus amassing a large volume of clinical data captured in electronic medical records (EMR). These large treatment datasets are ideal for AI applications. With advances in technology, remote treatment monitoring applications allow clinical data to be collected from patients dialyzing at home. Recently, it has also become possible to measure and store beat-to-beat hemodynamic and respiratory values during dialysis treatment. 8 Furthermore, the emerging field of medical grade wearables is anticipated to yield even more robust data in all populations. 9 DOI: 10.1111/sdi.12915 The aim of this review is to: (a) provide an overview of the AI application process in a clinical setting; (b) provide brief descriptions of select advanced Machine Learning (ML) algorithms; (c) present the current status of AI in research toward kidney disease and dialysis; and (d) explore future pathways for AI within the discipline of nephrology. This review focusses on the applications of AI in progression to end-stage kidney disease (ESKD) and dialysis omitting the unique acute kidney injury population.

| T YPE S OF AI
There is no universal definition of AI, but central to most definitions is the ability of a learning system to mimic human behavior.
As depicted in Figure 1, AI is an umbrella term that brings together concepts from several fields such as computer science, statistics, algorithmic, ML, information retrieval, and data science at large. 10 ML techniques are very powerful in their ability to detect hidden patterns in large datasets that are otherwise difficult to identify by traditional statistical techniques.
The types of ML techniques that currently exist for building AI applications broadly fall into three families (Figure 2), namely supervised learning (SL), unsupervised Learning (UL), and reinforcement learning (RL). SL and UL are briefly discussed below, although technical details are beyond the scope of this review. 11 Most of the applications of RL are in the fields of board and video games and beyond the scope of this paper.

| Supervised learning
Supervised learning is the most frequently used type of ML. The objective of SL is to build a predictive model that takes historical input features to predict a specific output. For example, one may want to predict if a patient will miss their next dialysis treatment (binary output Yes/No) or predict how long it would take until a patient will transition to dialysis (continuous output).
Supervised learning can be divided into two categories (classification and regression) depending on the type of the output ( Figure 2). In classification, the output belongs to a set of distinct classes (eg, missed treatment vs not missed treatment). In regression, the output is usually a continuous numerical quantity (eg, N days until transitioning to dialysis).
There are many ML algorithms for building predictive models ranging from traditional to more advanced methods. Prediction performance of these models is usually presented as area under the receiver operating characteristic curve (AUROC). 12 The most common traditional SL methods are logistic regression (for classification) and ordinary least squares regression. 13 These traditional methods are popular analysis techniques within health care and hence not discussed here for brevity. Over the past decade, more advanced techniques, such as tree-based methods and deep learning (DL) algorithms, have grown in popularity.
The foundation of tree-based methods is the decision tree, a ML technique for sequentially dividing the samples based on F I G U R E 1 Figure shows the relationship between artificial intelligence (AI), machine learning (ML), and deep learning (DL). ML is a subset of AI and DL is a subset of ML. ML is a sub-discipline of AI that uses training examples of how to perform a specific task without explicit instructions to identify associations for a given outcome measure. DL is a subfield of ML that mimics neural networks to learn [Color figure can be viewed at wileyonlinelibrary.com] determining if a selected feature is greater than, or less than, a threshold determined by the model. At every level of the decision tree, the ML model learns which feature to use, and which threshold is the best. Unfortunately, a single decision tree can memorize the training data, resulting in poor performance on unseen data. As a result, many advanced analytical techniques (eg, random forest and Gradient Boosting Classifier) have been created to improve upon traditional single decision trees, increasing generalization to new data. 14 In random forest methods, multiple decision trees are created using random subsets of samples (ie, by bootstrapping) and random subsets of the input features (ie, bagging). On the other hand, Gradient Boosting methods sequentially add decision trees with few levels of nodes (shallow) that leads to a progressive improvement in model performance. One Gradient Boosting method known as XGBoost is currently one of the top performing models in the ML field. 15 An extensive bibliography of new SL techniques, their application, and performance compared to traditional techniques is becoming available. Akbilgic et al compared several different ML modeling techniques to predict risk of death in incident dialysis patients. 16 The random forests model outperformed logistic regression with an AUROC of 0.76 compared to an AUROC of 0.68.
Deep learning, which uses artificial neural networks (ANNs), is another SL technique that has grown in popularity in the last decade.
ANN began in the 1950s with the MADALINE algorithm, 17  training input features are fed forward through the neural network to create a set of predictions. The predictions are then compared to the actual output labels and this difference (ie, the error) is fed backward through the hidden layers. Over several iterations, the network "learns from its mistakes" and optimally adjusts its unit weights to a point where it can accurately predict the outcome.
To optimize these weights, the DL algorithm uses a technique known as backpropagation which was invented in the 1980s. 18 As the number of layers are added to the neural network, the number of weights and connections increase dramatically. Convolutional neural networks (CNNs) and recurrent neural networks (RNNs), as shown in Figures 5 and 6, are two variants of ANN that have also been created to reduce the number of weights, resulting in increases in performance, and decreases training time. A CNN is mostly used for image processing and RNN is widely for natural language processing (NLP). 19 In the medical field, DL 20 (specifically CNN) has been mainly applied for image processing in the fields of radiology, histology, dermatology, and retinopathy, which has been able to demonstrate at or above clinical performance. [21][22][23] For example, in cardiology, DL has been used to predict outcomes after cardiac arrest. 24 Support vector machine (SVM) is a form of SL, where the ML algorithm performs complex data transformations on the labeled data and defined output to draw boundaries within the input data. SVMs can be used to solve classification problem as well as a regression problem. 25

| Unsupervised learning
In UL, there is no output label, but rather the objective is to learn about patterns in the input data itself. UL techniques usually focus on clustering, dimensionality reduction, or anomaly detection. 26 A commonly used clustering technique is k-means clustering. 27 k-means clustering utilizes an iterative refinement algorithm with assignment step and update step to partition the data into k clusters, and the algorithm aims to minimize the within-cluster variance and maximize the between-cluster variance. It is critical to determine an appropriate number of k clusters when using k-means clustering method. Hierarchical clustering 28 Table 1 shows a very high level overview of the differences between the traditional statistical techniques and advanced analytical methods.

| AI APPLI C ATI ON PRO CE SS
The AI application process in a clinical setting generally consists of This capability allows new kinds of data (eg, free text, images, videos, sound, and temporal data) to be utilized. The volume and complexity of such data add additional challenges in analyzing the data.
With a set of well-engineered features, the predictive models are trained and tuned until acceptable performance is achieved.

| APPLI C ATI ON S IN K IDNE Y D IS E A S E
There are several unmet needs in nephrology and there is a huge potential for use of big data and AI in patients with kidney disease.
The applications of AI kidney disease can be broadly subdivided into three main topics: (a) predicting events in the future; (b) treatment and decision aids; and (c) identification of existing, but unrecognized, patterns. Table 2

| Predicting chronic kidney disease progression
Chronic kidney disease (CKD) is a growing health crisis across the world. 7 Detecting it early and managing the progression of the disease are critical for positive patient outcomes and controlling Traditional approach to correct medical errors is to create new rules and procedures that need to be utilized in a health-care setting. 57 However, data-driven, AI approaches can also be applied particularly when historic evidence already exists. Most common application of AI in minimizing medical errors is to guide what therapeutic approaches may or may not be ideal for a given patient.
Paredes et al 56

| Identifying phenotypical patterns
In patients with ESKD, several patterns, such as the malnutritioninflammation-atherosclerosis syndrome, have been discovered by traditional statistical methods. It has thus increased our pathophysiological understanding and were shown to be strong prognostic indicators. Recently, studies have shown that fluid overload also can be part of a pathophysiologic spectrum including malnutrition and inflammation. 61,62 The concomitant presence of these three risk factors yielded a near six-fold increase in mortality risk.
However, unlike other chronic diseases, pattern detection based on UL techniques has not yet been published in nephrology. In patients with heart failure with preserved ejection fraction, three different phenotypical patterns were identified based on clinical, laboratory, and echocardiographic parameters by agglomerative hierarchical clustering. These clusters differed greatly in mortality risk. In cardiology, the use of UL techniques to detect phenotypical patterns was termed "phenomapping" by the authors. 63 Another example, infection medicine is based on k-means clustering on a cohort of patients with sepsis. Four different phenotypes were observed with a distinct difference in outcome, of which one was characterized by older patients with more chronic illness and renal dysfunction (β phenotype). The highest 28-day mortality (40%) was observed in the δ phenotype, characterized by patients with septic shock and liver dysfunction, as compared to 13% in the β phenotype and 5% in the α phenotype with the lowest risk. 64 Another study identified different metabolic clusters based on k-means clustering including a set of clinical parameters and biomarkers in older adults without diabetes. In the clusters characterized by lower eGFR and albuminuria and the cluster with the highest inflammation, the risk of cardiovascular endpoints was comparable to the diabetic cluster. 65 Whether phenomapping in different diseases has relevance for personalized treatment prescription needs to be addressed in future trials.

| Identifying unknown comorbidities
In One area of concern involves patient comorbidities. ESKD patients with multiple medical comorbidities face decreased survival likelihoods. 66  there will be instances where models will predict incorrectly. In such situations, a precedent of accountability needs to be established.

| Image classification for arteriovenous fistula aneurysm and biomarker fingerprints
AI solutions should be transparent and traceable. It is important that the predictive models use data collected routinely in standard of care or it will likely produce models that are bias by indication.
Teams developing and using AI solutions should be aware of this limitation. Thorough evaluation of the input data variables should be conducted as a key step in the selection of outcomes and the process of building predictive models.
While a lot of emphasis is placed in developing powerful and accurate models, more emphasis should be directed toward building an end-to-end team of practitioners in data analytics, data engineering, trainers, care providers, and patients to create effective solutions which would be beneficial for all stakeholders. The effectiveness of the prediction models depends heavily on the ability to use insights to make clinical interventions. On the other hand, interventions need to be thoroughly thought through depending on unique factors driving the clinical outcome and personalized for every patient.
Lastly, AI solutions when implemented at the point of care for nephrologists should be viewed as a clinical decision support tool to extend providers' insights about the patients. AI is not anticipated to replace providers' medical decision-making, but instead assist them in providing optimal personalized care for their patients.