Genetic prediction of ICU hospitalization and mortality in COVID‐19 patients using artificial neural networks

Abstract There is an unmet need of models for early prediction of morbidity and mortality of Coronavirus disease‐19 (COVID‐19). We aimed to a) identify complement‐related genetic variants associated with the clinical outcomes of ICU hospitalization and death, b) develop an artificial neural network (ANN) predicting these outcomes and c) validate whether complement‐related variants are associated with an impaired complement phenotype. We prospectively recruited consecutive adult patients of Caucasian origin, hospitalized due to COVID‐19. Through targeted next‐generation sequencing, we identified variants in complement factor H/CFH, CFB, CFH‐related, CFD, CD55, C3, C5, CFI, CD46, thrombomodulin/THBD, and A Disintegrin and Metalloproteinase with Thrombospondin motifs (ADAMTS13). Among 381 variants in 133 patients, we identified 5 critical variants associated with severe COVID‐19: rs2547438 (C3), rs2250656 (C3), rs1042580 (THBD), rs800292 (CFH) and rs414628 (CFHR1). Using age, gender and presence or absence of each variant, we developed an ANN predicting morbidity and mortality in 89.47% of the examined population. Furthermore, THBD and C3a levels were significantly increased in severe COVID‐19 patients and those harbouring relevant variants. Thus, we reveal for the first time an ANN accurately predicting ICU hospitalization and death in COVID‐19 patients, based on genetic variants in complement genes, age and gender. Importantly, we confirm that genetic dysregulation is associated with impaired complement phenotype.


| INTRODUC TI ON
Coronavirus disease-19  caused by severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) has led to unprecedented morbidity and mortality worldwide. 1 Although vaccination against SARS-CoV-2 has positively impacted the course of this pandemic, 2 the unmet need of reducing morbidity and mortality due to severe COVID-19, especially in special populations, remains.
Accumulating evidence suggests that SARS-CoV-2 induces a vicious cycle of immune dysfunction, endothelial injury, complement activation and microangiopathy. In particular, severe COVID-19 is a multisystemic vascular disease characterized by endothelial dysfunction. 3,4 Therefore, an improved understanding of endothelial dysfunction and complement activation is of utmost importance.
In this context, several groups worldwide have shown evidence of complement activation in experimental and clinical studies of severe COVID-19. [5][6][7][8][9][10][11][12][13] Based on the paradigm of genetic susceptibility in complement-mediated disorders or complementopathies, 14 our group and other researchers have suggested genetic susceptibility identifying complement genetic variants in COVID-19 patients. [15][16][17] In parallel, complement inhibitors have been safe and effective in severe COVID-19 during the first wave. 18 Encouraging results have been reported in case series for terminal complement inhibition with eculizumab, 19 C3 inhibition with the AMY-101, 20 C1 inhibition with conestat alpha 21 and lectin pathway inhibition with narsoplimab. 22 The comparison of AMY-101 to eculizumab suggested a broader involvement of C3 in thromboinflammation. 23 Based on promising data, randomized controlled trials are ongoing for AMY-191 and eculizumab in severe COVID-19 (NCT04346797 and NCT04395456).
Interim analysis is still pending on the paused phase 3 study of the long-acting terminal complement inhibitor ravulizumab.
Nevertheless, several issues need to be considered for the wider use of complement inhibitors in COVID-19, including the complex setting of inflammatory responses in COVID-19, the cost and the limitation of drug accessibility. All require proper selection of patients that would potentially benefit from complement inhibition.
Taking into account the constantly evolving COVID-19 landscape due to vaccination and viral mutations, there is an unmet clinical need of a prediction tool based on robust variables. Therefore, we aimed initially to identify critical complement-related genetic 22 Hematology Laboratory -Blood Bank, Aretaieion Hospital, School of Medicine, NKUA, Athens, Greece Correspondence Eleni Gavriilaki, Hematology Department -BMT Unit, G. Papanicolaou Hospital, Exochi, 57010, Thessaloniki; Greece. Email: elenicelli@yahoo.gr

Funding Information
Our research was supported by independent investigator-driven grants (Prefecture of Macedonia and Pfizer Pharmaceuticals).

Abstract
There is an unmet need of models for early prediction of morbidity and mortality of Coronavirus disease-19 . We aimed to a) identify complement-related genetic variants associated with the clinical outcomes of ICU hospitalization and death, b) develop an artificial neural network (ANN) predicting these outcomes and c) validate whether complement-related variants are associated with an impaired complement phenotype. We prospectively recruited consecutive adult patients of Caucasian  17 A few studies employing ANNs have also emerged focussing on COVID-19 prediction problems. [33][34][35][36][37][38][39][40] The basic building block of ANNs is the artificial neuron, which is a mathematical model mimicking the behaviour of the biological neuron ( Figure 1). Information is passed onto the artificial neuron as an input parameter and is then processed using a mathematical function to derive an output which determines the behaviour of the neuron (similar to the fire-or-not situation of the biological neuron).

| Genetic studies
Genetic studies were performed as previously described. 17 Briefly, peripheral blood samples were used to isolate genomic

| Functional assessment of complement activation
Plasma was isolated from EDTA tubes collected at hospitalization for non-ICU patients or at ICU admission for ICU patients and stored Continuous variables were compared using t test or Mann-Whitney, according to normality.

| Genetic analysis
We identified a total of 381 variants, ranging from 40 to 101 per patient (mean value 71 and standard deviation 12). The database of variants is presented as supplementary material in the supplemental excel file entitled variants Database of 133 COVID-19 Patients.
Using the pre-defined set of criteria, a total of 381 variants are reduced to 5 critical variants, which are mainly associated with severe COVID-19 infection (morbidity and mortality), as shown in Table 1: F I G U R E 2 Study population characteristics categorized by age, gender and infection severity (requiring or not requiring intensive care unit (ICU), mortality) rs2547438 (C3), rs2250656 (C3), rs1042580 (THBD), rs800292 (CFH) and rs414628 (CFHR1). Variant characteristics are shown in detail in Supplementary Table 1. Interestingly, Figure 3 shows that variants satisfying Criterion II are by 15% more present in male than in female patients and the reverse.

The number of variant combinations is defined as follows:
where nv is the number of variants and npv is the number of variant patterns.
In this study, nv = 381 number of variants were investigated for a 133 COVID-19 patient sample.

| Development of ANNs
The database used in this research comprised of 133 data sets, with each data set containing 7 input parameters (age, gender and 5 parameters indicating the presence or absence of each of the 5 critical variants). Figure 4 illustrates a statistical analysis of the selected input parameters.

| Functional assessment of complement activation
Variant frequency based on Sex difference (%)

| DISCUSS ION
We reveal for the first time an ANN able to accurately predict ICU hospitalization and death in COVID-19 patients, based on genetic variants in complement genes, age and gender. Importantly, we also confirm that genetic dysregulation is associated with an impaired complement phenotype. Considering that these analyses were not able to provide tools for disease severity prediction that could be helpful in clinical practice or a clinical trial setting, our group recently developed an algorithm identifying complement-related variants in C3, CFH and THBD that predict COVID-19 severity. 17 Nevertheless, the prediction rate of this logical algorithm reached values above 80% only in patients not requiring ICU hospitalization and did not incorporate basic features associated with morbidity and mortality, such as age and gender. In the present study, we improved this logical algorithm in order to identify both ICU and non-ICU patients. Using the updated algorithm, we identified variants in complement-related genes (C3, CFH, Two of the five critical variants were also the ones that composed the initial algorithm. Based on the five critical variants derived from the updated algorithm, we further implemented an ANN incorporating age and gender. This tool is able to predict not only morbidity but also mortality in COVID-19 patients. that may be also associated with differences in morbidity and mortality, such as differences in socioeconomic factors or comorbidities, are also incorporated in the algorithm by the addition of age and gender.
Our study is also the first to show an association between gen- indicating the need for future studies. In addition, our patient population is rather small, and therefore, the suggested ANN needs to be further validated in other real-world cohorts.

| CON CLUS IONS
An artificial neural network model was developed, trained and evaluated targeted to the prediction morbidity and mortality ratios of COVID-19 patients. A number of complement-related genetic variants, associated with severe COVID-19, were identified and used as inputs to the model, together with patient's age and gender. Using a sample of 133 patients, the developed ANN model was found capable to successfully predict COVID-19 severity in 89.47% of the study population.
In conclusion, germline complement-related genetic variants along with age and gender predict morbidity and mortality in COVID-19 patients. Given that vaccinations and viral mutations constantly change the landscape of COVID-19, a prediction tool based on such robust variables is of high importance in the future of this pandemic. Additionally, such a prediction tool is also expected to significantly contribute to better selection of patients that would benefit from targeted complement inhibition, considering the clinical phenotype associated with these variants. Last but not least, this novel approach of artificial intelligence paves the way for future application in additional clinical entities.

ACK N OWLED G EM ENT
We would like to thank the biologist Maria Spachidou for her technical assistance.

CO N FLI C T S O F I NTE R E S T
E.G. has consulted for Omeros Cooperation and is supported by the ASH Global Research Award. Remaining authors declare no competing financial interest.

F I G U R E 6
Percentage predictions of COVID-19 severity based on the optimum ANN model. The first group of three column bars represents the correct predictions of the ANN model for patients that did not require ICU treatment (achieving more than 93% successful predictions). The percentage of successful predictions is better for the female patients (more than 96%), while for male ones the respective percentage is 91.67%. The last group of three column bars represents the correct predictions of the ANN model for all cases of patient infection severity (requiring or not requiring intensive care unit and died). The percentage of successful predictions is more than 90% for male patients and close to 90% for female ones, while the combined percentage is 89.47%. The remaining groups of column bars report the respective results for the patients requiring ICU and those who have died