Alveolar–arterial oxygen gradient nonlinearly impacts the 28‐day mortality of patients with sepsis: Secondary data mining based on the MIMIC‐IV database

Abstract Objective Lung is often implicated in sepsis, resulting in acute respiratory distress syndrome (ARDS). The alveolar–arterial oxygen gradient [D(A‐a)O2] reflects lung diffusing capacity, which is usually compromised in ARDS. But whether D(A‐a)O2 impacts the prognosis of patients with sepsis remains to be explored. Our study aims to investigate the association between D(A‐a)O2 and 28‐day mortality in patients with sepsis using a large sample, multicenter Medical Information Mart for Intensive Care (MIMIC)‐IV database. Methods We extracted a data of 35 010 patients with sepsis from the retrospective cohort MIMIC‐IV database, by which the independent effects of D(A‐a)O2 on 28‐day death risk was investigated, with D(A‐a)O2 as being the exposure variable and 28‐day fatality being the outcome variable. Binary logistic regression and a two‐piecewise linear model were employed to explore the relationship between D(A‐a)O2 and the 28‐day death risk after confounding factors were optimized including demographic indicators, Charlson comorbidity index (CCI), Sequential Organ Failure Assessment (SOFA) score, drug administration, and vital signs. Results A total of 18 933 patients were finally included in our analysis. The patients' average age was 66.67 ± 16.01 years, and the mortality at 28 days was 19.23% (3640/18933). Multivariate analysis demonstrated that each 10‐mmHg rise of D(A‐a)O2 was linked with a 3% increase in the probability of death at 28 days either in the unadjusted model or in adjustment for demographic variables (Odds ratio [OR]: 1.03, 95% CI: 1.02 to 1.03). But, each 10 mmHg increase in D(A‐a)O2 was associated with a 3% increase of death (OR: 1.03, 95% CI: 1.023 to 1.033) in the case of adjustment for all covariants. Through smoothed curve fitting and generalized summation models, we found that non‐linear relationship existed between D(A‐a)O2 and the death at 28‐day, which demonstrated that D(A‐a)O2 had no any impacts on the prognosis of patients with sepsis when D(A‐a)O2 was less than or equal to 300 mmHg, but once D(A‐a)O2 exceeded 300 mmHg, however, every 10 mmHg elevation of D(A‐a)O2 is accompanied by a 5% increase of the 28‐day death (OR: 1.05; 95% CI:1.04 to 1.05, p < 0.0001). Conclusion Our findings suggests that D(A‐a)O2 is a valuable indicator for the management of sepsis patient, and it is recommended that D(A‐a)O2 be maintained less than 300 mmHg as far as possible during sepsis process.


| INTRODUCTION
Sepsis is one of the most severe disease processes in clinical settings. The disease has been paid more and more attention to by clinicians since 1991, and Surviving Sepsis Campaign (SSC) guideline has been updated several times since its first version in 2001, 1,2 among which the most recent definition of sepsis is life-threatening organ dysfunction because of a dysregulated host response to infection. 2 Sepsis is estimated to affect more than 19 million people each year, killing approximately 5 million worldwide. 3 Despite substantial advances in understanding the host response to microbial pathogens, sepsis mortality has unfortunately not decreased so far. One of the bottlenecks is the lack of valid model to effectively predict the prognosis of sepsis and thus provide a basis for prognostic enrichment, leaving clinical trials of new drugs with insufficient power. 4 The alveolar-arterial oxygen gradient (D[A-a]O 2 ) is a specific indicator reflecting the diffusion capacity of lung function, with a normal physiologic range of 7-14 mmHg 5 in healthy adults (fraction of inspired oxygen [FiO 2 ] being 21%), which is also associated with age. The capacity has been proven to be impaired in some conditions such as severe pulmonary, especially in acute respiratory distress syndrome (ARDS). Clinical studies reported that D(A-a)O 2 could identify COVID-19 patients at risk of developing severe pneumonia early 6 and also has the ability to predict mortality rate of interstitial pneumonia in patient with dermatomyositis treated by cyclosporine A/glucocorticosteroid combination therapy. 7 Meanwhile, D(A-a)O 2 could predict the short-term prognosis in patients with submassive pulmonary embolism and is useful in risk stratification of these patients. 8 During sepsis, the lungs are one of the most involved organs, making sepsis-related ARDS to be a common complication. 9,10 Occurrence of ARDS in turn aggravates the severity of sepsis because of its many pathologic characteristics, of which diffusion capacity is severely decreased (indicated as D[A-a]O 2 value increase), resulting in oxygenation dysfunction, a pivotal mechanism for sepsis genesis. 11 This mechanism plays a pivotal role in sepsis pathogenesis, suggesting that D(A-a)O 2 may be linked to sepsis prognosis.
The goal of our study is to investigate the relationship between D(A-a)O 2 and the risk of 28-day mortality in sepsis patient basing on the Medical Information Mart for Intensive Care (MIMIC)-IV large sample sepsis database from the United States. A larger sample size would provide more consistent and trustworthy results, enabling us to gain a better understanding of the relationship between these two variables in sepsis.

| Source of information
The data was sourced from the MIMIC-IV database, which was designed to be applied in retrospective cohort research, including clinical data of patients treated at Beth Israel Deaconess Medical Center (BIDMC) from 2008 to 2019. The database is free to be downloaded once one completes an approved course on the official website. Lu Chen, one of our authors, has finished the certified course and acquired access to the database, and she hence is in charge of data extraction (record ID: 50668217). Our analysis conforms to the RECORD (REporting of Studies Conducted Using Observational Routinely-Collected Health Data) statement.

| Queue information
The MIMIC-IV database includes 377 207 adult patient records in total, from which we extracted data of 35 010 ones with sepsis diagnosis. The extraction was based on the ICD-9 and ICD-10 codes recorded in the database (ICD-9 codes 99591-99592 or ICD-10 codes R652, R6520, and R6521), spanning from 2008 to 2019. All of these patients whose data were extracted received treatment at BIDMC. We examined the interaction between 28-day death and D(A-a)O 2 value, with 28-day death as the outcome variable (dichotomous Y = 1, death; Y = 0, survival) and the D(A-a)O 2 value as the exposure variable (recorded as a continuous variable). The worst D(A-a)O 2 values from the first day of ICU admission were extracted for analysis. Patients with missing exposure variable information were not included in this study.

| Statement on ethics and informed consent
The MIMIC-IV database was authorized by the institutional review boards of BIDMC (2001-P-001699/14) in Boston, Massachusetts, and the Massachusetts Institute of Technology (0403000206). It is now available on the internet. Patient's informed permission was revoked as the data is publicly available, and the patient's identifiable information is uncertain.

| Description of missing data
Multiple interpolation was not utilized to fill in the blanks as the missing rate of the variables used in this study was less than 5% (0-4.1%).

| Statistical analysis
Continuous variables were expressed as the mean + standard deviation (Gaussian distribution) or as the median (minimum, maximum) (skewed distribution). Categorical variables were presented as rate (percentage). Considering that it was a cohort study, we categorized exposure variables into four groups (quartiles, Q1 to Q4) and examined the distribution of patient baseline information across the different subgroups. To evaluate any statistical differences between the means and proportions of the groups, we performed one-way ANOVA (Gaussian distribution), Kruskal-Wallis H (skewed distribution), and chi-square tests (categorical variables). Meanwhile, we also employed univariable and multivariable binary logistic regression models to investigate the relationship between D(A-a)O 2 and 28-day mortality under four different models, that is, Model 1 (no factors were adjusted), Model 2 (adjusted for demographic characteristics only) and Model 3 (adjusted for all covariates presented in Table 1). Finally, we have done a series of sensitivity analyses: (1) We converted D(A-a)O 2 from a continuous to a categorical variable (quartile, Q1 to Q4) and calculated P for the trend of relationship. The purpose of sensitivity analysis was to determine whether the results of categorical variable were as robust as those of the continuous variable. (2) We employed several different adjustment strategies to evaluate the robustness of our results.
As D(A-a)O 2 is a continuous variable, it could not be excluded that a nonlinear association existed between D(A-a)O 2 and 28-day mortality. Therefore, we performed the generalized additive models (GAM) and smoothed curve fitting to investigate the connection between D(A-a)O 2 and sepsis 28-day mortality. If the nonlinear relationship dose exist, we will explore the inflection point value by using a recursive technique then utilize the two-piecewise linear model to obtain the OR value and 95% confidence interval on both sides of the inflection point.
All analyses were carried out with the statistical programs R (http://www.r-project.org, The R Foundation) and EmpowerStats (http://www.empowerstats.com, X&Y Solutions, Inc, Boston, MA). P value of less than 0.05 (two-sided) is considered to be statistically significant.

| Description of the patient screening process
The MIMIC-IV database contains data of 377 207 patients, among which the data of 342 197 patients were excluded because of non-sepsis conditions. Among these, leaving the data of 35 010 patients, the data of 16 077 patients were also excluded as 15 787 of them are without D(A-a)O 2 information and another 290 are with D(A-a)O 2 value of less than 0. Ultimately, the data of 18 933 patients were included for the final analysis. (Figure 1).

| Patient baseline characteristics
In order to observe the association between D(A-a)O 2 as categorical variable and 28-day mortality, we categorized D(A-a)O 2 into four groups (Q1 to Q4) ( Table 1). We observed the tendencies in the distribution of each  antibiotics, vancomycin, penicillin antibiotics, dobutamine, norepinephrine, and a 28-day mortality when compared with the Q1, Q2, and Q3 groups, but has a lower value of PO2/FiO2 and received lower cephalosporin proportions than the Q1, Q2, and Q3 group. Patients in Q2 and Q3 group were older and had a lower rate of methylprednisolone use compared with the Q4 group. When compared to the Q4 group, patients in Q1 and Q2 group had higher rates of dexamethasone use.

| Univariate and multivariate analyses for the association between D(A-a)O 2 and 28-day mortality
In order to explore the relationship between D (A-a)O 2 and 28-day death in patients with sepsis, we conducted univariate and multivariate analyses. Results indicated that each 10-mmHg rise in D(A-a)O 2 was linked with a 3% increase in the probability of death at 28 days either in the unadjusted model or adjusted for variables in model 2 (Odds ratio (OR): 1.03, 95% CI: 1.02 to 1.03). In the adjustment of variables in Table 1, we found that each 10 mmHg increase of D(A-a)O 2 was accompanied with 3% of 28-day mortality elevation (OR: 1.03, 95% CI: 1.023 to 1.033). In order to observe the trend testing, we conducted a sensitivity analysis, in which we converted    (Table 2). Furthermore, we investigated the influence of treatment strategies on outcomes, including the use of vasoactive drugs, glucocorticoids, and adaptation of antibiotic therapy. Our findings indicated that there was no significant change in effect values for D(A-a)O 2 with or without adjustment for the above variables (Supplemental table S1, S2 and S3).

| Results of the nonlinear association between D(A-a)O 2 and 28-day mortality
We used smoothed curve fitting and generalized summation models to investigate the relationship of D(A-a)O 2 and 28-day death. After adjusting for all covariables shown in Table 1 Table 3).

| DISCUSSION
In this large retrospective cohort analysis, we discovered a curvilinear association between D(A-a)O 2 and 28-day mortality in patients with sepsis, with an inflection point of 300 mmHg of D(A-a)O 2 , which demonstrated that when D(A-a)O 2 was less than or equal to 300 mmHg, it has no association with the prognosis of the patient, but once D(A-a)O 2 became greater than 300 mmHg, each 10 mmHg increase is associated with a 5% increased risk of mortality (OR: 1.05; 95% CI: 1.04 to 1.05, p < 0.0001).
As far as we know, few previous clinical studies have focused on the relationship between D(A-a)O 2 and the prognosis of sepsis patient. D(A-a)O 2 refers to the difference between the alveolar and the arterial oxygen partial pressure, with the capability of hypoxia detection, which is mainly associated with pulmonary diffusion function, anatomical shunt, and so forth. In physiologic condition, D(A-a) O 2 will increase with the rise of concentration of inhaled oxygen, but rarely exceeds 56 mmHg. 5 However, some respiratory disorders could make D(A-a)O 2 dramatically elevated, such as interstitial pneumonia, sever pulmonary fibrosis, and ARDS, in which diffusion dysfunction or ventilation/perfusion (V/Q) ratio mismatch occurred. D(A-a)O 2 is therefore often used as an indicator to measure the severity of these diseases. For patients with submassive pulmonary embolism, Ince O et al found that a D(A-a)O 2 of ≥42.38 mmHg had a good predictive value for a 90-day mortality, with an area under curve of 0.83 and a sensitivity, specificity, and negative predictive value of 93.3%, 65.1%, and 98.6%, respectively. 8 Pipitone et al reported that D(A-a)O 2 is superior to PaO 2 /FiO 2 in identifying COVID-19 patients at risk of developing severe pneumonia early, with an area under curve of 0.877 and a sensibility of 77.8%, a positive and negative predictive value of 75% and 94%, respectively, in the case of D(A-a)O 2 being ≥60 mmHg. 6 However, our study found a significantly different "inflection point" compared with these previous studies. In Ince et al's study, patients with submassive pulmonary embolism were included, and arterial blood samples were obtained while they were breathing room air to avoid interference T A B L E 2 Results of univariate and multivariate analysis using non-adjusted and adjusted Cox regression models. from supplemental oxygen administration. This indicates that the population we studied was significantly different from that of Ince O. Our study population consisted of sepsis patients most of whom were on ventilator therapy, which can explain why D(A-a)O 2 was significantly higher in this population than in previous studies. In Pipitone et al's study, D(A-a)O 2 was obtained on admission to hospital. However, the study population was COVID-19 patients, and the outcome was whether or not they had severe pneumonia. Therefore, the patients had D(A-a)O 2 measured on admission, and their lung function did not deteriorate at that time. This suggests that the specimens were likely collected and analyzed for blood gas in a situation where mechanical ventilation was not being used.

Exposure
In our retrospective analysis, we first observed the relationship between D(A-a)O 2 and 28-day death of patient with sepsis by using univariate and multivariate analyses, in which D(A-a)O 2 was used as an exposure variable and 28-day death as an outcome variable with other variables being adjusted. We found that 28-day mortality climbed with the increase of D(A-a)O 2 whether D(A-a)O 2 was used as a continuous or a categorical variable. We further conducted smoothed curve fitting by adjusting all other covariables and found out a nonlinear association between D(A-a)O 2 and the risk of 28-day death in sepsis patient, in which a 5% increased risk of death was paired with a 10 mmHg rise of D(A-a)O 2 when D(A-a)O 2 was more than 300 mmHg. As we employed multiple statistical analyses and adjusted all other cofounders, we think our results are reliable.
Sepsis begins with an infection and progresses to multiple organ dysfunction via cytokine storm, among which respiratory system is vulnerably involved, mostly represented as a sepsis-induced ARDS. Clinical data demonstrated that once sepsis patients was complicated with ARDS, their condition would inevitably deteriorate and their survival probability would also significantly be affected. 14,15 The important involved reasons, we guess, include the dysfunction of pulmonary diffusion capacity T A B L E 3 Non-linear relationships addressing. and mismatch of V/Q ratio, which are important pathophysiologic characteristics of ARDS. We did not know whether there were some sepsis-induced patients included in our analysis, but it could be inferred from the distribution of data that D(A-a)O 2 values of many patients were more than 300 mmHg, a value obviously exceeding the normal threshold, implying many sepsis-induced ARDS patients in our analysis. Therefore, we have the reason to think that patients with D(A-a)O 2 of more than 300 mmHg are actually these patients with diffusion dysfunction and/or V/Q mismatch, that is, sepsis-induced ARDS. So the nonlinear relationship found in our result means that the more serious the sepsis-induced ARDS, the higher the possibility of 28-day death. Anyway, our reliable findings suggested that D(A-a)O 2 values are useful indicators effectively predicting the risk of 28-day death in patients with sepsis. Our work has the following advantages: (1) It has a large sample size and statistically significant power; (2) It employs a generalized summation model and a twopiecewise linear model, both of which are advanced algorithms used to better determine the genuine association between D(A-a)O 2 and death.(3) The more covariate data, together with the presentation of multiple adjustment strategies and sensitivity analyses, ensure the robustness of the results and decrease the probability of chancing conclusions.
Meanwhile, there are some limitations in our analysis: (1) Given that this is a clinical retrospective study, it is inevitably subject to confounding factors. Nevertheless, we have systematically accounted for the confounding factors, and the robustness of the results has been examined by sensitivity analysis. (2) Due to the nature of observational studies, we can only observe relationships rather than determine cause and effect. (3) We can only account for detectable confounding, not unmeasurable puzzling, so extensive clinical research with stronger levels of evidence in larger populations is required to corroborate our findings. (4) As the population investigated in this study involves septic patients in the United States, researchers will have to exercise caution when extrapolating our findings to other populations. (5) Given that this study is a secondary data analysis based on a large, multicenter critical care database, we were unable to include patients based on definitions and clear criteria for sepsis and septic shock as in the actual clinical scenario. This could potentially result in some patients who meet the definition of sepsis or diagnostic criteria not being included. However, because it is not possible to determine whether this misclassification is related to the exposure variable as well as the outcome variable, the impact of this misclassification on our findings is unknown.

| CONCLUSION
Our results demonstrated that D(A-a)O 2 has a nonlinear impacts on the risk of 28-day death in patients with sepsis, meaning, the D(A-a)O 2 is a reliable indicator to predict the prognosis in this population.

AUTHOR CONTRIBUTIONS
Ying Wang contributed to study concept and design and drafting of the manuscript. Lu Chen gained access to the database and is responsible of the data extraction. Yan He interpreted the data. Ying Liu, Jia Yuan, Hongying Bi, Qimin Chen, and Xianjun Chen helped with the data arrangement. Feng Shen contributed to the study concept, supervision and organized the final manuscript. All authors have read and approved the manuscript for publication.