Association between days alive without life support/out of hospital and health‐related quality of life

Trials in critically ill patients increasingly focus on days alive without life support (DAWOLS) or days alive out of hospital (DAOOH) and health‐related quality of life (HRQoL). DAWOLS and DAOOH convey more information than mortality and are simpler and faster to collect than HRQoL. However, whether these outcomes are associated with HRQoL is uncertain. We thus aimed to assess the associations between DAWOLS and DAOOH and long‐term HRQoL.

inadequate calibration. Moderate associations were found when including non-survivors, although predictions remained uncertain and calibration inadequate.
Conclusion: DAWOLS and DAOOH were poorly associated with HRQoL in adult survivors of severe or critical illness included in the COVID STEROID 2 and HOT-ICU trials.
K E Y W O R D S critical care, days alive out of hospital, days alive without life support, health-related quality of life, outcome selection

Editorial Comment
More patient-centred outcomes are preferred in current ICU treatment trials, including days alive without life support or days outside of hospital, as well as health-related quality of life. The results from two recent trials were assessed for associations between these outcomes. The two first number of days with the good outcomes were not associated with the later quality of life scoring for survivors in these two cohorts.

| INTRODUCTION
All-cause mortality is highly patient-important and frequently used as the primary outcome in randomised clinical trials (RCTs) in the critical care setting 1,2 despite several limitations, 1,3,4 including the need for larger samples than for non-dichotomous outcomes. 5 Consequently, critical care RCTs are often only powered to detect mortality differences substantially larger than what could be considered clinically important or plausible. 1,6,7 Furthermore, survivors of critical illness survive to very different health states, which are quantifiable using other outcomes such as health-related quality of life (HRQoL). HRQoL is highly patientimportant and increasingly used in critical care RCTs as a secondary, long-term outcome. 2 However, HRQoL comes with limitations, including handling of non-survivors (focusing on survivors only may yield substantially misleading results 8 ) and loss to follow-up (which may be related to the actual HRQoL 9 ). Finally, long follow-up durations may be an additional disadvantage in emergency situations (e.g., pandemics) or if used to guide adaptation in adaptive trials. 10 Days alive without life support (DAWOLS) and days alive out of hospital (DAOOH) convey more information than mortality 5 and can be considered as composites of mortality and illness durations and severity, which may be hypothesised to be associated with long-term HRQoL. 11 These outcomes have increasingly been used as primary outcomes during the coronavirus disease 2019 (COVID- 19) pandemic, 12,13 and similar outcomes have previously been validated in surgical patients where fewer days at home were found to be associated with an increased number of post-operative complications. 14 Although these outcomes also have limitations, 3,15,16 they are objective, easy to register and generally assessed after short-to-medium follow-up durations, all making them less prone to missing data compared with HRQoL. In this study, we assessed the associations between DAWOLS and DAOOH and long-term HRQoL in two large, international RCTs in severely and critically ill adults.

| METHODS
This study was conducted according to a protocol and statistical analysis plan published prior to the conduct of the analysis and before HRQoL-follow-up was completed for one of the included trials. 11,17,18 This manuscript was prepared in accordance with the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) checklist 19 (Data S1).

| Population and data sources
We included severely and critically ill adults enrolled in two investigator-initiated, international RCTs:  20 Both trials were conducted in accordance with the Declaration of Helsinki with enrolment after informed consent by patients or their legal surrogates; additional details on consent procedures and approvals are available elsewhere. 12,[20][21][22] No further approvals were required for this secondary study.
EQ VAS scores range from 0 to 100 (worst to best imaginable health states, respectively), while EQ-5D-5L index values are calculated using previously derived country-specific value sets 23 based on studies conducted by interviewing representative samples asked to 'weigh' different health states defined by different responses to the five individual domains. EQ-5D-5L index values are anchored at 1 (perfect health) and 0 (a health state considered as bad as being dead) with negative values corresponding to health states considered worse than death. 23 As recommended, 24

| Statistical analyses
Analyses were conducted separately in the two trial databases using R v. 4.1.0 (R Core Team, R Foundation for Statistical Computing, Vienna, Austria) with the tidyverse and mice packages.
Descriptive baseline and outcome data were calculated for all patients and stratified by survival/respondence status at HRQoL-follow-up. Numeric data were summarised using medians with interquartile ranges (IQRs), and categorical data were summarised using absolute and relative frequencies.

| Primary analyses
The primary analyses were conducted in patients known to be alive at HRQoL-follow-up only and assessed all combinations of DAWOLS or DAOOH (at both time points) and EQ VAS or EQ-5D-5L index values.
The associations were modelled using the best fitting (lowest root mean squared error [RMSE]) first-or second-degree fractional polynomial transformations 30 as described in detail in the protocol 11 and Data S1. We present the full model fits for the selected models and evaluated fit using RMSEs and Spearman's non-parametric rank correlation coefficient with 95% confidence intervals (CIs) in the trial dataset used to fit each model. Models were assessed externally in the other trial dataset using RMSEs, calibration-in-the-large (mean prediction error; ideally 0, values >0 indicate systematic over-prediction, while values <0 indicate systematic under-prediction) and calibration slopes (systematic over-/underfitting; ideally 1, values <1 suggest too extreme predictions, while values >1 suggest too moderate predictions). 31 Finally, best fits from both datasets were visualised as curves with 95% confidence bands overlaid scatterplots of each trial dataset.

| Secondary analyses
The distributions of DAWOLS and DAOOH (at both time points) according to each level of each EQ-5D-5L dimension were assessed numerically (medians with IQRs) and graphically in both datasets.

| Sensitivity analyses
Three sets of pre-specified 11 sensitivity analyses were conducted for the primary analyses.
First, all patients were included, with EQ VAS and EQ-5D-5L index values set to 0 (the lowest possible value for EQ VAS and the value that corresponds to a health state as bad as being dead, respectively) in nonsurvivors at HRQoL-follow-up. Patients who died before DAWOLS-DAOOH follow-up were assigned 0 days (worst possible value) for these outcomes as previously recommended and frequently done in trials to make death the worst possible outcome in the analyses. 15,32 Of note, to focus on patients where the prediction of HRQoL based on DAWOLS or DAOOH is most difficult, we focused on survivors only in the primary analyses, as the assignment of specific, fixed values to non-survivors for HRQoL values was expected to lead to stronger associations as mortality also affects the DAWOLS-DAOOH outcomes. As we only focused on associations between outcomes in the complete trial populations (i.e., not separated by allocation), we consider this the most appropriate approach for the primary analyses in this study and supplemented it with the sensitivity analyses including non-survivors. This may be in contrast with the optimal analysis of treatment effects on HRQoL in trials conducted in populations with high mortality where treatments are hypothesised to affect mortality.
In such cases, focusing on survivors only may be misleading, 8 as treatments that improve survival are likely to lead to more of the most ill patients surviving, who, in turn, are prone to have relatively low HRQoL, causing potentially beneficial treatments to appear inferior. Second, all patients were included, with the actual DAWOLS-DAOOH values used (i.e., no penalisation of death). Third, sensitivity analyses were conducted using EQ-5D-5L values calculated using the Danish value set 25 for all patients as most patients were included in Denmark.
Fourth, a post hoc sensitivity analysis was conducted for DAOOH after 28 days, excluding 402 HOT-ICU patients whose exact number of days were unobtainable from the available data; for the other analyses, values from these patients were multiply imputed and truncated to the possible range of values as described in Data S1.

| Sample size
The sample size was fixed to the relevant intention-to-treat populations from the COVID STEROID 2 and HOT-ICU trials, that is, 982 patients (615 survivors at HRQoL-follow-up) and 2910 patients (1476 survivors at HRQoL-follow-up), respectively. 12,17,18,20 Consequently, formal sample size calculation was forgone. 11

| Missing data
The proportion of missingness for all variables is presented; as patients with missing data for at least one outcome exceeded 5% in both trials, we used multiple imputation 33 with 25 imputed datasets for each trial as specified in the protocol 11 and detailed in Data S1.

| RESULTS
Descriptive baseline and outcome data are presented in Table S1 for the COVID STEROID 2 trial and Table S2 for the HOT-ICU trial. In both trials, patients who died before HRQoL-follow-up were older, received more life support at baseline and had more co-morbidities than those alive; HRQoL respondents and non-respondents were similar at baseline and for DAWOLS and DAOOH outcomes, while imputed HRQoL values were somewhat lower in non-respondents than observed values in respondents in both trials.

| Associations between DAWOLS or DAOOH and HRQoL in survivors (primary analyses)
In total, 615 (62.6%) of the included patients in COVID STEROID

| Sensitivity analyses
The results from all sensitivity analyses are presented in Tables S7--S11   The poor associations between DAWOLS or DAOOH and HRQoL may be explained by multiple factors. First, as HRQoL is generally measured much later than DAWOLS or DAOOH, 2 it may be affected by many external factors and events happening after ICU or hospital discharge. 34 Second, EQ-5D-5L domains and especially EQ VAS scores may be hypothesised to be somewhat volatile and expected to vary from day to day making the detection of associations more difficult due to the high variation. Third, long-term HRQoL in survivors of critical illness may be substantially affected by HRQoL prior to hospitalisation with the hospital admission contributing less to overall HRQoL as time from discharge increases. 35 The lack of convincingly strong associations between DAWOLS or DAOOH and HRQoL precludes using DAWOLS or DAOOH to predict HRQoL, and these shorter-term outcomes can, thus, not be considered reliable proxies of long-term HRQoL. Importantly, the lack of strong associa-

| Strengths and limitations
This study comes with several strengths. These include the prespecification and publication of the protocol 11 before HRQoL-followup was completed for the COVID STEROID 2 trial; the pre-defined sensitivity analyses including non-survivors using two different strategies and using the Danish value set in all patients; 11 the assessment of DAWOL and DAOOH at two different but commonly used time points; and finally, the assessment of associations and external validation of predictions in two different yet comparable RCT populations, which increases external validity.
The study has limitations, too. First, as expected, 11 the amounts of missing data were non-negligible in both trial databases for HRQoL.
This was handled using multiple imputations as specified in the protocol, 11 since missingness for HRQoL data is unlikely to be missing-completely-at-random. 38 We assumed that missing data were missing-at-random and that missing data could be reasonably predicted from the other available data; 11 inherently, the missing-at-random assumption cannot be verified. Importantly, even if the data are not truly missing-at-random, multiple imputation is still expected to decrease bias and loss of power compared to (and so preferred over) complete case analyses, which were not conducted. 11,39 In addition to the expected missing data for HRQoL, data were only partially complete for DAOOH at Day 28 for 13.8% in the HOT-ICU trial. Importantly, the results were similar in our primary and post hoc sensitivity analyses with different handling of partially complete data for this outcome. Second, we did not assess whether associations differed in the two intervention arms in each trial; this was not planned as we did not expect this to be the case and as we found similar HRQoL data for survivors in both intervention groups in both trials. 17,18 Third, the assessment of HRQoL at two different time points in the two trials is a limitation that may partially explain the systematic over-/under-prediction observed when assessed externally in the other dataset, although we consider the external validation valuable and the systematic over-/under-prediction to be more likely to be explained by population differences. Fourth, although country-specific EQ-5D-5L value sets were available for most included patients, this was not the case for all countries, and we had to use the Danish value set for some non-Danish patients. 11,40 Importantly, the results were similar when comparing the primary analyses using different value sets and the sensitivity analysis using the Danish value set for all patients. Finally, DAWOLS or DAOOH may be more strongly F I G U R E 4 Associations between DAOOH and HRQoL in survivors only the HOT-ICU trial. Scatter plots with data from the HOT-ICU trial using the multiply imputed datasets (survivors only, using mean values across all imputations); darker points indicate more patients with identical values. DAOOH after 28 and 90 days are presented on the horizontal axes, while HRQoL values (EQ VAS or EQ-5D-5L index values) are presented on the vertical axes. The predicted values according to the best fitting fractional polynomial transformation models from both trials are presented with 95% confidence bands; predictions based on the best model from the HOT-ICU trial are presented in blue with full lines, while predictions based on the best model from the COVID STEROID 2 trial are presented in red with dashed lines (external validation). DAOOH28/90, days alive out of hospital after 28 or 90 days; EQ-5D-5L, EuroQol 5-dimension 5-level survey; EQ VAS, EuroQol visual analogue scale; HRQoL, health-related quality of life. associated with changes from baseline (i.e., before severe or critical illness) in HRQoL than with absolute long-term HRQoL values. However, baseline HRQoL data are generally not registered in trials conducted in severely or acutely ill adults, including the COVID STE-ROID 2 and HOT-ICU trials, and consequently, we were unable to assess such associations in this study.

| CONCLUSIONS
We found limited or weak associations between DAWOLS or DAOOH and HRQoL in adult severely or critically ill patients included in the COVID STEROID 2 and HOT-ICU trials. There was substantial variability in outcomes, and prediction accuracy from the best fitted flexible models was poor both internally and externally in the alternate trial dataset, which also showed inadequate calibration.
Although moderately strong associations were found when including non-survivors, this seemed mostly driven by the assignment of the value 0 for HRQoL in these patients. Note: Performance measures of the selected (the best-fitting) model for each association in each dataset for the primary analyses. Spearman's rank correlation coefficient is a non-parametric measure of the relationship between two variables ranging from À1 (one variable perfectly monotonically decreases as the other increases) through 0 (no monotonic relationship) to 1 (one variable perfectly monotonically increases as the other increases). RMSEs is a measure of the differences between values predicted by a statistical model and the observed values on the same scale as the dependent variable (EQ VAS or EQ-5D-5L index values here); RMSEs of 0 indicate perfect predictions, while increasing RMSEs indicate increased lack of fit, that is, that the model is increasingly worse at predicting the dependent variable (EQ VAS or EQ-5D-5L index values here) using the independent variable(s) (DAWOLS or DAOOH after 28 or 90 days here). RMSEs were assessed both in the trial dataset in which models were developed (internally) and in the other trial dataset (externally). The calibration-in-the-large was used to assess the model fit in the other trial dataset (externally) and corresponds to the mean prediction error; ideally, this value is 0, while values >0 and <0 indicate systematic over-and under-prediction, respectively, of the dependent variable (EQ VAS or EQ-5D-5L-index values here). 31 Calibration slopes were similarly used to assess the model fit in the other trial dataset (externally) and measures systematic over-or underfitting of models, with values of 1 being ideal, while values <1 and >1 suggest too extreme or too moderate predictions, respectively. 31 Abbreviations: CI, confidence interval; DAOOH, days alive out of hospital; DAWOLS, days alive without life support; EQ-5D-5L, EuroQol 5-dimension 5-level survey; EQ VAS, EuroQol Visual Analogue Scale; RMSE, root mean squared error.

Conception
and Morten Hylander Møller were involved in the conduct or analysis of the COVID STEROID 2 trial and/or the HOT-ICU trial.