Ingar Holme, Centre of Preventive Medicine, Department of Preventive Cardiology, Building 19, Oslo University Hospital, Kirkevn 166, 0407 Oslo, Norway. (fax: +47-221-19975; e-mail: firstname.lastname@example.org).
Abstract. Holme I, Fellström BC, Jardin AG, Schmieder RE, Zannad F, Holdaas H (Oslo University Hospital, Ullevål, Oslo, Norway; British Heart Foundation Glasgow Cardiovascular Research Centre, Glasgow, UK; University Hospital, Erlangen, Germany; Centre d`Investigation Clinique; Centre Hospitalier Universitaire, and Nancy Université, Nancy, France; and Oslo University Hospital, Oslo, Norway). Prognostic model for total mortality in patients with haemodialysis from the Assessments of Survival and Cardiovascular Events (AURORA) study. J Intern Med 2012; 271: 463–471.
Objectives. Risk factors of mortality in patients with haemodialysis (HD) have been identified in several studies, but few prognostic models have been developed with assessments of calibration and discrimination abilities. We used the database of the Assessment of Survival and Cardiovascular Events study to develop a prognostic model of mortality over 3–4 years.
Methods. Five factors (age, albumin, C-reactive protein, history of cardiovascular disease and diabetes) were selected from experience and forced into the regression equation. In a 67% random try-out sample of patients, no further factors amongst 24 candidates added significance (P <0.01) to mortality outcome as assessed by Cox regression modelling, and individual probabilities of death were estimated in the try-out and test samples. Calibration was explored by calculating the prognostic index with regression coefficients from the try-out sample to patients in the 33% test sample. Discrimination was assessed by receiver operating characteristic (ROC) areas.
Results. The strongest prognostic factor in the try-out sample was age, with small differences between the other four factors. Calibration in the test sample was good when the calculated number of deaths was multiplied by a constant of 1.33. The five-factor model discriminated reasonably well between deceased and surviving patients in both the try-out and test samples with an ROC area of about 0.73.
Conclusions. A model consisting of five factors can be used to estimate and stratify the probability of death for individuals The model is most useful for long-term prognosis in an HD population with survival prospects of more than 1 year.
A number of risk factors for total mortality have been reported in patients with haemodialysis (HD), including elevated levels of high-sensitivity C-reactive protein (hsCRP), low diastolic blood pressure, increased left ventricular mass, short HD treatment time, reduced adequacy of HD (i.e. low Kt/V value), hypoalbuminaemia, hypercalcaemia/hyperphosphataemia and prolonged QT interval [1–7]. Reported studies have included significant risk factors with specific and adjusted estimates of mortality and morbidity reductions per unit of risk factor change, mostly with an aetiological purpose of analysis [2, 8]. There have been few studies with a purely prognostic aim to estimate the probability of death in patients with HD with appropriate validation, and they have comprised patients from HD registers and have so far been investigated only short-term prognosis [9–11].
Large randomized trials to evaluate the effects on mortality of new treatments in patients with HD have been performed for several treatment modalities including statins [12, 13]. One such randomized placebo-controlled trial is the Assessment of Survival and Cardiovascular Events (AURORA) study of 2776 patients with HD to investigate whether rosuvastatin 10 mg daily reduces the risk of cardiovascular disease (CVD) and mortality. No significant effects on any of the prespecified endpoints were demonstrated . In AURORA, a number of risk factors were measured according to standardized protocols with inclusion of a central biochemical laboratory .
Although patients with several high-risk co-morbidities and poor long-term prognosis were excluded from AURORA, we wanted to use this high-quality database to develop a prognostic model of total mortality including readily measurable risk factors. Our aim was to investigate an HD population at medium risk and with reasonably high probability of survival at 3 years, so that nephrologists would be able to calculate the long-term probability of dying for patients at a moderate level of risk.
Patients and methods
Study design and patient baseline characteristics and outcome of the AURORA trial have been described previously [12, 14, 15]. In short, 2776 patients (aged 50–80 years) with end-stage renal disease who had been treated with regular HD for a minimum of 3 months were randomly assigned to rosuvastatin 10 mg daily or matching placebo. Three patients were erroneously randomized, thus 2773 were included in the intention-to-treat population. In this cohort study, both treatment groups were combined, because mortality was similar in the two groups, and the aim was to develop a prognostic model of total mortality.
Vital status was ascertained for all patients, so the endpoint of death was classified without errors and with knowledge of the precise day of death. Risk factors were selected from patient characteristics, typical HD-related factors, lipids and lipoproteins, history of disease and drug therapy.
Differences in baseline risk factor levels or proportions between deceased and surviving patients were tested by t-test or chi-square statistics, as appropriate. Potential prognostic factors (n = 29), see Table 1, were selected on the basis of the authors’ clinical experience. The following variables were judged to be clinically important enough to be forced into the model: age, presence of CVD at baseline, diabetes, albumin and hsCRP. However, because patients in AURORA were subject to a number of inclusion and exclusion criteria, it was difficult to judge which further factors should be forced into the final prognostic model, and which could be eliminated by stepwise procedures. We therefore chose to let the residual most important factors in this population be determined by a backward stepwise Cox regression likelihood ratio elimination procedure amongst the remaining 24 potential candidates. To avoid inclusion of too many falsely significant risk factors in the final model, P-values for inclusion and exclusion during the procedure were both required to be <0.01. Two of these factors, haematocrit and haemoglobin, were highly correlated (r = 0.98), and only haemoglobin was used as a candidate factor in the backward elimination procedure. Two strategies were compared. First, we randomly selected a try-out sample of 67% of the AURORA population and ran the procedure described. Only the five factors forced into the model remained as significant at P <0.01. Second, we used each treatment group and ran the same elimination procedure. In the placebo group, only the original five mentioned factors remained significant (P <0.01), but in the rosuvastatin group, oxidized low-density lipoprotein (LDL) and haemoglobin remained significant over and above the five factors (both P =0.003). However, when a receiver operating characteristics (ROC) analysis was performed with and without these two extra factors as part of the prognostic (i.e. the sum of products between the regression coefficient and centralized risk factor level), the ROC area changed by only 0.003 units. It was, therefore, decided that for prognostic purposes only the original five factors would be used, and the model should be developed on the basis of the 67% random sample. hsCRP was best modelled by using its natural logarithm. The model was then estimated in 1868 patients.
Table 1. Potential prediction factors at baseline (mean, SD) by vital status at study end
Potential predictors at baseline
Dead (n = 880)
Alive (n = 988)
SD = standard deviation; BMI = body mass index; HDL-C = high density lipoprotein cholesterol; LDL-C = low density lipoprotein cholesterol; ApoB/apoA-1 = apolipoprotein B to apolipoprotein A-1 ratio; OX-LDL = oxidized LDL; Ln hsCRP = natural logarithm of high sensitivity C-reactive protein; Syst BP = systolic blood pressure; Diast BP = diastolic blood pressure; Pulse P = pulse pressure (Syst BP minus Diast BP); HD = hemodialysis; CVD = cardiovascular disease; ACE = angiotensin converting enzyme; B-blocker = beta blocker.
Ln hsCRP (10*mg/dL)
Syst BP (mmHg)
Diast BP (mmHg)
Pulse P (mmHg)
Years on HD treatment
Curr duration of HD at baseline (hrs)
Categorical variables (N, %)
History of CVD
Curr HD treatment
Additionally, an unrestricted approach was used, in which no variable was forced into the model, and treatment group was also added as a candidate factor. When the try-out sample was used, three more factors were included: LDL-cholesterol (inverse association), serum phosphate and body mass index (inverse association). However, ROC area was only increased from 0.730 to 0.744; such a small change is usually considered to be of minor clinical importance. For simplicity, we, therefore, decided to use the five-factor model described earlier for further evaluations.
A computer program (spss 18, PASW Statistics 18 for windows, IBM Corporation, Somers, NY, USA) was used to calculate both the prognostic index and the probability of survival for each patient. In that way, the underlying probability of surviving could be calculated for a given time period in a patient, with no variation between patients. Further calculation details are given in the Appendix. Calibration of the final model was performed internally using the regression coefficients in the random 67% try-out sample, to find a centralized prognostic index (product sum of the five regression coefficients and risk factor levels from the try-out sample) in the 33% remaining test sample and then fitting the centralized prognostic index within the 33% sample itself. In this internal validation from two randomly selected subgroups from the same population, all relative statistical information about the probability of surviving is contained in the prognostic indices. If the relative calibration was good, a high-squared correlation coefficient should emerge between these two scores. A correlation plot was made to illustrate the degree of correlation. A method of absolute calibration is to partition the estimated probabilities of death in the test sample (again using the try-out sample regression coefficients, when calculating the centralized prognostic index) into deciles and calculate the observed number of deaths in such groups. Deviations should be small if the calibration is good. However, because the model was fitted in patients other than those tested, the total number of deaths predicted could be different from the observed number. A calibration coefficient calculated as the ratio between the total number of observed and calculated number of deaths was applied to each calculated number per decile and then compared with the observed number of deaths. Similarly was performed in noncases. The total deviation in cases and noncases across deciles was determined by the Hosmer–Lemeshow test .
The discrimination ability of the model was assessed using the traditional ROC area calculation. This measure determines the ratio of the probability of death in deceased individuals compared with survivors across different cut-off levels of the mortality prognostic index, calculated from the Cox regression model using the five factors in the try-out sample. The influence of a single factor on the discrimination was assessed by recalculating the ROC area, removing one factor at a time from the full model. A large drop in ROC area (usually regarded as more than 0.02 area units), as compared to that of all factors, would indicate greater clinical influence of this factor on prognostic ability.
As age had a dominating influence on the discrimination ability for mortality, a stratified backward stepwise Cox regression procedure was repeated for different age groups (<60, 60–69 and 70+) to determine whether the selection of factors was heterogeneous across the age categories.
Table 1 shows that there was a difference between deceased patients and survivors in the mean level of a number of potential predictors, including age, albumin, atherogenic lipids and lipoproteins, hsCRP, haemoglobin, haematocrit and history of diabetes (all P ≤0.001). Five variables – age, albumin, Ln-hsCRP, history of CVD and diabetes – were forced into the regression model. None of the 24 remaining risk factors had a P-value for inclusion <0.01 after the backward stepwise Cox regression analysis of time to death. Statistical information for the five variables is given in Table 2 for the try-out and test samples. There were slight variations in the regression coefficients between the two samples, but none was statistically significant, either by comparison of hazard ratio or ROC area (all P >0.10). Table 3 shows ROC areas calculated by different approaches. If the try-out sample prognostic index calculated with the five factors was used to discriminate death from survival in just that sample, the highest ROC area was achieved (0.730). It was almost as high in the test sample when the prognostic index estimated from just that sample was used (0.727). Even when the regression coefficients from the try-out sample were applied to calculate the prognostic index in the test sample, the ROC area was only slightly reduced (0.722). The influence of each factor, deleting one at a time in the model, is shown in Table 4. Age was the most influential, and the other four factors changed ROC area to a lesser degree. However, collectively, the remaining factors added much to the ROC area over and above age alone. For the continuous factors, probabilities of death in the total AURORA population (n = 2773) are calculated by deciles, when adjusted for the other four shown in Fig. 1. Furthermore, age was the most dominating factor amongst the three, whereas the other two were almost equally strongly associated with mortality. The log-likelihood chi-square is a measure of total statistical information of risk. Table 4 shows that age alone, when deleted from the risk function consisting of all five factors, was responsible for about 25% of the total chi-square information value. Given the other factors, each factor was responsible for only about 30–40% independent information value compared with age. Collectively, however, the four factors added more information value than age alone, contributing only 44.1% of the total five-factor chi-square.
Table 2. Predictors of total mortality after backward likelihood ratio Cox regression modelling in a randomly selected 67% and estimates in the remaining 33% of the Assessments of Survival and Cardiovascular Events population. HR per SD for continuous variables*
Random 67% sample (n = 880 deaths)
Remaining 33% sample (n = 409 deaths)
Regression coefficient (SE)
Regression coefficient (SE)
SE, standard error; HR, hazard ratio; CI, confidence interval; Y/N, yes/no; CVD, cardiovascular disease.
Table 3. Ability of prognostic indices to discriminate dead from alive patients with five predictors selected from stepwise Cox regression analysis
Based on 67% dataset, applied to same set
As found in random 67% dataset, applied to 33% test sample
Same five predictors fitted to 33% dataset test sample, applied to the same sample
Table 4. Discrimination ability of single factors on total mortality measured by ROC and log-likelihood information in the 67% random sample
Predictors, removed individually
ROC (95% CI)
Log likelihood chi- square
Fraction of age contribution (%)
ROC, receiver operating characteristics (relation between proportion of predicted deaths amongst dead patients and survivors; CI, confidence interval; Ln-hsCRP, natural logarithm of high-sensitive C-reactive protein; CVD, cardiovascular disease; Y/N, yes/no.
History of CVD (Y/N)
Age as only predictor
Possible heterogeneity in prognostic ability across age groups was explored using regression coefficients with standard errors by age categories in the try-out sample (Table 5). Age was not entered into any of these prognostic indices. There were only two common factors across the age groups, diabetes and Ln-hsCRP. Phosphate and lipids were significant predictors in the two younger groups, whereas albumin was associated (inversely) with mortality in both the youngest and the oldest groups. Ln-hsCRP did not show any trend towards weakening of the relationship with mortality with increasing age. The discrimination ability was good in those below 70 years, but was only acceptable in those above 70 years of age.
Table 5. Regression coefficients of selected predictor variables in the try-out sample, after backward stepwise Cox regression analysis within each of the three age groups*
Internal calibration was carried out using the regression coefficients from the try-out sample to calculate a centralized prognostic index in the test sample of patients, which was then compared with the fitted prognostic index in the same sample. Figure 2 shows a correlation plot between these two scores with a high correlation (R2 = 0.96) and no major deviations from the ideal regression line with slope of 1.00 (observed slope of 0.99), indicating a high degree of relative internal calibration. Calibration in absolute terms was further explored by dividing the estimated probability of death in the test sample into deciles using coefficients from the try-out sample. The calculated total number of deaths according to the model was 25% lower than the observed number, so the calculated numbers per decile were multiplied by 1.33. Table 6 shows the observed and calibrated number of deaths; the Hosmer–Lemeshow test indicated only small deviations per decile amongst cases and noncases totally (P >0.20).
Table 6. Calibration of calculated number of deaths from the model in the try-out sample to patients in the test sample, compared with observed number of deaths by deciles of calculated risk in the test sample
Decile of calculated probability of death
Calculated number of deaths
Observed number of deaths
Ratio of observed to calculated
Calibrated calculated number of deaths
Deviations between observed and calibrated calculated number of deaths by Hosmer–Lemeshow chi-square test: χ2 = 10.4, P >0.20.
The calculated (using try-out sample coefficients) probabilities of death in the test sample were split into thirds with cut-off points at 0.25 and 0.40, which after calibration became 0.33 and 0.53. The proportions of deaths in these thirds (<0.33, 0.33–0.53 and >0.53) were 23.1%, 44.3% and 68.9%, respectively. Thus, the absolute risk was three times higher in the upper third than in the lower third. This indicates that a risk stratification of the probability of death at 0.53 could be used as a target for administering more intensive monitoring and care for patients with a 3-year mortality risk above that level.
The proportionality assumption of the Cox model was tested in the try-out sample by a study time multiplied by each variable interaction term in separate time-dependent Cox regression models, with the other five variables retained as covariates. None showed a significant association (all P >0.10). Exploration of interactions between two and two variables on mortality was performed for the five variables. Only one interaction between age and diabetes was found to be nominally significant (P =0.028). However, the ROC area only increased from 0.730 without the interaction term to 0.731, when included. It was, therefore, decided not to include this interaction term in the prognostic model.
The results of this study showed that there are relatively few potent single-prognostic factors for total mortality in the AURORA HD population. Age was the strongest risk factor of mortality in the try-out sample. The remaining four factors were weaker predictors, but added collectively a large proportion of ROC area to age alone. A global ROC area of about 0.73 is regarded as good and was the same in both the try-out and test samples. If the calculated individual probabilities were stratified into thirds with cut-off points at 0.33 and 0.53 after calibration, the risk was about three times higher in the upper third than in the lower third. This indicates that more intensive monitoring and care could be needed in patients with a calculated probability of death within 3 years above 0.53.
Age was a dominant factor in the total try-out sample. However, after stratification by age, it did not enter into the prognostic index, probably because it had a more limited span of variation. The prognostic ability was good in the two lower-age strata, but only acceptable in the upper age group. Whereas factors such as phosphate, lipids and diabetes showed diminishing strength of association with mortality with increasing age, Ln-hsCRP was associated with mortality equally strong in all age groups. Thus, low-grade inflammation may be affected differently by age, compared with the other factors.
Previous efforts to provide prognostic models have done so mostly from HD registry data. Different scoring systems have been developed, but the models are mostly for short-term prognosis, typically 6 months (9–11). For long-term prognosis, when probability of survival for at least 1 year is substantial, more specific scores may be needed.
There are many risk factors for death in patients with HD. In registry data, such as the Dialysis Outcomes and Practice Pattern Study (DOPPS) , a number of risk factors were common to those found in AURORA, such as age, history of CVD and diabetes. However, several factors of importance in DOPPS were not detected in AURORA, such as smoking and absence of hypertension. Other risk factors such as concomitant disease (cancer, lung disease) were not recorded in AURORA, because such patients were excluded owing to a small probability of long-term survival.
Similar risk factors to those in the AURORA population were also found in a number of randomized clinical trials. In the Fosinopril in Dialysis Study , using a broad cardiovascular endpoint, history of CVD, age, diabetes and hsCRP were common risk factors, whereas left ventricular mass was not computerized in AURORA. In addition, smoking, gender and high-density lipoprotein cholesterol were not significant risk factors in the AURORA population. In the Assessment of Lescol in Renal Transplantation (ALERT) study, age and history of coronary heart disease were found to be important risk factors for cardiac death . In none of these studies was there any attempt to build a prognostic model for estimation of patient survival probability for risk stratification.
A prognostic model is clinically useful. Being able to calculate individual probabilities of death by a simple formula is appealing for clinicians to summarize their patients’ prospects. By risk stratification, different strategies for patient monitoring, treatment and care can be implemented depending on the level of risk. Furthermore, how much the estimated probability of death is changed when manipulating a risk factor can be of clinical value and a motivational tool for patients. For future research, the prognostic index formula can be used to define appropriate inclusion criteria, so that a realistic risk level can be calculated and used as input to a sample size calculation. In addition, one or several risk factors may be changed by drug intervention, and the estimated absolute risk reduction can then be calculated by the same formula.
The major strength in the AURORA trial is its size, providing narrow confidence limits of discrimination estimates. However, developing a prognostic model from a randomized population may be misleading, because inclusion and exclusion criteria are different from those of an HD registry. This may introduce selection bias, as compared with an unselected HD population. The yearly mortality rate in AURORA was 12%, and thus not very different from that found in the European section of the DOPPS registry (16%) . However, AURORA patients were 4 years older and were more likely to have diabetes, but had a lower rate of CVD. In summary, a slight selection bias towards lower risk in AURORA patients may have occurred.
The prognostic model had no significant deviations from the proportional assumption of the Cox regression model, and a weak interaction was only found for one pair of factors; therefore, it is unlikely that an influential loss of discrimination occurred by keeping the simple additive model without stratifications.
The final model included a small number of readily available risk factors and thus is easy to apply in everyday practice. A computerized spreadsheet would be helpful for repeated calculations of the probability of death once the level of the five risk factors have been entered into the clinical database, as demonstrated in AppendixTable A1. In this way, patients could be stratified according to a high or low level of risk and thus treated differently.
Currently, we do not have access to another validation sample with similar patient characteristics and length of follow-up, so the external validity of the model was not controlled formally. Nevertheless, the most important risk factors found in AURORA were also considered to be important risk factors in other HD trials [2, 13].
Age, history of CVD, hsCRP, diabetes and albumin were the most influential factors for mortality discrimination in AURORA. These factors gave an ROC area of about 0.73. Only age had a clinically marked influence on the ROC area, when removed from the model. Within age strata, other factors still produced good to acceptable discrimination properties of total mortality. The prognostic model given in Table 2 with additional information from Table A1 can be used to calculate individual survival probabilities for a 3-year period, and risk stratification at, for example, the upper third of the mortality risk (53%) could be used to initiate more intensive monitoring and care of patients in this risk group.
Conflict of interest statement
I.H. received steering committee fees from Pfizer, Roche and Novartis; B.C.F. received consulting fees from AstraZeneca, Novartis, Roche and Wyeth, lecture fees from AstraZeneca, Novartis and Roche, and grant support from Novartis, Roche, Merck-Schering-Plough and Wyeth, and served as a national coordinator for the Study of Heart and Renal Protection (SHARP) study at Oxford University Clinical Trial Service Unit; A.G.J. received consulting fees from Novartis and Astellas; R.E.S. received consulting and lecture fees from AstraZeneca, Novartis, Merck Sharp & Dohme and Pfizer; F.Z. received consulting fees from Servier and Medtronic; H.H. received consulting fees from Novartis, AstraZeneca and Schering-Plough and lecture fees from Novartis and AstraZeneca, and served as a national coordinator for the SHARP study.
This study could not have been performed without the sponsorship of AstraZeneca together with the assistance of the investigators, nurses and patients in the AURORA study.
Risk estimation from the Cox model
We have selected an 80-year-old patient in the database to demonstrate how to calculate 3-year mortality in the try-out sample. The individual levels of the five risk factors are given for this patient in Table A1. First, we calculate the centralized prognostic index: CPI = Σβi*(Xi−Mean), where βi is the regression coefficient from the try-out sample, Xi is the level of risk factor i for the patient and Mean is the corresponding average value in the same sample. The coefficients βi are taken from Table 2. The underlying probability of surviving for the given time period t, So(t), can be calculated as a constant, independent of the risk factor profile. In this database, So(t) = 0.606 at t = 3 years. The survival probability for any patient with a given risk factor set Xi is then given by the formula:
where exp is the exponential function.
Table A1 Calculation example of survival probability for an 80-year-old patient in the try-out sample, who survived for 3 years (1098 days)
Risk factor level in patient
Regression coefficient from try-out sample
History of CVD (0 vs. 1)
Diabetes mellitus (0 vs. 1)
The centralized prognostic index for this patient is as follows: