Risk factors for incident heart failure in age‐ and sex‐specific strata: a population‐based cohort using linked electronic health records

Aims Several risk factors for incident heart failure (HF) have been previously identified, however large electronic health records (EHR) datasets may provide the opportunity to examine the consistency of risk factors across different subgroups from the general population. Methods and results We used linked EHR data from 2000 to 2010 as part of the UK‐based CALIBER resource to select a cohort of 871 687 individuals 55 years or older and free of HF at baseline. The primary endpoint was the first record of HF from primary or secondary care. Cox proportional hazards analysis was used to estimate hazard ratios for associations between risk factors and incident HF, separately for men and women and by age category: 55–64, 65–74, and > 75 years. During 5.8 years of median follow‐up, a total of 47 987 incident HF cases were recorded. Age, social deprivation, smoking, sedentary lifestyle, diabetes, atrial fibrillation, chronic obstructive pulmonary disease, body mass index, haemoglobin, total white blood cell count and creatinine were associated with HF. Smoking, atrial fibrillation and diabetes showed stronger associations with incident HF in women compared to men. Conclusion We confirmed associations of several risk factors with HF in this large population‐based cohort across age and sex subgroups. Mainly modifiable risk factors and comorbidities are strongly associated with incident HF, highlighting the importance of preventive strategies targeting such risk factors for HF.


Introduction
Heart failure (HF) is one of the leading causes of morbidity and mortality and is one of the initial presentations of cardiovascular disease (CVD). 1 The lifetime risk in individuals aged 55 years and older is about one in five and the 5-year survival ranges from 20-50% after first diagnosis. 2 -4 Furthermore, management of well-known risk factors could be partly responsible for a declining incidence of HF. 9 However, as 'classic' risk factors such as hypertension are successfully treated by BP-lowering medication to decrease CVD risk, in a population where such strategies are implemented, the equilibrium between risk factors, dependent on age, sex and risk factor distribution, could have shifted, and relatively less known risk factors could emerge.
Previous studies of risk factors for HF may lack data richness or sheer volume for a thorough assessment of differences in the contribution of risk factors across different patient groups of interest (notably strata of age and sex). 10 -12 Very large databases of electronic health records (EHR) may provide the opportunity to study risk factors among age-and sex-specific groups of patients in the general population.
In the current study, we studied a large population-based cohort using EHR, with a highly heterogeneous HF phenotype, to identify risk factors for developing HF and to compare these risk factors between men and women across different age groups. 13

Study population
A cohort of 871 687 individuals was constructed from the CAL-IBER resource (CArdiovascular research using LInked Bespoke studies and Electronic health Records), which links four sources of EHR in England: primary care records from the Clinical Practice Research Datalink (CPRD), secondary care hospital discharges in Hospital Episodes Statistics (HES), disease registration in the Myocardial Ischaemia National Audit Project (MINAP) registry and the national death registration in the Office for National Statistics (ONS) registry. 13 Individuals were included if they were 55 years or older between 1 January 2000 and 25 March 2010, if they had been registered with a general practitioner for at least 1 year, in a practice that had at least 1 year of up-to-standard data recording in CPRD. The last date of the previously mentioned occasions was considered cohort entry date (index date).
We excluded individuals with a history of HF in CPRD, HES or MINAP before their index date. Individuals were censored at first diagnosis of HF, death, de-registration from a practice, last practice data collection, or at the study end date, whichever occurred first. The study flow diagram of participants can be found in the online supplementary Figure S1.
Study approval was granted by the Independent Scientific Advisory Committee of the Medicines and Healthcare products Regulatory Agency (protocol 14_ 246) and the MINAP Academic Group.
Baseline risk factors were identified as the closest measurement to index date up to 3 years before and 1 year after index date. All determinants were recorded during consultations in CPRD or HES. Reported ethnicity was used to classify individuals as Caucasian, black, Asian, or other. Social deprivation was measured as quintiles of the index of multiple deprivation, a score calculated based on seven indices of deprivation: income, employment, health and disability, education, barrier to housing and services, crime, and living environment. 14 Furthermore we classified hypertension as three SBP measurements >140 mmHg and/or use of BP-lowering medication, obesity as a BMI measurement >30 kg/m 2 , smoking status as never, ex-or current smokers, and patient's level of physical activity as recorded in primary care was classified as sedentary lifestyle or active lifestyle. Definitions of all risk factors can be found at https://www.caliberresearch.org/portal/.

Endpoints
The primary endpoint was incident HF and was based on the first record of HF from CPRD or HES. 4 Events in CPRD were defined by a diagnosis of HF or diagnosis of chronic left ventricular dysfunction on echocardiogram with READ codes, and in HES by a diagnosis of HF with ICD-10. Secondary endpoint was the first record of HF, excluding patients with a previous myocardial infarction (MI) event at baseline. READ and ICD-10 codes for HF and MI definitions can be found in the online supplementary Table S1.

Statistical analysis
Incidence rates of HF (per 1000 person-years of follow-up) were estimated by calendar time including 95% confidence intervals (CI), stratified by sex and age category: 55-64, 65-74 and > 75 years.
Missing data in all baseline risk factors were imputed, except comorbidities and prescriptions, using multiple imputation, from the mice algorithm in the statistical software package R. We stratified imputations by sex and age category and created 10 imputed datasets. Analyses were performed on the imputed datasets separately and results were pooled using Rubin's rules. Multivariable Cox proportional hazards analysis was used to estimate hazard ratios (HRs) for associations between baseline risk factors and incident HF, separately by sex and age categories for all baseline risk factors. The proportional hazards assumption was verified by assessment of the Schoenfeld residuals. For our secondary analysis, we repeated the above analysis in a subset of individuals without a history of MI. The Bonferroni correction was used to account for multiple testing. We tested for interaction with age categories (55-64, 65-74 and > 75 years) and sex for all associations presented.
We estimated the population attributable risk (PAR) of risk factors for incident HF for: social deprivation, smoking, sedentary lifestyle, obesity and diabetes. To assess the impact of these risk factors, we estimated the PAR (95% CI) with the standard formula: PAR = [P(F)*(HR-1)]/[1 + P(F)*(HR-1)] where P(F) is the prevalence of the risk factor in the population and HR the HR of disease due to that risk factor. 15 In sensitivity analyses, we compared the results after multiple imputation to those based on a complete case analysis and to a subset of individuals not using BP-lowering medication at baseline. Furthermore, we compared inter-practice/hospital variation in a frailty Cox proportional hazards model where practice is a random effects variable and we compared associations of risk factors for incident HF stratified by endpoints from different sources of EHR (CPRD and HES).
All analyses were performed using R version 3.2.3.   BP, blood pressure; COPD, chronic obstructive pulmonary disease; DBP, diastolic blood pressure; eGFR, estimated glomerular filtration rate; HDL, high-density lipoprotein; HF, heart failure; IQR, interquartile range; LDL, low-density lipoprotein; SBP, systolic blood pressure; SD, standard deviation; WBC, white blood cell. a Assessed by index of multiple deprivation. b Measurement closest to and within 3 years before baseline. c Denotes prior medical history of given comorbidity 3 years before baseline, prescription use 3 years before baseline.

Baseline characteristics
The study cohort included 871 687 individuals aged 55 years or older of whom 47 987 (5.5%) individuals developed incident HF during a median follow-up of 5.8 years [interquartile range (IQR) 2.7-9.9], with a median time to event of 3.7 years [IQR 1.8-6.4].
A Kaplan-Meier time-to-event plot for incident HF can be found in the online supplementary Figure S2. Baseline characteristics are presented separately for men ( Table 1) and women (

Incidence rates
Incidence rates of HF events per 1000 person-years varied between sexes and age categories. Overall, incidence rates in men were higher than in women ( Figure 1). Incidence rates were stable over calendar time for men and women aged 55-64 years with a mean incidence rate per 1000 person-years of 3.6 and 1.9, respectively; these incidence rates increased with older age to an average of 13.6 for men and 9.2 for women at age 65-74 years. The highest incidence rate per 1000 person-years was observed for the age category >75 years with a mean incidence rate per 1000 person-years of 34.4 for men and 28.0 for women. BP, blood pressure; COPD, chronic obstructive pulmonary disease; DBP, diastolic blood pressure; eGFR, estimated glomerular filtration rate; HDL, high-density lipoprotein; HF, heart failure; IQR, interquartile range; LDL, low-density lipoprotein; SBP, systolic blood pressure; SD, standard deviation; WBC, white blood cell. a Assessed by index of multiple deprivation. b Measurement closest to and within 3 years before baseline. c Denotes prior medical history of given comorbidity 3 years before baseline, prescription use 3 years before baseline.

Risk factors for incident heart failure
Results from the multivariable Cox proportional hazards models show that diabetes, AF and COPD had the strongest associations with incident HF in men and even stronger associations with HF in women in all age categories, with associations attenuating with older age (P-value for interaction with age < 0.05). In men, we found associations with HF for age, lowest quintile of social deprivation, BMI, haemoglobin, total WBC count and creatinine in all age categories (Figure 2). haemoglobin, total WBC count and creatinine were associated with incident HF in all age categories. However, compared to men, women showed stronger associations of creatinine, diabetes, AF and COPD, these were associated with incident HF in all age categories (P-value for interaction with sex < 0.05) (Figure 2). Similar to men, associations of social deprivation, smoking, BP and diabetes attenuated in older women (P-value interaction with age < 0.05).
We found no associations with incident HF in either men or women for platelets, total plasma cholesterol, triglycerides, or albumin ( Figure 2). We found an association for SBP (per 20 mmHg) for the youngest age category in women (1.11, 95% CI 1.05-1.18), but not for men. Furthermore, SBP was inversely associated in the oldest age category for both sexes (0.96, 95% CI 0.94-0.99 and 0.97, 95% CI 0.96-1.00, respectively). DBP (per 10 mmHg) was inversely associated with incident HF in the Incidence rate/1000 person-years with 95% confidence interval (band), table with absolute number of cases stratified by age category and sex. two younger age categories, whereas no association in the oldest age group was observed (Figure 2). Overall results from the multivariable Cox proportional hazards model, men and women and all ages combined, are shown in the online supplementary Figure S3. When patients with and without a history of MI were analysed, similar HRs were found for the associations between risk factors and incident HF in men and women (online supplementary Figures S4 and S5), with a trend towards a positive association of total cholesterol with HF, though not significant.
. When we added history of MI to the main model, it did not change the observed associations of other risk factors (data not shown). When we compared individuals using BP-lowering medication with those who were not, we observed an attenuation of most associations in individuals not prescribed BP-lowering medication, except for SBP and diabetes, the associations of these risk factors with incident HF became stronger in all age categories for both men and women (online supplementary Figures S6  and S7).

Relative contribution of modifiable risk factors and comorbidities
The largest proportion of male HF cases that could be prevented was if COPD, AF and hypertension would not occur in the population ( Relative contributions of risk factors to incident HF appeared to be stronger in women compared to men. In women, the largest proportion of HF cases that could be prevented by modifiable risk factors were COPD and AF, but not hypertension. Similar to men, obesity and diabetes could prevent a smaller proportion of HF cases ( Table 4). In both men and women the relative contributions attenuated with older age, whereas the relative contribution of sedentary lifestyle remained similar across age categories.

Sensitivity analysis
Patient characteristics were similar between imputed data and complete case data for men and women (online supplementary  Tables S2 and S3). Sensitivity analysis showed that a complete case analysis yielded similar directions of associations for risk factors with incident HF in both men and women (online supplementary  Tables S4 and S5); however, associations were attenuated in the imputed data analysis. General practice variability had no effect on the overall associations in men and women (online supplementary  Tables S6 and S7), since the random effects models resulted in near identical estimates to our main analysis. Lastly, analyses stratified by different sources of EHR showed that the associations of social deprivation, current smoking and diabetes with incident HF were stronger in HES cases compared to CPRD, whereas the association of AF was stronger in younger (55-65 years) men and women in CPRD compared to HES (online supplementary Tables S8 and S9).
Overall, the analyses were comparable with our main analysis.

Discussion
In this large population-based cohort study using linked EHRs, we investigated the association of risk factors with the development of HF. We found independent associations of diabetes, . AF, COPD, age, social deprivation, modifiable lifestyle factors and inflammatory markers, but not SBP, with incident HF, in a population using BP-lowering and lipid-regulating medication. In England, we found higher incidence rates for men and the elderly (>75 years) which were stable in the period of 2000-2005, though increasing from 2006 onwards for all categories. Previous studies have reported sex-and/or age-specific incidence rates of HF and indicate that the incidence of HF is stable over time, whereas others suggest it might be increasing or even decreasing. 16 -21 These differences might be reflected in a varying follow-up time, diverse patient populations, diversity in quality of data, lack of distinction of incidence rates based on both age and gender, and regional or cultural differences underlying these incidence rates.

Risk factors for incident heart failure
We confirmed several associations of risk factors with HF, such as diabetes, BMI, and smoking. Our study supports and contributes to previous studies in CALIBER, 21 -25 which have shown associations of these risk factors with a range of CVDs. We observed similar patterns of association between men and women as well as attenuation of the associations of risk factors with HF at older age. Compared to men, women showed stronger associations of modifiable lifestyle factors, such as smoking, a sedentary lifestyle and diabetes, with incident HF. This could reflect a different aetiology between men and women in the development of HF. We found no, or weak, independent association between SBP and incident HF in our multivariable analyses. This contrasts with papers reporting on the association of SBP with incident HF. 5,6 However, similar associations between SBP and incident HF, as previously reported, 6 could be reproduced by excluding individuals using BP-lowering medication in our analyses. This reinforces the importance of treating high BP accordingly.
Our results show that in a population with high prescription rates of BP-lowering medication, smaller independent associations of other risk factors become more evident. For example, we found levels of total WBC count independently associated with HF, this could indicate an underlying inflammatory process leading to HF. 26,27 Inflammation could be triggered by comorbidities such as diabetes or obesity or via endothelial dysfunction and atherosclerosis from an underlying heart disease; however, it remains to be investigated how inflammation and HF interact exactly. Similar results have recently been reported for other CVDs. 24 Additionally, we found an association of creatinine and an inverse association of haemoglobin with incident HF. Low haemoglobin, or anaemia, and raised creatinine levels are frequently observed among HF . patients and are associated with worse outcomes and increased mortality. 28,29 Lastly, our results show an inverse association of DBP with incident HF. This is likely due to reversed causality induced by the relatively old age of our study population [median age 61.5 years (IQR 55-71.9)]; it is known that DBP is lower in the elderly and is associated with worse survival. 30,31 Observing the substantial prevalence of modifiable risk factors and comorbidities, such as COPD, AF, obesity, a sedentary lifestyle and smoking, our results suggest that preventive strategies could be an opportunity to reduce the risk of incident HF. Previous research has already shown that adherence to a healthy lifestyle reduces the lifetime risk of HF. 32,33 Future studies should verify these results in population-based studies and focus should be directed to implicating effective preventive strategies in clinical practice.

Study strengths and limitations
Strengths of this study are the linkage of multiple EHR sources, which allowed for the collection of a large representative sample of 871 687 individuals across England and studying a large population of HF patients. Previous studies have shown the feasibility and validity of routinely collected data in CPRD and HES. 34,35 However, several limitations of this study should be considered when interpreting these findings. First, due to the nature of EHR, the accuracy and amount of detailed information recorded are limited, though findings based on the multiple imputed dataset showed a similar direction of association compared to complete case analysis. Residual confounding may still exist. Second, we were unable to differentiate between HF phenotypes, since there was no access to detailed echocardiography estimates to assess systolic function. This is likely to conceal a greater degree of heterogeneity. Third, all measurements are prone to measurement error and/or misclassification. To define HF, we used data from two different EHR sources, each having their own measurement error. Yet, associations were similar between CPRD and HES cases in our sensitivity analysis, and others have delivered evidence of the validity of using linked EHRs. 4,36

Conclusions
In this large population-based cohort study using linked EHRs in England, we observed that diabetes, AF, COPD, age, social deprivation, modifiable lifestyle factors such as smoking, sedentary lifestyle, BMI, and physiological measures such as haemoglobin, total WBC count and creatinine were associated with incident HF across age-and sex-specific groups. Mainly modifiable risk factors and comorbidities are of interest, considering a substantial PAR. This highlights the importance of preventive strategies targeting modifiable lifestyle risk factors for HF, besides BP management.

Supplementary Information
Additional supporting information may be found online in the Supporting Information section at the end of the article. Figure S1. Flowchart of the study population. Figure S2. Kaplan-Meier time-to-event for incident heart failure. Figure S3. Risk factors associated with incident heart failure. Figure S4. Risk factors associated with incident heart failure in men stratified by age and prior myocardial infarction. Figure S5. Risk factors associated with incident heart failure in women stratified by age and prior myocardial infarction. Figure S6. Risk factors associated with incident heart failure in men stratified by age and blood pressure lowering medication. Figure S7. Risk factors associated with incident heart failure in women stratified by age and blood pressure lowering medication. Table S1. Overview of READ and ICD-10 codes used to identify heart failure and myocardial infarction in CPRD and HES data sources. Table S2. Complete case baseline characteristics stratified by age in men. Table S3. Complete case baseline characteristics stratified by age in women. Table S4. Complete case analysis for risk factors associated with incident heart failure stratified by age in men. Table S5. Complete case analysis for risk factors associated with incident heart failure stratified by age in women. Table S6. Evaluation of heterogeneity at practice level for the association of risk factors with heart failure stratified by age in men.
.  Table S7. Evaluation of heterogeneity at practice level for the association of risk factors with heart failure stratified by age in women. Table S8. Associations of risk factors with incident heart failure stratified by age and endpoints from different sources of EHR in men. Table S9. Associations of risk factors with incident heart failure stratified by age and endpoints from different sources of EHR in women.