Objective: The shape of the association between BMI (kg/m2) and mortality has important methodological implications as it partially determines the optimal form for operationalizing BMI for use in analyses. We examined various BMI operationalizations in relation to mortality from all causes and specific causes.
Methods and Procedures: A clinical examination with measurements of height and weight was conducted at baseline (1967–1970) for 18,860 working men aged 40–69, in the total cohort and 7,865 men in the healthy subcohort, that is, those who had no unexplained weight loss, no cardiovascular (CVD) or respiratory disease, were nonsmokers and did not die during the first 5 years of follow-up (the original Whitehall study). A mean follow-up of 35 years for mortality gave rise to 13,498 deaths of which 4,766 were in the healthy subcohort.
Results: There was a dose-response relation between BMI and CVD and coronary heart disease (CHD) mortality in the total cohort and healthy subcohort, with an increasingly steep slope at the high end of the BMI distribution. For noncardiovascular, cancer, and respiratory mortality, an excess risk was also associated with a BMI <18.5; in the healthy subcohort, this was true only for respiratory deaths. The association between BMI and all-cause mortality was J-shaped in the total cohort and healthy subgroup and even after excluding underweight participants.
Discussion: For associations with all-cause and cause-specific mortality, a linear and quadratic term in combination provided a more parsimonious BMI operationalization than the WHO definition, obese-nonobese dichotomy or BMI treated as a continuous linear variable.
High BMI (kg/m2) is associated with many diseases that contribute to premature death, such as coronary heart disease (CHD), stroke, diabetes, and certain types of cancer (1,2,3,4,5,6,7,8,9,10,11). However, a low BMI is also associated with an increased risk of mortality (12,13,14). This is because some preexisting illnesses and smoking are related to reduced BMI and increased mortality over time (11,12,14,15,16,17). In addition, underweight might have its own effect on mortality and/or contribute to worse prognosis of diseases. Such multiple and partly opposing influences are likely to affect the shape of the BMI-mortality association and these effects may vary between causes of death.
Many different operationalizations of BMI are used in epidemiological studies (7,8,11,12,13,14,17,18,19,20,21,22,23,24,25,26,27,28) and the extent to which each of them accurately capture the shape of the BMI-mortality association has not been well demonstrated. Operationalizations of BMI include the WHO classification that divides the distribution of BMI in categories such as “underweight” (BMI <18.5 kg/m2), “normal” (18.5–24.9 kg/m2), “overweight” (25.0–29.9 kg/m2) and “obese” (≥30.0 kg/m2) (29); the simple dichotomy obese vs. nonobese; BMI treated as a continuous variable (a linear term); and BMI fitted both as linear and quadratic (i.e., BMI × BMI) terms.
We examined BMI in relation to mortality from all causes and specific causes in the original Whitehall study, which is based on an occupational cohort of British men followed up over 35 years. An important feature of this study is the exceptionally large number of deaths, which allows a detailed examination of associations with mortality in the total sample and in a healthy, nonsmoking subcohort. To determine which BMI measure provides the most parsimonious fit, we compared the degree of interindividual variability in mortality accounted for by various operationalizations of BMI.
Methods and Procedures
Data were collected on 19,019 nonindustrial London-based male government employees aged from 40 to 69 years at screening between September 1967 and January 1970, response 74%. Screening involved the completion of a study questionnaire and participation in a medical examination (30). Mortality records to 2005 were successfully flagged for 18,863 men (99.2%). Three men had missing BMI measurements. The remaining 18,860 men form the total study population in subsequent analyses. To obtain a healthy subcohort, we excluded men who reported recent unexplained weight loss (n = 417), were hospitalized for cardiovascular disease (CVD) (n = 767), were doctor-diagnosed for hypertension or heart disease (n = 785), had ischemia (n = 2,947), intermittent claudication (n = 339), were dyspneic (n = 1,072), were bronchitic (n = 552), were diabetic (n = 248), were smokers (n = 7,853), or died within the first 5 years of follow-up when deaths are more likely to reflect preexisting disease (n = 757). We also excluded those in the “other” employment grade category (n = 1,789), as there is a high prevalence of people with disability in this group (30). This left a healthy subcohort of 7,865 men.
Assessment of BMI
Height was measured with the participant wearing shoes and standing with his back to a measuring rod; readings were taken to the nearest ½ inch (∼12.7 mm) below (10). Weight was recorded with the participant wearing shoes but with jacket removed; readings were taken to the nearest ½ lb (227 g) (10). Following conversion to metric units, BMI was computed as weight (kg) divided by height squared (m2).
Mortality data were obtained from the mortality register by National Health Services for all participants who died between study entry and the 30 September 2005 using the National Health Services identification number assigned to each UK citizen. Among the 13,498 men who died, 83.8% of death certificates were coded according to the eighth revision of the International Classification of Diseases (ICD), 6.2% according to the ninth revision and 10.0% according to the tenth revision. Five disease categories were utilized: deaths due to CVD (ICD-8/9: 390–458; ICD-10: I00–I99), CHD (ICD-8/9: 410–414; ICD-10: I20–I25), noncardiovascular causes (i.e., remaining deaths with specified cause), cancer (ICD-8: 140–208; ICD-9: 140–209; ICD-10: C00–C97), and respiratory causes (ICD-8/9: 460–519; ICD-10: J00–J99).
Mortality rates were calculated per 1,000 person-years at risk and were standardized for age at entry using 5-year age groups with the total Whitehall population as the standard. The associations between BMI and each mortality outcome were studied using Cox proportional hazards regression models with follow-up period as the time scale. For the total cohort, the follow-up period started at baseline and participants were censored at the time of death or the time of loss to follow-up or at the end of September 2005. The proportional hazards assumption was tested by fitting interaction terms between BMI and the logarithm of follow-up period. All-cause and non-CVD mortality outcomes showed significant interaction effects due to stronger negative BMI-mortality associations in the first 5 years of follow-up. On exclusion of the first 5 years of follow-up, the proportional hazards assumption was not violated. Analyses for the total cohort were therefore repeated excluding the first 5 years of follow-up and showed similar results to those presented using the total follow-up. All analyses for the healthy subcohort also started 5 years after baseline.
Different forms of operationalizing BMI were tested for parsimony: BMI dichotomized as obese (≥30 kg/m2) vs. nonobese (BMI < 30 kg/m2) (form A); BMI categories based on the WHO classification (<18.5, 18.5–24.9, 25.0–29.9, and ≥30 kg/m2) (form B); a linear term for each BMI integer category (formed by rounding down all BMI values to the nearest integer with <18.5 and 18.5–18.9 as the lowest two categories and ≥33 as the highest) (form C); a linear term (as in C) and a quadratic (i.e., C × C) term (form D); and separate BMI categories for each BMI integer category as in C (form E). For forms C, D, and E, the rounding of the BMI values prior to fitting linear and quadratic terms was only done to ensure that all models (A-E) were hierarchical and could be compared. However, a sensitivity analysis with BMI treated as a continuous variable without rounding to integers (as would be the usual situation) replicated the findings from forms C and D. Note that the form E cannot be tested without rounding because of the extensive number of categories.
The first step in the analyses examined the extent to which the different forms of operationalizing BMI (A-E) accounted for the interindividual variability in mortality when included in a model with age. The extent to which forms A-E accounted for the interindividual variability for each mortality outcome in addition to age was examined using the likelihood ratio statistic (−2 (difference in log likelihood of two models)), which allows the goodness of fit of forms A-E to be compared. Under the hypothesis of no difference between the two models, this statistics has a χ2 distribution with degrees of freedom equal to the difference in the number of parameters in the two models. Subsequent analyses examined differences between forms A-E. As it is possible to calculate P values only for nested forms, this comparison involved forms B vs. A, D vs. C, and E vs. D. These analyses were run in (i) the total sample, (ii) the healthy subcohort, and (iii) the healthy subcohort excluding all underweight participants (BMI < 18.5 kg/m2).
Finally, the shape of the association between BMI and mortality was illustrated by smooth curves for age-adjusted hazard ratios based on the most parsimonious form of BMI operationalization, that is, the model with BMI fitted as a linear term and a quadratic term (form D). We used actual BMI values rather than rounding to integers as these illustrations included no comparison with other forms of BMI operationalizations. The nadir of the cause-specific mortality vs. BMI curves appears to be between BMI values 22 and 24 kg/m2. We chose the midpoint, BMI = 23 kg/m2, as the reference (hazard ratio = 1.0). This value is within the larger reference category of 22.5–24.9 kg/m2 used previously (14).
All the analyses were performed using the SAS software, version 8.2 (SAS Institute, Cary, NC).
Table 1 presents sample characteristics. The proportion of participants who are underweight or obese is slightly higher in the total sample than in the healthy subcohort. Mean BMI (kg/m2) is 24.7 for all men, 24.9 for the healthy subcohort, and 24.9 for the healthy subcohort excluding underweight individuals (BMI < 18.5 kg/m2). The age-adjusted hazard ratios for all-cause, CVD and non-CVD mortality for unhealthy vs. healthy participants are 1.67 (95% confidence interval 1.61–1.73), 1.64 (95% confidence interval 1.56–1.73), and 1.70 (95% confidence interval 1.64–1.78), respectively, with little change after excluding underweight individuals from healthy group of participants (all hazard ratios between 1.64 and 1.68). This suggests that the definition of healthy subcohort is successful. Age-adjusted hazard ratios for each BMI integer in relation to all-cause and cause-specific mortality for the total sample and the healthy subcohort are provided in the Supplementary Table S1 online.
Table 1. Sample characteristics
Forms of operationalizing BMI: comparison of the fit
A comparison of the forms of operationalizing BMI (A-E) as predictors of mortality in the total sample is presented in Table 2. Based on the likelihood ratio statistics, the combination of linear and quadratic terms for BMI (form D) accounts for more of the variability in all-cause and cause-specific mortality than the dichotomous form (A), the WHO form (B), and the linear-only BMI form (C). Replacing the linear and quadratic terms with categories for each BMI integer (E) does not add to the variability explained. For all mortality outcomes, a larger proportion of the variability is accounted for by the WHO form (B) than by dichotomization (A). The linear BMI term (form C) explains the smallest proportion of all-cause mortality whereas the dichotomous form (A) explains the smallest proportion of non-CVD, CVD, and CHD deaths.
Table 2. The total sample: likelihood ratio statistic (LRS)a comparison for various BMI operationalizations in explaining all-cause and cause-specific mortality (N = 18,860 for all-cause analyses; N = 18,816 for cause-specific analyses)
The combination of linear and quadratic terms for BMI (form D) also accounts for more variability in mortality than the other forms in the healthy subcohort (Table 3) and the healthy subcohort excluding all underweight participants (Table 4). This was also the case for the 2,599 healthy men who had never smoked (1,362 deaths, data not shown).
Table 3. The healthy subcohort: likelihood ratio statistic (LRS)a comparison for various BMI operationalizations in explaining all-cause and cause-specific mortality (N = 7,865 for all-cause analyses; N = 7,848 for cause specific analyses)
Table 4. The healthy subcohort excluding those underweight (BMI < 18.5): likelihood ratio statistic (LRS)a comparison for various BMI operationalizations in explaining all-cause and cause-specific mortality (N = 7,815 for all-cause analyses; N = 7,798 for cause specific analyses)
We have rounded the BMI measurements to integer values to ensure that all the models (forms A-E) are hierarchical, thus allowing the amount of variability in mortality explained by each model to be compared. However, in practice, when fitting linear or quadratic models, the actual BMI values themselves should be used in these models. In our data, models with actual BMI values explain slightly more of the variability in mortality than the model forms C and D.
Shape of the BMI-mortality association
Figure 1 presents smooth curves for age-adjusted hazard ratios of BMI in relation to mortality outcomes in the total cohort and in the healthy subcohort (curves for the healthy subcohort excluding underweight participants are not shown as they largely overlap with those for the healthy subcohort). The curves are calculated from models which include the linear and quadratic terms (form D) of actual BMI values. In the total sample, the association of BMI with all-cause mortality is J-shaped (Figure 1a) and the associations with mortality from non-CVD, cancer, and respiratory diseases in particular are an inverse J with the highest risk being associated with underweight (Figure 1d-f). There is a positive association between BMI and mortality from CVD and CHD, with an increasingly steep slope for greater BMI values (Figure 1b, c).
In the healthy subcohort, hazard ratios for the association between underweight and mortality are not as high for deaths from all causes (Figure 1a), non-CVD (Figure 1d), cancer (Figure 1e) and respiratory disease (Figure 1f) as among all men (Figure 1). In contrast, overweight and obesity in the healthy subcohort are related to greater hazard ratios for CVD, CHD, non-CVD, cancer, and respiratory mortality compared with the cohort of all men (Figure 1b-f). Thus, in the healthy subcohort, the association between BMI and all-cause and cause-specific mortality is J-shaped, with the exception that the association between BMI and death from respiratory causes is U-shaped.
Data on 13,498 deaths over the 35-year follow-up period show that the association between BMI and overall mortality in the total sample and healthy subcohort is J-shaped. The shape of this association reflects the sum of the different patterns by which BMI is associated with cause-specific mortality. First, for noncardiovascular deaths, a substantial excess risk is associated with the lower end of the BMI distribution with the elevated hazard ratios for underweight; in the healthy subcohort, this is true only for respiratory deaths. High BMI is also associated with increased noncardiovascular mortality in both cohorts. Second, there is a dose-response association between BMI and CVD and CHD deaths, with a slightly steeper slope at the upper end of the BMI distribution. A linear and quadratic term in combination provided the most parsimonious form of operationalizing BMI for all these associations.
Comparison with other studies
The elevated risk for noncardiovascular mortality among underweight men may be in part accounted for by preexisting disease and smoking, as the excess risk related to underweight was smaller in the healthy nonsmoking subcohort than in the total sample. A smaller excess risk related to underweight in the healthy subsample compared to the total sample was also reported for the Renfrew/Paisley study and the Collaborative study (including men and women who resided in two Scottish towns) (12), as well as for clinical samples (31,32). In the Physicians' Health Study of US male physicians aged 40–84 years, excess total mortality related to a BMI <20 kg/m2 disappeared after excluding participants with preexisting disease or any smoking history (14), but in a study of over 520,000 US men and women the J-shape association between BMI and mortality was observable even after such exclusions (11).
Differences between cohorts were also apparent for other causes of death. In the healthy nonsmoking subcohort, a BMI equal to or above 33 was associated with a 2.7 times higher risk of death from CVD and CHD compared with a BMI of 23. In all men, these hazard ratios were 2.3 while in the INTERHEART study of 12,000 patients with first acute myocardial infarction, the odds ratio for the top vs. the bottom quintile of BMI was even smaller, 1.4 (16). Finally, a meta-analysis of studies on patients with coronary artery disease prospectively followed for a mean of 4 years found no increase in cardiovascular mortality among obese patients (13). Taken together, these findings demonstrate the ability of prevalent disease and smoking to attenuate the association between obesity and cardiovascular outcomes.
The fact that the association between BMI and mortality assumes different shapes dependent on baseline morbidity, smoking and cause of death is potentially an important source of the inconsistent findings observed in previous studies of BMI and total mortality (7,11,12,13,14,17,18,19,20,21,22,23,24,25,26). The distribution of cause-specific mortality varies between populations and this may be reflected in the patterns of association with total mortality. The health of participants also varies between cohorts and so will to contribute inconsistency. Studies have also applied different ways of dealing with baseline morbidity, including adjustments for health indicators (33,34,35), exclusions of unhealthy and underweight participants (21), as well as exclusion of the first years of mortality follow-up, a method criticized by some commentators (36).
As our data were based on European-origin male British civil servants, further research is needed to confirm the generalizability of the present results. An unavoidable consequence of the 35-year follow-up is that the findings are based on a cohort recruited a longtime ago, in the late 1960s. Since then, the prevalence of overweight and obesity has grown from 45% in this cohort to over 65%, for example, among US adults (37,38). Although the US National Health and Nutrition Examination Survey found some evidence that the association of obesity with mortality may have decreased over time (39), it seems unlikely that the linear and curvilinear components would have completely disappeared in contemporary cohorts.
Implications for future studies
Shape of the association between BMI and mortality potentially has important methodological implications as it partially determines the optimal form for operationalizing BMI for use in analyses. This study suggests that fitting BMI simultaneously as a linear term and as a quadratic term provides the most parsimonious form for examining these associations, whereas other BMI forms, such as the WHO definition, obese-nonobese dichotomy, and BMI treated as a continuous linear variable accounted for less interindividual variability in all-cause and cause-specific deaths. This finding holds even after excluding unhealthy and underweight participants and smokers from the analysis.
Interestingly, only few previous studies on mortality have actually fitted both linear and quadratic terms of BMI. Nonoptimal forms for BMI have several potential consequences for research findings. According to our data, the contribution of a linear form for BMI to all-cause and non-CVD mortality is particularly small, and thus studies using a linear BMI term as the main exposure variable may underestimate the association between BMI and these mortality outcomes. Studies of other main exposures in which BMI is used only as a covariate are also affected. Entering a linear BMI covariate into the model, instead of both linear and quadratic terms may make the adjustment more open to residual confounding; thus the independent association between the main exposure variable and mortality is potentially overestimated. All this also applies to BMI treated dichotomously in studies of CVD and CHD mortality.
There was a difference in parsimony between the combined linear and quadratic terms of BMI and the WHO, BMI categories. The advantage of the WHO categorization of BMI is that it provides clear categories that form an unambiguous basis for therapeutic decisions as well as defines clinically meaningful groups. However, our findings suggest that supplemental analyses with linear and quadratic terms of BMI may provide additional information based on the exact form of the association between BMI and mortality.
By virtue of the large number of deaths, long follow-up period and identification of a healthy subcohort, our findings extend research on the shape of the BMI-mortality association. The association of BMI with all-cause and cause-specific mortality seems to incorporate linear and curvilinear components even after taking account of reverse causality and smoking and excluding all underweight participants. This should be taken into account in operationalizations of BMI.
M.K. is supported by the Academy of Finland (project no. 117604), J.E.F. by the MRC (Grant no. 47413), M.G.M. by an MRC Research Professorship, and M.J.S. by a grant from the British Heart Foundation. G.D.B. is a Wellcome Trust Fellow.