Estimating the effects of body mass index and central obesity on stroke in diabetics and non‐diabetics using targeted maximum likelihood estimation: Atherosclerosis Risk in Communities study

Summary Objectives The association of body mass index (BMI) with the risk of cardiovascular disease among diabetic patients is controversial. This study compared the effects of BMI and central obesity on stroke in diabetics and non‐diabetics using targeted maximum likelihood estimation. Materials and methods This analysis included 12 725 adults aged 45–75 years, after excluding prevalence cases and participants with missing data, from the Atherosclerosis Risk in Communities study. Obesity was defined with BMI, waist circumference, waist‐to‐hip ratio (WHR), waist‐to‐height ratio (WHtR), body shape index (BSI) and body roundness index (BRI), which categorized all participants as obese and non‐obese. Generalized linear models and TMLE (with the tmle package) were used to estimate risk ratio (RR). Results During 27 years of follow‐up, 1078 (8.47%) cases of stroke occurred. After adjustment for demographic, behavioural, biologic and central obesity variables, the effect of BMI decreased in both diabetics and non‐diabetics. The effect of BMI in diabetics was more attenuated, in full model, (RR: 1.04 [0.90, 1.20]) rather than non‐diabetics (RR: 1.11 [1.00, 1.24]). This attenuation was more related to biologic variables in non‐diabetics and central obesity in diabetics. With respect to central obesity, BSI (RR [95% CI]: 1.15 [0.96, 1.38]) and WHR (RR [95% CI]: 1.15 [0.87, 1.52]) had strongest and marginally significant effects for diabetics and BSI (RR [95% CI]: 1.10 [1.02, 1.20]) for non‐diabetics. Conclusions Among diabetics, BSI and WHR indices were associated with a higher incidence of stroke. Future studies should consider how central obesity affects higher incidence of stroke among diabetics stratified by sex and age groups.


| INTRODUCTION
Stroke is one of the important causes of mortality and disability worldwide. 1,2 Obesity is a recognized risk factor for many diseases, such as cardiovascular and cerebrovascular diseases. 2 The association of obesity and stroke, however, is controversial. In this case, the body mass index (BMI) is the common index for obesity and overweight, which is unable to differentiate between excess fat mass and other body masses. [3][4][5] For example, individuals with high BMI may have excess muscle mass instead of fat mass. 6 Given the importance of visceral fat distribution for chronic diseases, waist circumference (WC), waist-to-hip ratio (WHR) and waist-to-height ratio (WHtR) have gained popularity for the measurement of obesity. These indices are closely related to central fat mass and can be used as indicators for central obesity. 7 In addition, the contradictory results regarding the effect of obesity on stroke could be due to the lack of attention to the different fat distributions and obesity definitions in males and females. 8,9 Furthermore, previous studies have indicated a paradoxical effect of fat distribution in diabetics and non-diabetics. [10][11][12] Therefore, some studies have shown an inverse association between BMI and stroke, and some studies have shown that the central obesity is a useful predictor of stroke in diabetics.
Model misspecification is another problem for the assessment of this complex and multifactorial relationship, especially when observational studies are used for causal inference. 13,14 Targeted maximum likelihood estimation (TMLE) is a two-stage estimator that reduces the bias for the estimation of the target parameters if either exposure or outcome models are estimated consistently. 14,15 Furthermore, this estimator is based on causal assumptions under which observational data may emulate inference from a perfect randomized trial, allowing us to evaluate the nearest causal and true effects. 16 Moreover, TMLE is known as a double-robust method and naturally integrates lossbased super learning, which increases the chance to reduce bias due to model misspecification. 16 The current study was designed to examine the unbiased association of BMI and central obesity with the risk of stroke separately for diabetics and non-diabetics in the Atherosclerosis Risk in Communities (ARIC) cohort study using the TMLE method. It was hypothesized that diabetic participants with central obesity are more at risk of developing a stroke.

| Study design and participants
The ARIC study is a prospective cohort study designed to evaluate the risk factors of atherosclerotic disease. ARIC is a community-based cohort comprising participants from four U.S. communities (Washington County, Maryland; Jackson, Mississippi; Forsyth County, North Carolina; and the suburbs of Minneapolis, Minnesota). In 1987-1989 (Visit 1), 15 792 males and females aged 45-64 were recruited and completed the baseline clinic examination (Visit 1). Then, every 3 years, all participants were invited to a follow-up examination during 1990-1992, 1993-1995 and 1996-1998. Cohort participants were selected by probability sampling. In these communities, all ageeligible persons were selected as potential cohort participants. The baseline examination evaluated cardiovascular conditions and assessed the related risk factors. After each visit, a telephone questionnaire was administered annually, and medical conditions were identified by the annual questionnaire. Response rates for the successive examinations were 93%, 86% and 80%, respectively. The institutional review boards approved the ARIC study protocol by each participating study field centre, and informed consent was obtained from participants at each study visit. Details of this study are described elsewhere. 17 In the current study, the baseline data, the exposure and covariates values in Visit 1, and all outcome to 2014, were included. The definition of type 2 diabetes occurrences was based on blood glucose level ≥200 mg dl −1 or blood glucose level after 8 h or more of fasting time ≥140 mg dl −1 . According to this cut point, all participants were divided into diabetic and non-diabetic groups. (1 − ((WC/2π)^2)/(0.5 * Height)^2)))) ≥ 4. 18,19 Because there is no universal agreement regarding the BSI and BRI cut point, it was also evaluated based on best threshold cut-off value in receiver operating characteristic (ROC) curve.

| Outcome and covariates
A definite or probable stroke that occurred by 31 December 2014 (after 27 years of follow-up) was considered as outcome (binary outcome [0, 1]) in the present study. Data were collected by annual telephone interviews that listed all hospitalizations during the past year.
In addition, all local hospitals provided lists of stroke occurrences. This outcome was identified by the presence of related hospital discharge codes (ICD-9 codes 430-438 until 1997 and ICD-9 codes 430-436), presence of stroke findings on a computerized tomography (CT) or magnetic resonance imaging (MRI) report or by death certificates. 20 All included covariates were categorized in three demographic, behavioural and biologic dimensions as potential confounders. These covariates included age, sex, race, education level, resident centre, cigarette smoking status, drinker status, total physical activity score, total calorie intake (kcal), hypertension, plasma lipids (mg dl −1 ) and the history of stroke at the baseline. The plasma lipids included cholesterol, high-density lipoprotein (HDL) cholesterol and triglyceride. In addition, waist and hip circumference were included in full models for evaluating the possible mediation effect, as appropriate.

| Causal diagram and notations
A directed acyclic graph (DAG) in Figure 1 represents the associations between the exposure and outcome and other covariates. In this method, the data are represented in data structure given by O = (W, A, Δ, ΔY A ), where W is a vector of measured baseline covariates, A is an exposure variable, delta is a missing mechanism for outcome of interest in which Δ = 1 indicates the outcome is observed, and Δ = 0 indicates the outcome is missing, and Y is the study outcome. In this study, W is all listed potential confounders, A is obesity based on different indices, Δ is the missing mechanism of outcomes that is explained in statistical analysis section, and Y is the occurrence of stroke.

| Statistical analysis
Descriptive statistics were used to describe the participants (mean ± SD for continuous variables and number and percent for categorical variables). An independent t-test analysis was used to examine the statistical differences in continuous covariates between two levels of stroke. In addition, the χ 2 test was used to examine the associations of categorical variables with stroke. P < .05 was considered statistically significant.
The TMLE was used to quantify the relationship between BMI and central obesity and stroke. TMLE, as a double-robust estimator, which uses both outcome and exposure models, was implemented in several steps. The statistical and mathematical details of this method and related components are described elsewhere. 15,16 The super learner approach is an ensemble machine learning approach that uses cross-validation to select an optimal statistical model from among many candidate models. In this study, a super learner algorithm with three algorithms, including generalized linear model (GLM), stepwise GLM and interaction GLM, was used. Model misspecification may be present in many situations. In these situations, the models do not account for everything, and this may result in biased estimations. Thus, recently, researchers have developed robust methods to reduce this problem. Double-robust methods have the advantage of using both exposure and outcome models that remain consistent if either exposure or outcome models are estimated consistently. In addition, TMLE can use the super learner machine learning algorithms that result in protecting against bias.
The exposure variable (obesity) was characterized dichotomously; values above the defined cut-off were considered as 'obese' and the other ones as 'non-obese.' The missing mechanism (outcome missing at follow-up) was defined as the occurrence of a competing event (total mortality of all other causes) or loss to follow-up before the occurrence of the stroke. 21 To assess the possible mediation effect of biological and central obesity covariates in the association between obesity and stroke, this relationship was evaluated in four models in diabetics and nondiabetics for six obesity indices. In these four models, the effect of

| Outcome evaluation by diabetic groups (diabetics and non-diabetics)
The risk ratios with 95% confidence intervals for six obesity indices in four adjusted models are presented in Tables 2-4 and Figure 2 for all participants, diabetics and non-diabetics, respectively. The obtained effect for BMI adjusted for demographic covariates was 1.10 in diabetics, which increased to 1. 16

| Outcome evaluation for all participants and by sex (males and females)
The risk ratios with 95% confidence intervals estimated by TMLE for six obesity indices adjusted for all listed covariates are presented in Figure 3 and Table 2 for all participants and in Table 3

| Sensitivity analyses
In sensitivity analyses, the results of complete data and imputed data had an ignorable difference. The second sensitivity analysis comparing models with and without biological variables exhibited that the effects were sensitive to exclusion of these covariates.
These results can indicate the mediation effects of biological factors (Tables 2 and 4). The third sensitivity analysis showed that the results of the TMLE method were more precise than the GLM method. The point estimates of these two methods were different, and all the confidence intervals for the TMLE method were more precise (Figure 3).

F I G U R E 2
Decreasing risk of stroke for six obesity indices after adjustment for demographic, behavioural, biologic and central obesity variables in four models, respectively, in communities (ARIC) study , in total, diabetics, non-diabetics, males and females. BMI, body mass index; WC, waist circumference; WHR, waist-to-hip ratio; WHtR, waist-to-height ratio; BSI, body shape index; BRI, body roundness index F I G U R E 3 Comparison of risk ratios of stroke for six obesity indices based on targeted maximum likelihood estimation with super learner modelling (tmle) and generalized linear model (glm) in the atherosclerosis risk in communities (ARIC) study . BMI, body mass index; WC, waist circumference; WHR, waist-to-hip ratio; WHtR, waist-to-height ratio; BSI, body shape index; BRI, body roundness index

| DISCUSSION
In this large cohort study, the TMLE method was used to estimate the association of BMI and central obesity with stroke in diabetics and nondiabetics. Furthermore, this association for all participants and sex groups was evaluated. It is worth mentioning that model misspecification and biased estimation are two important problems in conventional statistical methods, especially for observational data. To solve these problems, this study used the TMLE method that considers these limitations and makes causal assumptions that allow us to evaluate the nearest causal and true effects. Generally, regarding central obesity, the WHR and BSI indices have the strongest and marginally significant effects of stroke occurrence in diabetics, and the BSI does so for non-diabetics. Furthermore, the effects of BMI were attenuated to the null after adjusting for central obesity in both diabetics and non-diabetics. With respect to the effects in males, females and all participants, the results of the TMLE method (the fourth model) were more precise than those based on conventional models and showed that the strongest effect was related to BSI and BMI for all participants: WC, BSI and WHR for males and BSI and BRI for females.
The results of the current study agree with those of a Mendelian randomization study on the effect of BMI on ischaemic stroke, for all participants. 24 The findings of that study confirmed a weak and marginally significant result. However, this relationship was not assessed separately among males and females or diabetics and non-diabetics. In addition, only BMI was assessed as an obesity index, which is subject to some limitation.
Consistent with our study, regarding the association between BMI and stroke in diabetics, previous studies have found a null or inverse association between BMI and total stroke. 11,25 However, some studies have shown inconsistent results. 11,26 On the other hand, regarding the association between central obesity and stroke in diabetics, previous studies have found that central obesity in diabetics plays an important role in causal pathway of stroke. 27,28 These contradictions seem to arise largely because of misclassification of persons with and without obesity by BMI and misspecification of the models.
The misspecification problem could be minimized by the TMLE method, which uses super learning algorithms. 16 The limitations of can better explain these causal pathways. 6,7,18,19 Furthermore, the fat distribution and body shape are different in males and females, and several studies have confirmed the necessity of separate estimations of this relationship for males and females. 30 Besides, the researcher should consider the mediation role of the biological risk factors confirmed in previous studies, separately for males and females. 31,32 It is worth noting that several studies in this field have been conducted using ARIC data; however, many of them have had some limitations that lead to biased and contradictory results. 33 These studies are limited by the analysis methods, the covariates included in the models and misspecification of the models that is common.
Before interpreting and concluding the results, the strengths and limitations of this study need to be addressed. In this study, the double-robust method was used, which consistently estimates the parameters under a semiparametric model when one of two (exposure and outcome) models is correctly specified, regardless of which, while most previously published studies used conventional methods for their analyses, which may be associated with estimation error and model misspecification. 11,26 In addition, different indices of obesity were used to determine the best index that defines the obesity concept for all participants and for each sex group. Furthermore, the missing mechanism of outcome was taken into account for better estimation of true effects. In contrast, this package only takes into account one phase of exposure and covariates estimation and is unable to consider the time-varying confounders. The authors aim to consider this issue in future studies. In addition, in future studies, these associations should be evaluated in diabetics and non-diabetics, separately for males and females.
In summary, the findings of our study indicated that the effect of different fat distribution was different between diabetics and non-diabetics. Moreover, this inconsistency was shown for males and females, and previous studies have shown that females commonly have a higher percentage of body fat than males. 9,30 Previous findings were confirmed by considering the results of BMI and central obesity indices. WC and WHR, which are more related to waist girth, provide a stronger and more significant effect for males and BSI and BRI that consider other body measures and fatty mass do so for females.

| CONCLUSION
From the present study, it can be generally concluded that among the measures of obesity, central indices had better prediction and stronger association with the incidence of stroke in diabetics. In sex groups, the WC and WHR indices in males and the BSI and BRI indices in females were more appropriate indices for definition of obesity as a risk factor for stroke.

ACKNOWLEDGEMENT
This manuscript was prepared using ARIC research materials obtained from the NHLBI Biologic Specimen and Data Repository Information Coordinating Center and does not necessarily reflect the opinions or views of the ARIC or the NHLBI. The authors thank the staff and participants of the ARIC study for important contributions.