A Body Shape Index (ABSI), hip index, and risk of cancer in the UK Biobank cohort

Abstract Abdominal size is associated positively with the risk of some cancers but the influence of body mass index (BMI) and gluteofemoral size is unclear because waist and hip circumference are strongly correlated with BMI. We examined associations of 33 cancers with A Body Shape Index (ABSI) and hip index (HI), which are independent of BMI by design, and compared these with waist and hip circumference, using multivariable Cox proportional hazards models in UK Biobank. During a mean follow‐up of 7 years, 14,682 incident cancers were ascertained in 200,289 men and 12,965 cancers in 230,326 women. In men, ABSI was associated positively with cancers of the head and neck (hazard ratio HR = 1.14; 95% confidence interval 1.03–1.26 per one standard deviation increment), esophagus (adenocarcinoma, HR = 1.27; 1.12–1.44), gastric cardia (HR = 1.31; 1.07–1.61), colon (HR = 1.18; 1.10–1.26), rectum (HR = 1.13; 1.04–1.22), lung (adenocarcinoma, HR = 1.16; 1.03–1.30; squamous cell carcinoma [SCC], HR = 1.33; 1.17–1.52), and bladder (HR = 1.15; 1.04–1.27), while HI was associated inversely with cancers of the esophagus (adenocarcinoma, HR = 0.89; 0.79–1.00), gastric cardia (HR = 0.79; 0.65–0.96), colon (HR = 0.92; 0.86–0.98), liver (HR = 0.86; 0.75–0.98), and multiple myeloma (HR = 0.86; 0.75–1.00). In women, ABSI was associated positively with cancers of the head and neck (HR = 1.27; 1.10–1.48), esophagus (SCC, HR = 1.37; 1.07–1.76), colon (HR = 1.08; 1.01–1.16), lung (adenocarcinoma, HR = 1.17; 1.06–1.29; SCC, HR = 1.40; 1.20–1.63; small cell, HR = 1.39; 1.14–1.69), kidney (clear‐cell, HR = 1.25; 1.03–1.50), and post‐menopausal endometrium (HR = 1.11; 1.02–1.20), while HI was associated inversely with skin SCC (HR = 0.91; 0.83–0.99), post‐menopausal kidney cancer (HR = 0.77; 0.67–0.88), and post‐menopausal melanoma (HR = 0.90; 0.83–0.98). Unusually, ABSI was associated inversely with melanoma in men (HR = 0.89; 0.82–0.96) and pre‐menopausal women (HR = 0.77; 0.65–0.91). Waist and hip circumference reflected associations with BMI, when examined individually, and provided biased risk estimates, when combined with BMI. In conclusion, preferential positive associations of ABSI or inverse of HI with several major cancers indicate an important role of factors determining body shape in cancer development.


Exclusion criteria
Data for 502,488 participants were available, after removing participants who had withdrawn consent. In total, 71,873 participants were excluded after applying sequentially the exclusion criteria listed below, such that each excluded individual was counted only once: 1. Ethnic background restricted to self-reported white (n=29,809): Field [21000-0.0] "Ethnic background"; included codes: 1 "White", 1001 "British", 1002 "Irish", 1003 "Any other white background". The total number of participants included in the main study dataset was 430,615.
For analyses involving endometrial cancer, 40,611 women were excluded from the main study dataset due to hysterectomy prior to baseline. The total number of women included in the endometrial cancer dataset was 189,715.
For analyses involving ovarian cancer, 16,274 women were excluded from the main study dataset due to bilateral oophorectomy prior to baseline. The total number of women included in the ovarian cancer dataset was 214,052.

Definition of prevalent and incident cancer
Information for prevalent cancers was obtained from the cancer registry and from selfreported cancer. Information for incident cancers was obtained from the cancer registry. Alcohol consumption was based on Field [1558-0.0] "Alcohol intake frequency"; Question: "About how often do you drink alcohol?" as follows: Up to 3 times a month -Answer 4: "One to three times a month"; or 5: "Special occasions only"; or 6: "Never"; Up to four times a week -Answer 2: "Three or four times a week"; or 3: "Once or twice a week"; Daily or almost daily -Answer 1: "Daily or almost daily". Missing values were replaced with category "Up to four times a week" for both sexes.
Physical activity was defined as follows: Very active -was based on Field [816-0.0] "Job involves heavy manual or physical work"; Question: "Does your work involve heavy manual or physical work?"; Answer 3: "Usually" or 4: "Always"; OR Field [904-0.0] "Number of days/week of vigorous physical activity 10+ minutes"; Question: "In a typical WEEK, how many days did you do 10 minutes or more of vigorous physical activity? (These are activities that make you sweat or breathe hard such as fast cycling, aerobics, heavy lifting)"; Answer (numerical) 3-7; Moderately active -was based Field [904-0.0] Answer 1-2 OR Field [884-0.0] "Number of days/week of moderate physical activity 10+ minutes"; Question: "In a typical WEEK, on how many days did you do 10 minutes or more of moderate physical activities like carrying light loads, cycling at normal pace? (Do not include walking)"; Answer (numerical) 3-7; OR Field [864-0.0] "Number of days/week walked 10+ minutes"; Question: "In a typical WEEK, on how many days did you walk for S7 at least 10 minutes at a time? (Include walking that you do at work, travelling to and from work, and for sport or leisure)"; Answer (numerical) 7, when participants were not already included in category very active; Less active -was based on Field [904-0.0] Answer (numerical) 0 OR Field [884-0.0] Answer (numerical) 0-2 OR Field [864-0.0] Answer (numerical) 0-6 or -2: "Unable to walk", when participants were not already included in category moderately or very active. Missing values were replaced with category "Moderately active" for both sexes.
Townsend deprivation index was used as an indicator of socioeconomic status and was based on Field [189-0.0] "Townsend deprivation index at recruitment" (continuous) calculated by UK Biobank. This variable represents a score corresponding to the output area in which the participant's postcode was located immediately prior to joining UK Biobank, based on the preceding national census output areas. A greater score implies a greater degree of material deprivation. Missing values were replaced with the middle tertile for both sexes.
Fresh fruit and vegetable intake was based on the sum of two fields: Field [1309-0.0] "Fresh fruit intake" (continuous), Question: "About how many pieces of FRESH fruit would you eat per DAY? (Count one apple, one banana, 10 grapes etc as one piece; put '0' if you do not eat any)" and Field [1299-0.0] "Salad / raw vegetable intake" (continuous), Question: "On average how many heaped tablespoons of SALAD or RAW vegetables would you eat per DAY? (Include lettuce, tomato in sandwiches; put '0' if you do not eat any)". Answers: -10 "Less than one" were re-coded to 0.5. Answers: -1 "Do not know" and -3 "Prefer not to answer" were consider missing. The total was dichotomised as Less than five portions a day or Five or more portions a day and were used as an indication of a healthy lifestyle. Missing values were replaced with category "Less than five portions a day" for both sexes.
Processed meat intake was based on Field [1349-0.0] "Processed meat intake", Question: "How often do you eat processed meats (such as bacon, ham, sausages, meat pies, kebabs, burgers, chicken nuggets)". Category Less than twice a week included answers: 0 "Never", 1 "Less than once a week" and 2 "Once a week". Category Twice or more a week included answers: 3 "2-4 times a week", 4 "5-6 times a week" 5 "Once or more daily". Answers: -1 "Do not know" and -3 "Prefer not to answer" were consider missing. Missing values were replaced with category "Less than twice a week" for both sexes.
Red meat intake was based on the sum of three fields: Field [1369-0.0] "Beef intake", Question: "How often do you eat beef? (Do not count processed meats)", Field [1379-0.0] "Lamb/mutton intake", Question: "How often do you eat lamb/mutton? (Do not count processed meats)" and Field [1389-0.0] "Pork intake", Question: "How often do you eat pork? (Do not count processed meats such as bacon or ham)". The categorical answers were converted to a continuous scale as follows: Answer 0 "Never" remained 0; Answer 1 "Less than once a week" was coded as 0.5; Answer 2 "Once a week" was coded as 1; Answer 3 "2-4 times a week" was coded as 3; Answer 4 "5-6 times a week" was coded as 5.5; Answer 5 "Once or more daily" was coded as S8 7. Answers: -1 "Do not know" and -3 "Prefer not to answer" were consider missing. Categories Less than twice a week and Twice or more a week were derived with respect to the total of the three variables. Missing values were replaced with category "Twice or more a week" for men or category "Less that twice a week" for women.
Family history of cancer was based on three variables: Fields [20107-0.0/9] "Illness of father", Question: "Has/did your father ever suffer from? (You can select more than one answer)", Fields [20110-0.0/10] "Illness of mother", Question: "Has/did your mother ever suffer from? (You can select more than one answer)" and Field [20111-0.0/11] "Illness of siblings", Question: "Have any of your brothers or sisters suffered from any of the following diseases? (You can select more than one answer)". Category Yes was based on Answers: 3 "Lung cancer", 4 "Bowel cancer", 5 "Breast cancer" or 13 "Prostate cancer" to any of the three sets of fields and category No included the remaining participants.
Hormone replacement therapy (HRT) use was determined for women by Field [2814-0.0] "Ever used hormone-replacement therapy (HRT)"; Question: "Have you ever used hormone replacement therapy (HRT)?"; Answer 0: "No" (for Never user) or Answer 1: "Yes" and Field [3546-0.0] "Age last used hormone-replacement therapy (HRT)" Question: "How old were you when you last used HRT?" Answer -11: "Still taking HRT" (for Current user) or else Answer 1: "Yes" to Field [2814-0.0] (for Former user). Further information was derived from Fields [6153-0.0/3] "Medication for cholesterol, blood pressure, diabetes, or take exogenous hormones", Question: "Do you regularly take any of the following medications? (You can select more than one answer)". Women providing Answer 4 "Hormone replacement therapy" were considered Current user. Missing values were replaced with category "Never user".
Use of oral contraceptives was determined for women by Field [2784-0.0] "Ever taken oral contraceptive pill"; Question: "Have you ever taken the contraceptive pill? (include the 'mini-pill')"; Answer 0: "No" (for Never user) or Answer 1: "Yes" (for Ever user). Further information was derived from Fields [6153-0.0/3]. Women providing Answer 5 "Oral contraceptive pill or minipill" were considered Ever user. Missing values were replaced with category "Ever user".
Age at last live birth was defined as follows: No live births -was based on Field [2734-0.0] "Number of live births"; Question: "How many children have you given birth to? (Please include live births only)"; Answer (numerical) 0; < 30 years or ≥ 30 years -was based on Field [2764-0.0] Age at last live birth; Question: "How old were you when you had your LAST child?" and Field [3872-0.0] "Age of primiparous women at birth of child"; Question: "How old were you when you had your child?" (UK Biobank note: "Current Field was collected from women who indicated they had given birth to only one child, as defined by their answers to Field 2734"). Missing values were replaced with category "< 30 years".

S9
Menopausal status was determined as follows: Post-menopausal -were classified women with age at baseline ≥58 years OR with bilateral oophorectomy from Field: [2834-0.0] "Bilateral oophorectomy (both ovaries removed)"; Question: "Have you had BOTH ovaries removed?"; Answer 1: "Yes" OR Field [20004-0] "Operation code (self-reported operation)" code: 1355 "bilateral oophorectomy"; OR with self-reported post-menopausal status from Field [2724-0.0] "Had menopause"; Question: "Have you had your menopause (periods stopped)?"; Answer 1: "Yes" OR with age at baseline ≥ 55 years when menopausal status was unknown, i.e. they had not answered 0: "No" to Field [2724-0.0]; Pre-menopausal -were classified women who had not been defined as post-menopausal above and had reported pre-menopausal status with Answer 0: "No" to Field [2724-0.0] OR had age at baseline < 55 years when menopausal status was unknown, i.e. not defined as post-or pre-menopausal according to the above criteria.
Bilateral oophorectomy prior to baseline was determined as follows: women with Answer 1: Skin colour was based on Field [1717-0.0] "Skin colour"; Question: "What best describes the colour of your skin without tanning?"; Answer 1: "Very fair" (for Very fair); Answer 2: "Fair" (for Fair); Answer 3: "Light olive", or Answer 4: "Dark olive", or Answer 5: "Brown", or Answer 6: "Black" (for Dark). Missing values were replaced with category "Fair" for both sexes.Ease of skin tanning was based on Field [1727-0.0] "Ease of skin tanning"; Question: "What would happen to your skin if it was repeatedly exposed to bright sunlight without any protection?"; Answer 1: "Get very tanned"; Answer 2: "Get moderately tanned"; Answer 3: "Get mildly or occasionally tanned"; Answer 4: "Never tan, only burn". Missing values were replaced with category "Get moderately tanned" for both sexes.
Sunburn in childhood was based on Field [1737-0.0] "Childhood sunburn occasions"; Question: "Before the age of 15, how many times did you suffer sunburn that was painful for at least 2 days or caused blistering?"; Answer numerical 0 (for Never burned); Answer any positive numerical (for Ever burned); missing, or Answer -1: "Do not know", or Answer -3: "Prefer not to answer" (for Missing). S10 Solarium use was based on Field [2277-0.0] "Frequency of solarium/sunlamp use"; Question: "How many times a year would you use a solarium or sunlamp?"; Answer -10: "Less than once a year" or any positive numerical value (for Ever use); not missing, and not Answer -1: "Do not know", and not Answer -3: "Prefer not to answer" (for Never use). Missing values were replaced with category "Never use" for both sexes.
Sun / UV protection was based on Field [2267-0.0] "Use of sun/UV protection"; Question: "Do you wear sun protection (e.g. sunscreen lotion, hat) when you spend time outdoors in the summer?"; Answer 1: "Never / rarely"; Answer 2: "Sometimes"; Answer 3: "Most of the time"; Answers 4 or 5: "Always / do not go out in sunshine". Missing values were replaced with category "Sometimes" for men or category "Most of the time" for women.
Time spent outdoors in summer was based on Field [1050-0.0] "Time spent outdoors in summer"; Question: "In a typical DAY in summer, how many hours do you spend outdoors?"; Answer -10: "Less than an hour a day" or numerical 1-3 (for ≤ 3 hours a day); Answer positive numerical >3 (for > 3 hours a day); missing, or Answer -1: "Do not know", or Answer -3: "Prefer not to answer" (for Missing). It should be noted, that allometric indices calibrated for a given dataset are proportional to the residuals of log-linear models regressing out associations with weight and height, i.e. adjusting for weight and height each body-shape measure. The only part of the log-linear model omitted from the allometric formula is the intercept, which is a constant. As body mass index (BMI) is a combination of weight and height, the same models can also be re-parameterised to adjust for BMI and height, instead of weight and height, as previously explained [4,5].

Calibration of allometric body shape indices for UK Biobank participants
To examine associations between ABSI and HI calibrated for participants in the National Health and Nutrition Examination Survey (NHANES) (5, 6) and ABSIUKB and HIUKB calibrated for UK Biobank participants, we used partial Pearson correlation coefficients, with adjustment for age at baseline and region of the assessment centre. We also repeated the main analyses with ABSIUKB, HIUKB and WHIUKB. Note that although both WHIUKB and WHI are calibrated for UK Biobank participants, WHIUKB uses the exact regression coefficients for the dataset in this study, while WHI uses coefficients rounded to simple fractions. The simplified version of WHI can be calculated with a conventional calculator with the following sequence, dependent on how square root is required to be entered on the specific device (prior to or after the number): WC (cm) / HC (cm) / Weight (kg) √ √ * Height (cm) √ = or WC (cm) / HC (cm) / √ √ Weight (kg) * √ Height (cm) = where / stands for key "division", * stands for key "multiplication", √ stands for key "square root" and  3) † -participants with missing values formed a separate category in the analysis; ǂ -female-specific variables (note that menopausal status was also used as a covariate in the models); n (%)number of participants (percentage from total in cohort (for cohort size and cancer cases in men and women), or from total in women (for cohort size and cancer cases in pre-and postmenopausal women), or from total per column for categorical variables). The definition of variables is described in Supplementary Methods. Sun-exposure variables were used as covariates in the analyses for skin squamous-cell carcinoma and melanoma. Missing values were replaced with the sex-specific median category, except when marked with †.

S13
Supplementary Women -models included adjustment variables as for men, with the addition of menopausal status (except for cancers marked with # ), use of hormone replacement therapy, ever use of oral contraceptives and age at last live birth (with "no live births" as one of the categories). Pre-MP -for the models, this applies to the sub-group of women pre-menopausal at baseline, with adjustment as in models for women. Pre-menopausal -applies to breast cancers diagnosed below 55 years of age in women pre-menopausal at baseline. Post-MP -for the models, this applies to the sub-S16 group of women post-menopausal at baseline, with adjustment as in models for women. S30 † -cancers with less than 20 cases in women pre-menopausal at baseline, for which models were not adjusted for menopausal status; ABSI -a body shape index; BMI -body mass index; CIconfidence interval; HI -hip index; HR -hazard ratio; SCC -squamous cell carcinoma; SDstandard deviation; WHI -waist-to-hip index. HRs (95% CI) were obtained from delayed entry Cox proportional hazards models stratified by age at baseline and region of the assessment centre; Supplementary Figure S4 Sensitivity analyses excluding participants with less than two years of follow-up † -cancers with less than 20 cases in women pre-menopausal at baseline, for which models were not adjusted for menopausal status; ABSI -a body shape index; BMI -body mass index; CIconfidence interval; HI -hip index; HR -hazard ratio; SCC -squamous cell carcinoma; SDstandard deviation; WHI -waist-to-hip index. HRs (95% CI) were obtained from delayed entry Cox S32 proportional hazards models stratified by age at baseline and region of the assessment centre. Note: In women, hazard ratio (HR) estimates based on ABSI and HI calculated with regression coefficients from NHANES were almost identical to HR estimates based on ABSI (UKB) and HI (UKB) calculated with regression coefficients derived from UK Biobank data. In men, there was similarly no material difference between HR estimates based on ABSI and HI calculated with regression coefficients from NHANES and from UK Biobank, despite some weak inverse association between BMI and HI calculated with regression coefficients from NHANES.