Sarcopenia Definitions as Predictors of Fracture Risk Independent of FRAX®, Falls, and BMD in the Osteoporotic Fractures in Men (MrOS) Study: A Meta‐Analysis

Dual‐energy X‐ray absorptiometry (DXA)‐derived appendicular lean mass/height2 (ALM/ht2) is the most commonly used estimate of muscle mass in the assessment of sarcopenia, but its predictive value for fracture is substantially attenuated by femoral neck (fn) bone mineral density (BMD). We investigated predictive value of 11 sarcopenia definitions for incident fracture, independent of fnBMD, fracture risk assessment tool (FRAX®) probability, and prior falls, using an extension of Poisson regression in US, Sweden, and Hong Kong Osteoporois Fractures in Men Study (MrOS) cohorts. Definitions tested were those of Baumgartner and Delmonico (ALM/ht2 only), Morley, the International Working Group on Sarcopenia, European Working Group on Sarcopenia in Older People (EWGSOP1 and 2), Asian Working Group on Sarcopenia, Foundation for the National Institutes of Health (FNIH) 1 and 2 (using ALM/body mass index [BMI], incorporating muscle strength and/or physical performance measures plus ALM/ht2), and Sarcopenia Definitions and Outcomes Consortium (gait speed and grip strength). Associations were adjusted for age and time since baseline and reported as hazard ratio (HR) for first incident fracture, here major osteoporotic fracture (MOF; clinical vertebral, hip, distal forearm, proximal humerus). Further analyses adjusted additionally for FRAX‐MOF probability (n = 7531; calculated ± fnBMD), prior falls (y/n), or fnBMD T‐score. Results were synthesized by meta‐analysis. In 5660 men in USA, 2764 Sweden and 1987 Hong Kong (mean ages 73.5, 75.4, and 72.4 years, respectively), sarcopenia prevalence ranged from 0.5% to 35%. Sarcopenia status, by all definitions except those of FNIH, was associated with incident MOF (HR = 1.39 to 2.07). Associations were robust to adjustment for prior falls or FRAX probability (without fnBMD); adjustment for fnBMD T‐score attenuated associations. EWGSOP2 severe sarcopenia (incorporating chair stand time, gait speed, and grip strength plus ALM) was most predictive, albeit at low prevalence, and appeared only modestly influenced by inclusion of fnBMD. In conclusion, the predictive value for fracture of sarcopenia definitions based on ALM is reduced by adjustment for fnBMD but strengthened by additional inclusion of physical performance measures. © 2021 The Authors. Journal of Bone and Mineral Research published by Wiley Periodicals LLC on behalf of American Society for Bone and Mineral Research (ASBMR).


Introduction
S arcopenia (accelerated loss of muscle strength, function, and mass), (1) while now having an associated International Classification of Diseases (ICD) code, presents a diagnostic challenge, given the current wide range of approaches to its operational characterization. By various definitions, sarcopenia has been associated with outcomes such as falls, fractures, and death, (1) but it is increasingly apparent that there are limitations inherent in these approaches resulting from dual-energy X-ray absorptiometry (DXA)-derived appendicular lean mass (ALM) being a component part. (2,3) We have recently demonstrated that the predictive capacity of DXA ALM for incident fracture, in the three Osteoporotic Fractures in Men (MrOS) cohorts, is attenuated to the null by inclusion of femoral neck bone mineral density (BMD) T-score. (4) Similar findings have been observed in the US Health ABC (5) and WHI cohorts (6) and are recognized in recent recommendations from the Sarcopenia Definitions and Outcomes Consortium (SDOC). (7,8) Other than that from SDOC, current definitions of sarcopenia incorporate DXA ALM as the measure of muscle mass, either alone, or in the newer definitions, together with measures of physical function/performance/strength, such as gait speed and grip strength. Indeed, the most recent European Working Group algorithm moves the focus from ALM to that of performance/function and strength as the important attributes. (9) Given the centrality of DXA ALM to current sarcopenia definitions and the apparent weakness of ALM as a predictor of incident fracture after adjustment for BMD, an important unanswered question is whether the predictive capacity for fracture of these sarcopenia definitions remains when bone mineral density is also taken into account. Clinically, there would seem little point in undertaking the assessments required for sarcopenia definition (including whole-body DXA, gait speed, and grip strength, for example) if the risk information conveyed by the assessment tells us nothing beyond that associated with femoral neck BMD (a very quick measure to obtain). This consideration also applies to independence from fracture risk assessment tool (FRAX ® ) probability and prior history of falls. A further unanswered question therefore relates to the magnitude of the risk relationships between sarcopenia definitions and fracture outcomes with these various adjustments. Finally, it unknown how these associations might vary by age, an important consideration in terms of clinical impact. We therefore undertook a meta-analysis of the three MrOS cohorts (US, Sweden, Hong Kong) to investigate whether the predictive value of sarcopenia definitions for incident fracture was independent of BMD, FRAX probability, and history of falls and to quantify the magnitude and age dependence of these associations.

Materials and Methods
Participants Details of the MrOS cohort studies have been published previously, (4,(10)(11)(12)(13) but briefly, MrOS is a multicenter study of community-dwelling men aged 65 years or older from three international cohorts, recruited and evaluated using similar protocols. To be eligible for the study, subjects had to be able to walk without aid. In the MrOS Hong Kong Study, 2000 Chinese men, aged 65 to 92 years, were enrolled between August 2001 and February 2003. (14) All were Hong Kong residents of Asian ethnicity. Stratified sampling was adopted to ensure that 33% of subjects were included in each of the following age groups: 65 to 69, 70 to 74, and ≥75 years. Recruitment notices were placed in housing estates and community centers for the elderly. In the MrOS Sweden Study, 3014 men, aged 69 to 81 years, were enrolled between October 2001 and December 2004. (12,15) The cohort comprised men from the cities of Malmo, Gothenburg, and Uppsala, identified and recruited using national population registers. More than 99% were of white ethnicity. The participation rate in the MrOS Sweden Study was 45%. In the MrOS United States study, 5994 men, aged 65 to 100 years, were enrolled at six sites between March 2000 and April 2002. (16,17) Each US clinical site designed and customized strategies to enhance recruitment of its population. Common strategies included mailings from the Department of Motor Vehicles, voter registration and participant databases, common senior newspaper features and advertisement, and targeted presentations. Self-defined racial/ ethnic ancestry was ascertained through questionnaires at baseline (90% white ethnicity).

Exposure variables
The international MrOS questionnaire (16) was administered at baseline to collect information about current smoking, number and type of medications, fracture history, family history of hip fracture, past medical history (rheumatoid arthritis), and high consumption of alcohol (3 or more glasses of alcohol-containing drinks per day), calculated from the reported frequency and amount of alcohol use. Previous fracture at baseline was documented as all fractures after the age of 50 years, regardless of trauma. For glucocorticoid exposure, this was documented in MrOS as use at least 3 times per week in the month preceding the baseline assessment. Apart from glucocorticoid use and rheumatoid arthritis (both FRAX input variables), there was no information on secondary causes of osteoporosis and the "secondary osteoporosis" input variable for FRAX probability calculation was set to no for all men. Self-reported falls during the 12 months preceding the baseline were recorded by questionnaire (past falls).
At baseline, height (centimeters) and weight (kilograms) were measured, and body mass index (BMI) was calculated as kilograms per square meter. Time to complete 5 chair stands, walking speed over 6 m (at usual pace), and grip strength using JAMAR dynamometers (Sammons Preston Rolyan, Bolingbrook, IL, USA) were assessed at the baseline visit. Areal bone mineral density (aBMD) was measured at the femoral neck, and appendicular lean mass from whole body scans, using Hologic QDR 4500 A or W (Hologic, Bedford, MA, USA) or Lunar Prodigy (GE Lunar Corp., Madison, WI, USA) depending on the center, with cross-calibration of instruments for BMD. A T-score was calculated using NHANES young women (white) as a reference value. (18,19) In the subset in which the necessary variables were available (n = 7531), FRAX 10-year probability of major osteoporotic fracture (MOF: hip, proximal humerus, clinical vertebral, or distal forearm sites) was calculated using clinical risk factors described above with and without femoral neck BMD entered into country-specific FRAX models.

Fracture and death outcomes
Hong Kong (20) Incident fractures were captured via subject follow-up through phone call or visit to the research center. All fracture sites (hip, wrist, skull/face, ribs, shoulder, arm, wrist, vertebra, tibia, fibula, foot, metatarsal toes, hand, fingers, and pelvis) were recorded. Pathological fractures were excluded. Only incident fractures reported by participants and confirmed by X-ray or medical record review were included. Deaths were verified by death certificates.
Sweden (21) Central registers covering all Swedish citizens were used to identify the subjects and the date of death for all subjects who died during the study. For incident fracture evaluation, the computerized X-ray archives in Malmo, Gothenborg, and Uppsala were searched for new fractures occurring after the baseline visit using the unique personal registration number allocated to every Swedish citizen. If additional fractures were reported by the study subject after the baseline visit, these were only included if confirmed by physician review of radiology reports.
USA (16) Triannual questionnaires were mailed to each participant. If a participant reported a fracture, study staff conducted a followup telephone interview to determine the date the fracture had occurred, a description of how the fracture occurred, the type of trauma that resulted in the fracture, the participant's location and activities at the time of the fracture, symptoms just before or coincident with the fracture, and source of medical care for the fracture. All reported fractures were centrally verified by a physician adjudicator through medical records. Deaths were verified through centralized review of state death certificates.

Sarcopenia definitions
Individuals were classified as sarcopenic or non-sarcopenic according to each individual sarcopenia definition, as published by International Working group on Sarcopenia (IWGS), (22) Baumgartner, (23) European Working Group on Sarcopenia in Older People (EWGSOP1), (24) Morley, (25) Delmonico, (26,27) Asian Working Group on Sarcopenia (AWGS), (28) and the Foundation for the National Institutes of Health (FNIH 1 and 2), (29) together with the recently published revised EWGSOP2 guidelines, incorporating definitions of "confirmed" and "severe sarcopenia", (9) and the recent US guidelines from the Sarcopenia Definitions and Outcomes Consortium (SDOC). (7,8) Thus, 11 sarcopenia definitions were explored as the exposure. The majority of these guidelines use thresholds derived from expert consensus, and all but SDOC include a measure of appendicular lean mass. In all but the two data-driven FNIH definitions (in which appendicular lean mass is divided by body mass index), this is incorporated as appendicular lean mass divided by height squared. In the earlier definitions (Baumgartner, Delmonico, FNIH1), the presence of sarcopenia is based solely on the measure of appendicular lean mass. In later definitions (FNIH2, IWGS, EWGSOP 1 and 2, Morley, and AWGS), this is combined with the requirement for impaired strength or function, assessed through grip strength, chair stand time, or gait speed. In EWGSOP2, "confirmed" sarcopenia is based on low DXA ALM/height 2 in combination with increased chair stand time or low grip strength; additional low gait speed constitutes severe sarcopenia. The SDOC approach dispenses with ALM entirely. The cut points used in the sarcopenia definitions, together with the prevalence of each definition by cohort, are demonstrated in Table 1.

Statistical methods
Clinical outcomes comprised any fracture, osteoporotic fracture (defined according to Kanis and colleagues (30) as clinical vertebral, ribs, pelvis, humerus, clavicle, scapula, sternum, hip, other femoral fractures, tibia, fibula, distal forearm/wrist), MOF, and hip fracture. An extension of Poisson regression models (31) was used to study the association between the future risk of fracture and sarcopenia, FRAX, prior falls, and BMD. All associations were adjusted for age and time since baseline. In contrast to logistic regression, the Poisson regression uses the length of each individual's follow-up period and the hazard function is assumed to be exp(β 0 + β 1 Á current time from baseline + β 2 Á current age + β 3 Á variable of interest). The observation period of each participant was divided into intervals of 1 month. One fracture per person and time to the first fracture were counted, and time at risk was censored at the time of first fracture, loss to follow-up, death, or end of follow-up. Unlike a Cox model, the Poisson model uses a data duplication method, accounting for the competing mortality risk for fracture risk prediction. (32) We initially investigated the predictive value of each sarcopenia definition adjusted only for age and follow-up time. Subsequently, we used multivariate models to investigate the predictive value of these definitions independent of FRAX, prior falls, or BMD (entered into the model as femoral neck T-score). The association between sarcopenia definition (yes/no) and risk of fracture is presented as a hazard ratio (HR) together with 95% confidence intervals (CI). Two-sided p values were used for all analyses and p < .05 considered to be significant. Analyses were undertaken separately within each cohort and then the β-coefficients from each cohort were weighted according to the variance and merged to determine the weighted mean of the coefficient and its standard deviation (fixed-effects metaanalysis, since heterogeneity was low to moderate as assessed by I 2 ). (33) The risk ratios are then given by e (weighted mean coefficient) . Finally, we investigated whether the magnitude of associations differed by age.

Characteristics of participants
The study cohort consisted of 10,411 men who had information on the key exposures, together with prior falls and femoral neck BMD: (4) 5660 men in USA (mean age 73.5 years; mean follow-up 10.9 years); 2764 men in Sweden (mean age 75.4 years; mean follow-up 8.7 years); and 1987 men in Hong Kong (mean age 72.4 years; mean follow-up 9.9 years) ( Table 2). Previous fracture was more commonly reported in Sweden (35%) than in the USA (22%) and Hong Kong (14%). The frequency of past falls was similar across the cohorts at 20%, 16%, and 15%, respectively. Consistent with the known country-specific epidemiology of fracture, the highest mean FRAX MOF probability (with BMD) was observed in Sweden (11.4%), followed by USA (7.8%) and Hong Kong (6.6%). Supplemental Table S1 presents the baseline characteristics according to whether FRAX probability was available for analysis or not, demonstrating that overall, the populations with or without FRAX were similar within countries.

Proportion of cohort defined as sarcopenic by individual definition
The proportion of the population defined as sarcopenic varied markedly by individual definition, much more so than by country cohort (Table 1). Thus, the Baumgartner definition gave the highest prevalence in all three cohorts (35% in Hong Kong, 22% in Sweden, and 21% in USA). In contrast, the proportion who were defined as sarcopenic was 10-fold or more lower using the FNIH2 definition (4% in Hong Kong, 0.4% in Sweden, and 0.9% in USA). The EWGSOP2 severe sarcopenia definition also yielded low prevalences: 3.6% Hong Kong, 0.6% Sweden, 0.5% USA.

Associations between sarcopenia definition and incident fracture
In base models adjusted for age and follow-up time only, all sarcopenia definitions other than FNIH 1 and 2 were predictive of incident fracture across the fracture groupings. Overall, the Baumgartner and Delmonico definitions, which are based on ALM alone, had somewhat lower hazard ratios for fracture (Baumgartner HR for MOF = 1.39 [95% CI 1.22-1.58] and Delmonico HR = 1.40 [95% CI 1.23-1.59]) than did the definitions incorporating ALM and a measure of function or strength, from IWGS, EWGSOP (1 and 2-confirmed), Morley, and AWGS, with hazard ratios for MOF ranging from 1.60 to 1.92. The highest hazard ratios were for the SDOC and EWGSOP2 severe sarcopenia definitions: for example, EWGSOP2 severe, MOF: HR = 2.07 (95% CI 1.28-3.33) and hip: HR = 2.40 (95% CI 1.26-4.57), albeit with relatively wide confidence intervals. These associations are summarized in Table 3 and Fig. 1, and representative associations by country cohort are presented in Supplemental Table S2.
Effect of adjustment for prior falls, FRAX probability, or femoral neck BMD T-score Table 3 and Fig. 1 demonstrate that inclusion of prior falls or FRAX MOF probability (calculated without or with femoral neck BMD) in the models in addition to age and follow-up time did not materially change the magnitude of the relationships. In contrast, inclusion of femoral neck BMD T-score in addition to age and follow-up time attenuated the predictive value of sarcopenia definitions. Indeed, the definitions of Baumgartner and Delmonico, together with that of Morley, were no longer statistically significant predictors of incident major osteoporotic fracture.

Interaction between predictive value and age
We investigated whether the magnitude of association between sarcopenia presence and incident fracture differed by age. There was evidence of such an effect, summarized in

Discussion
In this large population of older men, uniformly assessed across three international cohorts, we observed that sarcopenia definitions other than those based on ALM divided by BMI were modestly predictive of incident fractures but that this association was attenuated when femoral neck BMD T-score was incorporated in the regression models. The SDOC and EWGSOP2 severe sarcopenia definitions appeared the most predictive of fracture outcomes but at low population prevalences. Indeed, as has been observed elsewhere, (34) the prevalence of sarcopenia varied 10-fold or more according to individual definitions within each country cohort, which clearly presents some practical difficulties with the operationalization of sarcopenia definitions in clinical practice or in their use as endpoints in clinical trials of agents aimed at treating the condition. This variability will also have an obvious impact on global sarcopenia epidemiology. Our recent findings that ALM/height 2 , a key component of sarcopenia definitions, is poorly predictive of incident fractures in men and women (and even potentially a risk factor for hip fractures) after adjustment for femoral neck BMD (4,35) are consistent with previous studies in the Women's Health Initiative (WHI). In one analysis, WHI participants were classified into mutually exclusive groups based on BMD and sarcopenia (dichotomous variable according to appendicular lean mass adjusted for height and fat mass) status. (6) Although low BMD was associated with increased risk of hip fracture, women with sarcopenia alone had similar hazard ratios for hip fracture to non-sarcopenic women with normal BMD, suggesting that sarcopenia alone is not predictive of hip fracture. In a further WHI study, appendicular lean mass was predictive of incident hip fracture among 872 participants 65 years or older who met Fried's criteria for frailty, but this association did not remain statistically significant after adjusting for total hip BMD. (36) Sarcopenia definitions of course reflect the contribution of their constituent parts, be that appendicular lean mass alone or together with gait speed and physical performance measures or grip strength. (2) Evidence to date suggests that it is the appendicular lean mass component, derived from DXA, which limits the predictive capacity of sarcopenia definitions for incident   Models are presented adjusted for age and follow-up time alone and then additionally for either prior falls, FRAX MOF probability without BMD, FRAX MOF probability with BMD, or femoral neck BMD T-score. Associations with p < .05 are in bold. n = 10,411 except for +FRAX with and without BMD (n = 7531).

Journal of Bone and Mineral Research
fracture, particularly when femoral neck BMD T-score is also included. (2) Thus, we demonstrated in MrOS that DXA ALM is modestly predictive of incident fracture independently of past falls and FRAX probability. (4) However, the relationship was markedly attenuated by the addition of femoral neck BMD T-score, and indeed greater ALM (or ALM/ht 2 ) appeared to be a risk factor for hip fracture after accounting for femoral neck BMD. A similar finding was observed in the Health ABC study, (5) with the findings possibly suggesting that muscle mass in excess of bone mass might be a pro-fracture state. However, this would seem to be at odds with the general adaptation of bone to muscle, (37) and excess muscle mass or power over bone strength seems unlikely in older men (compared with younger athletes, for example). Similar findings come from the Framingham study, (38) although in the Swiss GERICO study adjustment of low lean mass for BMD did not substantially attenuate associations with incident fracture, (39) and differential patterns by sex have been noted elsewhere. (7)   The reason for this limited value of DXA appendicular lean mass in the prediction of fracture, and indeed in the prediction of other outcomes such as falls and mortality, may reflect several potential factors. (2) First, ALM is not a direct measure of muscle itself but reflects all of the non-fat, non-bone tissue within the body. (40) Second, it is derived from the same measurement instrument as femoral neck BMD; the body composition equations incorporate two compartments to solve the third, meaning that a mathematical relationship between lean mass and BMD is inevitable. (40) Third, there is a clear biological relationship between muscle and bone, elegantly laid out in the mechanostat hypothesis. (37) Thus, there is the potential for ALM to act as a surrogate for BMD if this relationship is not considered. Indeed, in our recent analysis of the MrOS cohort, measures of physical performance such as gait speed and chair stand time, together with grip strength, appeared to be rather more robust predictors of incident fracture than did DXA ALM, (4) supporting the notion that it is the non-DXA components of the more recent sarcopenia definitions that drive relationships with incident fracture. Other studies have similarly demonstrated the greater predictive capacity of physical function over DXA ALM as an estimate of muscle mass. (41)(42)(43)(44)(45) Taken together, these findings suggest that alternative measures of muscle, such as creatine dilution (46) or muscle crosssectional area or density from (p)QCT, (47)(48)(49) may offer a more useful measure on which to base sarcopenia definitions. These notions have been recognized in the recently revised guidelines from the US Sarcopenia Definitions and Outcomes Consortium, which dispenses with ALM completely, and from the European Working Group on Sarcopenia in Older People (EWGSOP2), which incorporates muscle strength rather than mass as the initial assessment, with the latter assessable via a range of measures rather than simply DXA ALM. (50) Of note is that the resulting SDOC and EWGSOP2 severe sarcopenia definitions within this revised guidance were associated with the greatest magnitude of association with incident fracture outcomes in our analyses and that these associations were only modestly attenuated by incorporation of femoral neck BMD T-score. The EWGSOP2 cut-offs for ALM/height 2 , gait speed, and grip strength were similar to those used in the AWGS definition (which is, of course, predicated on the generally smaller body size in the Asian population), and for grip strength, in the FNIH2 definition, but differs in the incorporation of chair stand time (previously shown to predict fracture risk (4,44) ) from AWGS and in use of ALM/ht 2 rather than ALM/BMI from FNIH2. It is therefore likely that this additional functional component reduces the attenuating effect of BMD, but this apparent advantage must be taken in the context of the very low population prevalence of sarcopenia using these criteria. It is notable that adjustment for FRAX probability calculated with inclusion of femoral neck BMD had a much less obvious influence on these associations. Although this may at first sight appear surprising, it is important to recognize that femoral neck BMD T-score is a very different construct to FRAX probability calculated with BMD as one of several input variables, all interlinked through a multivariate structure and generating a probability over 10 years, which synthesizes risk of fracture with risk of death.
The evidence of an age interaction (illustrated for the EWG-SOP2 definition in Table 4, and for other definitions in Supplemental Table S3) suggests that sarcopenia is actually more predictive of fracture at younger compared with older ages, although given the potential for healthy selection bias in cohorts, it is possible that a less marked pattern would be observed in a completely unselected general population. Importantly, the Poisson model uses a data duplication approach that accounts for the competing hazard of mortality, thus reducing the likelihood that this finding is simply attributable to higher mortality at older ages. Although this may lessen the clinical impact of such definitions in the populations at highest risk, this observation is similar to the age patterns documented with many risk factors; essentially, occurrence of a risk factor becomes less unusual compared with general population as age increases. The perhaps more pertinent clinical implication arises in the markedly diverse range of prevalences according to definition and country cohort. Clearly a definition that detects less than 0.5% of the population as having the condition of interest is likely to have very limited impact on health care, a notion that seems rather at odds with the widespread recognition of the increasingly elderly and frail populations in many countries globally. (51) We studied three well-characterized cohorts drawn from general populations with standardized assessments and prospective recording of fractures. However, there are some limitations that should be considered in the interpretation of our findings. (16) First, the population studied was exclusively male, and of a modest age range (64 to 99 years), so limiting generalizability of our findings. Second, we were limited to DXA measures of lean mass, so that both lean and bone measures were obtained from the same scanner and DXA only approximates muscle mass. Third, comparison of predictive value would ideally compare exposures with similar prevalence in the population. Because the sarcopenia definitions were dichotomous, this was clearly not feasible, and it is therefore possible that some of the difference in the effect size between definitions was determined by the differences in prevalence. Indeed, it is notable, however, that patterns for the EWGSOP2 confirmed, Morley, and AWGS, which were all of similar prevalence, were broadly similar. Fourth, there was no information on causes of secondary osteoporosis (other than rheumatoid arthritis and glucocorticoids), and this variable was therefore set to null. The effect of these considerations on our findings is uncertain but may have led to an underestimation of risk by FRAX. Finally, the definition of glucocorticoid use differed from those usually specified for incorporation into FRAX.
In conclusion, we have demonstrated novel findings that sarcopenia definitions based on appendicular lean mass corrected for height squared, but not corrected for BMI, are modestly predictive of incident fracture after adjustment for FRAX probability and more so for those definitions also incorporating measures of physical performance and/or muscle strength. The predictive value of such definitions may vary with age and is attenuated to different degrees by the inclusion of femoral neck BMD Tscore. This latter observation is consistent with numerous observations suggesting the limited value of DXA appendicular lean mass as a measure of muscle in fracture risk assessment but has not been quantified previously. Our findings thus support the inclusion of physical performance measures in the assessment of sarcopenia and also the investigation of alternative measures of muscle such as creatine dilution and pQCT, which may prove to be more usefully incorporated into sarcopenia definitions, at least in the context of predicting incident fractures. tion. EVM and JAK oversee FRAX and provided FRAX methodology. NCH