Adiposity assessments: agreement between dual-energy X-ray absorptiometry and anthropometric measures in US children1-3

Objectives To evaluate performance of anthropometric measures relative to percentage body fat (%BF) measured by dual-energy X-ray absorptiometry (DXA) in children. Design and Methods We used data from 8-19-y-old US children enrolled in a nationally representative cross-sectional survey in 2001-2004 (n=5,355) with measured %BF, body mass index (BMI), waist circumference (WC), waist-to-height ratio (WHtR) and triceps skinfold thickness (TSF). Agreement and prediction were evaluated based on standardized regression coefficients (β), kappa, and the area under the receiver-operating characteristic curves (AUC). Results The association between Z scores for %BF and anthropometric measures was strong (β of ~0.75-0.90, kappa of ~0.60-0.75, and AUC of ~0.87-0.98; P<0.001 for all), with only some variations by race-ethnicity, mostly in girls. In boys, TSF and WHtR Z-scores had stronger agreement with %BF than BMI (β of 0.91 and 0.86 vs. 0.79, kappa of 0.75 and 0.71 vs. 0.59, and AUC of 0.97 and 0.97 vs. 0.91; P<0.05 for all). In boys with BMI < median but %BF ≥ median, β value of TSF Z score was higher than those from BMI. In girls, TSF also provided a higher agreement than BMI, but was only statistically higher for kappa. Conclusions High agreement and small racial-ethnic variations in the association between percentage body fat and anthropometric measures support the use of anthropometric measures, especially waist-to-height ratio and triceps skinfold thickness, as proxy indicators for adiposity.


Introduction
Although obesity is defined as excess fatness (1), anthropometric measures such as body mass index (BMI) are used for screening, diagnosis, and treatment of childhood obesity (2)(3)(4). Because anthropometric measures are non-invasive, inexpensive, easy to use, and relatively precise, they are commonly collected in clinical practice and field research (5,6). Anthropometric measures, however, are indirect measures and might yield biased estimates of adiposity (5)(6)(7). In addition, challenges associated with the use of anthropometric measures in children are the variation of anthropometric measures by age, sex, and race-ethnicity (5-7) as well as associated measurement errors (8). Dual-energy X-ray absorptiometry (DXA) provides direct, precise, and accurate measures of adiposity, but it is a less practical approach because of the cost and operation issues (5,6,9). Quantifying the performance of different anthropometric measures based on directly measured adiposity and understanding potential racial-ethnic differences are needed to guide the use of anthropometric measures.
There was a stronger association or better prediction of increased %BF by BMI than by waist circumference (WC) in 9 to 10-year-old English children (17), by triceps skinfold thickness (TSF) than by BMI in 10 to 15-year-old Portuguese children (16) and by WC than by waist-to-hip ratio in 3 to 19-year-old New Zealanders (15), and by sum of subscapular and TSFs than BMI in 8 to 19-year-old U.S. children (12). With one exception (12), these studies, however, were based on small samples (15,16) with narrow age ranges (15)(16)(17) and did not evaluate racial-ethnic variations (15)(16)(17). The study by Freedman et al. (12), however, did not explore performance of WC and waist-to-height ratio (WHtR).
Our study evaluated the performance of different anthropometric measures relative to %BF as measured by DXA in 8 to 19-year-old U.S. children based on nationally representative data. Specifically, by sex and race-ethnicity, we (a) examined the agreement between Z-scores for %BF and BMI, WC, WHtR, and TSF; and (b) compared the performance of these anthropometric measures for screening high %BF (75th percentile). Understanding the performance of different anthropometric measures and related potential differences by age, sex, and race-ethnicity would help to identify valid anthropometric measures as indicators of obesity and to evaluate the need for racial-ethnic-specific anthropometric references if they are based on measured adiposity. Using anthropometric measures to timely and correctly identify at risk children in clinics or population surveys for appropriate intervention would help to lessen burdens of obesity and related disorders.

Study population
We used data from 5,355 children aged from 8 to 19 years, enrolled in the 2001-2004 National Health and Nutrition Surveys (NHANES) and with measured fat mass, weight, height, WC, and TSF. NHANES is a nationally representative, multiethnic sample of the U.S. civilian and included a household interview followed by a standardized physical examination and laboratory assessment (18). We restricted our analysis to the NHANES 2001-2004 data because, (a) DXA measured fat mass data were only available in NHANES 1999-2004 and (b) DXA data for NHANES 1999-2000 were not available in the public domain for 8 to 17-year-old girls (18).

Key study variables
Body fat and high %BF. Body fat was estimated by wholebody DXA scans (Hologic QDR 4500A fan-beam densitometer; Discovery software version 12.1) administered to participants 8 years of age and older, excluding pregnant females and participants who reported taking tests with radiographic contrast material in the preceding 72 hours, who had participated in nuclear medicine studies in the previous 3 days, or who had a self-reported weight or height over the DXA table limit such as weight >300 lb or 136 kg; height >6 0 5 00 or 196 cm (18). DXA is considered a quick, accurate, precise, low-risk approach to assess body composition and is often used as a gold standard method in field studies (5,6).
In our analysis, we focused on %BF; high %BF was defined based on the age-and sex-specific %BF being 75th percentile for this population. The 75th percentile of %BF in 2001-2004 was equiva-lent to a %BF of 26%-33% in boys and 36%-38% in girls (Table 1). Population-specific percentiles of %BF has been used by other authors (13,14) because there is no agreement about pediatric body fat cut points (7). We chose not to use fat mass (kg) as a study outcome because of the additional complexity in data analysis and interpretation due to its variation by body weight (5).
Anthropometric measures. We used the following measures: Body mass index (BMI; kg/m 2 ) was calculated based on weight and height, which were measured by trained health workers who followed standardized procedures and used regularly calibrated Seca electronic scales and stadiometers (18).
Waist circumference (WC; cm) was measured at the uppermost lateral border of the hip crest (ilium) to the nearest mm using steel measuring tape (18). We also used WHtR (WHtR 5 % waist circumference/height) to take into account the variation in height.
Triceps skinfold thickness (TSF; mm) was measured to the nearest 0.1 mm using a Holtain skinfold caliper (18). TSF represents extremity fat depots and is among the most commonly collected skinfold thickness (5).
Z-scores and percentiles of %BF and anthropometric measures: We calculated age-and sex-specific Z-scores and percentiles for these measures to facilitate the comparison across age and sex groups (see below).
Demographic characteristics. Age, sex, and race-ethnicity information was obtained from the direct interview. Race-ethnicity was classified as non-Hispanic white (white), non-Hispanic black (black), Mexican American, and others (18). Racial-ethnic composition of this sample was 1,538 whites, 1,771 blacks, 1,627 Mexican Americans, and 419 others, which represent the U.S. population composed of 62.0% whites, 14.5% blacks, 11.5% Mexican Americans, and 12.0% others. We did not report findings for children belonging to other ethnicities because it was a small, heterogeneous group as regards %BF and anthropometric measures.

Statistical analysis
We calculated age-and sex-specific Z-scores using the LMSChartmaker Pro 2.54 (19) and then performed all analyses using Stata 11.2 (Stata Inc., TX). Survey commands were used in all analyses (except for kappa statistic) to take into account NHANES' complex survey design and to represent the 40 million U.S. children. We stratified our analyses by and compared across racial-ethnic groups and anthropometric measures using two-sided v 2 (for proportion) and t tests (for mean). Because we did not use imputed data, the multiple-imputation data analysis is not necessary.
First, we calculated age-and sex-specific Z-scores to facilitate comparisons across these groups. We established age-and sex-specific reference values for %BF and each anthropometric measure based on the NHANES 2001-2004 population (n 5 5,355) using the LMS method (19). The LMS method summarizes the distribution of each measure by age and sex in terms of three curves called L (lambdaskewness), M (mu-median), and S (sigma-coefficient of variation), which are needed to transform the data to near normality (19). This approach was used to develop the U.S. 2000 CDC Growth References, the World Health Organization (WHO) Growth Standards, and the International Obesity Task Force (IOTF) BMI cut points (20)(21)(22)(23). For the purpose of this study, we chose to use internal BMI Z-scores because this approach (a) facilitates the comparison between %BF and BMI (e.g., for Kappa statistics) and (b) could be performed consistently with other anthropometric measures (no universal agreement about reference curves).
Values of %BF and anthropometric measures of each child were compared with corresponding, newly developed age-and sexspecific reference values to estimate Z-scores and percentiles. Selected percentile values are presented in Table 1. All measures varied substantially by age and sex, except for WHtR (the 50th percentile of about 0.47 and 75th percentile of about 0.52). The Z-scores/percentiles were used in continuous, dichotomized (e.g., <75th vs. 75th percentile) and categorical (quartiles) scales, when appropriate.
Second, to examine the pattern of different indicators for adiposity, we compared Z-scores for %BF, BMI, WC, WHtR, and TSF and the prevalence of high %BF (%BF 75th percentile) across sex and racial-ethnic groups.
Third, we used linear regression models to examine the agreement between Z-scores for %BF and BMI, WC, WHtR, and TSF (four pairs), stratified by sex and race-ethnicity. We repeated this analysis for children with a lower BMI but higher %BF (BMI < median but %BF median) and a higher BMI but lower %BF (BMI median but %BF < median). We reported regression coefficients (b) and coefficient of determination (R 2 ) from all linear regression models. We also presented the best fit curves (LOWESS, locally weighted regression scatter-plot smoothing) for the association between the Z-score of %BF and the Z-score of each anthropometric measure.
Fourth, the agreement between quartiles of %BF and of each of the anthropometric measures was assessed using weighted kappa that ranges from 0-lowest to 1-highest (24), stratified by sex and raceethnicity. A kappa 0.8, 0.6-0.79, 0.4-0.59, and 0-3.9 denotes excellent, substantial, moderate, and marginal reproducibility, respectively (24). A weighted kappa assigns less weight to the   Obesity Agreement in Adiposity Assessments in Children Tuan and Wang agreement as categories are further apart. Applying the linear weight equation (Linear weight 5 distance/distance max ) in four groups, the weights are 1, 0.67, 0.33, and 0 for the distance 0, 1, 2, and 3, respectively (24).
Finally, we estimated area under the receiver-operating characteristic (ROC) curves (AUC) (25) for the screening of a %BF 75th percentile by using different anthropometric measures, stratified by sex and race-ethnicity. AUC values are usually used as criteria to evaluate and compare overall performances of different screening tests in correctly classifying those with and without a condition (25). For example, an AUC value of 0.8 indicates that 80% of the time, a randomly selected individual from the diseased group has a test value larger than that for a randomly selected individual from the nondiseased group. AUC values range from 0.5-no prediction to 1.0-perfect prediction (25).

Racial-ethnic variation in adiposity
In boys, Table 2 shows that the 37% prevalence of high %BF ( 75th percentile) in Mexican Americans was higher than among whites (30%; P < 0.10) and blacks (20%; P < 0.05). The Z-score for %BF in black boys was smaller than that for whites and Mexican Americans (P < 0.05). Similarly, the median %BF across age in white boys was approximately 2.5-4 percentage points higher than in blacks and approximately 2-4 percentage points lower than Mexican Americans (data not shown). Z-scores for anthropometric measures also varied by race-ethnicity: lower in black boys than whites (WC, WHtR, and TSF; P < 0.05); lower in blacks than Mexican Americans (BMI, WC, WHtR, and TSF; P < 0.05); and lower in whites than Mexican Americans (WHtR; P < 0.05).
In girls, the 33% prevalence of high %BF in Mexican Americans was also higher than in whites (25%; P < 0.10) and blacks (25%; P < 0.05). The Z-score for %BF in Mexican American girls was higher than in whites and blacks (P < 0.05); while the Z-score for %BF was higher in whites and blacks (P < 0.05). Similarly, median %BF in white girls was approximately 0-2.5 percentage points higher than in blacks and approximately 0-2.5 percentage points lower than in Mexican Americans (data not shown). The Z-score for BMI was higher in black girls than in whites and Mexican Americans (P < 0.05); and the Z-score for WHtR was higher in Mexican American girls than in white and black girls (P < 0.05; Table 2).

Agreement between Z-scores for %BF and anthropometric measures
Standardized regression coefficients (b) between Z-scores for %BF and anthropometric measures were approximately 0.75-0.90. In boys, b for the TSF Z-score was 0.91, higher than the WHtR Z-score (0.86; P < 0.05), the WC Z-score (0.83; P < 0.05), and the BMI Zscore (0.79; P < 0.05), which was driven by whites and Mexican Americans (Table 3). In girls, the b values were similar across raceethnicity and anthropometric measures, except for being higher (P < 0.05) in Mexican Americans than in blacks for the BMI Z-score (Table 3). Similar trend was found for R 2 (Table 3). Table 4 shows that, among those with a lower BMI but higher %BF or a higher BMI but lower %BF, (a) b values and R 2 were lower than values from all participants presented in Table 3 and (b) b value of TSF Zscore was higher in boys than girls. In boys with a lower BMI but  Abbreviations: BMI, body mass index; R 2 , coefficient of determination; TSF, triceps skinfold thickness; WC, waist circumference; WHtR, waist-to-height ratio. 1 Age-and sex-specific Z-scores were estimated using the LMS curve-fitting procedure based on the NHANES 2001-2004 population. 2 Data were weighted to represent the 40 million US children. All coefficients differ from zero (P < 0.001).
*Differed from whites within sex; † differed from blacks within sex; ‡ differed from boys within race-ethnicity (P < 0.05; t-test).

a,b,c
Values in a subgroup (e.g., by sex and racial-ethnic group) with different superscript letters (a, b, or c) were significantly different (P < 0.05; t-test); a group without the letters indicates no difference (P 0.05; t-test). higher %BF, b value of TSF Z-score was higher than those of BMI and WC Z-scores (Table 4). Figure 1 shows that across sex and racial-ethnic groups, anthropometric Z-scores correlated well with the Z-score for %BF with a similar correlation slope when anthropometric Z-scores greater than or equal to 21. However, these slopes varied by anthropometric measure when the anthropometric Z-scores less than 21. The slope for the association between Z-scores for TSF and %BF was more consistent than for other anthropometric Z-scores. Figure 2 shows that weighted sex-specific kappa values for the agreement between quartiles of %BF and of each of the anthropometric measures were approximately 0.60-0.75, higher in boys than girls (WHtR and TSF; P < 0.05), increased with the order being from BMI, WC, WHtR to TSF. Kappa for BMI in black boys was lower than in Mexican American and white boys (P < 0.05). In girls, kappa was lower in Mexican Americans than in whites (BMI; P < 0.05) and blacks (TSF; P < 0.05). Kappa values were higher in boys than girls for TSF (in whites, Mexican Americans, and overall; P < 0.05) and WHtR (in Mexican Americans and overall; P < 0.05).

Performance of anthropometric measures for the prediction of high %BF
AUC was used to evaluate the performance of each anthropometric measure for the screening of high %BF (75th percentile) by sex and race-ethnicity. Sex-specific AUC values ranged from 0.89 to 0.97 and varied across anthropometric measures: WHtR and TSF provided statistically higher AUC than BMI and WC in boys (overall and in each race-ethnicity; P < 0.05) and girls (overall for WHtR; P < 0.05; Figure 3). In girls, AUC values for BMI, WC, WHtR, and TSF were lower (P < 0.05) in Mexican Americans than in blacks. AUC values for TSF were smaller in girls than in boys for overall and ethnic-specific estimates (P < 0.05). AUC values for WHtR were smaller (P < 0.05) in girls than in boys overall and in Mexican Americans (Figure 3).

Discussion
Based on nationally representative data from 8-to 19-year-old U.S. children, we found that Z-scores for anthropometric measures were highly correlated with the Z-score for DXA-measured %BF across racial-ethnic groups and that anthropometric measures predicted well increased %BF. The moderate to high agreement between anthropometric measures had been reported earlier using different approaches (12)(13)(14)(15)(16)(17). In addition, WHtR and TSF had a stronger association with %BF and a better prediction for increased %BF than BMI and WC, especially in thin boys.
Our study suggests that TSF and WHtR are very good predictors of DXA-measured %BF in pediatric obesity research and in clinical settings. TSF was strongly correlated with %BF and provided higher kappa and AUC values than BMI and WC, especially among boys with a lower BMI but higher %BF. TSF is an important complement to the limitation of BMI in assessing adiposity in thin children (12). TSF provides a more direct measure of adiposity than other anthropometric measures (6), predicts %BF (16,26), and is commonly used in the equation for body composition estimation (5). Several reference curves (27,28) and computerized assisting software, such as LMSgrowth (29), have been developed and used.
In addition to TSF, WHtR can be a useful measure of, considering its accuracy, precision, accessibility and acceptability as well as the availability of reference values (3). WHtR was correlated well with %BF, and previously found to predict increased cardiometabolic risk in cross-sectional and longitudinal studies (30,31). Similar to previous studies (31-34), we found that WHtR was similar across age and sex (e.g., 75th percentiles were around 0.51 in boys and around 0.52 in girls) and height independent (correlation 5 0.01; P 5 0.40), TABLE 4 Standardized regression coefficients (6SE) and coefficient of determination (R 2 ) for the agreement between age-and sex-specific Z-scores for percentage body fat (%BF) and individual anthropometric measures among children with lower BMI but higher %BF and higher BMI but lower %BF 1,2 Lower BMI but higher %BF Higher BMI but lower %BF Abbreviations: BMI, body mass index; R 2 , coefficient of determination; TSF, triceps skinfold thickness; WC, waist circumference; WHtR, waist-to-height ratio. 1 Lower BMI but higher %BF: BMI< median but %BF median. 2 Higher BMI but lower %BF: BMI median but %BF < median. ‡differed from boys within race-ethnicity; § coefficients differ from zero (P < 0.05; t-test).
Values in a subgroup (e.g., by sex) with different superscript letters (a or b) were significantly different (P < 0.05; t-test); a group without the letters indicates no difference (P 0.05; t-test).
which support the use of a WHtR cut point of 0.50 to define central obesity in children and adults (31)(32)(33)35).
Weight, height, and BMI are accurate, precise, accessible, and acceptable, with widely accepted reference values, and thus should continue to be used as proxy indicators for adiposity in pediatric research and clinical practice (2,(6)(7)(8). BMI levels (e.g., normal weight, overweight, and obesity) classified well children by adiposity levels (e.g., normal, high) (13,14,20,36). However, the definition of normal weight in these studies (13,36) covered a wide range of BMI: <85th percentiles of the CDC BMI growth charts (13) or corresponding to a BMI of <25 kg/m 2 at 18 year of the IOTF BMI cut points (36), which might lead to an overestimate of the overall agreement, especially in thinner children (11). We evaluated the agreement based quartiles of %BF and BMI and found a high agreement between adiposity and BMI, which supports the use of BMI to rank children by body fat.
In addition, we explored the sex-racial-ethnic variation in the agreement between anthropometric measures and %BF. Kappa for the agreement between BMI and %BF was higher in Mexican American boys than in blacks and whites, and in white girls than in Mexican Americans (P < 0.05). Because black boys tend to have a lower %BF at a given BMI than Mexican Americans (37), there would be more of a misclassification of %BF by BMI (e.g., there would be high BMI but low %BF) in blacks, and thus less agreement. In contrast, because Mexican American girls tend to have higher %BF than whites at a given BMI (37), there would be more of a misclassification of %BF by BMI (e.g., having a low BMI but a high %BF) in Mexican Americans, and thus less agreement. AUC values for all anthropometric measures were lower in Mexican Americans than in black girls (P < 0.05). Nonetheless, the absolute difference in AUC was small (0.03-0.06). The variation in body composition due to social and environmental factors or biological genetic heredity is not easily distinguished (38), especially with the use of self-identified race-ethnicity (39).
We would expect very similar regression coefficients (b) and coefficient of determination (R 2 ) when the BMI Z-scores from the CDC or WHO growth charts had been used because these BMI Z-scores are highly correlated our population-specific one Z-score (Pearson's correlation coefficients of 0.99; P < 0.001). However, using the FIGURE 1 Associations between age-and sex-specific Z-scores for percentage body fat (%BF) and anthropometric measures: LOWESS smoothing procedure. BMI, body mass index; TSF, triceps skinfold thickness; WC, waist circumference; WHtR, waist-to-height ratio. Age-and sex-specific Zscores were estimated using the LMS curve-fitting procedure based on the NHANES 2001-2004 population.
25th, 50th and 75th percentiles of the WHO or CDC growth charts will lead to a lower kappa and AUC values because corresponding values are from 0.4 to 0.8 Z-score units less than our population-based reference curves. Because there is no universal agreement about Zscores, percentiles and cut points for WC, WHtR, TSF, and %BF, the use of population-specific Z-scores and percentiles is more feasible.
There are several strengths of this study. First, data analysis was based on adiposity measured accurately using DXA (6,7) from a large, nationally representative, multiethnic sample of U.S. children. Each sex and racial-ethnic group have similar sample size due to the oversample of ethnic minority groups (18), which facilitate the comparison across groups. Second, we examined different aspects of agreement (e.g., regression coefficients, coefficient of determination, kappa, and AUC) by sex, race-ethnicity, and anthropometric measures. Third, we developed age-and sex-specific reference values for %BF and each anthropometric measure to calculate Z-scores and percentiles. Using the age-sex-specific, normally distributed Z-score provides a more accurate linear regression coefficient than the use of skewed %BF and anthropometric measures, facilitates the comparison across groups, and increases the power of statistical tests. Using Z-scores is important because %BF and anthropometric measures in children vary by age and sex and are non-normally distributed (3,5,22,40). In addition, age-and sex-specific reference values for %BF, WC, WHtR, and TSF could be used in LMSgrowth, a freeware program, to generate the Z-score and percentiles of other data for 8 to 19-year-old children, which can be a useful reference for future research.
Several limitations might affect the generalizability of our findings. First, the data did not allow us to provide findings for certain groups such as children younger than 8 years old (DXA data were not obtained) or other race-ethnicities (a small heterogeneous group). Second, we used a sample with a complete measure of %BF and FIGURE 2 Weighted kappa (95% CI) for the agreement between quartiles of percentage body fat (%BF) and anthropometric measures. BMI, body mass index; TSF, triceps skinfold thickness; WC, waist circumference; WHtR, waist-to-height ratio. *Differed from whites within sex; †differed from blacks within sex; ‡differed from boys within race-ethnicity (P < 0.05; t-test). a,b,c Values in a subgroup (e.g., by sex and racial-ethnic group) with different superscript letters were significantly different (P < 0.05; t-test); a group without the letters indicates no difference (P 0.05; t-test). studied anthropometric measures in NHANES 2001-2004, which could have excluded some children whose weight exceeded the equipment capacities. A very heavy child is classified as obese regardless of adiposity measures, which leads to an increase in numerators for all estimates. Thus, the agreement between %BF and anthropometric measures could be slightly stronger if there were no missing values for %BF. Third, we chose not to include subscapular skinfold thickness because there were about 500 missing values (20% of them were due to exceeding the capacity of the skinfold caliper), which is less likely to be a random situation.
Fourth, we limit our analysis to the NHANES 2001-2004 data. In 1999, DXA was not performed in about 600 girls aged 8-17 years due to a lack of institutional review board approval; thus, DXA data for NHANES 1999-2000 were not available in the public domain for this age group (13). Missing DXA data had been imputed and provided by the National Center for Health Statistics at the CDC (18). We chose not to use imputed data for our analysis because (a) anthropometric measures, age, sex, and race-ethnicity were key predictors for imputing %BF (18); (b) with a smaller sample size, we would have a more conservative finding; and (c) several key analyses (e.g., correlation, kappa) could not be performed in the multiple imputation analysis that is required for imputed data (18).
In conclusion, the strong agreement between %BF and anthropometric measures such as BMI, WC, WHtR, and TSF demonstrated a FIGURE 3 AUC (95%CI) for the prediction of high percentage body fat (%BF 75th percentiles) by anthropometric measures. BMI, body mass index; TSF, triceps skinfold thickness; WC, waist circumference; WHtR, waist-to-height ratio. *Differed from whites within sex; †differed from blacks within sex; ‡differed from boys within race-ethnicity (P < 0.05; t-test). a,b Values in a subgroup (e.g., by sex and racial-ethnic group) with different superscript letters were significantly different (P < 0.05; t-test); a group without the letters indicates no difference (P 0.05; t-test).
good prediction of increased %BF by anthropometric measures; the small racial-ethnic differences in the associations between anthropometric measures and %BF revealed by the large, nationally representative sample in the U.S. support the use of anthropometric measures in children in obesity research. Because WHtR and TSF predict increased %BF well and are easily assessed, their use in pediatric research and clinical practice should be promoted, especially in thin boys. To facilitate the use of WHtR and TSF, their universally accepted references curves should be developed. O