Validation of predictive equations to estimate resting metabolic rate of females and males across different activity levels

Using equations to predict resting metabolic rate (RMR) has yielded different degrees of validity, particularly when sex and different physical activity levels were considered. Therefore, the purpose of the present study was to determine the validity of several different predictive equations to estimate RMR in female and male adults with varying physical activity levels.


| INTRODUCTION
Resting metabolic rate (RMR) is usually defined as the minimum energy an individual needs to sustain basic vital functions (Henry, 2005;Leonard, 2012) and comprising 60%-75% is considered the largest component of total energy expenditure (TEE; Donahoo et al., 2004;Speakman & Selman, 2003).Indirect calorimetry is the gold standard for measuring RMR but it is expensive, time-consuming, and requires trained technicians to perform.As such, predictive equations are a convenient, often-used alternative to estimate RMR (Levine, 2005).
There are significant differences in body composition (FFM and fat mass -FM-) between sexes (Karastergiou et al., 2012;Klausen et al., 1997).However, some (Lindsey et al., 2021), but not all previous studies have addressed the influence of sex on the accuracy of predictive equations.Moreover, it has recently been suggested that variability in RMR is greater in males than females, even after adjusting for differences in FFM (Halsey et al., 2022; but see Buchholz et al., 2001).In addition, physical activity tends to be lower in females compared with males (Matthews et al., 2023) and this could impact RMR (Speakman & Selman, 2003;Thompson & Manore, 1996).Therefore, sex differences in physical activity (Hands et al., 2016) may also impact the accuracy of RMR equations.
The magnitude of under-or overestimation with predictive equations has also been related to demographic characteristics of the populations under study (see Henry, 2005 and the references therein), with some of the historical equations having an overrepresentation of a specific group, (i.e., Italian males in Schofield, 1985).These equations were developed in Caucasians (Reneau et al., 2019).However, the use of different populations may contribute to variability in RMR (Sabounchi et al., 2013) due, at least in part, to an effect of body composition and body fat distribution (Reneau et al., 2019).In addition, environmental adaptations have a significant effect on RMR (Froehle, 2008;Leonard et al., 2002;Ocobock, 2016Ocobock, , 2023;;Ocobock et al., 2020).However, neither ethnicity nor environmental adaptations (such as temperature, diet, or lifestyle) were considered in historical formulas to predict energy requirements (Froehle, 2008;Leonard et al., 2005;Reneau et al., 2019;Snodgrass et al., 2005).These limitations led to the necessity of testing new equations based on recent, large, and diverse datasets, such as the equation recently developed by Pontzer et al. (2021).
Thus, the purpose of the present study was to determine the validity of predictive equations based on body mass and fat-free mass to estimate RMR in female and male adults with varying physical activity levels.

| Participants
As part of a larger study, 50 healthy adults (26 females, 24 males), aged 19-58 (35 ± 10) years were recruited.The participants were uniformly distributed across levels of self-reported physical activity varying from sedentary, walking, and/or running from 0 km per week, to ultraendurance level, regularly running more than 80 km/ week (Figure 1).
Eligible volunteers were informed about the nature of the study and both verbal and written consent were obtained.Individuals who were smokers, consumed diets with extremes of macronutrient intake (e.g., ketogenic diet), were pregnant or breastfeeding, and/or took medications or had a medical history that could impact metabolic rate or make participation unsafe were excluded.All experimental procedures were approved by the Institutional Review Board at the Virginia Polytechnic Institute and State University (Virginia Tech) (IRB #21-567).

| Procedures
Height was determined to the nearest 0.1 cm and body bass (BM) to the nearest 0.1 kg using a stadiometer and stand-on scale (Scale-Tronix 5002), respectively.Fat mass (FM) and fat-free mass (FFM) were determined by dualenergy X-ray absorptiometry (DXA scan, Lunar Digital Prodigy Advance, software enCORE version 15, GE Healthcare; Madison, WI, USA), and were expressed in total kilograms and as a percentage of BM (%).Body mass index (BMI) was calculated as BM (in kilograms) divided by the square of height (in meters).
RMR was measured using indirect calorimetry (Parvo Medics, TrueOne 2400 Metabolic Measurement System, OUSW 4.3.4;Murray, Utah, USA) with a ventilated canopy following a 12-h fast as previously described (Van Pelt et al., 1997) and at least 12 hours after the last exercise training session in runners to not interfere with their habitual physiological state.RMR was measured in the supine position in a dimly lit, temperature-controlled room between 22-24 C; participants wore laboratoryprovided clothing and were covered with a blanket; the last 30 min of a 45-min period was used for analysis.RMR was measured on two occasions, separated by 14 days during which body mass was stable.As such, we utilized the second measurement for our analysis.We observed excellent test-retest reliability (r = .93;p < .001)for RMR; the within-person coefficient of variation was 2.92%.
We estimated the RMR of our participants using 10 different equations.Five of them relied on BM as an independent variable (FAO/WHO/UNU, 2004 [WHO]; Harris & Benedict, 1919 [H-B]; Mifflin et al., 1990 [Mifflin BM ]; Pontzer et al., 2021 Model 4 [Pontzer BM ]; Schofield, 1985 [Schofield]), and the other five relied on FFM (Cunningham, 1991 [Cunningham]; Johnstone et al., 2006 [Johnstone]; Mifflin et al., 1990 Model 1 [Mifflin FFM ]; Nelson et al., 1992 [Nelson]; Pontzer et al., 2021 Model 5 [Pontzer FFM ]).A description of the models is provided in Table S1.We selected these equations to compare widely used predictive formulas with recent models published by Pontzer et al. (2021).All the predictive equations applied were obtained from individuals of a wide range of adult ages and body sizes, and their sample size was ≥150 individuals with the data obtained via primary collection or meta-analysis.Equations exclusively developed in athletes were not included (i.e., De Lorenzo et al., 1999;Freire et al., 2021;Ten Haaf & Weijs, 2014).
Physical activity was measured using a triaxial accelerometer (ActiGraph GT3X, Actigraph Corporation, Pensacola, FL).Subjects were asked to wear the accelerometer on an elastic belt around their waist continuously for 14 days and to remove the device only for swimming, showering, bathing, or sleeping.Wear time log sheets were kept by each participant and accelerometer data were screened using standard methods (Chomistek et al., 2017;Troiano et al., 2008;Tudor-Locke et al., 2012).The data collection interval was set at 10-s epochs with a sampling rate of 30 Hz.At least 4 days over a 1-week period with 10 h/d or more wear time were included for analysis.
Mean counts per minute per day (CPM/d) of the three axes (triaxial vector magnitude) on valid monitoring days were used to objectively quantify physical activity levels.Self-reported physical activity levels (km walking and/or running per week) were correlated with objectively measured steps per day (correlation coefficient = .71;p < .001)and counts per minute/day (CPM/d; correlation coefficient = .55;p < .001)obtained from accelerometry (Figure 1).

| Statistical analysis
T-test analyses were used to compare sample demographics, body weight, and composition by sex and composition by sex.A one-way repeated-measures analysis of variance (ANOVA), with Bonferroi post-hoc tests, was used to compare measured and predicted RMR means.
[Correction added after first online publication on 22 November 2023.The sentence "T-test analyses were used..." has been corrected.].The level of significance was set at p < .05.Agreement between measured and predicted RMR was analyzed by Bland-Altman plots (Bland & Altman, 1986).The association between the magnitude of the RMR and the difference between predicted and measured RMR (heteroscedasticity), was examined by regression analysis, and the slope (β) pointed when the relationship was significant ( p < .05) in the Bland-Altman plots (Freire et al., 2021;Ruiz et al., 2011).This analysis was made for the entire sample and each sex separately.Bias was calculated as the mean of the difference between measured and predicted RMR, with Standard Deviation (SD).Lower (LLOA) and upper limits of agreement (ULOA) were calculated (Formula 1 in Supplementary Material).
Mean absolute percent error (MAPE) and mean difference, as a percentage (%) (Formulas 2 and 3, respectively, in Supplementary Material), were calculated to test the accuracy of predictive equations.A positive error score in these calculations demonstrates an underestimation of the models.A mean difference (%) lower than 10% is usually indicative of adequate accuracy (Pavlidou et al., 2018).The root mean square of error (RMSE) was used to calculate the average difference between predicted and measured RMR values (Formula 4 in Supplementary Material).In addition, the percentage of RMSE (RMSE%) was assessed (Formula 5 in Supplementary Material).An RMSE% value under 10% has been considered acceptable when comparing measured and predicted RMR in previous publications (i.e., Amaro-Gahete et al., 2019;Balci et al., 2021;Freire et al., 2021).
To test the accuracy of the predictive equations at an individual level, the percentage of subjects with a predicted RMR within ±10% of measured RMR was also assessed (Frankenfield et al., 2005;Marra et al., 2019;Miller et al., 2013;Xue et al., 2019).Three criteria had to be met to be considered an accurate predictive equation: no statistical difference between measured and predicted RMR (p ≥ .05);mean difference (%) ≤10%; and RMSE % ≤ 10% (Freire et al., 2021).
One-way ANOVA analyses were used to test the effect of sex on equation accuracy.Absolute biases of predicted RMR were examined against age, BM, FFM, FM, %FFM, %FM, and CPM/d by multiple regression analysis with forward stepwise selection to detect if participants' characteristics and physical activity were affecting the error magnitude of the estimations.The statistical analyses were carried out with Statgraphics Centurion XIX ® software (Statgraphics Technologies, 2022).

| Sample characteristics
The main characteristics of our sample are described in Table 1.BM, height, and FFM were significantly higher for males, while the percentage of FM (%FM) was lower when compared to females ( p < .001).There were no differences in age (p = .82)or BMI (p = .16)between sexes.RMR was higher in males compared with females (p < .001),but there were no significant differences in steps/day (p = .58)or CPM/d (p = .82)between the two groups (Table 1).

| Performance of predictive equations based on BM
The comparison between estimated and measured RMR, positive MAPE, and positive mean difference (%) values indicated that all predictive equations underestimated the RMR in the whole sample (Table 2) and in both males and females when considered separately (Table S2).Based on the accuracy metrics, the WHO equation performed best, followed by H-B (Tables 2 and   T    3).The highest percentages of individuals with a predicted RMR within ±10% of the measured value were observed with WHO and H-B equations (Table 2 and Table S2).However, all of the equations were considered inaccurate for females (Table 3 and Table S2).
There were large limits of agreement and RMSE values for all of the equations (BM equations in Table 2 and Figure 2).In the combined group of males and females, the extent of error of the predictive equations did not vary notably with the magnitude of the RMR (homoscedasticity; blue dotted line in Figure 2, p > .05).However, when each sex was considered separately, there was marked heteroscedasticity (p < .01;purple [females] and black [males] dotted lines in Figure 2), especially females (higher slopes [β]).
Sex influenced some indicators of accuracy, so the precision of the equations was generally lower when applied to females: higher bias, MAPE, RMSE, %RMSE, lower accuracy (%), and significantly higher mean difference (%) in every equation but H-B (ANOVA test, Table S2).
Multiple stepwise regressions showed that CPM/d were weakly but positively correlated with bias at H-B (Adjusted R 2 = .11,p = .01)and Mifflin BM equations (Adjusted R 2 = .07,p = .03)when applied to the whole sample (Table S3).When each sex was considered separately, FFM was the only factor (positive correlation) remaining in the models against bias in every equation applied to females (Table S3).Therefore, all equations' biases were higher for leaner females.

| Performance of predictive equations based on FFM
All the equations underestimated RMR in all our participants (Table 2) and when each sex was considered separately (Table S2).However, only the Pontzer FFM equation met the three accuracy criteria when applied to the whole sample and regardless of sex (Tables 2 and 3; Table S2).The Pontzer FFM equation demonstrated the best general performance, predicting 88% of the individuals' RMR accurately in the entire sample (Accuracy (%), Table 2), 92% of females and 83% of males (Accuracy [%], Table S2).
Equations based on FFM exhibited large limits of agreement and RMSE values (Table 2 and Figure 2).Different degrees of heteroscedasticity (p < .05)could be observed when the whole sample was analyzed in every equation (Blue dotted lines in Figure 2) but Nelson.When each sex was evaluated separately, greater heteroscedasticity (higher β) was observed in males than females (Figure 2).
Sex did not generally influence the metrics of accuracy (ANOVA test, Table S2), except for the Nelson equation, with MAPE and mean difference % significantly lower for males.However, most of the equations showed a higher precision when applied to females (Table S2).No significant multiple regression models were found when the predictive equations' bias was evaluated against age, BM, FFM, FM, %FFM, %FM, and CPM/d (Table S3).

| Comparison of predictive equations using BM with predictive equations using FFM
Bias, MAPE, and mean difference (%) were generally higher for BM than FFM equations (Table 2; Table S2).Most models using BM as the independent variable had a higher percentage of individuals with a predicted RMR within ±10% of the measurement than equations based on FFM (Accuracy (%) at Table 2; Table S2).However, the WHO model could not be considered consistently accurate (females in Table 3 and Table S3).On the contrary, Pontzer FFM equation was accurate under each circumstance (Table 3) and consistently had the highest accuracy rate (%) (Table 2; Table S2).
BM equations were generally less accurate for females, while FFM equations were less accurate for males (Table S2).All equations presented large limits of agreement and RMSE values, with limits of agreement being slightly higher using BM equations (Figure 2 and Table 2).Heteroscedasticity was evident in all the FFM equations when the whole sample was included (slight at every equation but Nelson in Figure 2).Nevertheless, BM F I G U R E 2 Bland-Altman plots for measured and predicted Resting Metabolic Rate (RMR).H-B, Harris & Benedict, 1919;Mifflin BM , Mifflin et al. (1990), Model #3; Pontzer BM , Pontzer et al. (2021), Model #4;Schofield, Schofield, (1985); WHO, FAO/WHO/UNU, ( 2004).Cunningham, Cunningham, (1991); Johnstone, Johnstone et al., (2006); Mifflin FFM , Mifflin et al. (1990) equations had generally marked heteroscedasticity by sex, higher than FFM equations, and with steeper slopes for females (Figure 2).Therefore, the magnitude of the RMR had a greater impact on the prediction error of BM equations: worst performance predicting individuals with higher RMRs, especially females.BM equations' bias was also more affected by sample attributes such as physical activity (CPM/d) and FFM (Table S3).
Lastly, the most accurate equation using FFM (Pontzer FFM ) performed notably better when predicting RMR than the most accurate equation using BM (WHO): lower mean RMR difference, bias, mean difference %, and %RMSE, and higher number of individuals with a predicted RMR within ±10% of measured RMR (Table 2; Table S2).

| DISCUSSION
The major finding of the present study was that all of the equations evaluated underestimated RMR in our sample, the degree to which tended to be greater with increasing RMR and, for some BM equations, with increasing levels of physical activity.Importantly, all the equations had large limits of agreement and %RMSE, reflecting sizeable errors in estimation at the individual level.The equation based on FFM developed by Pontzer et al. (2021) was the only one to meet all three accuracy metrics.In addition, the Pontzer FFM equation demonstrated agreement with measured RMR across a wide spectrum of physical activity, and independently of sex.
The underestimation of the RMR by predictive equations may suggest an effect of physical activity to increase RMR in our sample (Speakman & Selman, 2003).Our participants may have a higher RMR per kilogram of BM and FFM than the primarily sedentary populations (Jagim et al., 2018) used to develop the predictive formulas (i.e., table 4 at Schofield, 1985, or Table 2 at Nelson et al., 1992).That the equation developed by Pontzer et al. (2021) demonstrated agreement with measured RMR across a wide spectrum of physical activity may be a reflection of the more heterogeneous physical activity of the sample studied.
The results of our study indicated that the accuracy of BM equations is more dependent on the level of the RMR, particularly when sex is considered (Figure 2).However, lower levels of heteroscedasticity were detected in FFM models (Figure 2).As such, FFM models may be more robust when applied to subjects with higher than predicted RMR, particularly in females (Table S2).Our observations are consistent with Jagim et al. (2019) and Lindsey et al. (2021) but not Flack et al. (2016), who did not demonstrate clear differences in heteroscedasticity between BM and FFM models.The latter may be important when applying predictive models to specific populations.
In general, multiple regression model bias had a greater impact on sample parameters (activity levels and body composition) in BM equations than in FFM equations (Table S3).Bias was greater when the BM equations were applied to leaner females.In our sample, these females tended to have higher RMRs (Figure 2).Other authors have also found a positive correlation between bias and FFM (Flack et al., 2016;Javed et al., 2010;Wang et al., 2000;but see Li et al., 2010).In general, a higher FM is associated with a smaller difference between measured and predicted RMR.This suggests that some equations may have been developed using populations with higher values of FM than our sample, as mentioned by Nösslinger et al. (2021).At the same time, this may also reflect the effect of exercise on our participants (leaner population).
There were no sample parameters that significantly predicted FFM equations bias (Table S3).However, we would have expected this, due to differences in body size and composition between our sample and the populations included in the predictive models, that is, the inclusion of overweight or obese individuals according to Table 1 in Cunningham (1991), Johnstone et al. (2006), andMifflin et al. (1990); Table 2 in Nelson et al. (1992); and heavier subjects at table S1 in Pontzer et al. (2021).Therefore, it seems that, in our study, FFM equations dealt better with the variability of our participants' characteristics.
In general, most of the equations evaluated could be considered accurate for clinical use based on an acceptable margin of error (estimation bias ±250 kcal/d) and a mean difference % of less than 10% (Hasson et al., 2011;Pavlidou et al., 2018) (see exceptions at Table 2; Table S2).However, we proposed the best-fit equations based on three accuracy metrics (Freire et al., 2021).Based on these criteria, Pontzer FFM (Pontzer et al., 2021) was the most accurate of the FFM equations and overall (88% of prediction accuracy and mean absolute bias of 51.7 ± 117.1 kcal/d when applied to the entire sample).Bendavid et al. (2021) reported that the accuracy of most predictive equations does not exceed 70% and many equations are widely used despite the low levels of performance.All equations evaluated demonstrated large limits of agreement and RMSE values (Table 2 and Figure 2).Taken together, these observations reinforce the need to apply caution when applying these prediction equations to individuals in the clinical setting.
The Pontzer FFM equation performed remarkably better than the rest in the present study.This may not be surprising given FFM is a better predictor of RMR than BM.In contrast, others have not observed superior performance of a variety of FFM equations (Balci et al., 2021;De Lorenzo et al., 2001;Flack et al., 2016;Jagim et al., 2018Jagim et al., , 2019;;Nösslinger et al., 2021).Nevertheless, the notable accuracy of Pontzer FFM equation may be related to the sample size and better representation of individuals with characteristics similar to those in the present study.Our sample was composed primarily of Caucasians following a Western lifestyle and living in a temperate climate.The diversity of the Pontzer FFM population may be more generalizable to other groups exposed to similar environmental adaptations variability that have not systematically been considered by historical formulas (Froehle, 2008;Galloway et al., 2000;Leonard et al., 2005;Ocobock, 2016;Snodgrass et al., 2005).
Furthermore, the Pontzer FFM model was developed most recently, and Speakman et al. (2023) have reported that RMR has declined over the last three decades.As such, our sample may be more similar to that studied by Pontzer et al. (2021).In turn, we may anticipate that older equations would overestimate RMR in our participants.However, our findings suggest otherwise.Pontzer FFM may be more accurate for females in our sample because they were better represented than males in their model (Table S1).We also found a strong correlation between FFM and RMR in our sample (Table S4).Besides, Pontzer FFM formula may be more accurate for females because FFM $ RMR relationship is stronger than for males in our study (Table S4).This is supported by other studies (Frings-Meuthen et al., 2021;Jagim et al., 2019;Nielsen et al., 2000; but see Buchholz et al., 2001).Future studies evaluating the effects of sexual dimorphism on metabolism after accounting for body composition differences are needed.

| Practical applications
We have tested the accuracy of several predictive models in a sample of participants varying in physical activity levels.Our analyses indicated that, if body composition is not available, the (FAO/WHO/UNU, 2004) model accurately predicted RMR across physical activity levels, but less so for females.However, the Pontzer et al. (2021) model using FFM performed accurately for both sexes and independently of physical activity (CPM/d).The Pontzer FFM model precision demonstrated less dependency on the magnitude of RMR and sample characteristics (no correlations with age, body mass and composition, and physical activity) than other RMR prediction equations.More investigations are needed to test the accuracy of the Pontzer FFM model in other, more diverse populations.Importantly, the accuracy of prediction equations likely depends, at least in part, on the method used to measure body composition.DXA and isotope dilution used in the previous study and by Pontzer et al. (2021), respectively, are well-established and accurate methods for assessing body composition.Less accurate approaches, for example, bioelectrical impedance analysis (BIA; Achamrah et al., 2018), may negatively impact accuracy.

| Strengths and limitations
There are some strengths of our study that should be highlighted.We demonstrated excellent reproducibility of our RMR measurements and utilized dual-energy X-ray absorptiometry for the assessment of body composition.In addition, a balanced distribution among sexes and physical activity levels allowed us to test differences, and we included new predictive models (Pontzer et al., 2021) in our analysis.The use of three criteria to evaluate the best-fit equations is also a strength and beyond what others have performed.Nevertheless, we are aware that this may limit the comparison of our results with those that consider fewer variables in their predictive models.
There are some limitations to our study which also should be considered.First, our sample size is relatively small and a different outcome might occur with a larger sample.Second, our sample was primarily Caucasian, young, and with normal weight.The accuracy of predictive equations may be compromised when applied to other groups or those with obesity (Fern andez-Verdejo & Galgani, 2022).More diverse datasets are needed to account for RMR variability and the validity of predictive equations.Third, we did not control for menstrual cycle phase (Benton et al., 2020;Henry et al., 2003) or circulating hormones (Johnstone et al., 2005) that influence RMR variability (Compher et al., 2006).Finally, our participants were instructed to avoid vigorous physical activity at least 12 h before RMR measurements as is consistent with others (Fullmer et al., 2015).Other authors have had participants avoid exercise for longer durations (Compher et al., 2006;Melby et al., 1993;Speakman & Selman, 2003).In our case, allowing a longer period (e.g., 24-48 h) without vigorous physical activity may have disrupted the habitual physiological state of endurance athletes.Besides, having participants in their habitual state provides us with more applicable data.

| CONCLUSION
The present study demonstrated that a new equation by Pontzer et al. (2021), with FFM as the main predictor of RMR, yielded significantly better results than classic formulas when applied to a sample varying in physical activity levels.FFM equations generally demonstrated higher independence on the magnitude of RMR, sex, activity levels, and sample characteristics (age, body mass, and body composition) than other models.Due to the potential applicability issues to general populations, more investigations are encouraged to test new models, based on other diverse populations.

F
I G U R E 1 Sample size (n) distribution by self-reported km walked/ran per week and correspondence with Counts per Minute/ day (CPM/d) measured by accelerometry.Data represented as mean (gray diamond) and 95% confidence intervals (Fisher LSD) (blue brackets).♀ = Females; ♂ = Males.
, Model #1; Nelson, Nelson et al. (1992); Pontzer FFM , Pontzer et al. (2021), Model #5.Purple (♀ = females), black (♂ = males), and blue (both sexes) dotted lines represent the relationship between the magnitude of the RMR and the extent of error of the predictive equations by sex (homoscedasticity or heteroscedasticity).Asterisks (*) represent significant heteroscedasticity (p-value < .01);β = slope of the line.Green solid line shows the mean difference between measured and predicted RMR for each equation.Orange dashed lines show the limits of agreement (Bias ± 1.96*Standard Deviation).