Objective: Accelerometers are promising tools for characterizing physical activity (PA) patterns in free-living persons. To date, validation of energy expenditure (EE) predictions from accelerometers has been restricted to short laboratory or simulated free-living protocols. This study seeks to determine the capabilities of eight previously published regression equations for three commercially available accelerometers to predict summary measures of daily EE.
Methods and Procedures: Study participants were outfitted with ActiGraph, Actical, and RT3 accelerometers, while measurements were simultaneously made during overnight stays in a room calorimeter, which provided minute-by-minute EE measurements, in a diverse subject population (n = 85). Regression equations for each device were used to predict the minute-by-minute metabolic equivalents (METs) along with the daily PA level (PAL).
Results: Two RT3 regressions and one ActiGraph regression were not significantly different from calorimeter measured PAL. When data from the entire visit were divided into four intensity categories—sedentary, light, moderate, and vigorous—significant (P < 0.001) over- and underpredictions were detected in numerous regression equations and intensity categories.
Discussion: Most EE prediction equations showed differences of <2% in the moderate and vigorous intensity categories. These differences, though small in magnitude, may limit the ability of these regressions to accurately characterize whether specific PA goals have been met in the field setting. New regression equations should be developed if more accurate prediction of the daily PAL or higher precision in determining the time spent in specific PA intensity categories is desired.
Physical activity (PA) is widely recognized as an important factor in maintaining healthy body weight. As the prevalence of obesity increases (1), increasing the daily PA level (PAL) among adults has become an important public health priority. Several specific PA guidelines have been issued in an attempt to help individuals develop appropriate exercise habits. Both the Centers for Disease Control and Prevention and the American College of Sports Medicine (ACSM) have issued recommendations that adults perform moderate intensity PA for at least 30 min per day, 5 days a week (2,3). Healthy People 2010 encourages adults to engage in at least three 20-min bouts of vigorous PA each week (4). Some researchers suggest that even these guidelines are insufficient to combat weight gain (5). Regardless of the specific PA goal adopted, it is important for researchers to measure objectively and accurately the actual daily participation in PA to understand PA patterns within the general population as well as to characterize the impact of achieving specific PA goals on overall health on an individual or group basis.
One common method for objective assessment of PA is accelerometry. Accelerometer output can be used to predict gross energy expenditure (EE) (6,7,8) or metabolic equivalents (METs) (9,10,11), which can be computed by normalizing EE by resting EE (REE). To simplify interpretation of accelerometer data, cutoff points that distinguish intensity categories have been developed with descriptive names that correspond to those used in making public health predictions. Typical MET categories include sedentary (1–1.5 METs), light (1.5–3 METs), moderate (3–6 METs), and intense/vigorous (>6 METs) PA (2).
A number of different EE prediction equations for both METs and gross EE exist in the literature using minute-by-minute accelerometer output. Some of these equations have been further used to develop cutoff points, which serve to discriminate PA intensities without making specific EE predictions. All of these regressions are specific to a particular accelerometer device, such as the ActiGraph (9,10,11,12), Actical (7), or RT3 (6). Early equations were developed using only ambulatory activities performed at a moderate-to-vigorous intensity (9), whereas more recent approaches have incorporated lower intensity lifestyle activities, such as sweeping, house-cleaning, and gardening (7,10,11). A number of analytic approaches have been explored in an effort to attain robust prediction capabilities including linear regression (7,9,10,11), bilinear regression (7), and a nonlinear power model (6). An extensive review of experiments designed to develop EE prediction equations and cutoff points has been recently published by Matthew (13).
With numerous accelerometer devices available on the market and multiple regression equations developed for each device, it is often difficult to select the device and regression equation that will be most appropriate for a specific study (14). Recently, a validation of three accelerometers, ActiGraph, Actical, and AMP-331, and 15 prediction equations was performed on data acquired from short, structured protocols using portable indirect calorimetry (15), which showed overestimation of the metabolic cost of walking and sedentary activities, while underestimating the cost of most other activities tested. In addition, a number of regressions for the TriTrac (a predecessor to the RT3 accelerometer) and the ActiGraph were compared using doubly labeled water as the reference criteria (16).
In this study, only two equations for the ActiGraph gave estimates of total daily EE that were comparable to doubly labeled water, the Hendelman et al. (10) and Swartz et al. (11) equations, which were developed using lifestyle activities. In this study, we compared the predictive performance of three accelerometry-based PA monitors, ActiGraph, Actical, and RT3, and seven EE prediction equations from the literature and one provided by the device manufacturer. We used EE measured using a room calorimeter during day long stays in a heterogeneous group of healthy adult volunteers as the reference criteria. Understanding the prediction accuracy of each monitor with respect to room calorimeter data provides an important intermediate step between EE estimates based on fully structured laboratory protocols and free-living analyses, because overnight stays in room calorimeters comprise both spontaneous and structure PA intervals.
Methods and Procedures
Eighty-five adults (37 men, 48 women) aged between 18 and 70 years participated in this study. Subjects were weight stable (<2 kg change in the past year), free of both diseases and medications known to alter EE, were nonsmokers, and were free of major orthopedic problems that would limit their ability to perform PA. The characteristics of these subjects are shown in Table 1.
Table 1. Characteristics of study participants
Volunteers were recruited from the Nashville, TN, area using flyers, email distribution lists, and personal contact. Before participation, all subjects signed an informed consent document approved by the Vanderbilt University Committee for the Protection of Human Subjects. Each subject was asked to complete one overnight stay in the room calorimeter while minute-by-minute activity data was acquired with three hip-mounted accelerometers. Subjects engaged in two structured activity intervals. The morning activity period comprised self-paced ambulatory activities (walking and jogging), whereas the afternoon activity period contained sedentary activities, such as deskwork, along with stationary biking. Because hip-worn accelerometers were used, stationary biking was ultimately eliminated from the analysis. Each prescribed activity was performed for 10 min followed by a 10-min rest period to allow the metabolic rate to return to baseline between activities. During times when no specific activity was prescribed (∼15 h), subjects were encouraged to engage in their normal daily PA patterns. REE was computed using the mean of a seated rest period during the morning of the visit. Subjects' body composition was assessed using dual-energy X-ray absorptiometry (GE Lunar Prodigy, Madison, WI) the week before their study visit. Height and weight were measured the morning the subject entered the room calorimeter.
Whole-room indirect calorimetry chamber. EE was computed on a minute-by-minute basis using the Vanderbilt University room calorimeter, which is located within the General Clinical Research Center. This system measures oxygen consumption and carbon dioxide production with high accuracy (monthly alcohol combustion tests insures >98.5% recovery over 15 h and >95% over EE during PAs over 8 min). The room calorimeter is an airtight environmental room measuring 2.5 × 3.4 × 2.4 m3. The calorimeter is equipped with a toilet and sink, desk, chair, telephone, television, DVD player, stereo system, bed, treadmill, and exercise bike. Technical details of the calorimeter have been previously reported (17).
Accelerometers. During the study visit, subjects were simultaneously outfitted with three commercially available accelerometers, the ActiGraph (formerly MTI/CSA, Fort Walton Beach, FL), the Actical (MiniMitter/Respironics, Bend, OR), and the RT3 (StayHealthy, Monrovia, CA). Both the ActiGraph and Actical are primarily sensitive to motion in one plane (vertical). The RT3 is a triaxial accelerometer, which reports activity in each of three orthogonal directions as well as the vector magnitude of the three measurements.
Each of these monitors reports activity counts, a device-specific arbitrary unit, which represents the frequency and amplitude of acceleration events occurring over a user-defined measurement epoch. Technical specifications for each type of monitor have been previously reported (7,18,19,20). For this study, all monitors were attached to a belt secured at the waist with monitors positioned on the right hip, and all data were acquired in 1-min epochs.
Regression equations. Activity count data for each monitor can be converted to measures of PA intensity (EE or METs) using a variety of both published and proprietary equations. Three regression equations were studied for the ActiGraph and RT3, whereas two equations were explored for the Actical (Table 2). These equations represent a mix of equations developed using only ambulatory data (AG 1, AG 3, RT3 3) and those developed using lifestyle activities (AC 1, AC 2, AG 2, RT3 2), as well as a mix of analysis techniques including linear regression (AC 1, AG 2, AG 3, RT3 1, RT3 3), bilinear regression (AC 2) and a generalized nonlinear model (RT3 2). Data were analyzed using both MET-based categorical predictions and using daily PAL, which was computed as the average of the minute-by-minute MET predictions during the study interval.
Table 2. selected regression equations for AG, AC, and RT3 accelerometers explored in this work are shown (ct = activity count)
For two equations, the Chen equation for the RT3 (RT3 2) and the Hendelman equation for the RT3 (RT3 2), adaptations were made to the originally published form to make them appropriate for our study data. These equations were originally developed for the TriTrac-R3D accelerometer, and thus were modified by a correction factor to account for the scaling difference in counts between the two devices (18).
Both activity monitor types and specific regression equations were compared as to their ability to accurately predict daily PAL, and time spent in four PA intensity ranges, specified in METs. As the standard criteria, the calorimetry-measured MET values were calculated on a minute-by-minute basis as the ratio between absolute EE and REE, which was the averaged EE from a 30-min period of seated rest from the first morning of the study visit. To test the null hypothesis that there is no difference in PAL and the percent of time spent in each intensity category between each regression equation and the calorimeter data, ANOVA was performed. Analyses were performed using STATA 9.1 (StataCorp, College Station, TX) and R (r-project.org).
PAL over the entire measurement period (21.7 ± 0.41 h) was computed for each subject using the room calorimeter EE as well as predicted EE using each activity monitor and regression equation (Figure 1). The mean of the measured PAL values was 1.40 ± 0.10, indicating that on average, subjects had a sedentary day (PAL < 1.5). PAL predictions for AC 1, AC 2, AG 1, AG 2, and RT3 1 were significantly different (P < 0.001) from the measurement. Differences on average were small (PAL ≥ 1.30) for all equations except the Hendelman AG regression (AG 2), which showed a large overprediction relative to the calorimeter, which is attributable to the large y-intercept in this regression equation (Table 2).
Data from each subject were also analyzed to determine the percent of the measurement period associated with each of four PA intensity categories, or MET ranges, sedentary (1–1.5 METs), light (1.5–3 METs), moderate (3–6 METs), and vigorous (>6 METs). On average 80% of the study visit was spent between in the 1–1.5 MET category, 16.6% between 1.5–3 METs, 2.0% between 3–6 METs, and 1.46% in the >6 MET range (Table 3). The time spent in sedentary, light, and vigorous PA was best represented by RT3 2 (no statistically significant difference). With the exception of AG 2, all other equations underestimated the time spent in sedentary PA with a subsequent overprediction of the time spent in light PA. AG 2 did not predict any data as belonging to the sedentary category and underestimated the total time spent between 1–3 METs. AC 1 and AG 3 best represented the time spent in moderate PA. The difference between each regression and the calorimeter measurements for each intensity category are shown in Figure 2.
Table 3. Tabular results for PAL and percent of the study visit spent in the sedentary, light, moderate, and vigorous PA for each accelerometer and regression for calorimeter (IC) and each activity monitor
Accelerometry-based portable PA monitors are a feasible and objective means of detecting PA patterns. Many studies have developed and validated models with various accelerometers to predict activity EE; however, to our knowledge, the scopes of such studies were largely limited to short protocols consisting of structured intermittent bouts of PA. These validation protocols, while in many cases similar to protocols employed in the development of the regression equations we tested, may not mimic free living because they include a limited number of PA types, and assume that all are equally likely to be present in free living. This may lead to larger prediction errors when using these equations for longer and free-living studies than we experience in the laboratory. In this study using a whole-room indirect calorimeter, we validated the ability of the ActiGraph, Actical, and RT3 activity monitors to accurately report summary statistics relating to time spent in specific PA intensity categories in a heterogeneous group of healthy men and women. Previously published regression equations for each device were explored to discover their relative strengths and weaknesses. The long study duration (∼22 continuous hours) presents a bridge between short laboratory PA protocols, where all exercise intervals are explicitly specified, and free-living studies by allowing subjects to engage in both prescribed and spontaneous bouts of PA while still providing minute-by-minute EE measurements from the room calorimeter. Analyses were designed to attempt to highlight features that would be of interest to researchers examining long-durations (weeks) of free-living data or a data collected from a large number of subjects, where minute-by-minute prediction accuracy is less important than reliable summary measures of each day.
PAL is a measure of the mean EE above REE. It is an attractive daily PA outcome because it rises proportionally to the number and intensity of active minutes in each day while being comparable between subjects, because data from each subject is normalized by REE. Mathematically, accurate predictions of PAL require that intervals in which activity counts are close to zero be assigned an EE close to or equivalent to the REE. Thus, the Hendelman (AG 2) equation is a poor choice because of its high y-intercept (15). The Swartz ActiGraph regression (11) also has a large y-intercept and was not considered in this study because of its performance similarities to AG 2. The regressions that best estimated PAL contain the most physiological intercepts. In the cases of AG 3 and RT3 2, which predict EE in METs, the intercepts are slightly greater than one, whereas in RT3 2 activity EE is forced to zero when activity counts are zero. Using other regressions, PAL was, on average, underpredicted which highlights potential limitations in the regression forms and also reflects that there are some increases in EE that were measured by the calorimeter but do not have an associated acceleration response to be detected by the accelerometers (thermic effect of food, limb movements, and isometric muscle contractions). The higher predicted PAL that was observed for most of the RT3 regressions could be due to measurement sensitivity of each device, represented by the lower proportion of measured zeros by the RT3 (0.50) relative to the Actical (0.59) and ActiGraph (0.61), or could be due to characteristics of this regression, such as an overpredicted baseline value, higher slope, or a nonlinear model form. In the case of the proprietary RT3 regression, it is difficult to isolate the source of any potential benefits or artifacts, as the form of the regression is proprietary.
Time spent in MET categories is a summary metric which characterizes the intensity distribution of daily PA, and is a useful tool for assessing whether a daily PA goal has been met in the field. Although differences between predicted and measured intensity distributions were generally small in the moderate and vigorous intensity categories (<2%), discrimination between sedentary and light PA had a much higher error rate (generally around 10%). Because, on average, the total difference between the models and the calorimeter for the combined category of 1–3 METs is small (<2%), we may be able to infer that the form of the regressions we tested may not be appropriate for low intensity PA, as adjustments in the slope alone would change the amount of total time classified in this intensity region. Recently, an approach has been presented to attempt to mitigate these problems by using a different regression form for these activities (21); however this equation was not tested here because it requires that data be acquired 1-s epochs. As in previous work (15,22) our largest errors were observed using the Hendelman (AG 2) regression. However, because it is one of the only regression equations developed primarily using lifestyle activities, which are the most prevalent activities in our protocol, we felt inclusion of this equation was important to show the way errors observed in short protocols, those with experimental duration of 2–3 h, propagate when data is considered over the course of a day.
In addition to the regressions presented for all of our analyses, we also computed the time spent in sedentary (1–1.5 METs), light (1.5–3 METs), and moderate/vigorous PA (>3 METs) using the Matthew's ActiGraph moderate/vigorous cutoff point (760 counts/min) (13). This cutoff point was developed using combined data from several subjects. We coupled this cutoff point with a sedentary/light cutoff point of 100 counts/min, which has been previously suggested for adolescents (23). Using these cutoff points there was only a small difference between the activity monitor predicted time spent in sedentary PA and the calorimeter (2.9%), although this difference was significant (P < 0.001). This difference was markedly smaller than differences observed using the other ActiGraph cutoff points and regressions tested. Although the time spent in light PA estimated using Matthews' cutpoint showed the best agreement with the calorimeter of any ActiGraph equation tested (mean time spent in light PA = 11.93%), there was still a significant underprediction in this category and a corresponding overprediction of the time spent in moderate/vigorous activity.
When considering these results, it is important to remember that even small percentage differences between predicted and measured time in each intensity category can cause problems in assessing subjects' adherence to public health recommendations, which typically require between 20 and 45 min of moderate-to-vigorous PA. If only the discrimination between light (1.5–3 MET) PA and all others is considered, we can determine the potential of each regression to correctly characterize whether such a goal has been achieved. Because 1% of our average study visit corresponds to ∼13 min, even regressions that demonstrated this seemingly high mean agreement with the measured intensity distribution often had wide ranges of agreement. These errors would likely render reliable determination of the time an individual spent engaged in moderate-to-vigorous PA difficult, unless the subject exceeded the specified amount of PA by 20–30 min. This would suggest that current accelerometer regressions should be used only for assessing compliance within a population and not on an individual basis. This could be because the forms of the current regressions are too simple to account for the inter-individual variability in PA performance, suggesting that either more flexible modeling techniques or individual calibrations should be considered.
There are some limitations in this study. First, we did not evaluate all predictive equations available for all the monitors we tested. We restricted our search to commonly used regressions, developed using 1-min epoch data. Further, whenever possible we used equations that are built-in to activity monitor software, thereby attempting to isolate the equations that would be most accessible to researchers in the field. A new, nonlinear regression for the ActiGraph has been recently published (21); however, the data collection epoch was 1 s, we were unable to validate its performance using this data set. Also, although we frequently referred to our predictions in terms of METs, they are more truly PA ratios because each subjects' EE was normalized by a measured REE (24). This difference could explain some discrepancies with regression equations developed using a constant 3.5 ml O2/kg/min as the normalization factor. However, there is some recent evidence that the constant normalization factor is not valid for all subjects (25) and PA ratio may be a more meaningful summary metric. We explored the impact of the value used for normalization by analyzing all of our data using an REE computed with the Harris-Benedict equation. Resulting statistical trends for PAL and percent of time spent in each intensity category were unchanged.
It should also be noted that due to our relatively large sample size, both with respect to number of subjects and duration of data collection, many statistically significant differences were detected that may result from absolute differences that are too small to be clinically relevant. Thus, each researcher must examine the magnitude of the difference between predicted and measured values to determine whether observed differences are meaningful in the context of their work. Underprediction in PAL may be important, even if absolute differences are small, because a threshold value for an active day may not be predicted even if it is achieved using these approaches. For intensity categorizations, small percentage differences can correspond to enough minutes of erroneous prediction as to restrict their ability to detect whether an exercise goal has been met, or worse, can predict that an exercise goal has been met when it has not.
In this study, we compared three commercially available accelerometry-based activity monitors and seven EE prediction equations with measured values using a room indirect calorimeter. Mean PAL was underpredicted by four regressions (AC 1, AC 2, AG 1, and RT3 1), overpredicted by one (AG 2) and was not different from the criterion measure in three cases (AG 3, RT3 2, and RT3 3). Despite many performance similarities across monitor types and regressions, specific strengths and weaknesses were found for each, suggesting that no one equation or monitor is superior in all circumstances. For example, the RT3 regressions had the most comparable PAL value to those measured, whereas the Actical single-regression model (AG 1) was generally good at estimating the time spent in moderate and vigorous PA. Consequently, researchers should consider their outcome goal in determining not only the instrument they use to collect data but also their postcollection processing method. Because data can be safely analyzed using multiple regression approaches, researchers who are interested in more than one type of outcome may determine that more than one regression approach should be employed within a study in order to produce the highest accuracy results for each measurement variable of interest.
This work was supported by grants from the National Institutes of Health: DK069465, HL082988, DK02973, and RR00095. We thank Jessica White, Ashley Béziat, and Sarah Johnson for their contributions to the data collection, Dr Brychta for assistance in preparing the manuscript, and Dr Buchowski for his continued support of this project.