INTRODUCTION
 Top of page
 Abstract
 INTRODUCTION
 PATIENTS AND METHODS
 RESULTS
 DISCUSSION
 Acknowledgements
 REFERENCES
The rheumatic disease process frequently leads to physical disability (1). Researchers aiming for a better understanding of rheumatoid arthritis (RA) outcome must first quantify it in a meaningful and reliable way. However, physical disability is a hypothetical construct, i.e., it was put together by scientists to explain the decline in the ability to perform physical activities that can occur in RA and other diseases (2). This implies that physical disability cannot be directly observed or measured, and as such, it can be considered a latent variable (3). Available measurement tools to assess physical disability in RA indirectly tap into the underlying construct (3).
To measure disability in RA, researchers have a variety of instruments and scales from which to choose (4). Some of these are considered arthritisspecific because they center on outcomes more immediately relevant to arthritis (5–7). Generic scales, on the other hand, measure more global outcomes and are suitable for studying a diversity of diseases (8). Each has its own set of advantages (9–12). Some empirical studies, however, have not found major differences in performance between the 2 types of scales (13, 14). The choice between one type over the other not being clear cut, some authorities reasonably advocate including both an arthritisspecific and a generic outcome measure in RA trials (9, 15). This recommendation has the added benefit that 2 or more measurement tools will provide a more reliable representation of the underlying construct (16).
However, not much attention has been given to how to report the results of studies that include 2 or more outcome measures of the same construct. The option of describing results on both scales separately in the same, or different, reports has certain disadvantages, including the need to conduct separate, parallel analyses; a greater potential for type I errors due to multiple comparisons; the added space needed to show results fully; enticement to duplicate publication; and problems of interpretation if results on the different scales diverge. When the last of these occurs, investigators may be lured into simply omitting results on the scale that do not fit their hypotheses. These potential problems, theoretical or real, can be averted by a datareduction process aimed at estimating the underlying latent variable, conserving or enhancing information provided by the various scales.
We have confronted some of the above dilemmas during an ongoing study of the disablement process in RA. We selected 2 selfreport scales, one generic and the other diseasespecific, and an observerderived classification system to assess the extent of physical disability in RA. In this article, we describe the datareduction process we utilized to derive a parsimonious, single variable representing the construct of physical disability. We also show evidence of its equivalence, or superiority, to the 3 primary scales.
RESULTS
 Top of page
 Abstract
 INTRODUCTION
 PATIENTS AND METHODS
 RESULTS
 DISCUSSION
 Acknowledgements
 REFERENCES
As expected for a group of people with established RA visiting a rheumatologist, most were women, median disease duration was 8 years, and rheumatoid factor was present in the majority (Table 1). The median number of 8 deformed joints indicates a substantial amount of joint damage (22). In accord with this finding, only 21% of the patients were working full or part time, and 27% stated they were unable to work. Of the 756 patients on whom we had followup information up to 6 years later, 71 were known to have died (9%).
Table 1. Clinical characteristics of the 776 RA patients studied*Characteristic  No. with data available  Distribution 


Age, median (range), years  776  57 (19–90) 
Male, no. (%)  776  229 (30) 
Ethnic group, no. (%)  776  
White   272 (35) 
Black   53 (7) 
Asian   14 (2) 
Hispanic   431 (56) 
Other   6 (1) 
Education, median (range), years  772  12 (0–17) 
Currently working, no. (%)  776  166 (21) 
Disabled for work, no. (%)  776  213 (27) 
Time from disease onset, median (range), years  776  8 (0–52) 
Tender joint count, no. (%)  776  15 (13) 
Swollen joint count, no. (%)  776  7 (7) 
Deformed joint count, no. (%)  776  10 (11) 
Nodules, no. (%)  776  233 (30) 
Rheumatoid factor positive, no. (%)  770  682 (89) 
Walking velocity, mean ± SD, meters/minute  775  59 ± 25 
Grip strength, mean ± SD, lbs  776  14 ± 10 
Button test, mean ± SD, buttons/minute  769  7.1 ± 3.8 
MHAQ, mean ± SD  776  1.89 ± 0.70 
SF36, mean ± SD  776  35.6 ± 27.87 
Steinbrocker functional class, mean ± SD  776  
I   163 ± 21 
II   383 ± 49 
III   190 ± 24 
IV   40 ± 5 
Latent Disability Scale, mean ± SD, lbs  776  56 ± 23 
Deaths as of March 2002, no. (%)  756  71 (9) 
Figure 1 is a diagram of the factor analysis we used to derive the physical disability latent variable. The 3 primary variables, MHAQ, SF36PF, and Steinbrocker class, loaded strongly on a single factor, with loadings ≥0.8. This factor explained ≥75% of the primary variables' combined variance. Uniqueness values were <0.3 for each of the primary variables, indicating that these share more than twothirds of their combined variance. We extracted the single factor without rotation, using linear regression scoring. Figure 2 shows probability distributions for the 3 primary scales and the latent variable.
The Pearson correlation coefficients between the extracted latent variable and the primary variables, as expected, was also strong, with r values ≥0.8. Figure 3 shows scatterplots of the bivariate distribution of these variables. The correlations between the latent variable and the criterion variables (pain, joint tenderness, swelling or deformity, grip strength, walking velocity, and the timed button test) are shown in Table 2, contrasted with the correlation coefficients between the primary scales and the same criterion standards. The latent variable had a significantly stronger correlation with most of the criterion standards than did the primary variables MHAQ, SF36PF, and Steinbrocker class. Notable exceptions were the correlation with the pain and articular examination variables, for which there was no significant difference between the MHAQ and the latent variable. Also interestingly, the number of deformed joints correlated more strongly with Steinbrocker class that with any of the other physical disability scales.
Table 2. Correlation between physical disability scales and variables measured as criterion standards*  MHAQ  SF36PF  Steinbrocker class  Latent variable 


Pain  0.59  −0.53†  0.41†  −0.59 
Tender  0.49  −0.43  0.33†  −0.47 
Swollen  0.24  −0.22  0.22  −0.24 
Deformity  0.20†  −0.25†  0.52†  −0.35 
Grip  −0.54†  0.52†  0.48†  0.59 
Velocity  −0.61†  0.65†  0.67†  0.72 
Button  −0.54†  0.55†  0.60  0.64 
Figure 4 shows the relationship between the latent variable and selected comparison criteria. These graphs show the association between higher values in the latent variable and graded decreases in the number of deformed joints, the proportion of disabled patients, and the proportion of those who died within 6 years. Conversely, performancebased functional measures (grip strength, timed button test, and walking velocity) displayed a proportional rise with increasing values on the latent scale, as did the probability of working full or part time.
Table 3 shows the BICs of models that contained age, sex, and each of the 4 disability scales (the MHAQ, SF36PF, Steinbrocker class, and the latent variable) as independent variables for each of the criterion standards. For most of the criterion standards, the BIC was smaller, indicating better fit, in the models that included the latent variable (Table 3). Notable exceptions, again, included the Steinbrocker class, whose model had a better fit versus the deformed joint count than did any of the other physical disability scales. Likewise, there was positive evidence that the SF36PF fit better in a model for disabled work status, than did any of the other physical disability scales.
Table 3. Bayesian information criterion of multivariate models, according to physical disability scale used as independent variable*Dependent variable  Physical disability scale included as independent variable in multivariate model† 

MHAQ  SF36PF  Steinbrocker class  Latent variable‡ 


Currently working  −4,429§  −4,447¶  −4,432§  −4,451 
Currently disabled  −4,288§  −4,308  −4,266§  −4,303 
Death within 6 years  −4,782#  −4,780§  −4,775§  −4,791 
Pain  −1,588§  −1,531§  −1,414§  −1,602 
Tender joint count  809#  855§  933§  818 
Swollen joint count  −8¶  −5¶  7§  −14 
Deformed joint count  662§  661§  519  607 
Grip strength  248§  261§  306§  170 
Walking velocity  1,654§  1,609§  1,660§  1,466 
Timed button test  −7,501§  −7,465§  −7,478§  −7,597 
DISCUSSION
 Top of page
 Abstract
 INTRODUCTION
 PATIENTS AND METHODS
 RESULTS
 DISCUSSION
 Acknowledgements
 REFERENCES
One desirable characteristic of research data is parsimony, or simplicity of explanation (31). Under this principle, one variable is preferable to 2 or more, providing that the single variable is as informative as the 2 or more. We have shown evidence that a single latent variable derived from principal component factor analysis of 3 scales, the MHAQ, the SF36PF, and the Steinbrocker functional class, has equal or superior performance to the primary scales, as manifested by an equal or stronger degree of association with the criterion standards we selected. We used the disablement process as a theoretical framework to inform our selection of criterion standards (18, 32, 33), aiming to test the underlying physical disability construct from as many perspectives as possible. Thus, our comparison criteria included key RA impairments, such as the amount of pain and the number of tender, swollen, and deformed joints (33). We also used measures of functional limitation, occupational status, and death within 6 years as criteria.
The correlation between the joint impairments and the latent variable was nearly always stronger than that between the same impairments and the primary disability scales (Table 2). This likely is due to the superior reliability of the latent variable, which is a composite of the 3 primary disability scales. This approach has been referred to as incomplete principal component regression because the variable of interest is provided by the first principal component in a factor analysis (34). The composite measure's stronger correlation with most criterion standards conforms to a fundamental theorem of measurement theory
according to which the correlation between 2 variables, x and y is limited by the square root of the product of each variable's reliability (16). However, there were 2 comparisons that did not follow this rule: The MHAQ correlated equally strongly as the latent variable with the impairments; and the Steinbrocker class correlated more strongly with the number of deformities than did any of the other disability scales, including the latent variable. The reason for this may be that examiners may have incorporated findings from the joint exam into their judgment of the Steinbrocker class. In contrast, the MHAQ and the SF36 are selfreported scales that patients answer according their own perceived condition.
We also used 3 performancebased measures of functional limitation: grip strength, walking velocity, and timed button test. Within the disablement process framework, these measurements are closer to the physical disability construct than are the joint impairments (18, 32, 33) and, consequently, their degree of correlation with the disability scales was stronger. Here, even more so than with the impairments, the latent variable's association with the performancebased measures was stronger than that of any of the 3 primary physical disability scales considered individually.
Work loss is one of the main adverse consequences of RA (35). We found that work status was strongly associated with the 4 physical disability scales with a tendency for the association to be stronger for the latent variable. Likewise, death displayed a similar pattern of association. One of the main uses of these comparison standards, occupational and vital status, is as anchors that researchers or clinicians can use to interpret the values along the latent variable scale. As shown in Figure 4, there are strong adverse outcomes associated with lower values for the latent variable.
The physical disability scales we used in the present analyses, including the latent variable we developed, often find use in multivariate models, either as outcomes or predictors. We thus compared the fit of models that included the different physical disability scales as independent variables and each of the different criterion variables as outcomes. Because each of the criteria we used can be heavily influenced by age and sex, we included these 2 variables as covariates in all of the multivariate models. We chose the BIC as a comparative measure because it is a tool used often for model selection (30, 36). We expected that the models that included the new latent variable would have smaller BICs, indicating better fit. Indeed this was the case with nearly all of the criterion variables.
The latent disability variable has distributional advantages over the 3 primary scales. Both the MHAQ and the SF36PF display skewed distributions (Figure 2), the former displaying a ceiling effect, the latter, a floor effect (37). The latent scale lacks skewness in either direction, more closely approximating normality than any of the primary scales. Moreover, the latent variable displays an interval or nearinterval distribution, as suggested by the monotonic rise in criterion variables as the scale increases (Figure 4).
The latent variable has theoretical advantages as well: Physical disability is a hypothetical construct and claims that any one disability measurement scale is superior to others are debatable. Using more than one measurement tool may be a more accurate way to get at the underlying construct because it enables the unmeasured construct to be assessed from a variety of angles. For this same reason, the idea of using both a scale intended specifically for arthritis and one intended for unselected populations (9, 15) is quite attractive, because the arthritisspecific scale, the MHAQ in our study, will capture the arthritisrelevant outcomes whereas the generic scale, here provided by the SF36PF, will capture an overall nonspecific disease impact.
We acknowledge some limitations of our analysis. Factor analysis assumes that data are distributed on interval, multivariate normal scales, an assumption that may not be stringently met by the 3 disability scales we entered into the factor analysis. However, this assumption is a strict requirement only if statistical inference is used to determine the number of factors and can be relaxed when factor analysis is used descriptively (26, 38). The least squares factor extraction method we used is also robust to deviations from normality (39). The MHAQ and SF36PF scales we used were developed using sound psychometric theory to produce results on interval or nearinterval scales, and they have each been used as such in numerous studies over many years. We used the composite scores of both these scales, scored as originally intended. It is possible, however, to select items from each of these scales and calibrate their weights so that they more closely approximate a true interval or ratio scale by using item response theory or Rasch analysis (40, 41). This may represent an alternative method to accomplish the aims we pursued here.
Data parsimony is a desirable feature in a research study; among other reasons, because it avoids the problems we mentioned in the beginning of this article. In the present analysis, we have reduced the original 3 scales into 1 single variable that in many respects outperforms the individual primary scales. A similar data reduction strategy could be used for other RA processes, such as inflammatory disease activity, disease damage, joint impairment, and functional limitation (32, 33). For example, a latent variable extracted from the disease activity measures recommended for RA clinical trials (42) could potentially lead to more efficient trials if the latent variable outperforms the primary scales, as was the case for disability measures in the present analysis.
It is important to point out that ours is a datadriven approach, and that the latent variable cannot be fully specified as an outcome measure in advance of a study. We do not advise investigators to attempt to directly apply the factor loadings we estimated here to develop a latent disability variable for use in their own studies, because data from another patient sample could be quite different. Moreover, investigators may have reasons to choose a different set of primary disability scales from those used here. We do believe, however, that researchers can apply a principal component factor analysis, similar to that shown here to their own data, to extract a latent variable that will likely exceed the primary scales in reliability.
In conclusion, we have used factor analysis to derive a latent variable that measures physical disability in RA. The new variable outperforms the primary scales in a number of tests of association with comparison criterion standards. This approach may be used to develop latent variables measuring other RA disease components, such as disease activity, damage, and functional limitation.