Comprehensive assessment of clinical outcome and quality of life after resection interposition arthroplasty of the thumb saddle joint

Authors


Abstract

Objective

To explore the biometric and psychometric properties of clinical, generic, and condition-specific instruments and to assess quantitatively the outcome after resection interposition arthroplasty (RIAP) of the thumb saddle joint.

Methods

One hundred three patients requiring 112 arthroplasties were assessed in a 4.5–7.7-year cross-sectional catamnesis by means of 4 widely used questionnaires and clinical and radiographic examinations.

Results

In all dimensions of the Short Form 36 (SF-36), the outcome was equal or significantly better than expected by the norm. The Disability of the Arm, Shoulder and Hand questionnaire (DASH) revealed some small, mainly functional limitations (mean score 78.4, norm 86.4). The SF-36, the DASH, and the Patient Related Wrist Evaluation form (PRWE) correlated highly and loaded on the same factor. The Hand Function Index was independent of the clinical measurements (range of motion, strength, etc. on the specially designed Custom Form) and of the self rating.

Conclusion

Long-term followup of 112 RIAP patients showed excellent health and quality of life. A questionnaire set consisting of the SF-36, the DASH (or alternatively the short PRWE), and the Custom Form is proposed for the comprehensive and specific assessment of thumb joint conditions.

INTRODUCTION

Osteoarthritis (OA) is the sixth most important cause of disability in the developed regions of the world, and an increase of 19% in its public health impact due to disability has been estimated for the period 1990–2020 (1). At least 12% of the US population had OA in the early 1980s (2). The prevalence of radiographic hand OA is estimated to be 32.5% in US adults and 33.0% in primary care settings, and the basal thumb joint is one of the most affected joints in hand OA (3–5).

The thumb is crucial for normal hand function, especially at the level of the carpometacarpal joint, and high disability results if it is affected by OA. This soon becomes apparent due to the fact that the thumb is involved in almost every activity of daily living (ADL), such as holding a tool, opening a bottle, etc. Assessment of specific functional deficits and their consequences for performance of ADL, perception of quality of life (QOL), and assessment of clinical and comprehensive QOL according to the World Health Organization's (WHO's) ICF concept (International Classification of Functioning, Disability, and Health) are particularly important in hand conditions, as outlined in various reports (6–9).

Resection interposition arthroplasty (RIAP) of the thumb saddle joint was first described by Epping and Noack and has become an established operative treatment (10–14). There are multiple reports in the literature on the mid- and long-term results of RIAP, but most of them used examiner-dependent outcome measures and only a few of them employed standardized, valid, and comprehensive assessment instruments (11–14). Hardly any study has examined the comparability of the various instruments used in relation to their relative validity, clinical utility, or specificity, especially among condition-specific tools. This is also true for other hand joint conditions, particularly in rheumatoid arthritis (15–17). Therefore, little data exist on how functional outcome affects ADL and QOL, and the data of many outcome studies cannot be compared because the assessment was not standardized, validated, uniform, or consistent.

We tested a set of standardized, validated, clinical (physician assessed), and patient self-administered health measurement instruments in a cross-sectional followup examination of patients 4–8 years after RIAP. The first aim of the study was to assess the validity of the assessment tools, the quality of the data obtained, and the practicability of the instruments' use in daily clinical routine. The resulting recommendations for an optimal set of instruments will contribute to the development of a standardized assessment tool for use in the clinical environment. The second aim was to describe health status and QOL compared with population-based normative data using a holistic comprehensive assessment approach.

PATIENTS AND METHODS

Patients and intervention.

All patients who had undergone RIAP of the thumb saddle joint at the Department of Upper Extremity and Hand Surgery, Schulthess Klinik, Zurich, Switzerland, in the years 1996–1998 were sent a written invitation to attend a followup consultation with the study physician (MJ). The patients were then contacted by telephone, which provided the opportunity to motivate them to come to the clinic, to answer any remaining questions, and, in the case of patients who did not want to attend assessment, to establish why the invitation was declined. Travel expenses were refunded, but otherwise no payment was made for participation.

All patients were operated on by 1 of 2 surgeons (DBH or BRS) by the following standard procedure. The trapeziometacarpal joint was exposed and the entire trapezium excised. Approximately two-thirds of the flexor carpi radialis tendon was released proximally in the forearm with its insertion left intact on the base of the second metacarpal. The free end was then routed through a drill hole in the base of the thumb metacarpal and sutured to itself. The remaining tendon was folded back on itself and held with suture, creating an interponate that looks like a small fish, and is thus called “anchovy” in hand surgery. This interposition material was placed into the trapezium space and the capsule is closed over it. Postoperatively, a removable thumb splint or thumb-forearm cast was then prescribed for 5–6 weeks.

Measures.

The assessment instruments to be used were selected using the same criteria as described in our previous shoulder study (9). All validated, clinically well-tested health measurement instruments for the upper extremity were identified by searching the literature in PubMed. They were judged for quality and rated in relation to their practical handling, suitability for use in clinical routine, and clinical–epidemiologic qualities. The resulting set comprised radiologic imaging and the following assessment tools: 1) A sociodemographic questionnaire (18), 2) the Self-Administered Comorbidity Questionnaire (SCQ) (19), 3) the Short-Form 36 (SF-36) (20–22), 4) the Disability of the Arm, Shoulder and Hand questionnaire (DASH) (23–26), 5) the Patient Related Wrist Evaluation form (PRWE) (27), 6) the Hand Functional Index (HFI) of the Keitel Function Test (KFT) (28), and 7) a specifically designed form to evaluate clinical parameters (Custom Form). Detailed descriptions of the sociodemographic questionnaire, the SCQ, the SF-36, and the DASH can be found in Angst et al (9).

To obtain an overall assessment of the arthroplasty result, patients were asked to assess their current state of health related to the operated thumb joint at the followup compared with the state before arthroplasty by the so-called transition question (29). Patients also rated their level of satisfaction with the help of a 10-cm visual analog scale (VAS) in terms of how the outcome met up to their preoperative expectations of the arthroplasty. Finally, they were asked whether, with their current knowledge of the outcome, they would again choose the operation if they found themselves in similar circumstances to those that prevailed preoperatively.

The PRWE is a short, self-administered questionnaire that uses a VAS from 0 = best to 10 = worst health for each item (27). Four items assess pain intensity under different conditions, 1 item assesses pain frequency, and 10 items assess disability during ADL that are dependent on hand function in 2 sections: specific activities (6 items) and usual activities (work, etc., 4 items). The unweighted means of the items are used to determine the pain and function scores. The unweighted average of these 2 scores gives the total PRWE score. Each score (PRWE pain, PRWE function, and total PRWE) was transformed into a scale ranging from 0 = worst to 100 = best to facilitate comparison with those of the other questionnaires (e.g., the SF-36). The English version of the PRWE was translated into German according to the guidelines of the American Academy of Orthopedic Surgeons Outcomes Committee (30) and showed good reliability and validity (31).

The English version of the PRWE proved to be reliable and valid, and to have a high correlation with the SF-36 (Pearson's r up to 0.72 in pain and up to 0.48 in function) (27). In a setting of patients after distal radius fracture, the PRWE was the most responsive score (standardized response mean [SRM] 2.27) when compared with the DASH (SRM 2.07), the SF-36 bodily pain (SRM 0.92), and the SF-36 physical functioning (SRM 1.33) (32). Grip strength was most predictive for the PRWE score in a set of clinical parameters (33).

The KFT consists of a part for the upper (11 items) and a part for the lower (14 items) extremity and was originally presented in German (28). We used only the upper-extremity section for this study, which is also known as the HFI. The KFT is a set of clinical tests performed by the physician or other health professional. The 11 items of the HFI measure hand performance and range of motion by partly complex tests/functions for the left and the right hand separately. The ability to perform the tests also depends on the function of the elbow and the shoulder, but mainly on the function of the hand joints. The scores of the items from 0 (worst) to 2 or 3 (best) per item are summed for each hand to a score from 0 (worst) to 26 (best), which was transformed for this study into 0 (worst) to 100 (best), as for all the other instruments' scores.

In the past, the KFT was mainly used in rheumatoid arthritis (RA), where one study showed that the KFT correlated well to the Health Assessment Questionnaire (HAQ; r = 0.72) and the HFI correlated well to finger joint erosion (r = 0.68) and to grip strength (r = 0.58) (34). The correlation of the HFI to the functional scales of the patient self-rated Arthritis Impact Measurement Scale was between r = 0.34 and r = 0.60 (35). The KFT was the most effective measure for the detection of functional treatment effects and had the highest interobserver agreement in a randomized controlled trial with auranofin in RA when compared with the HAQ and the Quality of Well Being Scale (16). In a multivariate regression model of various sociodemographic and clinical parameters (including the HAQ), the KFT made the greatest contribution as predictor for functional disability in RA (36).

To complete the clinical assessment of specific hand functions, a special form was designed to summarize the results of the various clinical tests, the so-called Custom Form, which is shown in Appendix A. A total score from 0 = worst to 100 = best can be computed as the arithmetic mean of the 34 item scores for each, where the minima of the parameters were set to scores of 0 points (e.g., 0° in range of motion, or ulnar deviation present), and the maxima (e.g., 60 kg in grip strength, or no deformity) were set as the scores of 100.

Analysis.

Health status and QOL were quantified and characterized by descriptive statistics and comparison with normative data. Descriptive statistics for all instrument scores (except grip and pinch strength, which were in kilograms) were given for a scale ranging from 0 = worst to 100 = best health to make it possible to compare them with each other. The scores of the SF-36 and the DASH (normative data are reported for these 2 instruments) were compared with population normative data using the nonparametric Wilcoxon's test because most of the scores were not normally distributed.

The quality of the data and the properties (including validity) of the instruments were explored by the distribution characteristics of the scores, the correlations between the scores, and factor analysis. Normal (Gaussian) distribution of the scores was examined by the Kolmogorov-Smirnov test. The percentage of patients who showed a score of 0 was quantified by the floor effect and the percentage with a score of 100 by the ceiling effect. Spearman's rank correlation coefficients quantified relationships between the scores from different scales and the construct validity of the instruments, as most of the data were not normally distributed and nonparametric analyses were required. In the factor analysis (main component analysis with varimax rotation), every instrument's score is a vector in the multidimensional space. The procedure calculates how parallel (how strongly correlated) one vector is to all the other vectors for each score to detect common domains, like bundles of vectors. This results in a sort of multivariate correlation coefficient for each score to different factors and was used to identify the main domains being represented by the various different instruments, both important to characterize the properties and to quantify the content validity of the instruments.

The examination unit was the patient for the SF-36, the DASH, and the PRWE throughout the analysis. For the HFI of the KFT, the Custom, and the other clinical data, it was the operated joint. All analyses were performed using the statistical software package SPSS 11.0 for Windows (SPSS Inc., Chicago, IL).

RESULTS

Patients.

RIAP of the thumb saddle joint as a treatment for primary OA (rhizarthrosis) was performed on 144 patients at the Schulthess Klinik, Zurich, between 1996 and 1998. Eleven (8%) could not be traced, even with the help of the general practitioner and the corresponding residents' registration office, e.g., they had changed address or moved abroad. Four (3%) patients had died and 19 (13%) declined participation in the study: 5 due to severe illness or handicap, 5 because of the great distance between their homes and the clinic (>1,000 km), and 9 refused to participate in the study. The data of 7 (5%) patients were highly incomplete for various reasons (e.g., they agreed at first to fill out the set of questionnaires but did not return it) and had to be excluded. Finally, 103 (72%) patients (112 joints; 9 had bilateral RIAP) were included in the study and were examined at the clinic between June and November 2003.

Table 1 shows the descriptive sociodemographic and disease-specific data. In Switzerland, the basic school consists of 6 years primary school for all children together and 2–3 years secondary school, which is divided into different levels of difficulty. The highest level may be comparable to the US high school. Sport comprised self-reported regular physical activity, organized or individual.

Table 1. Sociodemographic, disease-specific, and disease-modifying data*
  • *

    Data presented as no. (%) unless otherwise stated.

Age of patients,  mean ± SD/median  (range) years67.7 ± 9.8/67.8 (38.5–90.6)
Age of arthroplasty,  mean ± SD/median  (range) years6.2 ± 0.8/6.2 (4.5–7.7)
Sex
 M18 (17)
 F85 (83)
Race
 White103 (100)
Education
 Basic school (8–9 years)17 (17)
 Vocational training55 (53)
 College/high school/university31 (30)
Living conditions
 Urban63 (61)
 Rural40 (39)
 Alone26 (25)
 With partner77 (75)
Smoker
 No86 (83)
 Yes17 (17)
Alcohol consumption
 None25 (24)
 Occasional49 (48)
 Daily26 (25)
 Several times daily3 (3)
Sport, hours/week
 033 (32)
 0–<112 (12)
 1–236 (35)
 >222 (21)
Comorbidities excluding  joint disease
 None19 (18)
 129 (28)
 226 (25)
 316 (16)
 4 or more13 (13)
Arthroplasty
 Unilateral94 (91)
 Bilateral9 (9)
 Left48 (43)
 Right64 (57)

Compared with the 94 (91%) unilaterally operated patients, the 9 (9%) patients with bilateral arthroplasty were slightly younger (mean age 68.1 years unilateral and 63.8 bilateral; Wilcoxon's P = 0.118) and scored slightly worse on some of the instruments (e.g., mean DASH 78.8 unilateral and 73.7 bilateral; P = 0.510). However, none of the following differences were significant (data not shown in detail): sex distribution; age; education level; number of comorbidities; all scales of the SF-36, including the physical component summary (PCS) and mental component summary (MCS); the DASH; the PRWE; the HFI of the KFT; and the Custom score.

Administration and feasibility of the assessment instruments.

All patients could easily understand and complete all questionnaires. On average, the sociodemographic questionnaire took 5 minutes to complete, the comorbidity questionnaire 2, the SF-36 5, the DASH 4, and the PRWE 3 minutes (timed with a stopwatch for 20% of the patients). Thus, the whole set of self-rated questionnaires required, on average, 14 minutes to complete. With the brief introduction, distribution, collection of the questionnaires, and checking the completeness of responses, 20 minutes per patient were needed. The physical examination and completion of the KFT/HFI and the Custom Form by the examining physician took an additional 20 minutes (in this setting where only the basal thumb joint only was affected).

Health and QOL.

Table 2 and Figure 1 show the results for each instrument's score and subscores, and their comparison with normative values (only available for the SF-36 and the DASH), using a scale of 0 = worst to 100 = best health for all scales except for the grip and pinch strengths (which are in kilograms). All scores revealed relatively high values. Except in physical functioning (equal to the norm) and in role physical (in trend above the norm), 8 of the 10 SF-36 mean scores of the patients were highly significantly better than the expected age-, sex-, and comorbidity-matched normative population values. In the DASH, the mean scores were only slightly, but nevertheless significantly, below the norm (e.g., DASH total score 90.7% of the norm). Moderate to high ceiling effects (≥10%) were obtained on the SF-36 for physical functioning, role physical, physical pain, vitality, social functioning, role emotional, the DASH symptoms, the PRWE pain, the PRWE function, total PRWE, and the HFI/KFT. Radial (mean 86.2°) abduction revealed some hypermobility of the basal thumb joint, whereas palmar abduction (mean 51.3°) was as expected (Table 2).

Table 2. Outcome data after RIAP of the thumb saddle joint (n = 103 patients, 112 joints)*
 MedianMeanSDNormPnMinimumMaximum% Floor% Ceiling
  • *

    The instrument scales are 0 = worst health, maximal symptoms/limitation; 100 = best health, no symptoms/limitations; exceptions: radial and palmar abduction are in degrees and grip and pinch strength are in kilograms. The examination unit was the patient for the SF-36, the DASH, and the PRWE. For the HFI, the Custom, range of motion, and strength data, it was the operated joint. RIAP = Resection Interposition Arthroplasty; SF-36 = Short-Form 36; PCS = physical component summary; MCS = mental component summary; DASH = Disability of the Arm, Shoulder and Hand questionnaire; PRWE = Patient Related Wrist Evaluation form; HFI/KFT = Hand Functional Index of the Keitel Function Test; Custom = a score covering various clinical parameters (see text).

  • German population normative values (corrected for sex, age, and comorbidity).

  • Number of patients or arthroplasty joints assessed.

  • §

    Normally distributed (Kolmogorov-Smirnov test).

SF-36 Physical functioning73.566.925.565.80.4181000.0100.0110
SF-36 Role physical100.071.638.862.20.0711030.0100.01659
SF-36 Bodily pain62.063.326.951.0< 0.00110310.0100.0023
SF-36 General health72.067.721.853.9< 0.0011010.0100.012
SF-36 Vitality60.058.119.153.10.0041020.0100.011
SF-36 Social functioning100.087.020.479.6< 0.00110325.0100.0058
SF-36 Role emotional100.087.131.080.2< 0.0011030.0100.01083
SF-36 Mental health80.074.018.767.0< 0.00110013.0100.001
SF-36 PCS46.343.310.839.90.0019714.659.600
SF-36 MCS56.653.49.950.1< 0.0019718.768.800
DASH Symptoms83.379.519.385.60.0169912.5100.0020
DASH Function83.778.719.086.80.0028535.2100.009
DASH81.378.417.786.4< 0.0019141.4100.005
PRWE Pain84.077.024.01034.0100.0019
PRWE Function92.981.423.09615.6100.0024
PRWE86.579.022.59612.0100.0016
HFI/KFT100.090.615.810428.6100.0049
Radial abduction (°)90.086.214.710930.0100.001
Palmar abduction (°)60.051.323.51080.090.011
Grip strength (kg)19.020.08.61122.052.00
Pinch strength (kg)5.04.91.81122.010.00
Custom64.9§64.28.58438.285.500
Figure 1.

Comparison of the instruments' scores of patients after resection interposition arthroplasty (RIAP) of the thumb saddle joint (n = 103 patients, 112 joints). A color marks all subscores per instrument; horizontal stripes for the function subscores, and checkered for the pain/symptoms subscores. Scaling: 0 = worst, 100 = best. Grip and pinch strength in kilograms. Horizontal black lines are the German population normative values, corrected for sex, age, and comorbidity. SF-36 = Short Form 36; PCS = physical component summary; MCS = mental component summary; DASH = Disability of the Arm, Shoulder and Hand questionnaire; PRWE = Patient Related Wrist Evaluation form; HFI = Hand Function Index; Custom = a score covering various clinical parameters (see text).

In the overall rating of the arthroplasty result, 62 (60%) of the patients felt that their expectations of the arthroplasty had been completely met (score = 10 on the VAS, median 10.0); only 5 (5%) patients were somewhat dissatisfied (score < 5). Most (n = 94; 91%) patients felt their condition was improved (79 greatly and 15 slightly) at the time of the assessment compared with before the arthroplasty; 2 (2%) equally; 7 (7%) slightly worse, and none much worse. Most of the patients (n = 91; 88%) declared that they would choose RIAP again if they found themselves in similar circumstances to those that prevailed preoperatively. Two patients marked “no” for that item and 10 marked “do not know.” Dorsopalmar and lateral radiographs showed no dislocation of metacarpal I in any patient. Detailed analysis of the radiographic results will be one focus of a future report.

Construct and concurrent validity of the instruments.

The highest correlation was observed between the 2 condition-specific self-rating scores, the DASH and the PRWE (r = 0.82) (Table 3). The SF-36 PCS was moderately correlated to the DASH (r = 0.68) and the PRWE (r = 0.53). This finding was supported by factor analysis (Table 4). These 3 scores provided the strongest factor 1 and explained almost half of the variance. The 2 clinical measures, the HFI/KFT and the Custom score, were weakly correlated (r = 0.30) and described 2 different clinical domains (factors): the Custom (factor 2) with moderate correlations to the self-rated DASH (r = 0.57) and the PRWE (r = 0.56), and the HFI/KFI (factor 3), which was more weakly correlated to all other scores. The psychosocial dimensions of the SF-36 formed an independent dimension (factor 4). All 4 factors together explained 89.6% of the total variance of the scores.

Table 3. Spearman's rank correlation coefficients between assessment instruments*
 SF-36 PCSSF-36 MCSDASHPRWEKFT/HFI
  • *

    For abbreviation definitions, see Table 2.

  • P ≥ 0.05.

  • P < 0.001.

SF-36 PCS     
SF-36 MCS−0.18    
DASH0.680.04   
PRWE0.530.040.82  
KFT/HFI0.320.150.440.35 
Custom0.380.140.570.560.30
Table 4. Factor loads of the instruments' main scores*
 Factor 1 Physical QOLFactor 2 Clinical specificFactor 3 KFT/HFIFactor 4 Mental QOL
  • *

    A factor load of 0.0 indicates no agreement and a load of 1.0 indicates perfect agreement of the scale or score with the factor. QOL = quality of life. For other abbreviation definitions, see Table 2.

  • P < 0.001.

  • P ≥ 0.05.

Explained variance, %47.317.613.910.9
SF-36 PCS0.90−0.060.21−0.12
SF-36 MCS−0.04−0.040.050.99
DASH0.790.490.170.01
PRWE0.720.50−0.060.07
KFT/HFI0.160.210.940.06
Custom0.170.890.25−0.06

DISCUSSION

The 103 OA patients with 112 thumb saddle joints after RIAP revealed a very high level of health and QOL at 6.2 years (mean) postoperative. All 8 subscores and the 2 summary scores of the generic SF-36 were higher than those expected for the overall German population after correction for sex, age, and comorbidity (see Table 2). Eight of the 10 SF-36 scores were highly significantly above the norm. This finding is supported by the high rate (10–83%) of ceiling effect in 10 of the 18 subscores, which represents the percentage of patients with the highest possible score of 100 points. Most of the patients (at least 75%) felt much better than before the arthroplasty, were highly satisfied with the result, and would choose the intervention again if necessary. Post-hoc analysis showed that the proportion of unilateral–bilateral RIAP was not a source of bias.

Some slight, mainly functional limitations were revealed by the generic SF-36 physical functioning item, which was equal to the norm and not above it as might have been expected from the other SF-36 scores, and the condition-specific subscores of the DASH. However, the mean DASH subscores of 91–93% of the norm can be considered high because problems of the upper limb are very rare in the general population. Patients with bilateral RIAP had slightly but not significantly poorer health than those with unilateral RIAP. It is important to know the relative proportion of bilateral and unilateral arthroplasties in any given cohort, as it can confound the overall result for the given patient sample.

Compared with our total shoulder arthroplasty patients (mean age 65.1 years, 77% female, 77% with ≥1 comorbidities, 51% OA) (9) and the total elbow arthroplasty patients (mean age 64.1 years, 71% female, 71% ≥1 comorbidities, 25% posttraumatic) (37), the RIAP patients (mean age 67.7 years, 83% female, 81% ≥1 comorbidities, 100% OA) showed much better health and QOL. This was especially due to lower functional limitations. The mean SF-36 physical functioning was 54.9 for shoulder arthroplasty, 48.7 for elbow arthroplasty, and 66.9 for RIAP (0 = worst, 100 = best). The corresponding numbers for the DASH, in which 24 of the 30 items measure function, were 64.0 (shoulder), 55.3 (elbow), and 78.4 (hand). Mean grip strength was 12.5 kg in the elbow patients and 20.0 kg in the RIAP patients. Mental health and social functioning were almost the same in all 3 conditions (SF-36 MCS score 52.3–53.4), and overall pain also showed smaller differences: SF-36 physical pain 55.4 (shoulder), 59.1 (elbow), and 63.3 (hand).

Our results were comparable to those of other settings with hand joint diseases. The mean DASH was 74.2 (our patients 78.4) for 70 RIAP 2.9 years postoperatively (12). Six years after scaphoid fracture (n = 35), the mean SF-36 PCS score was 48 (our patients 43.3) and the PRWE score was 78.7 (our patients 79.0) (27). Compared with OA, RA affecting multiple joints and general health revealed variable and much poorer results (with and without arthroplasty): the mean HFI of the KFT was 54.0 (our patients 90.6) in a population-based RA sample (34) and 71.7 in an outpatient RA setting (15).

The patient's questionnaire set (containing the sociodemographic questionnaire, the SCQ, the SF-36, the DASH, and the PRWE) needed on average no more than 20 minutes to complete. The evaluation of the clinical parameters was part of the physical examination, which lasted up to 20 minutes, including 5 minutes to complete the HFI/KFT and the Custom Form. For the examiner, some of the tests of the HFI were complicated and difficult to explain to the patient. For example, item 9: “Put your hands on the table with the thumbs down and the backs of the hands towards each other. The thumbs hang downwards over the edge of the table. Lean the back of the hands slightly inwards.”

The data qualities of normal score distribution and low floor and ceiling effects are positive bio- and psychometric properties of any scale, as discussed in detail in our previous article (9). Normally distributed scores allow the use of sensitive parametric significance tests. Low floor and ceiling effects allow differentiation between patients by score; for instance, 2 patients with a score of 100 (best health) cannot be further differentiated by that score, although their health state may be different. For example, DASH item 11: “Carry a heavy object (over 10 lbs),” 0 = unable, 100 = no difficulty (transformed score). Patient 1 might be able to carry up to 11 lbs and patient 2 up to 20 lbs, but both have a score of 100. Beside some scales of the SF-36 and DASH symptoms that are well known for this problem, all PRWE scores and particularly the HFI/KFT showed high ceiling effects. The Custom score was not affected in this way and was the only one with normally distributed values.

The degree of agreement between the instruments in measuring a given symptom or functional ability and the construct validity for one instrument compared with another are quantified by Spearman's rank correlation coefficients (see Table 3). The condition-specific DASH and PRWE correlated highly not only to each other (r = 0.82) but also to the SF-36 PCS (r = 0.68 and r = 0.53) and loaded all together on the same factor in factor analysis. This means, for instance, that the PRWE has a very similar construct validity as the DASH. As shown by the factor analysis, they were surprisingly not able to represent a condition-specific physical dimension, which would differ from the generic physical dimension represented by the SF-36 PCS. By shortening the self-rating set, the PRWE could be skipped because the DASH was less affected by the ceiling phenomenon. Further advantages of the DASH compared with the PRWE are indicated by the facts that, in the literature, the DASH is referred to more frequently and for a broader range of applications, it is also applicable for the elbow and the shoulder, and the literature gives normative values for the DASH. On the other hand, one study showed more specificity of the PRWE compared with the DASH (32). However, a condition-specific self-rating dimension for the basal thumb joint could not be found in the factor analysis (Table 4) because the PRWE and the DASH loaded on the same factor together with the generic SF-36 PCS. All 4 factors together explained the very high proportion of 89.6% of the variance of all scales, i.e., the factors represented the main 4 health domains. They were able to express almost all the information of the instruments' scales and could replace them.

Surprisingly, the 2 clinical measures, the HFI and the Custom score, correlated poorly (r = 0.30) and showed up as 2 considerably independent dimensions in factor analysis. The HFI did not correlate well to any other measure used in this study, which indicates that the HFI is either not a valid measure for hand function—whereas the other measures are because they have been validated—or it measures something different than that. Together with the difficulties of application and the high ceiling effect, the appropriateness and the importance of the HFI for this RIAP setting seems to be questionable. The Custom score had much better properties (floor, ceiling, normal distribution) and was moderately correlated to the DASH (r = 0.57) and the PRWE (r = 0.56).

The large (n = 103) and homogeneous sample (4.5–7.7 years postoperative) of RIAP patients can be considered a strength of the present study. The participation rate was high (103 of possible 144, 72%) resulting in a low participation bias (a form of selection bias) that has to be expected. Participation bias is a source of possible overestimation of the outcome because the health of the nonresponders may have been worse than that of the responders. Post-hoc analysis was performed to explore other possible selection bias. This can arise if a certain patient group, e.g., the bilaterally operated, has a worse outcome than another group, e.g., the unilateral RIAP, and the proportion of the 2 groups is different in 2 study settings. The assessment was comprehensive but also specific covering general health, QOL, subjective patient self ratings, and objective clinical findings. Normative data were stratified by the confounders age, sex, and comorbidity when interpreting the patients' scores because these 3 cofactors potentially affect and, therefore, confound health state and QOL (9). Unfortunately, norms were only available for the SF-36 and the DASH.

The limitations of the study included its cross-sectional, uncontrolled design (including selection bias) and the inherent problems of any self-assessment requiring a certain level of psychointellectual abilities and compliance. Purely methodologically, the study's findings are restricted to conditions of the thumb saddle joint. However, some generalizability of the findings to the whole hand can be expected because all instruments also cover symptomatology and function of the wrist and the finger joints.

Long-term followup (mean 6.2 years) of 112 RIAP showed excellent health, QOL, and satisfaction with the intervention result. Some small, mainly functional limitations were revealed by the DASH. A set consisting of the SF-36, the DASH (or the short PRWE instead of the DASH), and the Custom Form can be suggested for the assessment of basal thumb joint conditions by the findings of this study. However, longitudinal data focusing on responsiveness, i.e., the sensitivity to change of the instruments after specific interventions, must also be considered for the final selection of instruments.

To satisfy the assessment concepts of the WHO's ICF, the final set should allow a valid, sensitive, patient-oriented, and clinically relevant assessment, within the normal clinical routine with the opportunity to compare the results across different conditions, diseases, and interventions, and with those of the general population (6, 38). A comprehensive and simultaneously specific assessment of all health dimensions is important because a study relying only on functional measures would overlook the high self-perceived quality of life and satisfaction of the patients, which may be decisive in determining the future utilization of health care resources.

Acknowledgements

We thank Roberta Schefer and Susann Drerup for the management of the patients, the questionnaires, and the database, and Joy Buchanan for help in preparing the manuscript.

APPENDIX A

Table  . CUSTOM FORM (ENGLISH TRANSLATION)*
RIGHT HANDLEFT HAND
  • *

    MCP = metacarpophalangeal; Abduct. = abduction; max = maximum; PIP = proximal interphalangeal.

Range of motion (active) MCP (°)Range of motion (active) MCP (°)
   Radial Abduct.  Palmar Abduct.   Radial Abduct.  Palmar Abduct.
   (max = 100°) (max = 90°)   (max = 100°)   (max = 90°)
ThumbThumb
   Flexion  Extension   Flexion  Extension
   (max = 100°) (max = 30°)   (max = 100°)  (max = 30°)
2nd Finger2nd Finger
3rd Finger3rd Finger
4th Finger4th Finger
5th Finger5th Finger
Range of motion (active) PIP (°)Range of motion (active) PIP (°)
   Flexion (max = 100°)   Flexion (max = 100°)
ThumbThumb
2nd Finger2nd Finger
3rd Finger3rd Finger
4th Finger4th Finger
5th Finger5th Finger
Hand extension (from horizontal) (max = 90°)Hand extension (from horizontal) (max = 90°)
Ulnar deviation wrist and PIPUlnar deviation wrist and PIP
       yes    no       yes    no
Wrist     □   □Wrist     □   □
2nd Finger □   □2nd Finger □ □
3rd Finger □   □3rd Finger □ □
4th Finger □   □4th Finger □ □
5th Finger □   □5th Finger □ □
Strength (kg)Strength (kg)
Grip (max = 60 kg)Grip (max = 60 kg)
Pinch (max = 12 kg)Pinch (max = 12 kg)
VariaVaria
            yes    no            yes    no
Complete fist closing  □    □Complete fist closing  □    □
Buttonhole deformity □    □Buttonhole deformity □    □
Swan-neck deformity □    □Swan-neck deformity □    □
Any other deformity □    □Any other deformity  □    □

Ancillary