- Top of page
- PATIENTS AND METHODS
- AUTHOR CONTRIBUTIONS
Osteoarthritis (OA) is a degenerative disease responsible for pain, disability, and handicap (1, 2). It affects the knee and the hip more frequently. Its economic impact is substantial (3), and will increase in the next decades as the population ages (4). OA epidemiology data are limited in France and in other European countries partly because of difficulties in case ascertainment due to lack of consensual definition (5, 6). An accurate estimate of OA prevalence would improve the understanding of health care needs and would facilitate public health decision making and resources allocation.
To estimate the prevalence of symptomatic hip and knee OA in France, a 2-phase design using a questionnaire in the screening phase has been adopted to limit the full case ascertainment procedure (involving clinical examination and radiographs) only to subjects more likely to have the disease. The first phase (screening/detection phase) aims to classify the subjects as positively or negatively screened for the studied disease. In the second phase (confirmation phase), only the positively screened subjects undergo the case ascertainment procedure. Such a design has previously been successfully applied in rheumatology to assess inflammatory rheumatism prevalence in France, i.e., rheumatoid arthritis (7) and spondylarthropathies (8), and in other chronic conditions (e.g., neurology [9–13], pneumology , and psychiatry [15, 16]).
Because no adequate questionnaire was available for the detection of symptomatic hip and knee OA, a simple questionnaire was developed. Given that in the context of health surveys, telephone interviews achieved response rates comparable to those achieved by face-to-face interviews, with no difference in quality of response and at a lower cost (17, 18), this questionnaire was suited to be administered by telephone. Its metrologic properties were first estimated in the general population in a recent study that reproduced the 2-phase design of the future prevalence and aimed to examine the feasibility of the adopted 2-phase design (19). However, further data were required before it could be used in the prevalence survey.
The aim of this study was to assess the performance of 3 strategies based on different combinations of questions in this new telephone questionnaire in the detection of cases of symptomatic hip and knee OA.
- Top of page
- PATIENTS AND METHODS
- AUTHOR CONTRIBUTIONS
A total of 358 subjects were recruited consecutively from the rheumatology units of 6 French university hospitals. Among them, 16 did not complete the telephone interview (8 refused, 6 could not be further contacted, and 2 died). In all, 126 patients with hip OA, 143 with knee OA (38 had both hip and knee OA), and 111 subjects (controls) with other rheumatic disorders completed both case ascertainment and detection procedures. Patient characteristics are shown in Table 2. Among patients with OA according to the rheumatologist opinion, 119 of 126 satisfied the ACR criteria for the hip and 137 of 143 satisfied the ACR criteria for the knee (Figure 2).
Table 2. Patient characteristics*
| ||Hip OA (n = 126)||Knee OA (n = 141)||Controls (n = 111)|
|ACR criteria positive||119||137||–|
|Age, mean ± SD years||60.5 ± 9.8||63.4 ± 9.0||55.3 ± 10.1|
|Patients age ≤60 years||58||61||82|
|Patients age >60 years||61||76||29|
|Sex ratio (M/F)||0.88||0.53||0.56|
|Duration of symptoms, mean ± SD years||5.6 ± 6.8||7.4 ± 7.9||6.10 ± 7.6|
Controls had low back pain with radiculalgia (n = 39), rheumatoid arthritis (n = 31), spondylarthropathies (n = 11), knee algoneurodystrophy (n = 4), fibromyalgia (n = 2), OA of the ankle or foot (n = 7), and other disorders (n = 17; e.g., chondrocalcinosis, seronegative arthritis, erythema nodosum, and SAPHO syndrome [synovitis, acne, pustulosis, hyperostosis, and osteitis]). No diagnosis was reported for 5 subjects in this group.
On average, the questionnaire only took 5 minutes to conduct. No particular difficulty was encountered during the interview.
Table 3 shows the sensitivity, specificity, and positive and negative likelihood ratios of each symptom question and of the detection questionnaire according to the strategies described above for both subjects satisfying the ACR classification criteria and controls.
Table 3. Sensitivity, specificity, and positive and negative likelihood ratios of each question and each strategy*
| ||No. D+ (OA+)||Sensitivity, % (95% CI)||No. D− (OA−)||Specificity, % (95% CI)||Positive likelihood ratio (95% CI)||Negative likelihood ratio (95% CI)|
|Hip OA, total||119|| ||111|| || || |
| Q1||114||95.8 (90.5–98.6)||50||45.1 (35.6–54.8)||1.74 (1.47–2.07)||0.09 (0.03–0.25)|
| Q2||81||68.1 (58.9–76.3)||75||67.6 (58.0–76.2)||2.10 (1.56–2.82)||0.47 (0.35–0.63)|
| Q3||89||74.8 (66.0–82.3)||82||73.9 (64.7–81.8)||2.86 (2.06–3.98)||0.34 (0.25–0.47)|
| Q4||105||88.2 (81.1–93.4)||81||73.0 (63.7–81.0)||3.26 (2.39–4.46)||0.16 (0.10–0.27)|
| Strategy 1||117||98.3 (94.1–99.8)||47||42.3 (33.0–52.1)||1.71 (1.45–2.00)||0.04 (0.01–0.16)|
| Strategy 2||112||94.1 (88.3–97.6)||84||75.7 (66.6–83.3)||3.87 (2.78–5.39)||0.08 (0.03–0.18)|
| Strategy 3||111||93.3 (88.8–97.8)||95||85.6 (79.1–92.1)||6.47 (4.10–10.21)||0.08 (0.04–0.15)|
|Knee OA, total||137|| ||111|| || || |
| Q8||131||95.6 (90.7–98.4)||51||46.0 (36.5–55.7)||1.77 (1.48–2.11)||0.10 (0.04–0.21)|
| Q9||111||81.0 (73.4–87.2)||70||63.1 (53.4–72.0)||2.19 (1.70–2.83)||0.30 (0.21–0.44)|
| Q10||65||47.5 (38.9–56.2)||94||84.7 (76.6–90.8)||3.10 (1.93–4.96)||0.62 (0.52–0.74)|
| Q11||117||85.4 (78.4–90.9)||84||75.7 (66.6–83.3)||3.51 (2.51–4.91)||0.19 (0.13–0.29)|
| Strategy 1||132||96.4 (91.7–98.8)||47||42.3 (33.0–52.1)||1.67 (1.42–1.97)||0.09 (0.04–0.21)|
| Strategy 2||125||91.2 (85.2–95.4)||86||77.5 (68.6–84.9)||4.05 (2.86–5.74)||0.11 (0.07–0.20)|
| Strategy 3||124||90.1 (85.6–95.4)||92||82.9 (75.9–89.9)||5.29 (3.50–7.99)||0.11 (0.07–0.19)|
The highest value of sensitivity was achieved with the pain item for both hip and knee OA. Among hip OA cases, 58 (48.7%) of 119 reported pain in the groin (26 had only pain in the groin), 69 (58%) reported pain in the hip, and 36 (30.3%) reported pain in the upper thigh. Among subjects without OA, 12 of 111 had pain in the groin (6 had pain only in the groin), 28 had pain in the hip, and 37 had pain in the upper thigh. Specificity of the pain location was 89.2%, 74.8%, and 66.7%, respectively. Limitation of motion for the hip and knee swelling had the highest values of specificity. Self-reported OA diagnosis achieved a sensitivity of 88.2% for hip OA and 85.4% for knee OA, and a specificity of 73.0% for hip OA and 75.7% for knee OA.
At least 1 symptom for the hip was reported by 236 of 342 subjects, and for the knee by 251 of 342 subjects (173 reported a symptom for both joints). Among the hip OA group, 117 subjects presented with at least 1 symptom: 90 subjects reported a medical diagnosis of hip OA (sensitivity 76.9%; 95% CI 69.3–84.6), 3 reported another diagnosis that could explain the symptoms according to the rheumatologist opinion (e.g., radiculalgia, osteonecrosis), and 26 did not have any diagnosis. Among the knee OA group, 132 subjects presented with at least 1 symptom: 97 reported a medical diagnosis of knee OA (sensitivity 73.5%; 95% CI 66.0–81.0), 3 reported hip OA, 8 reported an alternative diagnosis that could explain the symptoms according to the rheumatologist opinion (e.g., radiculalgia, algoneurodystrophy), and 28 did not report any diagnosis. Among the non-OA group, 64 subjects presented with at least 1 symptom of the hip and the knee. To explain hip pain, 19 reported a medical OA diagnosis (4 hip OA and 15 knee OA), 27 had a non-OA diagnosis, and 18 had no diagnosis (specificity 42.2%; 95% CI 30.1–54.3, when considering that only subjects who reported a non-OA diagnosis were true negative, because knee OA could not explain hip symptoms according to the rheumatologist opinion). To explain knee pain, 11 reported an OA diagnosis (2 hip OA, 9 knee OA), 30 reported a non-OA diagnosis, and 23 did not have any diagnosis (specificity 50.0%; 95% CI 37.8–62.3, when considering that subjects who reported a non-OA diagnosis and a hip OA diagnosis were true negatives).
With strategy 1, where all subjects presenting with at least 1 symptom were considered as screened positive, the sensitivity was high (>96%), but the specificity was low for both joints. With strategy 2, taking into account the self-reported OA diagnosis, the specificity increased greatly, from 75.7% (for hip OA) to 77.5% (for knee OA), whereas the sensitivity decreased slightly (>91%). The highest values of specificity were achieved using strategy 3, taking into account the physician diagnosis reported by patients (questions Q7 for the hip and Q14 for the knee), ranging from 85.6% for the hip to 82.9% for the knee. The sensitivity was 93.3% and 90.1% for hip and knee OA, respectively. Age (≤60 or >60 years old) did not significantly affect the performance of the detection strategies. However, strategies 2 and 3 tended to have lower specificity in the older group than in the younger group.
Assuming constant likelihood ratios across different settings and prevalences, posttest probabilities were estimated. Figure 3 shows nomograms of the relationship between pre- and posttest probabilities after a positive and a negative detection. For example, for a subject living in Greece who presented with pain in the hip, the pretest probability will be 2% (prevalence of symptomatic hip OA in Greece ). Using the nomograms in Figure 3, a positive detection will lead to a posttest probability of 3.4%, 7.3%, and 11.7% for strategies 1, 2, and 3, respectively. On the other hand, a negative detection will produce a probability lower than 0.2%, whatever the strategy used.
- Top of page
- PATIENTS AND METHODS
- AUTHOR CONTRIBUTIONS
Classically, epidemiologic studies have relied on typical radiographic changes to define OA. However, since large-scale radiograph screening is no longer justifiable on ethical and economic grounds, it was useful to develop questionnaires aimed at adequately identifying subjects with OA. A 2-phase prevalence survey was therefore adopted to estimate the prevalence of symptomatic hip and knee OA in France.
The short questionnaire described here is intended for use in the telephone detection phase of this survey, in which the second phase, i.e., the confirmation phase, involves clinical examination and radiographs. Its metrologic properties were first estimated in the general population during a recently conducted study that reproduced the 2-phase design of the future prevalence survey (19). However, the main objective of this study was to evaluate the feasibility of such a design (selection sample method, acceptability of the detection procedure) and to adopt corrective measures for the future prevalence survey from the information obtained. Only one strategy based on the presence or absence of symptoms had been evaluated, and 35% of the screened-positive subjects failed to complete the ascertainment procedure. Further data were therefore required before the use of the questionnaire in the prevalence survey, particularly to evaluate other strategies (by taking into account answers that had been ignored during the pilot study) and to estimate the maximal specificity error by the use of subjects without hip or knee OA but presenting with other rheumatic diseases with lower extremity symptoms.
Reports of the development of detection questionnaires for OA have been published (24–29), often based on a simple question such as, “Do you have pain in a hip/knee?” (24–27), which was found to have good sensitivity but low specificity in our study and previous studies (29). A postal questionnaire based on self-reported OA (without joint specificity) was published as well (30). Algorithms based on a combination of symptoms have been evaluated, but had poor specificity (28, 29). None has been tested for use by telephone. To our knowledge, only one study evaluated different algorithms of symptoms for both joints together (29). However, this questionnaire was evaluated in a population of elderly people ages 60–90 years, and was not available at the moment of the survey preparation.
A 2-phase prevalence survey is only efficient if the detection procedure is inexpensive, easily administered, acceptable to subjects who participate in the survey, and accurate. To limit the failure in detecting cases, the detection questionnaire needs to be highly sensitive and specific. A low sensitivity fails in detecting cases. A low specificity (i.e., a high rate of false positives) increases the number of case ascertainment procedures required in phase 2, and therefore increases the cost of the survey. An estimate of the effect of specificity error on the radiograph prescription showed that if the questionnaire achieved 60%, 70%, 80%, 90%, or 95% specificity, the number of false positive cases (and thus useless prescribed radiographs) in a population of 5,000 with 5% prevalence would be 1,900, 1,425, 950, 475, and 240, respectively.
None of the evaluated strategies can accurately identify subjects with knee and/or hip OA. However, sensitivity was high, from 90% for the less sensitive strategy (strategy 3) to 96% for strategy 1. Using this strategy, which is only based on questions about symptoms, we could ideally expect 100% sensitivity. However, because of the symptoms' variability in time, some subjects who were symptomatic during the recruitment visit were no longer so during the detection, and were thus wrongly considered as not having symptomatic OA. To improve the performance of the detection questionnaire, it is therefore important to limit time between the detection and the ascertainment procedure. There is an important variability in how hip pain is reported. In our study, as expected, pain in the groin was found to be more specific than the other locations. Its sensitivity was poor, however, with less than 50% of detected cases. The use of this single question would lead to an important failure in detecting cases.
To estimate OA prevalence, classification criteria are usually used. We therefore only selected the subjects satisfying the ACR classification criteria among those considered as having OA according to the rheumatologists, in order to assess the sensitivity of the detection questionnaire. Had the rheumatologist judgment been taken as the reference, the sensitivity would have ranged from 90.0% (strategy 3) to 98.4% (strategy 1) for the hip, and from 87.4% (strategy 3) to 96.5% (strategy 1) for the knee.
Specificity values were lower than sensitivity values. When the case identification was only based on symptoms (strategy 1), the specificity was poor. It increased greatly when taking into account the answer to the question, “Do you have hip/knee OA?” (strategy 2), to achieve a minimum of 75.7% for the hip and 77.5% for the knee. The most specific strategy was strategy 3, which generated less than 20% of false positives but required a medical opinion to determine if the physician diagnosis reported by patients could explain the subject symptom(s). These rates would probably be higher in the general population, where a higher proportion of subjects will be free of rheumatic diseases than in controls deliberately selected among patients with rheumatic disease. Perhaps because of the limited number of subjects and thus a lack of statistical power, no effect of age on the performance of the strategies was found. For knee OA detection (28), sensitivity of detection instruments (based on symptoms) was found to be increased among subjects age >60 years compared with younger subjects. Specificity was diminished.
The main limitation of our study is the recruitment of subjects among patients from rheumatology units, which may limit the generalizability of the results. Although this questionnaire is to be applied in the general population, it was tested here in subjects recruited among patients from rheumatology units. Ideally, the 3 strategies should have been evaluated in a general population–based study with a prospective design (detection and then confirmation) to limit both spectrum bias and the effect of the initial rheumatologist visit. Since a first study has been recently conducted in the general population, a second one would have been too time-consuming and costly. We therefore decided to collect complementary data in a study conducted in a hospital setting. Patients who refer to a rheumatologist could have more severe disease than those who do not (more advanced cases, with more important disability or handicap). Since sensitivity and specificity vary across the spectrum of disease, results of our study are exposed to spectrum bias. If sensitivity is determined in subjects with serious disease, it will be overestimated relative to the real situation in the general population, as will specificity if tested in healthy subjects. In our sample, the mean duration of symptoms was from 5 to 7 years in the recruited patients. Data on the severity of the diseases (radiographic changes, disability) were not collected.
In this study, since specificity has no consequence on prevalence estimates in 2-phase surveys, but does have an impact on their logistics, controls have been deliberately recruited among patients with rheumatic disease to maximize the specificity error and estimate the maximal cost of the prevalence survey. Specificity is therefore probably underestimated relative to what we could expect in the general population.
Patients are probably more sensitized to their rheumatic condition than subjects who do not refer to a rheumatologist. Moreover, because of the reverse sequence of the design (patients were recruited based on whether or not they have the disease), their true disease status is known at the beginning of the study. Even if patients were blinded to the joint of interest, and did not know that this study was focused on OA specifically, they were probably more aware of their condition than subjects in the general population. Such a situation may improve both sensitivity and specificity of the studied questionnaire. For the same reason, the ability of the detection questionnaire to detect undiagnosed new cases could not be evaluated. We may, however, suppose that only strategies 2 and 3 are exposed to these biases (even if self-reported musculoskeletal diseases are highly prevalent in the general population ) because they both require medical diagnosis, whereas the first strategy is based only on self-reported symptoms, which are independent of having received a diagnosis. Moreover, 22% of the symptomatic subjects (with or without OA) could not give a medical diagnosis to explain their symptoms, and only 50% of patients in the non-OA group could report the correct diagnosis, although they underwent the medical visit a few weeks before. The impact of the reverse sequence of the design was therefore less important than expected.
In our sample, there was a sex difference between the hip OA and non-OA groups, with a more significant number of women in the hip OA group. Hip OA prevalence is known to be higher in women than in men. Symptoms are more frequently reported by women than by men (32, 33). Such a difference may affect interpretation of the results.
The questionnaire was tested in its French version using terms adapted to the French culture. Before using the questionnaire in another European or non-European country, it should be translated and adapted to the culture of the country of interest, because some items used cannot be equivalent across cultures.
In conclusion, the simple detection questionnaire tested in this study, derived from typical symptoms of hip and knee OA, is a useful instrument to detect symptomatic hip and knee OA. If the main objective consists of limiting at most the failure in detecting cases, the first strategy seems to be the best. However, because of its low specificity, its use of the detection phase of a 2-phase prevalence survey has lower cost-efficiency. If the main objective is to achieve the lowest possible cost for the 2-phase prevalence survey, the second strategy, based on the self-reported OA diagnosis, seems to be the most efficient, with both high sensitivity and specificity. Performance of strategy 3, which requires medical opinion and is therefore more time-consuming, is very close to that of strategy 2, and therefore seems to have a limited interest. Regardless of the chosen strategy for detection, this questionnaire fails to reach complete accuracy. Clinical examination and radiographs remain necessary to complete the ascertainment procedure.
- Top of page
- PATIENTS AND METHODS
- AUTHOR CONTRIBUTIONS
Dr. Guillemin had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study design. Morvan, Roux, Fautrel, Euller-Ziegler, Coste, Saraux, Guillemin.
Acquisition of data. Morvan, Roux, Rat, Euller-Ziegler, Loeuille, Banal, Mazieres.
Analysis and interpretation of data. Morvan, Roux, Rat, Euller-Ziegler, Coste, Saraux, Guillemin.
Manuscript preparation. Morvan, Rat, Mazieres, Coste, Saraux, Guillemin.
Statistical analysis. Morvan, Guillemin.