The American College of Rheumatology (ACR) criteria for fibromyalgia are the de facto criteria used for research. However, ACR criteria are not generally utilized by nonrheumatologists, and rheumatologists may diagnose fibromyalgia in patients who do not satisfy the ACR criteria. We undertook this study to determine concordance between ACR criteria and clinician diagnosis and between proposed survey criteria and clinician diagnosis.
Consecutive patients in a clinical practice setting were evaluated by tender point examination, survey criteria for fibromyalgia (Regional Pain Scale score ≥8 and fatigue score ≥6), and clinical diagnosis.
Among the 206 patients, the clinician diagnosed fibromyalgia in 49.0%, while 29.1% satisfied ACR criteria and 40.3% satisfied survey criteria. Clinical and survey criteria were concordant in 74.8% of cases (κ = 0.49 [95% confidence interval 0.36, 0.60]). Clinical criteria and ACR criteria were concordant in 75.2% of cases (κ = 0.50 [95% confidence interval 0.35, 0.59]), and survey criteria and ACR criteria were concordant in 72.3% (κ = 0.40 [95% confidence interval 0.25, 0.51]). The ACR tender point criterion (≥11) was not a factor in clinical and survey criteria. However, the tender point count was useful in clinical diagnosis.
Clinical diagnosis and ACR and survey criteria are moderately concordant (72–75%) and address a common pool of symptoms and physical findings. Because there is no gold standard for fibromyalgia diagnosis and because fibromyalgia is often viewed as a trait diagnosis, all methods of diagnosis have utility. The survey method has the advantage that it does not require physical examination.
In clinical trials and observational research studies, fibromyalgia is usually diagnosed by application of the American College of Rheumatology (ACR) criteria (1). These criteria require the concurrent presence of widespread pain and tenderness on palpation in at least 11 of 18 tender point sites. Although these criteria are accepted among investigators who accept the concept of fibromyalgia, they have a number of problems. With untrained assessors, the criteria are likely not to be applied uniformly (2); in practice the diagnosis is often made without the formal tender point examination (2); patients may have the requisite tender points and yet not have fibromyalgia; and, finally, tender points and widespread pain alone do not capture the essence of fibromyalgia—a disorder of multiple symptoms which prominently include fatigue, sleep disturbance, and cognitive dysfunction (3–7).
In addition, experienced clinicians do not rely on just ACR criteria for diagnosis, and patients who do not satisfy ACR criteria may still be properly diagnosed as having fibromyalgia (8). Still another problem with the current diagnostic criteria is that they require physical examination, thereby precluding diagnosis by survey or questionnaire.
Based on observations of characteristics of patients with and without fibromyalgia, we have suggested that fibromyalgia may be diagnosed in persons who have high levels of fatigue and many painful areas (9, 10). Specifically, we proposed that a Regional Pain Scale (RPS) score of ≥8 together with fatigue score of ≥6 on a visual analog scale (VAS) constitute sufficient research criteria for the diagnosis of fibromyalgia (9, 10). These criteria have the advantage that they can be used in survey research, since they do not require physical examination. In addition, identifying patients by these criteria satisfies the objections of Crofford and Clauw regarding an overemphasis on pain and tender points in the ACR criteria (5). However, the criteria have not been subjected to external validation.
In the study described herein, we examined the interrelationship of clinical diagnosis and ACR and survey criteria in the diagnosis and understanding of fibromyalgia. More specifically, we measured the concordance of results between the various methods of diagnosis and investigated the significance of this in the context of a broader understanding of fibromyalgia characteristics and diagnosis.
PATIENTS AND METHODS
In the practice of one of the authors (RSK), consecutive patients completed a questionnaire that included the RPS and a 0–10 VAS for fatigue, underwent a tender point examination by an experienced and specially trained nurse using the method described in the ACR criteria (1), and received a clinical diagnosis by the author, a clinician with experience in fibromyalgia (11–13). The goal was to identify ∼100 patients with a diagnosis of fibromyalgia and 100 patients without such a diagnosis. The RPS is a self-administered count of the number of painful nonarticular regions. Possible scores range from 0 to 19. Survey fibromyalgia was diagnosed when the RPS score was ≥8 and the VAS score for fatigue was ≥6 (9, 10).
The diagnosis of fibromyalgia was made regardless of any other diagnosis. Therefore, no distinction was made between primary and secondary or concomitant fibromyalgia. In not making this distinction, we followed the recommendation in the report of the ACR fibromyalgia criteria committee (1), i.e., “Primary and secondary-concomitant fibromyalgia were essentially indistinguishable with the study variables, and the criteria proposed worked equally well in both groups. The committee suggests abolishing the distinction between primary and secondary-concomitant fibromyalgia at the level of diagnosis.” However, to provide further information, the clinical diagnoses among patients not diagnosed by the clinician as having fibromyalgia were recorded, and were as follows: rheumatoid arthritis (RA) (n = 29), psoriatic arthritis (n = 4), systemic lupus erythematosus (SLE) (n = 26), scleroderma (localized) (n = 2), scleroderma (CREST syndrome [calcinosis, Raynaud's phenomenon, esophageal dysmotility, sclerodactyly, telangiectasias]) (n = 1), Cogan's syndrome (n = 1), polymyalgia rheumatica (n = 2), osteoarthritis (OA) (n = 6), low back syndrome (n = 1), Sjögren's syndrome (n = 2), polyarthritis (n = 9), mixed connective tissue disease (n = 3), discoid lupus erythematosus (n = 1), osteopenia (n = 1), myofascial pain (n = 1), degenerative disease of the lumbar spine (n = 8), spondylarthritis (n = 1), spinal stenosis (n = 3), vasculitis (n = 1), other (n = 3). Among patients diagnosed by the clinician as having fibromyalgia, 16.8% had other rheumatic disease diagnoses, including degenerative disease of the lumbar spine (n = 7), OA (n = 5), SLE (n = 2), RA (n = 2), and polyarthritis (n = 1). In addition to the observations of the ACR committee concerning the equivalence of primary and secondary-concomitant fibromyalgia (1), we have recently confirmed these observations in a clinical data set (10).
Patients were classified as having clinical fibromyalgia if the clinician diagnosed fibromyalgia and were classified as having survey fibromyalgia if they satisfied the survey fibromyalgia criteria. In addition, we noted whether patients satisfied the ACR fibromyalgia criteria, based on the nurse's tender point examination. The clinician was unaware of the survey diagnosis and results of the nurse's examination. At some time during the clinical course, every patient had had a full rheumatology examination performed by RSK, which included a fibromyalgia tender point examination. The decision regarding whether the patient could be clinically diagnosed as having fibromyalgia for this study was made considering the long-term patient-clinician experience and included factors related to pain, tenderness, fatigue, sleep disturbance, comorbidity, and psychosocial variables. At the time of the study visit, a tender point examination was performed if the clinician believed it was necessary to make the study diagnosis. Patients were classified as having fibromyalgia according to the ACR criteria if they had ≥11 tender points on the nurse's examination and satisfied the ACR criteria for widespread pain (1) as determined from the RPS pain site questionnaire (9).
All patients completed a survey which included assessments of demographic features, comorbid illness, disability status, a review of symptoms (which was used to determine a symptom count), the Health Assessment Questionnaire II (HAQ-II) (14), the Short Form 36 mood score (15), and VAS scales for pain, fatigue, sleep problems, and global severity, as previously described (10). The following symptoms were included in the symptom count: depression, diarrhea, dizziness, dry eyes, dry mouth, shortness of breath, hearing difficulties, headache, hives/welts, trouble thinking, oral ulcers, muscle pain, muscle weakness, nausea, nervousness, numbness/tingling, pain in upper abdomen, sun sensitivity, chest pain, easy bruising, heartburn, rash, Raynaud's phenomenon, seizures, loss/change in taste, ringing in ears, blurred vision, vomiting, pain/cramps in abdomen, hair loss, loss of appetite, wheezing, constipation, fever, itching, insomnia, and fatigue/tiredness. In addition, the ACR tender point examination results were recorded.
The statistical methods used for analysis of agreement/concordance included the kappa statistic (with bootstrap confidence intervals [CIs]) (16), Somer's D, and Kendall's tau-a (tau-a). Tau-a and Somer's D are related mathematically, and both are related to the area under the receiver operating characteristic curve (17). Tau-a enables understanding of the degree to which the binary fibromyalgia diagnostic methods are associated with ordinary study variables. Kendall's tau-a has a simple interpretation, i.e., the percent agreement between the demographic, clinical, and diagnostic variables and the 2 methods of diagnosis. For example, a value of 0.35 indicates that it is 35% more likely that a person with a high pain score will be diagnosed by survey criteria as having fibromyalgia than a patient with a low pain score. Tau-a can also be used to compare survey and clinical criteria to determine the comparative difference in concordance or strength of association. Kernel density estimates were based on the Epanechnikov kernel. Data analysis was conducted using Stata, version 8.2 (Stata, College Station, TX). P values less than 0.05 were considered significant.
Except for ethnicity, marital status, and education, patients with survey-diagnosed fibromyalgia differed in all diagnostic and clinical characteristics from patients without survey-diagnosed fibromyalgia (Table 1). In addition, patients meeting the survey criteria for fibromyalgia had high levels of work disability and clinical symptoms consistent with known characteristics of fibromyalgia.
Table 1. Demographic and clinical characteristics of patients meeting and patients not meeting the survey criteria for fibromyalgia*
Meeting survey fibromyalgia criteria
Not meeting survey fibromyalgia criteria
Except where indicated otherwise, values are the mean ± SD. HAQ-II = Health Assessment Questionnaire II; SF-36 = Short Form 36.
Non-Hispanic white, %
14.46 ± 2.18
14.85 ± 2.17
Sex, % male
Retired early or disabled, %
Comorbidity score, 0–11
2.60 ± 1.48
1.76 ± 1.57
7.86 ± 1.15
4.05 ± 2.65
1.32 ± 0.56
0.64 ± 0.50
7.11 ± 2.10
3.34 ± 2.60
Global severity, 0–10
5.98 ± 2.61
2.55 ± 1.99
SF-36 mental health, 0–100
58.59 ± 19.89
71.97 ± 17.95
Sleep disturbance, 0–10
6.05 ± 2.90
3.40 ± 2.98
Symptom count, 0–32
15.59 ± 6.50
7.90 ± 5.68
Tender point count, 0–18
10.61 ± 5.77
4.43 ± 5.56
11 tender points, %
Regional Pain Score, 0–19
13.93 ± 3.45
5.04 ± 4.19
Of the 206 study cases, the clinician's diagnosis was fibromyalgia in 101 (49.0%). Sixty of the 206 patients (29.1%) satisfied ACR criteria, and 83 (40.3%) satisfied survey criteria. Clinical and survey criteria were concordant in 74.8% of cases (κ = 0.49 [95% CI 0.36, 0.60]). Clinical criteria and ACR criteria were concordant in 75.2% of cases (κ = 0.50 [95% CI 0.35, 0.59]), and survey criteria and ACR criteria were concordant in 72.3% (κ = 0.40 [95% CI 0.25, 0.51]). The interrelationships of the 3 diagnostic groups are shown graphically in Figure 1. Of patients who would be diagnosed as having fibromyalgia by at least 1 method (n = 120), only 33% would be diagnosed by all 3 methods. Isolated positive cases were noted (of the 120 cases, 17%, 12%, and 2%, respectively, would be diagnosed by clinical diagnosis only, by survey only, and by ACR criteria only).
Table 2 depicts the relationship between methods of diagnosis and diagnostic and clinical variables. Within the “survey” and “clinical” columns, greater values mean stronger association with diagnosis and with fibromyalgia symptoms. Values in the “difference” column indicate how much stronger (positive values) or weaker (negative values) is the variable for survey diagnosis compared with clinical diagnosis. Among the diagnostic items, the tender point count (0.40) and the RPS (0.35) were most strongly associated with diagnosis among patients diagnosed clinically. Of interest, the tender point criterion (≥11 tender points) had a weaker effect (0.25). The reason for this is that the tender point count discriminated maximally at a count of ≥6, whereas the ACR criterion (≥11) had reduced sensitivity (Figure 2).
Table 2. Concordance of survey and clinical fibromyalgia criteria for demographic and clinical variables, by Kendall's tau-a*
Tau-a coefficients reflect the percent agreement, e.g., the survey criteria and fatigue were 37% more likely to be concordant than to be discordant, and clinical diagnosis and fatigue were 19% more likely to be concordant than to be discordant. 95% CI = 95% confidence interval (see Table 1 for other definitions).
Non-Hispanic white, %
Sex, % male
Retired early or disabled, %
Comorbidity score (0–11)
Global severity, 0–10
SF-36 mental health, 0–100
Sleep disturbance, 0–10
Symptom count, 0–32
Tender point count, 0–18
11 tender points, %
Regional Pain Score, 0–19
The relationship between tender point count and clinical and survey diagnosis is depicted in more detail in Figure 3. For clinical diagnosis, where tender points would be expected to influence diagnosis, as well as for survey diagnosis, where tender points would not be expected to influence diagnosis because their values are not included in the diagnosis process, a level of ≥6 tender points graphically separated diagnosis and non-diagnosis. These data provide evidence of the role of tender points in fibromyalgia, but suggest that the ≥11 level is not consistent with clinical or survey diagnosis.
In contrast to the tender point count, the RPS had the highest tau-a level for clinical diagnosis as well as being a criterion for survey diagnosis. Figure 4 indicates that the survey diagnosis level for RPS (≥8) was at the appropriate sensitivity, specificity, and percent correct level. The survey criterion level for fatigue by VAS (≥6) was also at the optimum predictive level (data not shown).
In addition to diagnostic elements, Table 2 shows that patients diagnosed by the survey criteria compared with the clinical criteria were more likely to have elevated levels of functional disability, pain, and global severity than patients diagnosed by the clinical criteria. They were also more likely to have higher levels of fatigue and regional pain; however, this would be expected because these variables are diagnostic elements in the survey criteria. In contrast, the tender point count was more likely to be higher in persons diagnosed clinically. As a whole, the survey criteria were more strongly associated with measures of clinical severity than were the clinical criteria.
When a person is diagnosed as having rheumatoid arthritis the diagnosis is ordinarily permanent, i.e., if RA is no longer evident we say that the illness is in remission, not that the diagnosis is not RA. Remission is problematic in fibromyalgia because there are no criteria for fibromyalgia remission. If one considers that fibromyalgia might be diagnosed as being at the end of a severity spectrum of pain, fatigue, sleep disturbance, and tenderness, it is not known how far toward normality the condition would have to become before one could say the patient is in remission. In addition, if the ACR fibromyalgia criteria are used to diagnose fibromyalgia, then, depending on change in clinical status, fibromyalgia may be diagnosed on one occasion and not diagnosed on another. This diagnostic mechanism works for clinical trials because it identifies persons with severe fibromyalgia-related symptoms, a requirement for clinical trials, but it is neither appropriate nor the de facto practice in the clinic.
Instead, a diagnosis of fibromyalgia is most often permanent in the sense that it tends to represent a trait rather than a state. As such, a person may have “a little” or even no fibromyalgia for a while and much fibromyalgia during other periods. For example, fibromyalgia characteristics may become more prominent during periods of physical and/or mental stress and may relent or decrease during periods of better health or tranquility. Furthermore, it has been the authors' experience in our clinics that patients considered to have fibromyalgia may not meet formal ACR criteria on some or all occasions. This conundrum between trait and state fibromyalgia is relevant to the level of concordance noted among the 3 different diagnostic methods where, as might be expected, ACR criteria that rely on a specific number of tender points (state) were less often positive than clinician's criteria (trait) or survey criteria (symptoms).
We found moderate agreement between the clinician's diagnosis and ACR criteria (κ = 0.50, 95% CI 0.35, 0.59). This level of concordance would be expected given the trait concept of fibromyalgia diagnosis, and as clinicians we were not surprised by these results. Most studies of fibromyalgia criteria have addressed the reliability of the tender point examination (18–20); only a few have addressed the permanence of fibromyalgia symptoms (21–25). In addition, Figures 2 and 3 provide insight into discordance, showing that patients whose number of tender points was ∼5–11 were frequently classified as having fibromyalgia and that 11 tender points was not the optimum cut point in clinical practice.
There was also a moderate level of concordance between survey and clinical diagnosis (κ = 0.49, 95% CI 0.36, 0.60). Thus, the clinician was approximately as likely to be in agreement with the survey criteria as with the tender point criteria. Figure 3, while showing that patients with tender point counts of ∼5–11 were often classified as having fibromyalgia, also shows that some patients with high numbers of tender points were not identified as having fibromyalgia by the survey criteria.
We also found that the RPS cut point and the fatigue cut point proposed in the survey criteria could be validated as the optimum cut points, using clinical diagnosis as the gold standard. We used clinical diagnosis as the gold standard because the ACR criteria often fail to identify patients who have fibromyalgia. Use of the ACR criteria did, however, yield approximately the same results.
The finding of moderate concordance between 2 methods raises an issue about the ACR criteria and what their use means to fibromyalgia diagnosis. It should be noted that the 1990 ACR criteria study (1) did not find just “one” fibromyalgia. Different sets of potential criteria, including symptom criteria, performed almost as well as the official criteria. Although the use of different criteria did not appreciably change the percent of patients correctly classified, the actual patients in the positive and negative predicted categories were often different. In fact, many similar but slightly different definitions of fibromyalgia existed in the ACR study, which resulted in patients at the borderline of having/not having fibromyalgia being classified differently by different criteria. Since there is really no gold standard for diagnosis, there is also no absolutely correct method of diagnosis, although for research purposes the ACR criteria de facto have filled that role.
Table 2 shows the extent to which fibromyalgia was correlated with demographic and clinical status variables. Regardless of diagnostic method, fibromyalgia was strongly correlated with the RPS score. This indicates that both the survey method and the clinical method rely strongly on the extent of pain (painful regions). Looking further into the relationship between the clinician's diagnosis and criteria items, we note that Kendall's tau-a was 0.35 (95% CI 0.30, 0.40) for the RPS score and only 0.25 (95% CI 0.20, 0.31) for the tender point criterion (half difference −0.04, 95% CI −0.07, −0.01). This further indicates the preferential reliance on pain extent by the clinician.
Except for fatigue, which is part of the survey diagnosis, the highest level of correlation of fibromyalgia was with clinical variables, such as HAQ score, pain, and global severity, regardless of method of diagnosis (Table 2). In addition, overall, persons with fibromyalgia by the survey criteria had higher levels of disability, comorbidity, somatic symptoms, and psychological distress than did those without fibromyalgia by the survey criteria, as would be expected.
Table 2 also provides insight into the extent to which different definitions (survey and clinical fibromyalgia) yield concordant results. With regard to clinical status variables, survey criteria were somewhat more concordant with these variables than were clinical criteria. For example, pain, HAQ, and patient global status were 4% more likely to be concordant with the survey diagnosis than with clinical diagnosis. However, there were no statistically significant differences between the 2 diagnostic methods with regard to disability, comorbidity, somatic symptoms, and psychological distress. It is also of interest that the tender point count was more strongly associated with both methods of diagnosis than was the tender point criterion, as might be expected also from the data in Figure 2. The data shown in Table 2, as well as the results seen in Table 1, indicate that survey diagnostic criteria identify fibromyalgia clinical associations at least as well as do the clinician's diagnosis.
The study results raise the question of whether published reports of fibromyalgia studies based on ACR criteria actually represent the population of fibromyalgia patients seen by rheumatologists in clinical practice. The issue is complicated, since it is likely that almost all patients with fibromyalgia met the ACR criteria at one time. We suggest that outcome studies should include patients who meet ACR fibromyalgia criteria at some time during their course, but that treatment studies should require that enrolled patients currently meet these criteria.
This study was performed in a rheumatology clinic, and it is possible that the results might be different in a different setting. In addition, the clinical examination results depend on assessment by an experienced nurse-metrologist and clinician who is an expert in fibromyalgia. Once again, it is possible that different results might be obtained with different examiners. We used this specific clinic because all patients attending the clinic had undergone a fibromyalgia examination in the past that included a tender point count. Therefore, we could be sure that the diagnosis of fibromyalgia had been considered in all patients prior to study initiation as well as during the study.
The survey criteria should be studied further in settings in which tender points are counted. It should be noted that one of the criticisms of the ACR fibromyalgia criteria is that they are circular and that there is no objective gold standard; additionally, at the time those criteria were developed, tender points were considered the most important feature of fibromyalgia. Today, many observers believe that fatigue, cognitive disturbances, and other symptoms may be more important. Thus, it may be that there can be many allied, but slightly different, definitions of fibromyalgia, and that the ACR criteria provide only one definition. Regardless of diagnostic method, patients should be examined for other conditions that cause pain and fatigue.
The survey definition of fibromyalgia assesses fatigue rather than sleep disturbance, even though sleep disturbance is known to be an important feature of fibromyalgia. In the analysis of survey fibromyalgia criteria, we noted (but did not mention in previous reports [9, 10]) that fatigue was slightly but statistically significantly more effective than sleep disturbance in distinguishing fibromyalgia patients from non–fibromyalgia patients. In the interest of keeping criteria simple enough that they might be used, we wished to include a small number of variables, and we thus chose fatigue over sleep rather than using both variables. In doing this we were aware that fatigue is currently thought to be important in illnesses such as RA and SLE and is often measured, and we wished to take advantage of this since it might give the proposed criteria more general use.
Finally, we note that in choosing not to analyze primary fibromyalgia separately we not only followed the ACR criteria committee's recommendation, but also addressed the problem that there is no naturally appropriate “primary” control group. If fibromyalgia is primary, are not conditions such as back pain and OA “secondary” conditions? And how should we classify patients with localized scleroderma, osteopenia, or Sjögren's syndrome? Although the prevalence of fibromyalgia may be higher in persons with other medical conditions, available evidence does not suggest that the presence of another medical condition makes fibromyalgia diagnosis less valid.
In summary, clinical, survey, and ACR criteria for fibromyalgia diagnosis have moderate levels of concordance (72–75%), and tap into the same pool of fibromyalgia symptoms. Since there is no gold standard for fibromyalgia and because fibromyalgia is often considered to be a trait diagnosis, all methods of diagnosis appear appropriate. However, the number of tender points required by the ACR criteria is not included in clinical diagnosis or survey criteria. The survey method has the advantage that it does not require physical examination.