Immunologic laboratory tests serve a critical role in the care of patients with rheumatic diseases. Historically, such tests have helped define the immunopathophysiologic basis of various rheumatic conditions, and have allowed more precise classification of distinct diseases with similar clinical features. The results of immunologic laboratory tests have become a key part of the diagnosis for several diseases. These tests may also provide relevant information concerning disease activity, end-organ involvement, and prognosis. Repeated measurement of certain tests has become a means to monitor disease activity and response to treatment.
Although immunologic laboratory tests can have great utility in the diagnosis and management of patients with rheumatic diseases, they can be misused. Improper application of these tests can result in misdiagnosis, inappropriate therapy, and unnecessary health care expenses. In an effort to define the optimal use of immunologic laboratory tests, the American College of Rheumatology (ACR) assembled an ad hoc committee to develop evidence-based guidelines for these diagnostic tests. None of the members of the committee have any commercial interest in any laboratories that would present a potential conflict of interest to the creation of these guidelines. The primary goal of these guidelines is to improve patient care. It is hoped that these guidelines will be practical and useful to clinicians caring for patients with rheumatic diseases. These guidelines are not intended to be a substitute for clinical judgment. In atypical cases, clinical judgment may require deviation from the guidelines. Also, these guidelines should be considered dynamic. Further investigation and research may allow refinement in our understanding of the optimal use of immunologic laboratory tests.
Scope of work
The first group of immunologic laboratory tests analyzed was the antinuclear antibody (ANA) and related autoantibody tests including: anti-double stranded DNA (anti-dsDNA), anti-Ro/SSA, anti-La/SSB, anti-Sm, anti-RNP, anti-histone, and anti-Scl-70/nucleolar/centromere. The tests were divided among the committee members, each of whom primarily reviewed the literature for one or more of the tests. Different committee members provided independent secondary review of a sample of the articles for each topic.
The guidelines were formulated using an explicit, evidence-based approach. Recommendations from the Evidenced-Based Medicine Working Group concerning guidelines were incorporated into the design process (1, 2). The first step in the formulation of the guidelines was the creation of a template (Table 1) that identified the key clinical questions that would need to be answered. Specific points considered for each test included laboratory methodology, indications for use of the test in diagnosis, prognosis, longitudinal followup, and areas that require additional research. Wherever possible, consideration was given to common clinical uses for the tests. For example, it was recognized that some tests may be used to “rule-in” or “rule-out” a diagnosis. The available literature was systematically examined to determinte the best solutions to the key questions of interest.
|I. Definition of the test|
|A. Relevant historic considerations|
|B. Methodologic considerations|
|III. Indications for clinical use of the test|
|1. What is the prevalence of a positive test result in systemic lupus erythematosus, other rheumatic diseases, other diseases, and normal controls?|
|2. What are the population statistics (sensitivity, specificity, and positive and negative likelihood ratio) for this test in these conditions?|
|3. What effect does titer have on the population statistics?|
|4. For whom should this test be used as an aid to diagnosis?|
|5. What further testing should be done based on a negative or positive test?|
|1. Does a test result correlate with any aspect of prognosis (e.g., disease activity, end-organ involvement, outcome, survival).|
|2. What are the population statistics relevant to prognosis?|
|3. What effect does titer have on the prognosis?|
|4. For whom should this test be used as an aid to determination of prognosis?|
|C. Longitudinal assessment|
|1. In which conditions do changes in the test correlate with clinical variables longitudinally?|
|2. What effect does treatment have on the test result?|
|3. For whom should the test be utilized as an aid to longitudinal assessment?|
|IV. Areas that require additional research|
Selection and grading of literature
A comprehensive literature review was performed for each test. Searches were conducted using electronic databases (e.g., MEDLINE, Healthstar), restricting the references to English-language articles. This was supplemented by thorough hand searching of reference lists. As a result of this search method, a literature database was assembled for each test. Review articles, editorials, individual case reports, case series with less than 10 patients, animal studies, and articles that did not report primary data were all excluded from analysis. Articles that did not report complete data (e.g., those showing data only from “representative patients”) were also excluded.
In order to identify studies with the highest methodologic quality, articles in the literature database were critically reviewed according to criteria developed a priori. Five criteria derived from those of the Evidenced-Based Medicine Working Group (3, 4) were considered (Table 2). An article satisfying all 5 criteria was considered an “A” article; those satisfying 3 or 4 criteria were considered “B” articles; those satisfying 2 criteria were “C” articles, and those meeting only 1 or none of the criteria were grade “D.” It was decided that in order to create the guidelines, the best available literature would be considered. Thus, for those questions on which there were 10 or more grade A articles, no consideration would be given to lesser quality articles. In other situations, articles of grade B were also considered. Consensus on grading was achieved by independent secondary review. For each test, approximately 10% of the articles in the database were randomly selected and independently reviewed. Interobserver agreement rates were calculated using a kappa statistic.
|1. Are complete data provided? (e.g., likelihood ratios or the data to calculate likelihood ratios provided)|
|2. Was there an independent blind comparison with a reference standard (e.g., were American College of Rheumatology diagnostic criteria used? What measure of disease activity was used, etc)?|
|3. Did the patient sample include an appropriate spectrum of patients to whom the test will be applied in clinical practice?|
|4. What methods were used to perform the test and are they currently available?|
|5. Did the results of the test being evaluated influence the decision to perform the reference standard (i.e., was there “work-up” bias?)|
Clinicians may consider ANA and other autoantibody testing for various rheumatologic and nonrheumatologic conditions. Performance characteristics of diagnostic tests (i.e., sensitivity, specificity, likelihood ratios, etc) can only be determined for specific diseases. Therefore, an initial review of the literature was performed to establish the scope of diseases for which there was available literature to analyze concerning the use of each test. For example, initial review of anti-Ro/anti-La testing revealed that there were a large number of articles addressing the use of these tests in both systemic lupus erythematosus (SLE) and Sjögren's syndrome. After the diseases to be assessed in detail were determined for each test, data were derived from the articles to provide answers to the questions in the original template (e.g., what is the prevalence of a positive test in patients with SLE). The proportion of patients with and without particular diseases who had positive or negative results for a given test was abstracted from the articles, and 2 × 2 contingency tables were created (Table 3). From these tables, sensitivity, specificity, and positive and negative likelihood ratios were calculated across all included studies. Where data were available, separate consideration was given to the use of the test in diagnosis, prognosis, and longitudinal followup, in SLE as well as other conditions.
|Test result||Disease present||Disease absent|
Substantial heterogeneity was noted among the various articles addressing any given topic. Some of the important differences included 1) different methods used to perform the same test, 2) different substrates used to perform similar methods, 3) dissimilar populations from which the patients were acquired (for example, patients seen at a specialty clinic in a tertiary referral center versus those seen in the community at large), 4) various lengths of disease duration at the time of testing, and 5) the use of medications or other therapies that might affect the results of diagnostic testing. Therefore, data from different articles were not combined into a single estimate (i.e., using metaanalysis). Data are presented for each article used in the recommendations. However, to provide a useful guide to clinicians, weighted averages for the performance characteristics are calculated. Recommendations for using the tests in the guidelines are based upon these weighted averages. Formal meta-analytic techniques were not used to combine the studies because of the heterogeneity noted within the sample of studies.
The statistics reported are meant to provide the clinician with an estimate of the utility of these tests (4). Many clinicians have some familiarity with the sensitivity (i.e., how often are the results of a particular test positive among all patients who actually have the disease in question) and specificity (i.e., how often are the results of a test negative among persons who do not have the disease) of various tests such as the ANA (Table 3). However, the sensitivity and specificity of a test are not sufficient to calculate the probability of disease for a given patient. The likelihood ratio is meant to provide an additional measure that may be of greater value in daily practice. The positive likelihood ratio and negative likelihood ratio are measures of how the posttest probability is impacted by the test result (Figure 1). In other words, the likelihood ratio allows the clinician to calculate the posttest probability based upon the pretest probability and the test result. The important issue in deciding whether to use a diagnostic test in a particular patient is whether the posttest probability will be significantly different from the pretest probability, given a positive or negative test result.
For example, a clinician may see a 33-year-old woman with arthritis, oral ulcers, and a facial rash. Based upon her age and symptoms, the clinician believes that her chances of having SLE are substantially greater than that of the general population, and he or she estimates that the patient has a 10% chance of having SLE. The clinician then wants to order a test (such as an ANA or related autoantibody test) to help confirm this suspicion. If the test has a high positive likelihood ratio (e.g., 10), and the test result is positive, then the posttest probability of the test will be greatly increased (Figure 2). This test will be very helpful in assisting the clinician to make the diagnosis of SLE. If the likelihood ratio of the test were slightly less (e.g., 5), there would still be a substantial increase in the posttest probability that could be clinically relevant. However, if the test had a small positive likelihood ratio (e.g., 1.2), then the posttest probability will not differ substantially from the pretest probability, and a positive test result would not help the clinician in making a diagnosis.
In a different clinical context, the same test would have different implications. A clinician may order an ANA on a 78-year-old man with fatigue and arthralgias. If there were no other findings suggestive of SLE, but an ANA is obtained to “rule-out SLE,” and the result is “positive” with a 1:40 titer, how should this result be interpreted? In this case, based on the clinical presentation, the pretest probability of SLE would be expected to be only slightly greater than the prevalence of SLE in the general population (approximately 0.1% or less). For this patient, even if the test had a large positive likelihood ratio, the posttest probability would not be expected to be significantly different from the pretest probability (Figure 2).
The likelihood ratios allow the clinician to estimate whether there will be a significant change in the pretest to posttest probability of a disease as a result of obtaining the test (4). The determination of a “significant” change depends on a number of factors, including the disease in question, the therapeutic options, and the threshold for starting therapy. A likelihood ratio of 1 implies that the posttest probability of disease will be exactly the same as the pretest probability of disease (Table 4). Thus, the test was of no value in determining the presence of disease. In general, likelihood ratios >10 or <0.1 translate into very large, clinically important differences in pretest–posttest probability. Likelihood ratios between 5 and 10 or between 0.1 to 0.2 often result in more modest, but still substantial differences, in pretest–posttest probability. Ratios from 2 to 5, or from 0.5 to 0.2 generate small differences that may still be relevant in certain clinical settings. Likelihood ratios between 1 and 2, or 0.5 and 1 generate very small differences that are seldom clinically important. In the clinic, the smaller the likelihood ratio, the more important the clinical context becomes in interpreting the test result. Likelihood ratios close to 1 affect pretest probability to a small and generally insignificant degree (4).
|Clinical implications||Range of positive likelihood ratios||Range of negative likelihood ratios|
|Large, often very clinically significant differences||>10||<0.1|
|Modest clinical differences||5–10||0.1–0.2|
|Small, but potentially relevant clinical differences||2–5||0.5–0.2|
|Small, rarely clinically important differences||1–2||0.5–1|
|No difference between pretest and posttest probabilities||1||1|
An important consideration in the use of these statistics in evaluating diagnostic tests is that the clinician should first have a reasonable estimate of the pretest probability of a disease. Unfortunately, pretest probabilities can be difficult to gauge, and very little data have been collected specifically on this issue.
In forming the Immunologic Laboratory Testing Guidelines, tests will be recommended as being “very useful,” “useful,” or “not useful” based on their diagnostic accuracy and associated likelihood ratios. A test will be considered to be “very useful” in a given situation if the majority of high-quality articles addressing the question have positive likelihood ratios >5 or negative likelihood rations <0.2. A test will be considered “useful” if the majority of positive likelihood ratios fall between 2 and 5, or negative likelihood ratios between 0.2 and 0.5. A test is considered “not useful” if the positive likelihood ratios are <2 or the negative likelihood ratios are >0.5.
For a number of important questions, there was not sufficient high-quality literature to allow rigorous analysis of the utility of a test. Because these guidelines are evidence-based, no recommendation is made in those areas for which there is insufficient literature. Rather, these questions are identified as important areas that require further investigation. This is relevant, because customary practice for some clinicians may include paradigms of diagnostic testing that are neither supported nor refuted by the medical literature. In those areas, additional research is needed to provide guidance as to optimal use of diagnostic tests.
The Committee has attempted to be explicit and evidenced-based in our methods, and objective in our formulation of these guidelines. As has been recently noted, determination of appropriateness of laboratory testing is quite difficult without clear methodology (5). Likelihood ratios were chosen to determine the appropriateness of laboratory testing because they were considered to be truly objective. In addition, likelihood ratios reflect the thought processes used by clinicians when using laboratory tests, although they often do not formally calculate such quantitative measures.
We hope these guidelines are useful to clinicians caring for patients with rheumatic disease. In addition, it is hoped that this format serves as a model for further guideline development, and spurs additional research into the utility of immunologic laboratory tests.