SEARCH

SEARCH BY CITATION

Abstract

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. AUTHOR CONTRIBUTIONS
  8. ROLE OF THE STUDY SPONSOR
  9. Acknowledgements
  10. REFERENCES

Objective

To prospectively validate the preliminary criteria for clinical inactive disease (CID) in patients with select categories of juvenile idiopathic arthritis (JIA).

Methods

We used the process for development of classification and response criteria recommended by the American College of Rheumatology Quality of Care Committee. Patient-visit profiles were extracted from the phase III randomized controlled trial of infliximab in polyarticular-course JIA (i.e., patients considered to resemble those with select categories of JIA) and sent to an international group of expert physician raters. Using the physician ratings as the gold standard, the sensitivity and specificity were calculated using the preliminary criteria. Modifications to the criteria were made, and these were sent to a larger group of pediatric rheumatologists to determine quantitative, face, and content validity.

Results

Variables weighted heaviest by physicians when making their judgment were the number of joints with active arthritis, erythrocyte sedimentation rate (ESR), physician's global assessment, and duration of morning stiffness. Three modifications were made: the definition of uveitis, the definition of abnormal ESR, and the addition of morning stiffness. These changes did not alter the accuracy of the preliminary set.

Conclusion

The modified criteria, termed the “criteria for CID in select categories of JIA,” have excellent feasibility and face, content, criterion, and discriminant validity to detect CID in select categories of JIA. The small changes made to the preliminary criteria set did not alter the area under the receiver operating characteristic curve (0.954) or accuracy (91%), but have increased face and content validity.

This criteria set has been approved by the American College of Rheumatology (ACR) Board of Directors as Provisional. This signifies that the criteria set has been quantitatively validated using patient data, but it has not undergone validation based on an external data set. All ACR-approved criteria sets are expected to undergo intermittent updates.

As disclosed in the manuscript, these criteria were developed with partial financial support from industry sources. The industry supporters were not involved in any stage of criteria development. As a courtesy, the authors sent copies of submitted manuscripts to their industry supporters, but review and approval of the manuscripts were neither requested nor given.

Although current ACR practice is to decline requests for review of criteria that have been supported by industry, an exception was made in this case due to prior ACR project support and because the ACR policy change took place after the industry support was solicited and received by the investigators. ACR is an independent professional, medical and scientific society which does not guarantee, warrant or endorse any commercial product or service. The ACR reviewed this manuscript on its merits and found the criteria to be methodologically rigorous and clinically meaningful. The ACR received no compensation for its approval of these criteria.

INTRODUCTION

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. AUTHOR CONTRIBUTIONS
  8. ROLE OF THE STUDY SPONSOR
  9. Acknowledgements
  10. REFERENCES

Validated, clinically useful, and reliable criteria for defining disease states are crucial for monitoring of disease status in individual patients, development of standards of care, assessment of quality care, and as potential end points in clinical trials. The availability of new more effective therapies for children with juvenile idiopathic arthritis (JIA), with the potential for eliminating disease activity for extended periods, underscores the need for criteria defining inactive disease (ID), clinical remission on medication (CRM), and clinical remission off medications (CR) (1–5). There is as yet no indisputable gold standard or biomarker for determining whether a patient with JIA is in a state of ID. At present, physical examination and clinical laboratory criteria must be used to define this state, and is the reason why we refer to the current effort as defining criteria for clinical inactive disease (CID) rather than ID; the latter referring to both clinical and biologic quiescent disease. In the absence of a biologic marker for active or inactive JIA, aggregated expert judgment becomes necessary to determine criteria for clinical inactive JIA (6).

Synthesis of the preliminary criteria using the literature and expert opinion based upon Delphi and nominal group consensus formation approaches (7, 8) have been described previously (9). Focus was placed on the polyarticular (rheumatoid factor positive [RF+] and RF−), extended oligoarticular, and systemic categories of JIA (Table 1). The preliminary criteria were used successfully to characterize disease patterns of activity in JIA in 2005, and retrospective validation studies were completed in 2006 using the Outcome Measures in Rheumatology Clinical Trials filter (10–12). These investigations found the criteria for ID to be quite feasible for use in the routine clinic, and because consensus formation was used to produce the preliminary set, face (clinical sensibility) and content (comprehensive) validity were considered to be high. Comparison to other criteria sets in the literature for describing remission in JIA (13–15) showed construct validity (convergent subtype) to be high. To date, the preliminary criteria have been utilized and cited in over 80 publications. Preliminary biologic evidence for the validation of the preliminary criteria has recently appeared in the literature. Knowlton and colleagues demonstrated that the disease states defined by the preliminary clinically-based criteria display differently expressed genes in peripheral blood mononuclear cells, using microarray data and hierarchal clustering analysis (16).

Table 1. Preliminary criteria for inactive disease in oligoarticular (persistent and extended), polyarticular (RF + and −), and systemic JIA*
  • *

    All criteria must be met. RF = rheumatoid factor; JIA = juvenile idiopathic arthritis; ESR = erythrocyte sedimentation rate; CRP = C-reactive protein.

Inactive disease:
 No joints with active arthritis
 No fever, rash, serositis, splenomegaly, or generalized lymphadenopathy attributable to JIA
 No active uveitis to be defined
 ESR or CRP level within normal limits in the laboratory where tested. If both are tested, both must be normal
 Physician's global assessment of disease activity score of best possible on the scale used

From its inception, this project has followed the recommended process of the Classification and Response Criteria Subcommittee of the American College of Rheumatology Committee on Quality Measures (17). The subcommittee recommends the use of large, high-quality data sets for prospective validation of preliminary response criteria sets. One source of such data sets for JIA is randomized controlled trials (RCTs) conducted for submission to regulatory agencies for drug approval. We used data from the phase III RCT of infliximab in patients with polyarticular-course juvenile rheumatoid arthritis. As with most existing RCTs, this limited diagnostic subset of patients is confined to those who resemble patients currently classifiable into the following categories of JIA: polyarthritis (both RF+ and RF−), extended oligoarthritis, and systemic arthritis (without currently active systemic features). We aimed to prospectively validate the criteria for CID in patients with polyarticular-course JIA. The methods and results of the multistep approach to prospectively validate the criteria for CID using data from the phase III RCT (2) of infliximab in polyarticular-course JIA are the subjects of this report. For this reason, this study was limited to using clinical trial data from subjects with polyarticular-course JIA without systemic features or uveitis. Because results of this exercise indicated that changes be made to the original preliminary criteria to maximize validity, this report also includes results of the effort to estimate the quantitative content validity index (CVI) of the modified criteria overall, and face validity index (FVI) of each criterion and its respective critical value.

MATERIALS AND METHODS

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. AUTHOR CONTRIBUTIONS
  8. ROLE OF THE STUDY SPONSOR
  9. Acknowledgements
  10. REFERENCES

Terminology associated with the description of performance characteristics of criteria is not standardized across various fields of research; the terms used here are those most commonly employed in rheumatology (10, 18–24).

Approach to prospective validation of the preliminary criteria.

A summary of the multistep process used for this exercise is given in Table 2.

Table 2. Steps in the prospective validation process for criteria for defining inactive disease in select categories of juvenile idiopathic arthritis
  • 1
    Extraction of 60 patient-visit profiles showing low or no disease activity from the 1,096 patient visits in the infliximab trial database
  • 2
    Rating by 40 pediatric rheumatologists of disease state (active or inactive) of the 60 patient-visit profiles (referred to as survey 1)
  • 3
    Intraphysician agreement survey in which 20 of the original 60 patient-visit profiles were re-sent to the 40 physician raters (referred to as survey 2)
  • 4
    Regression analysis to derive a best-fit model of physician judgment to be applied to the remaining 1,036 patient profiles
  • 5
    Application of the best-fit model to predict how the remaining 1,036 profiles would have been scored by the physician raters
  • 6
    Calculation of agreement among the 1,036 patient visits between the predicted likelihood of the physicians' score and by the preliminary criteria to assess sensitivity and specificity and area under the receiving operating characteristic curve
  • 7
    Modified criteria sent to 60 pediatric rheumatologists to estimate quantitative content and face validity indices, and final optimization (referred to as survey 3)
  • 8
    Final modification of the criteria is shown in Table 5
Table 5. Criteria for defining clinical inactive disease in oligoarticular (persistent and extended), polyarticular (RF + and −), and systemic JIA*
  • *

    All criteria must be met. Although this table contains criteria that refer to extraarticular manifestations of disease and uveitis, these were not part of this exercise because patients with systemic manifestations or uveitis were ineligible for enrollment into the randomized controlled trial. The uveitis and systemic criteria are shown here in order to present the entire set as it currently exists. RF = rheumatoid factor; JIA = juvenile idiopathic arthritis; ESR = erythrocyte sedimentation rate; CRP = C-reactive protein.

  • The American College of Rheumatology defines a joint with active arthritis as a joint with swelling not due to bony enlargement or, if no swelling is present, limitation of motion accompanied by either pain on motion and/or tenderness. An isolated finding of pain on motion, tenderness, or limitation of motion on joint examination may be present only if explained by either prior damage attributable to arthritis that is now considered inactive or nonrheumatologic reasons, such as trauma.

  • The Standardization of Uveitis Nomenclature (SUN) Working Group defines inactive anterior uveitis as “grade zero cells,” indicating <1 cell in field sizes of 1 mm by a 1-mm slit beam.

Inactive disease:
 No joints with active arthritis
 No fever, rash, serositis, splenomegaly, or generalized lymphadenopathy attributable to JIA
 No active uveitis as defined by the SUN Working Group (28)
 ESR or CRP level within normal limits in the laboratory where tested or, if elevated, not attributable to JIA
 Physician's global assessment of disease activity score of best possible on the scale used
 Duration of morning stiffness of ≤15 minutes

Step 1: extraction of 60 patient profiles showing low or no disease activity from the 1,096 patient visits in the infliximab trial database.

Sixty patient profiles were extracted from the 1,096 deidentified patient study visit records from the phase III prospective RCT of infliximab in JIA reported by Ruperto et al (2). This trial, completed in 2004, enrolled a total of 122 children ages 4–17 years with active polyarticular-course JIA not adequately controlled with methotrexate. To maximize the usefulness of this step, only patient-visit profiles demonstrating low or no clinically apparent disease were extracted from the database because profiles reflecting very active disease (AD) would have been scored as AD by both physician raters and the preliminary criteria set. In order to not bias the physician raters into basing their judgments solely on those parameters that are elements of the preliminary criteria set, each profile contained 7 clinical assessments: duration of morning stiffness (DMS), visual analog scale (VAS) for pain, Childhood Health Assessment Questionnaire, parent assessment of overall well-being, the physician's global assessment of overall disease activity (PGA), the number of joints with active arthritis and the number of joints with limitation of motion, and 4 laboratory assessments (erythrocyte sedimentation rate [ESR], hematocrit, white blood cells, and platelets). The profiles contained the actual raw data from the trial; no value of any variable was imputed. Figure 1 is an example of a patient-visit profile. Because of the inclusion/exclusion criteria used for the infliximab RCT, no subject had active systemic features or uveitis and all had disease for at least 6 months.

thumbnail image

Figure 1. Example of a patient profile: this subject has systemic juvenile idiopathic arthritis (JIA) with polyarticular course and at this visit has the clinical characteristics shown in the table below. VAS = visual analog scale; C-HAQ = Childhood Health Assessment Questionnaire; ESR = erythrocyte sedimentation rate; WBCs = white blood cells.

Download figure to PowerPoint

Step 2: rating by 40 pediatric rheumatologists of the disease state (active or inactive) of the 60 patient profiles (referred to as survey 1).

These patient-visit profiles were sent via computer link survey to 40 pediatric rheumatologists in 27 countries who were members of the Paediatric Rheumatology International Trials Organisation (PRINTO), the Childhood Arthritis and Rheumatology Research Alliance (CARRA), or the Pediatric Rheumatology Collaborative Study Group (PRCSG) and who had not participated in the development of the preliminary criteria, and were not currently using the criteria in a clinical trial. All physician raters were board certified in pediatric rheumatology and had a minimum of 10 years of postfellowship clinical experience. Physicians were asked to review and then score each patient profile as being in a state of AD or ID, or if the state was unable to be determined based upon the information provided.

Step 3: intraphysician agreement survey in which 20 of the original 60 patient profiles were re-sent to the 40 physician raters (referred to as survey 2).

Because the physician ratings of the profiles were to be used as the “gold standard” for criteria validation, we thought it necessary to determine if intraphysician reliability was in the acceptable range. Twenty of the original 60 patient profiles were re-sent to the physicians 2 months after receipt of the initial ratings in order to assess intraphysician reliability. Intraphysician reliability was calculated using the unweighted kappa method as described by Fleiss (25). Interrater reliability of a finalized criteria set was not part of this exercise.

Step 4: regression analysis to derive a best-fit model of physician judgment to be applied to the remaining 1,036 patient profiles.

We anticipated that very few of the 60 patient profiles would be judged by the 40 physician raters to be in a state of ID, using the 80% consensus agreement rule. Therefore, we used a series of binomial logistic regression analyses (GENMOD procedure in SAS) to develop a best-fit model of variables physicians weighted most heavily when making their judgment of disease state. In the regression analysis procedure, the physician ratings served as the dependent variable, and the 7 clinical and 4 laboratory assessments served as the independent (explanatory) variables. Because there are two analysis units, one for patient profiles and one for the physicians' ratings nested within each patient profile, we used generalized estimating equations to account for the clustering of the 40 physicians' ratings within each patient profile. All clinical and laboratory variables of the patient profiles were included in the initial multivariate logistic regression model. Stepwise and forward selection procedures were used to select the best subset of variables that predicted (showed the highest degree of correlation with) the patient state as judged by physician rating. The final logistic regression models included only those variables that remained significant at the 0.05 level. This allowed for identification of those variables that influenced most heavily the physicians' determination of the disease state of the patient.

Step 5: application of the best-fit model to predict how the remaining 1,036 profiles would have been scored by the physician raters.

In step 5, the remaining 1,036 patient-visit profiles (1,096 minus the 60 profiles actually scored by physician raters) that were not directly rated by physicians were computer scored using the best-fit regression model from step 4. This allowed us to use all of the patient visits from the trial by predicting, with a considerable amount of precision, how each of the profiles would have been scored, had the physicians rated each one. The resulting predicted probabilities of patient status were compared to the ratings of disease status by the preliminary criteria using the Kruskal-Wallis test.

Step 6: calculation of agreement between how the 1,036 patient visits were scored by the best-fit model (physician likelihood ratings) and by the preliminary criteria to assess accuracy, sensitivity and specificity, and area under the receiver operating characteristic (ROC) curve.

These metrics were calculated in the standardized manner, using the physician likelihood ratings as the gold standard.

Step 7: estimation of quantitative content and face validity and final optimization (referred to as survey 3).

Results from steps 1 through 6 suggested that changes to the preliminary criteria would be necessary to optimize their agreement with physician judgment. Therefore, after modification of the preliminary criteria, a third online survey was sent that was designed to establish the quantitative CVI of the criteria set overall and the FVI of each criterion and its respective cut point/critical value in the set using methods described by Davies et al (26). Survey 3 was sent to 60 pediatric rheumatologists, 40 of whom had participated in the exercise above and 20 of whom had participated in the original consensus conference that led to the development of the preliminary criteria. The survey contained a cover e-mail that explained the purpose of the survey and definitions of content and face validity. A computer link supplied the modified criteria set and an online questionnaire that asked that the criteria be scored as a whole for content validity using a 4-point ordinal level scale (where 1 = irrelevant and should not be used to assess ID, 2 = unable to assess relevance without revision, 3 = relevant but needs minor alterations, and 4 = very relevant and succinct). A second question asked that face validity of each of the variables and its corresponding critical value (e.g., active joint count = 0) be scored using the same 4-point scale. A free-text box allowed physician raters to make comments about their replies.

From survey 3 data, the CVI was calculated as the percentage of physician raters who scored the revised criteria set as a whole as either a 3 or 4 on the 4-point scale described above. The FVI was calculated for each item and its corresponding cut point/critical value using the identical method. A CVI or FVI score of >0.80 is considered to have excellent content validity (26).

Step 8: final modification of the criteria set.

Following the final survey, an additional modification to the criteria was made.

RESULTS

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. AUTHOR CONTRIBUTIONS
  8. ROLE OF THE STUDY SPONSOR
  9. Acknowledgements
  10. REFERENCES

All 40 physician raters responded to the initial survey and each scored all 60 patient profiles (2,400 evaluations). Of all 2,400 physician evaluations, 1,744 (72.7%) were scored as AD, 374 (15.7%) were scored as ID, and the remaining 282 (11.7%) were scored as unable to determine. As expected, only 3 of the 60 patient-visit profiles met the 80% consensus agreement among physician raters to be classified as ID. The stepwise selection of the binomial logistic regression utilizing all 2,400 physician ratings resulted in a final best-fit model that included active joint count, PGA, ESR, and DMS. Testing different critical values for DMS (≥5 minutes and ≥15 minutes) and ESR (≤20 mm/hour and ≤25 mm/hour), the model with DMS of ≤15 minutes and ≤20 mm/hour for the ESR produced the best area under the ROC curve. Odds ratios for these variables are displayed in Table 3. Highest weight was placed on the active joint count, followed by the ESR of ≤20 mm/hour, the PGA, and finally, DMS. Overall, the best-fit model (preliminary criteria with addition of DMS of ≤15 minutes) resulted in an area under the ROC curve of 0.942, indicating excellent fit with physician ratings. Therefore, we were confident about using this model to predict how the remaining 1,036 patient visits (not scored by physicians) would have been scored if physician raters had done so.

Table 3. Generalized estimating equations estimates finding the best-fit model to physician ratings of inactive disease versus active disease*
 OR (95% CI)
  • *

    Area under the curve = 0.942. OR = odds ratio; 95% CI = 95% confidence interval; ESR = erythrocyte sedimentation rate; ULN = upper limit of normal.

  • The ESR was used in the randomized controlled trial of infliximab in juvenile idiopathic arthritis; the C-reactive protein level was not assessed. The ESR criterion was later modified as shown in Table 5 and described in the text.

No joints with active arthritis121.3 (59.63–247.05)
ESR up to 110% the ULN for the test used13.2 (1.32–22.57)
Physician's global assessment = best attainable score on the scale used8.9 (4.59–17.42)
Duration of morning stiffness of ≤15 minutes2.7 (1.04–6.78)

Thirty-seven (92.5%) of 40 physician raters responded to the survey designed to estimate intraphysician reliability in rating the patient-visit profiles, with a resulting kappa value of 0.7 (95% confidence interval 0.63–0.77), indicating “substantial agreement” (27).

Applying the best-fit model from the regression analysis to the remaining 1,036 patient profiles, the predicted likelihood of a physician's rating was calculated as the inverse logit function of the linear combinations of the best-fit model. Using this method, a total of 744 profiles were judged to be in AD and 113 in ID (179 patient profiles were scored as unable to determine). The abnormal items for profiles that were scored as “unable to be determined” included parent global assessment and joints with limited range of motion (70%), pain (65%), PGA and ESR (45%), and hematocrit (30%). Using the best-fit model (derived from the physician likelihood ratings) as the “gold standard,” the original preliminary criteria correctly classified 744 of the profiles as being in AD and 37 as being in ID, thus yielding an area under the curve of 0.954 and an accuracy of 91%. These results are shown in Table 4.

Table 4. Agreement between preliminary criteria and predicted physician ratings based on the best-fit model for disease activity among 1,036 patient profiles*
 Disease activity rating according to the predicted likelihood of physician rating (ID and AD are classified by likelihood of >0.8)
ID, no. patientsAD, no. patients
  • *

    One hundred seventy-nine of the 1,036 patient profiles were scored as “unable to determine” and did not enter the analysis. Therefore, the total number of patients for the table is 857. Sensitivity = 33%; specificity = 100%; area under the curve = 0.954; accuracy = 91%. Results using the modified criteria (shown in Table 5) yielded the exact same results. ID = inactive disease; AD = active disease.

Disease activity rating according to the preliminary criteria for ID  
 ID370
 AD76744
 Total113744

Based on the analyses described above, 3 proposed changes were made to the criteria set and presented to physicians for their opinion in an online survey (survey 3). The proposed changes were: 1) addition of the definition of inactive uveitis as developed by the Standardization of Uveitis Nomenclature (SUN) Working Group (28), 2) addition of a new criterion of DMS not in excess of 15 minutes, and 3) clarification of abnormal ESR. Forty-one (68%) of 60 physicians replied to this third online survey (survey 3). The CVI of the modified criteria set as a whole was 95%, with only 2 of the 41 respondents indicating a score below 3, indicating extremely high content validity. The FVI, shown in parentheses as a percentage, for each criterion and its critical value were as follows: active joint count = 0, no fever or rash, no active uveitis as defined by the SUN Working Group (all 100%), ESR up to 110% of the upper limit of normal (93%), and DMS of ≤15 minutes (95%). The majority of physicians expressed opinions along with their score. Use of the results of the physician rating survey and analysis and accommodation of opinions expressed in this final survey yielded the modified criteria shown in Table 5. Specifically, the final optimization specified changes to two individual elements of the original preliminary criteria, uveitis as defined by the SUN Working Group and more detail regarding the ESR, and included the addition of DMS. Physicians were accepting of the SUN Working Group's definition of inactive uveitis, which was not available for the preliminary criteria. Pertaining to the fact that many different methodologies for determination of the ESR are used throughout the world, no specification should be made of an upper limit of normal. Further, ESR elevation associated with concurrent illness and not attributable to JIA should not be the sole criterion on which to exclude a patient from being classified as ID. Finally, results from the multivariate analyses and the final survey revealed that DMS in excess of 15 minutes is a clinically important indicator of AD. The modified criteria for CID are shown in Table 5.

DISCUSSION

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. AUTHOR CONTRIBUTIONS
  8. ROLE OF THE STUDY SPONSOR
  9. Acknowledgements
  10. REFERENCES

To our knowledge, this is the first report of prospective validation of the preliminary criteria for defining ID in JIA using prospectively collected RCT data done under Good Clinical Practice regulations. The multistep approach used in our analysis resulted in the modification of two of the criteria (definition of active uveitis and ESR) and the addition of another (DMS ≤15 minutes). This demonstrates the importance of prospective validation and underscores the fact that criteria sets continue to evolve and are never considered “final.”

The addition of DMS is the only new variable introduced to the criteria set. Comments from the final survey suggest that physicians believe that DMS of a short duration (i.e., ≤15 minutes) can represent residua of previously active disease without current active disease. However, DMS of a longer duration was considered to be sufficient cause by itself to classify the patient as being in a state of AD. Despite these modifications, the revised criteria set yielded the same area under the curve as did the preliminary criteria when using the physician ratings of patient profiles as the gold standard. Several reasons can be postulated for this finding. First, all patients classified as being in a state of ID by the preliminary criteria had a DMS ≤15 minutes. Next, the change in the criterion for inactive uveitis would not be expected to change the scoring, since no patients had uveitis and the modification represents only a further refinement of the definition of inactive eye involvement.

An important change from the preliminary criteria is that the ESR may be elevated, and therefore the addition of the term “or, if elevated, not attributable to JIA.” No specification of an upper limit of normal for the ESR is appropriate due to the multitude of methods for its determination, many of which have different limits of normality. Importantly, many children with new-onset active JIA and a flare of JIA do not have elevation of the ESR. For this reason, comments from the surveys repeatedly emphasized the fact that elevations of acute-phase reactants frequently are caused by conditions unrelated to JIA, and that such an elevation should not, by itself, be the sole justification for classifying a patient as having active JIA. The option of finding another source for an elevated ESR may introduce bias into the definition of ID and could allow for potential misinterpretation by physicians. However, these clinical interpretation issues are the same for the joint examination and the PGA and reinforce the appropriateness of calling these criteria for CID, since a biomarker is not currently available for the state of ID. Recently, the C-reactive protein (CRP) level has gained popularity as the acute-phase reactant of choice. Therefore, clinicians and investigators should feel free to use either the ESR or CRP level as a laboratory measure of inflammation.

The approach described in this article has limitations, and additional validation in other data sets is appropriate. When the patient profile survey was conducted, we used the term ID rather than CID. Results of ratings may have been different if CID had been used in the survey rather than ID. We believe that the term “clinical inactive disease” is best to use, as these criteria have not been shown to identify biologically inactive disease. It is hoped that in the near future, translational research will develop such a definition or biomarker. Because systemically ill and persistent oligoarthritis patients were not included in the RCT of infliximab in JIA, the results shown here apply most directly to those with polyarticular-course JIA (RF+ or RF−). The systemic criteria in the current set are those that had high face validity to physicians who participated in synthesis of the preliminary set. The clinical trial used for this exercise used the former classification of juvenile rheumatoid arthritis, as defined by the American College of Rheumatology (29). However, some patients included in the current study did have an oligoarticular onset that progressed to extended oligoarthritis, which would have been classified previously as pauciarticular onset–polyarticular course. Additionally, some patients in the trial had experienced a systemic onset, but followed a polyarticular course. Therefore, while results presented here are likely most applicable to patients with polyarthritis, oligoarticular and systemic-onset patients were included in the data set from the trial. Still, results of criterion and content validity and reliability may have been different if patients with other forms of JIA, particularly those with active systemic features, had been included. Patients with juvenile psoriatic arthritis, those with enthesitis-related arthritis, and patients with uveitis were not included in this data set. Additional clinical trial databases now in development will serve as the basis on which to validate the criteria in other JIA disease categories as well as permit further estimations of sensitivity and specificity.

Recently, the use of a 21-circle VAS rather than a 10-cm line or 11-circle (0–10 Likert-like scale) instrument to assess PGA has become popular. The database used in this exercise used the 11-circle scale, and therefore we were unable to assess the whether the 21-circle VAS (with a gradation of 0.5) would have allowed more patients to be classified as being in ID, and increase the sensitivity of the criteria set. Other databases will need to be used to investigate this possibility.

The reliability exercises described in this manuscript do not estimate either inter- or intrarater reliability of the assessment of individual components of the criteria. Nonreliability is known to exist among physician raters when judging the specific clinical parameters in the current criteria set. Interphysician reliability coefficients of an established criteria set cannot be estimated using data from this exercise. Agreement among physicians from the first survey helped us establish what the criteria should be, not the reliability of an established criteria set. The analysis of intraphysician reliability is more useful in the conventional sense. Physicians who were able to make a determination of patient status on the initial survey were quite reliable in their reassessment at a later date of the same patient profile.

The convergent validity subtype of construct validity has been estimated in prior retrospective validation exercises using existing instruments for describing remission in JIA. Analyses of these subtypes of validity were not performed in the current effort, as other remission criteria used variables that were not collected during the infliximab trial. Therefore, further work is needed to establish both convergent and divergent validity, although preliminary results from prior retrospective work are encouraging, as published earlier (12).

Van Tuyl and colleagues recently have described the collaborative process between the American College of Rheumatology and the European League Against Rheumatism to redefine remission in adults with rheumatoid arthritis (30). Two forms of remission have been developed for use in clinical trials and in the clinic (30, 31). In contrast, the current efforts in pediatric rheumatology have been focused on a single definition of CID, CRM, and CR.

Our ultimate goal is to establish worldwide criteria for CID, CRM, and CR for JIA that can be easily used in clinical care and research settings. Criteria for CID have been prospectively validated and modified as a result of the validation process. The modified criteria, while having the same sensitivity, specificity, and accuracy as the preliminary criteria, likely have greater face and content validity. Future efforts are necessary to 1) validate these criteria in additional prospectively collected data sets (32), 2) estimate the conditional probability of CRM, given CID and CR given CRM, and 3) establish the predictive validity of CRM and CR.

AUTHOR CONTRIBUTIONS

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. AUTHOR CONTRIBUTIONS
  8. ROLE OF THE STUDY SPONSOR
  9. Acknowledgements
  10. REFERENCES

All authors were involved in drafting the article or revising it critically for important intellectual content, and all authors approved the final version to be published. Dr. Wallace had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Study conception and design. Wallace, Giannini, Ruperto.

Acquisition of data. Wallace, Giannini, Itert, Ruperto.

Analysis and interpretation of data. Wallace, Giannini, Huang, Ruperto.

ROLE OF THE STUDY SPONSOR

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. AUTHOR CONTRIBUTIONS
  8. ROLE OF THE STUDY SPONSOR
  9. Acknowledgements
  10. REFERENCES

This study was supported by an unrestricted grant from Centocor. As disclosed in the manuscript, these criteria were developed with partial financial support from industry sources. The industry supporters were not involved in any stage of criteria development. As a courtesy, the authors sent copies of submitted manuscripts to their industry supporters, but review and approval of the manuscripts were neither requested nor given.

Acknowledgements

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. AUTHOR CONTRIBUTIONS
  8. ROLE OF THE STUDY SPONSOR
  9. Acknowledgements
  10. REFERENCES

The authors wish to acknowledge the important participation by the following members of the PRCSG, CARRA, and PRINTO: L. Abramson, S. Bowyer, B. Chalom, R. Cron, M. Elder, H. Gewanter, N. Ilowite, L. Jung, Y. Kimura, D. Kingsbury, C. Lindsley, D. Lovell, D. McCurdy, R. Moore, K. O'Neil, L. Rider, C. Rose, K. Schickler, D. Sherry, J. Soep, L. Stein, R. Vehe, L. Wagner-Weiner, L. Zemel (US); B. Lang, R. Laxer, R. Schneider, L. Tucker (Canada); S. Al-Mayouf (Saudi Arabia); B. Andersson-Gare (Sweden); T. Avcin (Slovenia); B. Flato (Norway); R. Cimaz, F. De Benedetti, F. Fantini, A. Martini, A. Ravelli (Italy); C. De Cunto, S. Garay (Argentina); C. Saad-Magalhaes, S. Oliveira (Brazil); J. de Inocencio (Spain); P. Dolezalova (Czech Republic); D. Foell, G. Horneff (Germany); J. Melo-Gomes (Portugal); S. Nielsen (Denmark); H. Ozdogan, S. Ozen (Turkey); P. Quartier (France); F. Kanakoudi-Tsakalidou (Greece); Y. Uziel (Israel); R. Vesely (Slovakia); and N. Wulffraat (The Netherlands).

REFERENCES

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. AUTHOR CONTRIBUTIONS
  8. ROLE OF THE STUDY SPONSOR
  9. Acknowledgements
  10. REFERENCES
  • 1
    Lovell DJ, Giannini EH, Reiff A, Cawkwell GD, Silverman ED, Nocton JJ, et al, for the Pediatric Rheumatology Collaborative Study Group. Etanercept in children with polyarticular juvenile rheumatoid arthritis. N Engl J Med 2000; 342: 7639.
  • 2
    Ruperto N, Lovell DJ, Cuttica R, Wilkinson N, Woo P, Espada G, et al, for the Paediatric Rheumatology International Trials Organisation and the Pediatric Rheumatology Collaborative Study Group. A randomized, placebo-controlled trial of infliximab plus methotrexate for the treatment of polyarticular-course juvenile rheumatoid arthritis. Arthritis Rheum 2007; 56: 3096106.
  • 3
    Lovell DJ, Reiff A, Ilowite NT, Wallace CA, Chon Y, Lin SL, et al, for the Pediatric Rheumatology Collaborative Study Group. Safety and efficacy of up to eight years of continuous etanercept therapy in patients with juvenile rheumatoid arthritis. Arthritis Rheum 2008; 58: 1496504.
  • 4
    Lovell DJ, Ruperto N, Goodman S, Reiff A, Jung L, Jarosova K, et al, for the Paediatric Rheumatology International Trials Organisation and the Pediatric Rheumatology Collaborative Study Group. Adalimumab with or without methotrexate in juvenile rheumatoid arthritis. N Engl J Med 2008; 359: 81020.
  • 5
    Ruperto N, Lovell DJ, Quartier P, Paz E, Rubio-Perez N, Silva CA, et al, for the Paediatric Rheumatology International Trials Organisation and the Pediatric Rheumatology Collaborative Study Group. Abatacept in children with juvenile idiopathic arthritis: a randomised, double-blind, placebo-controlled withdrawal trial. Lancet 2008; 372: 38391.
  • 6
    Raine R, Sanderson C, Hutchings A, Carter S, Larkin K, Black N. An experimental study of determinants of group judgments in clinical guideline development. Lancet 2004; 364: 42937.
  • 7
    Bowles N. The Delphi technique. Nurs Stand 1999; 13: 326.
  • 8
    Horton JN. Nominal group technique: a method of decision-making by committee. Anaesthesia 1980; 35: 8114.
  • 9
    Wallace CA, Ruperto N, Giannini E. Preliminary criteria for clinical remission for select categories of juvenile idiopathic arthritis. J Rheumatol 2004; 31: 22904.
  • 10
    Boers M, Brooks P, Strand CV, Tugwell P. The OMERACT filter for outcome measures in rheumatology [editorial]. J Rheumatol 1998; 25: 1989.
  • 11
    Wallace CA, Huang B, Bandeira M, Ravelli A, Giannini EH. Patterns of clinical remission in select categories of juvenile idiopathic arthritis. Arthritis Rheum 2005; 52: 355462.
  • 12
    Wallace CA, Ravelli A, Huang B, Giannini EH. Preliminary validation of clinical remission criteria using the OMERACT filter for select categories of juvenile idiopathic arthritis. J Rheumatol 2006; 33: 78995.
  • 13
    Oen K. Long-term outcomes and predictors of outcomes for patients with juvenile idiopathic arthritis. Best Pract Res Clin Rheumatol 2002; 16: 34760.
  • 14
    Fantini F, Gerloni V, Gattinara M, Cimaz R, Arnoldi C, Lupi E. Remission in juvenile chronic arthritis: a cohort study of 683 consecutive cases with a mean 10 year followup. J Rheumatol 2003; 30: 57984.
  • 15
    Flato B, Lien G, Smerdel A, Vinje O, Dale K, Johnston V, et al. Prognostic factors in juvenile rheumatoid arthritis: a case-control study revealing early predictors and outcome after 14.9 years. J Rheumatol 2003; 30: 38693.
  • 16
    Knowlton N, Jiang K, Frank MB, Aggarwal A, Wallace C, McKee R, et al. The meaning of clinical remission in polyarticular juvenile idiopathic arthritis: gene expression profiling in peripheral blood mononuclear cells identifies distinct disease states. Arthritis Rheum 2009; 60: 892900.
  • 17
    Classification and Response Criteria Subcommittee of the American College of Rheumatology Committee on Quality Measures. Development of classification and response criteria for rheumatic diseases [editorial]. Arthritis Rheum 2006; 55: 34852.
  • 18
    Bombardier C, Tugwell P. A methodological framework to develop and select indices for clinical trials: statistical and judgmental approaches. J Rheumatol 1982; 9: 7537.
  • 19
    Felson DT, Anderson JJ, Boers M, Bombardier C, Furst D, Goldsmith C, et al. American College of Rheumatology preliminary definition of improvement in rheumatoid arthritis. Arthritis Rheum 1995; 38: 72735.
  • 20
    Magni-Manzoni S, Ruperto N, Pistorio A, Sala E, Solari N, Palmisani E, et al. Development and validation of a preliminary definition of minimal disease activity in patients with juvenile idiopathic arthritis. Arthritis Rheum 2008; 59: 11207.
  • 21
    Ruperto N, Ravelli A, Oliveira S, Alessio M, Mihaylova D, Pasic S, et al, for the Pediatric Rheumatology International Trials Organization (PRINTO) and the Pediatric Rheumatology Collaborative Study Group (PRCSG). The Pediatric Rheumatology International Trials Organization/American College of Rheumatology provisional criteria for the evaluation of response to therapy in juvenile systemic lupus erythematosus: prospective validation of the definition of improvement. Arthritis Rheum 2006; 55: 35563.
  • 22
    Giannini EH, Ruperto N, Ravelli A, Lovell DJ, Felson DT, Martini A. Preliminary definition of improvement in juvenile arthritis. Arthritis Rheum 1997; 40: 12029.
  • 23
    Rider LG, Giannini EH, Brunner HI, Ruperto N, James-Newton L, Reed AM, et al, for the International Myositis Assessment and Clinical Studies Group. International consensus on preliminary definitions of improvement in adult and juvenile myositis. Arthritis Rheum 2004; 50: 228190.
  • 24
    Khanna D, Lovell DJ, Giannini E, Clements PJ, Merkel PA, Seibold JR, et al. Development of a provisional core set of response measures for clinical trials of systemic sclerosis. Ann Rheum Dis 2008; 67: 7039.
  • 25
    Fleiss JL. Measuring nominal scale agreement among many raters. Psychol Bull 1971; 76: 37882.
  • 26
    Davies EH, Surtees R, DeVile C, Schoon I, Vellodi A. A severity scoring tool to assess the neurological features of neuronopathic Gaucher disease. J Inherit Metab Dis 2007; 30: 76882.
  • 27
    Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics 1977; 33: 15974.
  • 28
    Jabs DA, Nussenblatt RB, Rosenbaum JT. Standardization of uveitis nomenclature for reporting clinical data: results of the First International Workshop. Am J Ophthalmol 2005; 140: 50916.
  • 29
    Cassidy JT, Levinson JE, Brewer EJ Jr. The development of classification criteria for children with juvenile rheumatoid arthritis. Bull Rheum Dis 1989; 38: 17.
  • 30
    Van Tuyl LH, Vlad SC, Felson DT, Wells G, Boers M. Defining remission in rheumatoid arthritis: results of an initial American College of Rheumatology/European League Against Rheumatism consensus conference. Arthritis Rheum 2009; 61: 70410.
  • 31
    Felson DT, Smolen JS, Wells G, Zhang B, van Tuyl LH, Funovits J, et al. American College of Rheumatology/European League Against Rheumatism provisional definition of remission in rheumatoid arthritis for clinical trials. Arthritis Rheum 2011; 63: 57386.
  • 32
    Foell D, Wulffraat N, Wedderburn LR, Wittkowski H, Frosch M, Gerss J, et al, for the Paediatric Rheumatology International Trials Organization (PRINTO). Methotrexate withdrawal at 6 vs 12 months in juvenile idiopathic arthritis in remission: a randomized clinical trial. JAMA 2010; 303: 126673.