Juvenile idiopathic arthritis (JIA) is the most common rheumatic disease of childhood, affecting close to 300,000 children in the US. JIA has a broad impact on a child's physical and mental health. The heterogeneity of disease severity, the broad age range of affected individuals, and fluctuations in disease course complicate measuring disease activity and treatment effects in children with JIA. Developing instruments that accurately assess the effect of JIA on health and well-being is critical to assessing the overall impact of the disease and quantifying the impact of treatments.
Many instruments are used to assess severity of disease, disability, and quality of life in JIA. To inform clinicians, patients, and families about the current evidence regarding the management of JIA with disease-modifying antirheumatic drugs, and to help researchers identify critical gaps in knowledge, the Agency for Healthcare Research and Quality commissioned the Duke Evidence-based Practice Center to conduct a comparative effectiveness review (CER) (1), which included a key question regarding the psychometric properties of the most commonly utilized outcome measures in JIA. In the present report, we describe our findings for that key question, which was stated as follows: “What is the validity, reliability, responsiveness, and feasibility of the clinical outcome measures for childhood JIA that are commonly used in clinical trials or within the clinical practice setting?”
MATERIALS AND METHODS
We developed and followed a standard protocol for all steps of this review. The key question and the methods used were developed with input from a technical expert panel. Details are available in the full CER (1).
Search strategy and identification of relevant studies.
We conducted a comprehensive search of Medline (1966 to December 2010) and Embase (1947 to December 2010) using medical subject heading terms and key words for JIA and its older designations (i.e., juvenile rheumatoid arthritis [JRA]) and the names of the common instruments used to assess outcomes of treatment. We limited our search to English-language articles of studies in humans and identified prospective clinical studies and cross-sectional studies relevant to our question. We also manually reviewed the references from review articles and articles meeting our selection criteria for additional pertinent studies.
Two independent reviewers reviewed all abstracts, with subsequent full-text review and study selection based on predetermined selection criteria. Differences were resolved by consensus. We included peer-reviewed, English-language articles of studies that had a sample population of individuals ages 18 years or younger with JIA according to the current American College of Rheumatology (ACR) definition (2), as well as past designations such as JRA and juvenile chronic arthritis. After compiling a list of outcome measures used in the studies identified in our initial search, we prioritized and selected the measures for detailed review, using input from the technical expert panel and our assessment of measures already commonly used and those of growing relevance. We chose to focus on studies in which the instrument's psychometric characteristics were examined specifically for children with JIA (Table 1). In addition, we described several composite measures and definitions of disease state of growing importance, although they lacked full psychometric evaluations, such as the ACR 30% improvement criteria for JRA (ACR Pedi 30) (3). While the recently developed composite measures aim to more broadly reflect overall health status and disability, we did not identify studies of psychometric characteristics for the composite measures. We therefore included those measures that are key components of the composite measures, including physician global, parent/patient global, joint counts, functional ability, and quality of life measures.
Table 1. Outcome measures*
|Measures of disease activity|| || || || || || || |
|Active joint count||Full 71-joint examination||Active arthritis||Active, inactive||0–71†||Health professional||Joint count summed||Reduced joint count measures exist|
|Physician global assessment||1 item||Active disease||Most commonly 100-mm VAS||0–100†||Health professional||Measure distance from 0 anchor|| |
|Parent/patient global assessment||1 item||VAS or categorical, overall well being||Most commonly 100-mm VAS||0–100†||Self-administered||Value of VAS, no calculation||Assesses disease activity, functional status, and quality of life|
|Measures of functional status|| || || || || || || |
|C-HAQ||C-HAQ DI: 30 items; VAS: pain, overall well-being||Physical function (covering 8 domains), pain, overall well-being||0–3 and NA, 0 = no difficulty, 3 = inability to perform||Physical function: 0–3; VAS: 0–100 mm†||Self-administered, parent or patient||5 minutes to complete, highest score in each domain = score for domain, 2 minutes to score||Adapted from Stanford Health Assessment Questionnaire|
|Measures of health-related quality of life|| || || || || || || |
|CHQ||Parent form: 50 or 28 items; child form: 87 items||Physical health, pain, mental health, school, social, family||0–100, 0 = poor well-being, 100 = excellent well-being||0–100‡||Self-administered, children self-administer after age 10 years||Apply scoring formula as per manual|| |
|PedsQL||23 items||Physical, emotional, social, school functioning||5-point Likert scale (never to always)||0–100§||Self-administered||Together (generic and RM) takes 10–15 minutes|| |
|PedsQL-RM||22 items||Pain and hurt, daily activities, treatment, worry, communication||5-point Likert scale (never to always)||0–100§||Self-administered|| || |
Psychometric properties evaluated.
Reliability addresses the consistency of the instrument in measuring the construct of interest. We examined 3 areas of reliability: reproducibility, interrater reliability, and internal consistency. Instruments with greater reproducibility and interrater reliability may be more feasible to use in clinical trials and require smaller sample sizes to detect clinically important differences between treatment groups. Internal consistency assesses whether the items purported to measure the same general construct actually produce similar results. Cronbach's alpha is usually interpreted as a measure of internal consistency, with a range from 0 (i.e., no internal consistency) to 1 (i.e., completely internally consistent). Recent research has challenged this interpretation (4). However, Cronbach's alpha is the most commonly reported method for measuring internal consistency for the measures of interest. Validity refers to how well an instrument measures what it claims to measure. Since many of the constructs assessed by the clinical outcome measures have no reference standard, we evaluated construct validity based on how well the measures correlated with other indicators of disease, including global assessments, articular counts, and scores from other validated instruments.
Responsiveness is determined by 2 properties: reproducibility and the ability to register changes in scores when a patient's symptom status shows clinically important improvement or deterioration. The effect size is a measure of responsiveness that uses the mean change score in the numerator and a measure of variability in the denominator. Responsiveness is often reported on a continuous scale, even when the scales in question are ordinal. Although a common practice, some researchers have challenged the resulting interpretation of effect sizes (5) and minimum clinically important differences (MCIDs) (6) as potentially inflated when calculated on ordinal scales, and it is important to be aware of this when reviewing the results.
Data extraction and quality assessment.
Data extraction was completed by pairs of reviewers. One reviewer performed the initial abstraction, while the second read over each article to ensure accuracy and completeness. We extracted data regarding inter- and intrarater reliability, test–retest reliability, responsiveness (standardized response mean [SRM] and responsiveness index), time needed to administer, and construct validity for our selected outcome measures.
To assess study quality, we adapted pertinent criteria from the Quality Assessment of Diagnostic Accuracy Studies tool, a validated measure designed to assess the quality of diagnostic test studies (7). We evaluated the selection of study participants, independent and blind comparison of the study instrument to other outcome measures, and the appropriateness of the analytical approach.
Our initial search included broad search terms for studies pertaining to the treatment of JIA as well as studies of the outcome measures used to assess treatment response. We identified a total of 4,815 potentially relevant citations, of which 35 were subsequently determined to meet eligibility criteria for the key question considered in this report. Figure 1 shows the flow of literature through the selection process. The 35 publications identified described 34 unique studies involving 14,831 patients that investigated the psychometrics of the selected outcome measures or developing definitions of treatment response. Among these were 14 studies that evaluated reliability (8–21), 21 studies that evaluated validity (8, 11–13, 15–18, 21–33), and 9 that evaluated responsiveness (9, 10, 17, 26, 32, 34–37) of the selected outcome measures. Of these measures, the Childhood Health Assessment Questionnaire (C-HAQ) was the most extensively studied measure, with 23 studies (8–16, 20, 22–24, 26, 28, 30–37). The overall quality of the studies was fair, with few studies commenting on blinding, and only 1 (9) reporting sample size calculations. Results were reported as median values as well as the ranges of values from the different studies.
We identified 10 studies examining various aspects of reliability for the C-HAQ (8–16, 20); 2 studies each for the physician global assessment of disease activity (PGA), parent/patient global assessment of well being (PGW) (19, 21), and Pediatric Quality of Life Inventory (PedsQL) (18, 20); and 1 for the Child Health Questionnaire (CHQ) (17). Reproducibility, also called test–retest reliability, was assessed for the C-HAQ in 5 studies, all of which demonstrated high correlation between administrations (correlation coefficient range 0.79–0.96) (8, 11–14). The reliability of the PedsQL, CHQ, joint counts, PGA, and PGW has not been studied specifically in JIA populations.
Interrater reliability was most commonly explored to determine the correlation between parent and patient scores. Interrater reliability was measured for the C-HAQ, CHQ, and PedsQL, all of which demonstrated a moderate to strong correlation between parent and child when assessing functional status or disability (C-HAQ: 0.54–0.84 [9, 10, 13, 20], CHQ physical score [PhS]: 0.69–0.87 , PedsQL: 0.46–0.8, and PedsQL Rheumatology Module [PedsQL-RM]: 0.3–0.90 [18, 20]). The correlation between parent and child was lower for the psychosocial domain, including the PedsQL-RM worry domain (correlation coefficient 0.3) (18) and the CHQ psychosocial score (PsS; correlation coefficient range 0.38–0.53) (17).
Internal consistency, assessed most commonly using Cronbach's alpha, was evaluated in 4 studies for the C-HAQ, with all showing high internal consistency (Cronbach's alpha 0.88–0.94 for all domains except the “arising” domain [0.69]) (12, 13, 15, 16). In addition, shorter versions of the C-HAQ disability index (DI) were found to have high internal consistency, with Cronbach's alpha of 0.93 for both the 29-item and the 18-item instruments (15).
The C-HAQ was also evaluated for unidimensionality in 4 studies. Item response theory was used to examine unidimensionality in 2 studies, 1 of which showed a misfit rate of 13% with only 1 problematic area (hygiene), while the other study found 4 items that did not fit the model (16, 31). Factor analysis was used to confirm the unidimensional character of the C-HAQ and C-HAQ DI in 2 additional studies, with the full C-HAQ reported as having a very high fit of the model to single dimension (functional disability) with P < 0.0001 (12). The analysis of the C-HAQ DI demonstrated 2 principal components: an upper extremity functional component and a lower extremity functional component.
Of the 21 articles that met our inclusion criteria, 17 explored validation of the C-HAQ (8, 11–13, 15, 16, 21–24, 26, 28–33), 4 explored validation of the CHQ (17, 25, 26, 29), and 2 explored validation of the PGA and PGW (21, 27). In addition, 1 study focused on the correlation of the PedsQL and PedsQL-RM with pain assessments (18).
Results are summarized in Table 2. The C-HAQ was most strongly correlated with the PGW, with a median correlation of 0.54 (range 0.44–0.7, 6 studies [12,21,24,26, 30,32]). Of the articular measures of disease, both the active joint count (AJC) and the joints with limited range of motion (LROM) demonstrated moderate correlations with the C-HAQ, with a median correlation of 0.45 (range 0.14–0.67, 9 studies [12, 13, 16, 21, 23, 26, 30–32]) and 0.49 (range 0.3–0.76, 7 studies [8, 22, 24, 26, 29–31]), respectively. There was considerable variability in these correlations, with the most significant variations among children categorized by disease duration. Palmisani et al reported that the C-HAQ correlated less well with AJC for children early in the course of disease than for children later in the course of disease (0.14 and 0.61, respectively) (30). Those with late disease had a strong correlation with LROM (0.76), but lower correlations with PGA (0.51) (30). Modified forms of the C-HAQ, including reduced-item and digital versions, have been validated as well, although the correlation with articular measures was slightly less than for the original C-HAQ (values of 0.34–0.59) (8, 15, 28).
Table 2. Validity: correlations of instruments with measures of diseases and other instruments*
|C-HAQ (8, 11–13, 15, 16, 21–24, 26, 28–33)||0.45 (0.2–0.67), 9 studies||0.54 (0.44–0.7), 6 studies||0.45 (0.14–0.67), 9 studies||0.49 (0.33–0.76), 6 studies||0.40 (0.22–0.65), 4 studies||PedsQL: −0.62; PedsQL-RM: −0.63; CHQ PhS: −0.63 and 0.58, 2 studies; CHQ PsS: −0.25; Steinbrocker functional class: 0.77; Disease Activity Index: 0.60; ACR functional class: 0.64; digital C-HAQ: 0.97|
|CHQ (17, 26, 29)||CHQ PhS: −0.54 (−0.52 to −0.56), 2 studies; CHQ PsS: −0.048||CHQ PhS: −0.64 (−0.63 to −0.65), 2 studies; CHQ PsS: −0.315||CHQ PhS: −0.39 (−0.36 to −0.42), 2 studies; CHQ PsS: −0.024|| || ||C-HAQ PhS: −0.54 (−0.50 to −0.57), 2 studies; C-HAQ PsS: −0.25 (−0.22 to −0.28), 2 studies|
|PGA (21, 27)||–||0.54||0.62 (0.47–0.77), 2 studies||0.49 (0.4–0.58), 2 studies||0.64 (0.51–0.76), 2 studies||0.39; CHQ PhS: −0.53; CHQ PsS: −0.13|
|PGW (21, 27)||0.54||–||0.45 (0.40–0.49), 2 studies||0.43 (0.38–0.48), 2 studies||0.43 (0.42–0.43), 2 studies||0.53; CHQ PhS: −0.7; CHQ PsS: −0.29|
While there were no strong correlations between indicators of disease activity and the C-HAQ, there were moderate correlations with measures of quality of life, including the PedsQL (−0.62) and the PedsQL-RM (−0.63) (24). Of interest, while there were moderate correlations between the C-HAQ and the CHQ PhS (−0.58), there was poor correlation with the CHQ PsS (−0.25) (17). The 2 studies reporting on validity of the CHQ found consistently higher correlations between the physical component on all measures, from PGA and PGW to articular indices and functional status (17, 29). While the CHQ was found to differentiate healthy children from those with JIA, we did not find any results indicating discriminate validity to accurately classify children with JIA by the extent of their disease (25). While the PedsQL and PedsQL-RM have been studied in the general pediatric rheumatology populations, only 1 study focused on children with JIA. That study found that child-reported pain assessments correlated with all subscales of the PedsQL and PedsQL-RM, and that parent pain assessments correlated with 3 of 4 subscales for both instruments (18).
The SRM (38) calculates an effect size that incorporates information about the response variance into the denominator. According to Cohen (39), an effect size of 0.2–0.3 is considered a small effect, ∼0.5 (0.4–0.7) a medium effect size, and ≥0.8 a large effect size.
Responsiveness was assessed in 9 studies (9, 10, 17, 26, 32, 34–37). Results are summarized in Table 3. The responsiveness of the C-HAQ was assessed in 6 studies (9, 26, 32, 34–36). The results of the 6 studies were quite variable, with effect sizes ranging from 0–0.5. The 2 studies evaluating responsiveness in oligoarticular populations (35, 36) found that the C-HAQ was less responsive in patients with oligoarticular disease, with SRMs ranging from 0–0.25, compared to studies of polyarticular disease, where SRMs ranged from 0.48–0.6 (9, 26, 32, 34). This difference in responsiveness by disease category was seen even when the same definition of improvement was used (32, 36).
Table 3. Responsiveness*
|C-HAQ (9, 26, 32, 34–36)||Responders, median 0.60 (range 0.39–0.8); non-responders, median 0.08 (range 0.01–0.15)||Median 0.24 (range 0–0.5)||0.56 (95% CI 0.41–0.71)|
|PGA and PGW (34–36)||PGA, median 0.9 (range 0.82–2.07); PGW, median 0.5 (range 0.3–0.8); PGA, mean ± SD change 5.4 ± 2.6; PGW, mean ± SD change 1.5 ± 2.0||PGA, median 1.59 (95% CI 1.0–2.32); PGW, median 0.5 (range 0.33–0.97)||PGA, 0.86 (95% CI 0.72–0.95); PGW, 0.63 (95% CI 0.46–0.78)|
|Joint counts (36)||No. swollen joints 0.7; no. active joints 1.3||No. swollen joints 1.3; no. active joints 0.7|| |
|CHQ (17, 35)||CHQ PhS 0.19; CHQ PsS 0.28; CHQ overall 0.23||CHQ PhS 0.18; CHQ PsS 0.23||CHQ PhS, 0.67 (95% CI 0.5–0.81); CHQ PsS, 0.71 (95% CI 0.54–0.85)|
Three studies reported on the responsiveness of the global assessment measures and joint count indices. The most responsive measure was the PGA, with a large effect size of 1.59 (95% confidence interval [95% CI] 1.0–2.32) (34–36). However, in 2 of these studies, the patients' initial designation of improved or not improved was based on the physician's assessment, either as a categorical assessment on a 5-point scale for the first study (35), or by a definition of flare based on the addition or escalation of therapy in the second (34). Swollen joint count and AJC were also found to have moderate to high responsiveness (effect sizes 1.3 and 0.7, respectively) and may be appropriate alternative measures (36).
The responsiveness of the CHQ was evaluated in 2 studies, both of which demonstrated poor overall responsiveness, with an SRM of 0.23 and an effect size of 0.18–0.23 (17, 35). However, in the study that reported responsiveness separately based on disease state, the responsiveness was high in those designated as improved, at 0.96, indicating that the CHQ is sensitive to improvement, but the SRM was lower (−0.60) in those with worsening disease (17).
The MCID was evaluated for the C-HAQ in 2 studies. The MCID helps clinicians interpret study results by estimating the amount of change on an instrument that is associated with a clinically meaningful change in the patient's status. The first study explored the question of minimal clinically important change using a theoretical scenario and found a mean MCID for improvement of −0.13 in the C-HAQ, and 0.75 for worsening (10). The second study evaluated MCID in a JIA population and found that results differed by which external standard of disease was used (patient, parent, or physician assessment of disease). The mean MCID for improvement was −0.188 to 0 compared to child ratings, and 0 for parent and physician ratings (37). The authors concluded that changes in a patient's condition did not correlate well with the C-HAQ, and therefore that the C-HAQ is unlikely to be to a useful tool when making short-term medical decisions.
The ability of the various outcome measures to differentiate those who improved from those who did not was assessed using receiver operating characteristic (ROC) curves. The most discriminate measure of the instruments we examined was the PGA, with an area under the ROC curve of 0.86 (95% CI 0.72–0.95), compared to the PGW value of 0.63 (95% CI 0.46–0.78) and the C-HAQ value of 0.56 (95% CI 0.41–0.71) (35). A summary of the evidence for the measures assessed is provided in Table 4.
Table 4. Evidence summary table*
|Active joint count||12 (8,064)||Shows high responsiveness and moderate correlation with other measures of disease activity and functional status, but poor correlation with psychosocial aspects of quality of life Lack of interrater reliability data|
|PGA||12 (8,668)||Moderate correlations with measures of disease activity, C-HAQ, and quality of life measures Responsiveness difficult to measure, as often compared to other physician measures of disease activity No data on interrater reliability between providers|
|PGW||8 (8,182)||Moderate correlations with other measures of disease activity, C-HAQ, and physical aspects of the quality of life measures, but poor correlation with CHQ psychosocial aspects Moderate responsiveness and discriminate abilities|
|C-HAQ||23 (13,374)||Most commonly reported outcome measure with strong reliability, including moderate to strong interrater reliability between parent and child Moderate correlations with other measures of disease activity, but poor responsiveness, which varies depending on how extensive the arthritis is at baseline (ceiling effect)|
|CHQ||5 (4,687)||Limited data for JIA population Moderate to strong parent to child interrater reliability for physical components, but lower for psychosocial aspects Similarly, moderate correlations with measures of disease activity, and C-HAQ for physical component of CHQ, but poor for the psychosocial domains Poor responsiveness|
|PedsQL/PedsQL-RM||2 (173)||Insufficient data in JIA populations to evaluate fully Moderate to strong parent to child interrater reliability for physical components, but lower for psychosocial aspects|
Composite definitions of disease status or response to therapy.
Because JIA is a complex disorder, several composite definitions have been developed to categorize disease status or response to therapy. While these definitions are in various stages of validation and lack the full psychometric evaluations needed to be included above, we recognize their growing importance as a means of capturing overall health status. The most commonly used of these definitions is the ACR Pedi 30. Developed to assess response to therapy in clinical trials, it is composed of a core set of 6 variables, 5 of which were individually evaluated in our review, including PGA, PGW, C-HAQ, number of active joints, and joints with LROM (3). A recently developed composite score, the Juvenile Arthritis Disease Activity Score, was designed to better characterize the absolute level of disease activity in JIA patients and consists of 4 measures: PGA, PGW, number of joints with active arthritis, and erythrocyte sedimentation rate. Initial validation studies have been performed (40).
Several definitions deserve mention as well, including the consensus-based definition of remission (including inactive disease, remission on medications, and remission off medications) (41, 42). While these definitions have been applied retrospectively to JIA populations, further validation studies are reportedly underway. A preliminary definition of flare has also been described. This definition was derived from a cohort of patients with polyarticular JIA using the ACR Pedi 30 (43). The success of recent innovative new therapies for treating JIA has pushed the goals of treatment from minimizing disease activity to remission. The various definitions of remission, as well as flare, serve to clarify and standardize the terminology used and improve our ability to determine treatment responses and comparative effectiveness.
Our results indicate that no single instrument or outcome measure appears superior in describing the various aspects of JIA with high reliability, validity, and responsiveness. While composite measures are commonly used in JIA trials today, the lack of psychometric evaluations of these composite measures prevented their inclusion. We therefore examined the individual measures most commonly used in JIA trials, including 5 of the 6 measures that make up the ACR Pediatric 30.
The C-HAQ was the most extensively evaluated instrument of the priority measures we considered. While it demonstrated high reproducibility and internal consistency, it had only moderate correlations with indices of disease activity and quality of life, and poor to moderate responsiveness. The C-HAQ is sensitive to the degree of disability at baseline, with higher responsiveness for patients with initially worse functional impairment, and therefore may have different utility in the various categories of JIA. Furthermore, the C-HAQ is a measure of disability and not disease activity. For JIA, the C-HAQ may fail to capture the full spectrum of physical function impairments, and some of the limitations detected in the C-HAQ reflect joint damage rather than disease activity. Both of these factors likely contribute to the poor responsiveness noted for the C-HAQ and limit its usefulness in clinical trials. Therefore, although the C-HAQ is a familiar and validated measure, our findings suggest the need for a better functional outcome measure that is responsive to change across the full spectrum of disease severity.
In general, across the measures studied, reliability was moderate to high for measures of physical function but poor to moderate for psychosocial domains. Similar findings were noted in validity and responsiveness, where measures of psychosocial function and quality of life showed less correlation with disease activity indices and less responsiveness compared to the physical aspects of JIA. The reasons for this discrepancy are likely multifactorial, although having to live with a chronic disease and taking medications that may result in nausea or may require painful injections likely negatively impact measures of psychosocial function and quality of life regardless of improvements in disease activity. These findings are important to consider when discussing risks and benefits of altering treatments since patients may have different tradeoffs based on the psychosocial aspects of the disease, which can impact treatment choices.
The psychometric methods shown in this study reflect those reported in the individual studies. It is important to note there may be limitations based on the methods chosen. In particular, more recently developed psychometric methods, such as Rasch analysis, were reported infrequently. These approaches have the potential to add to classic methods in the assessment of scale properties and ultimately in the interpretation of clinical trial results. Furthermore, the common practice of reporting responsiveness on a continuous scale, even when the scales in question are ordinal, may potentially inflate the interpretation of effect sizes and MCIDs (5, 6). Assessing the full impact of JIA is complicated by the heterogeneity in disease severity, the broad effects on both physical and psychosocial health, and the potential for both chronic and acute limitations. Efforts to develop a standardized composite measure that incorporates articular indices, severity, and a broader assessment of functional limitations and psychosocial impact would be useful to better discriminate levels of disease activity, overall impact of disease, and responsiveness to therapy. Consistent use of such outcome measures would facilitate comparative effectiveness research. However, developing one instrument that can serve as an accurate measure of all facets of disease, encompass the variable disease manifestations of the different categories of JIA, and be responsive enough to detect meaningful changes in disease status seems unlikely. The ACR Pedi 30 definition of improvement attempts to incorporate many of the clinically meaningful indices, but as our systematic review highlighted, the responsiveness of several of these measures, including functional status and PGW, is poor to moderate, and may not adequately reflect changes in disease state. Given that the amount of clinical change can vary depending on both the effectiveness of the intervention (e.g., nonsteroidal antiinflammatory drug versus biologic agent) and the disease severity (e.g, number of joints involved and severity of joint symptoms), responsiveness will need to be assessed across a broad spectrum of JIA severity and for treatments of varying effectiveness.
Knowing the performance characteristics of the outcome measures and standardizing the measures used in clinical trials is especially important for evaluating the comparative effectiveness of various treatments for JIA. While the selection of the most appropriate outcome measure may differ depending on the specific question being investigated (improvement in disease activity versus changes in quality of life), efforts should be made to standardize the measure chosen to evaluate a specific domain. To best assess treatment effects, the responsiveness of the instrument used is crucial. Therefore, focusing on the more responsive measures improves our ability to assess treatment effects and enhances our ability to detect promising new treatments. Reporting functional status and quality of life are also important, especially given that many of the current treatments require infusions or injection and have varying side effects that can negatively impact a child's quality of life. While these measures may be less responsive to changes than disease activity, they still provide valuable information. Examining the more responsive articular measures separately from the less responsive functional status and quality of life measures may actually improve our understanding of both the efficacy and effectiveness of treatment regimens, providing further insight into the complexities of living with JIA and its treatments.
All authors were involved in drafting the article or revising it critically for important intellectual content, and all authors approved the final version to be published. Dr. Van Mater had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study conception and design. Van Mater, Williams, Coeytaux, Sanders, Kemper.
Acquisition of data. Van Mater, Williams, Coeytaux, Sanders, Kemper.
Analysis and interpretation of data. Van Mater, Williams, Coeytaux, Kemper.