New interventions in inflammatory bowel disease (IBD) should be tested in the clinical trial setting, where validated multiitem measures of the disease activity such as the Pediatric Crohn's Disease Activity Index (PCDAI)1–5 should be used to assess response.
During the last 20 years, the PCDAI has become the standard of measuring disease activity in pediatric Crohn's disease (CD) but it is not without limitations. First, the inclusion of laboratory results, perianal examination, and height velocity in the PCDAI reduces its feasibility especially for retrospective review of patients' health records. Even in a prospectively collected “real-life” registry cohort, the PCDAI was scored in only 48% of eligible visits compared with 98% for the Pediatric Ulcerative Colitis Activity Index (PUCAI), which requires no laboratory values.6 A recent study has found that data to complete the PCDAI retrospectively were available in the charts of only 20% of 3643 clinical visits.7 Second, although the height item undoubtedly is a very important marker of disease activity in children, it is relevant only to young children in the Tanner-growing stages (until stages 2–3) and its calculation over many months reduces short-term responsiveness and discriminant validity. Acknowledging that fact, we recently determined that remission of the PCDAI should be defined as <10 points, or <7.5 points without the height item.5 Third, the inclusion of perianal item is debated as it reflects a different concept than luminal disease activity. Finally, the PCDAI cannot differentiate well the moderate from the severe end of disease activity.3, 8
To address the poor feasibility of the PCDAI, two shorter versions of the index have been published but with limited evaluation and validation. Two groups proposed an abbreviated PCDAI (abbrPCDAI), removing the height, extraintestinal manifestation and the three laboratory items (Appendix A1), Table 1 (see Supporting Information).9, 10 The remaining items of the abbrPCDAI were not reweighted. Recently, a larger study presented a short version of the PCDAI (shPCDAI), excluding items with a low frequency of completion in a patient registry (Appendix B1, Table 1) (see Supporting Information).7 The difference between the shPCDAI from the abbrPCDAI is that the extraintestinal manifestation item has replaced the perianal item, and that new weights have been mathematically assigned to each item by multivariate modeling, reflecting their relative importance to physician global assessment (PGA) of disease activity.
Table 1. Studies of the Different PCDAI Later Versions
|Abbreviated PCDAI (abbrPCDAI)||Loonen 2003 (9)||n=71, data from previously reported prospective cohort (2)||AUC of ROC 0.93 to discriminated between remission from active disease||Remission <10 points|
|Shepanski 2004 (10)||n=40, prospective single center cohort (5-24 years)||abbrPCDAI to PCDAI r=0.85||—|
|abbrPCDAI to IMPACT r=-0.58|
|Kappelman 2010 (7)||n=2815 visits of approximately 600 children from a registry||abbrPCDAI to PCDAI r=0.68||—|
|abbrPCDAI to PGA r=0.64|
|AUC of ROC 0.82 to discriminate remission from active disease|
|Short PCDAI (shPCDAI)||Kappelman 2010 (7)||n=4241 visits of approximately 900 children from a registry||shPCDAI to PCDAI r=0.66||Remission <15 points|
|shPCDAI to PGA r=0.60||Mild 15-20 points|
|AUC of ROC 0.80 to discriminate remission from active disease||Moderate-severe >20 points|
|Modified PCDAI (modPCDAI)||Leach 2010 (11)||n=100 visits of 62 children||modPCDAI to PCDAI r=0.66||Remission <7.5|
|modPCDAI to PGA r=0.79||Mild 7.5 – 10|
|modPCDAI to calprotectin r=0.48||Moderate 12.5-17.5|
| ||Severe >17.5|
A third version, a modified PCDAI (modPCDAI), was recently proposed by Leach et al11 based on the three laboratory items of the PCDAI (i.e., hematocrit, erythrocyte sedimentation rate [ESR], and albumin) with an addition of C-reactive protein (CRP) (Appendix C1, Table 1) (see Supporting Information). It aimed to overcome the ambiguity of the subjective and anthropometric components of the full index.
In the four aforementioned studies, only limited analyses of the clinimetric properties have been performed. In addition, scarce data are available on the cutoff values that should be used to define remission and response and other gradations of disease activity (Table 1). Finally, the PCDAI itself has hitherto never been subjected to multivariate mathematical weighting and item reduction, likely due to the insufficient sample size in previous studies for this purpose. We have previously shown that mathematical weighting of a disease activity index is superior to the judgemental approach.15
We therefore aimed to use the raw data from four prospectively collected datasets of pediatric CD to mathematically weight items in the PCDAI. We then systematically compared this mathematically weighted PCDAI (wPCDAI) with the original PCDAI, abbrPCDAI, shPCDAI, and modPCDAI with respect to feasibility, validity, and responsiveness as measures of disease activity in pediatric CD. Cutoffs that correspond to remission, response, and gradations of disease activity were determined for each version.
- Top of page
- MATERIALS AND METHODS
- Supporting Information
We systematically compared, for the first time, the clinimetric properties of the different PCDAI versions and determined the best cutoff scores that correspond to remission, active disease, and response. We also weighted the PCDAI mathematically, hitherto not done, thereby excluding three, statistically redundant items, and producing a modified index with better performance.
Despite the several concerns related to the original PCDAI outlined above, it has performed well in multiple studies over the years. Our survey showed that 91% of experts think that the index has good to very good face validity but it was found to be inferior to the wPCDAI in the construct validity, discriminant validity, and responsiveness. Our survey is likely affected by response bias (i.e., the tendency to respond in a particular way that leads to systematic bias), in which the participants assumed that the longer the version the more valid it is and less feasible. The high face validity among experts who use the current PCDAI frequently may also simply express comfort with a well-known and frequently used tool. This may explain the difference in the validity obtained in the analysis versus the findings of the survey.
The wPCDAI was obtained by mathematical evaluation of the weights of the PCDAI items, originally determined judgmentally. We previously compared the judgmental and the mathematical strategies in weighting the items of the Pediatric Ulcerative Colitis Activity Index (PUCAI), showing that assigning weights mathematically yielded an index that performed just as well as the judgmental one but without the need for laboratory tests, a major advantage in pediatrics.15 Similarly, the newly weighted wPCDAI performed just as well as the full index but without items of low feasibility. Indeed, evidence from cognitive psychology suggests that humans perform poorly in discriminating between important and less important items.16 Two rheumatologists were asked in a clinical judgment analysis to provide a PGA of disease activity on patients, and then to state how much emphasis they placed on specific items when providing that assessment.17 Both physicians placed comparable weighting across five items, but multivariate modeling showed that, in practice, the decision relied on only part of the items that the physicians stressed as important. In a different study, multivariate analyses calculated from clinical judgments in rheumatoid arthritis explained 88% of the variance of the model, whereas rheumatologist's specified judgment policies could explain only 34%.18 The evidence, thus, convincingly support our finding that mathematical modeling for assigning weights to the PCDAI yields a more valid index than the original judgmental weighting.
The primary aim of developing the abbrPCDAI and shPCDAI was to maximize feasibility, even in the expense of validity, and as such no blood tests are included. The results of our survey reflected that concept; both versions had the highest feasibility but with an associated low face validity. All in all, the two versions performed similarly in most of the evaluated clinimetric categories. Despite the lower overall performance of these indices compared with the PCDAI and the wPCDAI, their utility is inimitable when a more feasible index is needed (such as in retrospective chart review when lab tests are not always available). Based on the results of this study, both the shPCDAI and the abbrPCDAI may be used and the availability of specific items should dictate the use of either version.
The modPCDAI is a version comprised of only laboratory items that were developed with the aim of producing an objective measure of disease activity. Its responsiveness and discriminant validity proved significantly inferior to the PCDAI and wPCDAI. Indeed, laboratory tests have at most fair correlation with intestinal inflammation in CD.19 Consistent with the original study, the modPCDAI had only moderate correlation with PGA.11 Nonetheless, overall it performed well in the construct validation, likely since we chose constructs that were largely the very same blood test that constructed the index.
This highlights the most significant limitation of our study, which is the ambiguity in defining disease activity for validation. The latter is a concept for which no gold standard exists; it is the combined constellation of clinical, laboratory, endoscopic, and radiographic parameters that best define disease activity. Thus, validity is a process of continuous learning about the measure in different scenarios and using different constructs. Our study was limited to the constructs collected in the original studies, although other constructs (e.g., fecal calprotectin and ileocolonoscopy) are also important. Similarly, the PGA used to weight the wPCDAI may not necessarily reflect a “true” estimate of disease activity. Nonetheless, there are multiple precedents in using the PGA in this way including the PCDAI, the PUCAI, and disease activity measures used in rheumatologic diseases.20, 21 Another limitation of this study is the lack of reliability testing, which is the last aspect in clinimetric evaluation. The major strengths of this study is in including several constructs, a very large sample size that allowed statistical manipulations, and robust methodological techniques, including a large survey among experts in the field.
There is no perfect tool that combines responsiveness, discriminative and construct validity, and high feasibility. The performance of the modPCDAI was inferior to the other versions. When a very feasible index is needed, the shPCDAI or the abbrPCDAI have sufficient and similar validity and responsiveness. However, their overall performance was inferior compared with the full indices (i.e., PCDAI and wPCDAI), which should be preferred, most certainly in prospective studies. The newly weighted wPCDAI had the highest overall performance despite, or as a consequence of, the exclusion of three items. Two of the three (height velocity and abdominal examination) have low feasibility and two (abdominal examination and hematocrit) had a low frequency of endorsement in the datasets studied. Despite these encouraging results, the wPCDAI cannot yet replace the full version, which has gained extensive credibility through 20 years of successful experience. More comparative studies are necessary in the different scenarios to grasp the utility of the different versions.