Introduction and Background
- Top of page
- Introduction and Background
- Materials and methods
The measurement of disease activity in ulcerative colitis is critical in determining whether new therapies are effective, but there is no gold standard for measuring disease activity in ulcerative colitis. Many empirically derived indices have been developed, usually including measures of bowel movement frequency, stool blood and clinical assessment.1–4 Some indices have added biomarkers in the blood (haemoglobin, albumin or erythrocyte sedimentation rate) 5, 6 or endoscopic assessment.7, 8 None of these indices has been rigorously validated. The validity of a measurement can be measured in several ways, and these approaches can be divided into two categories: psychometric validity and performance validity. Psychometric validity is important to demonstrate that the instrument measures the correct symptom domains or disease state, but once that bar has been reached, it is not particularly valuable for quantitatively comparing two different instruments that are both psychometrically valid. Performance validity measures the ability of the instrument to accurately and reproducibly measure a disease state, and to differentiate between different levels of severity. Performance validity allows the quantitative evaluation and comparison of different instruments, evaluating how well they meet the measurement requirements of a clinical trial.
There are very limited data on the four components of psychometric validity (content validity, construct validity, criterion-convergent validity and criterion-predictive validity) in the existing disease activity indices in ulcerative colitis. Content validity is determined by the presence of all the important components (also known as domains or factors) in the measurement of the disease state. Construct validity is determined by the correlation with another validated index based on the same disease construct. Criterion-convergent validity is determined by the correlation with commonly used indices in the same field. Criterion-predictive validity is determined by the ability of the index to predict clinically important outcomes. No studies have been conducted on the performance validity (reproducibility in stable disease activity and responsiveness when disease activity changes) of the existing disease activity indices in ulcerative colitis.
This leaves regulatory agencies in the difficult position of making decisions about the benefits and risks of highly potent, and potentially dangerous, biological therapies for ulcerative colitis without proven measures of efficacy.9 Regulatory agencies have been forced to use empiric definitions from a non-validated endoscopic index of remission and response in making their decisions about new therapies.7 While psychometric validity is important to show that a scale truly measures the disease activity, performance validity is equally important in selecting an instrument to evaluate clinical trial results. The stability of a disease activity index in patients who have no change in the severity of their disease is essential in limiting the placebo response rate, and the responsiveness of the instrument is critical in being able to detect meaningful clinical improvement.
This also leaves clinicians without well-established useful measures of disease activity with which to judge the outcomes of clinical trials, and to compare the efficacy of different therapies for ulcerative colitis. Without these tools, many investigators modify existing indices or invent new ones that suit the therapy in question. The choice of an instrument to show a new therapy in the best light can introduce bias, and can make it very difficult for clinicians to make informed choices about how best to treat their patients with ulcerative colitis.
Recent data published by our group showed that two non-endoscopic indices, the Simple Clinical Colitis Activity Index (SCCAI)3 and the Seo Index5 (named after its originator, Mitsuru Seo) (Table 1), were able to predict clinically meaningful endpoints in patients with ulcerative colitis. However, these non-endoscopic and less costly indices are unlikely to be widely used unless there is evidence that they have both psychometric and performance validity. We hypothesized that these may be valid instruments for the measurement of ulcerative colitis. Repeat measurements of disease activity were obtained in patients who participated in our previous study to determine the psychometric and performance validity of these two indices in ulcerative colitis. This quantitative approach can be applied to any index of disease activity in ulcerative colitis, and can be used to compare the psychometric and performance validity of different indices.
Table 1. Structure and previous validation of non-endoscopic indices for ulcerative colitis
|Index||Simple Clinical Colitis Activity Index3||Seo Index5|
|Components||Six questions: Day stool frequency Night stool frequency Urgency Stool blood General well-being Extraintestinal||Two questions: Stool Frequency Stool blood Three laboratory tests: Erythrocyte Sedimentation Rate Haemoglobin Albumin|
|Criterion convergent||Correlated with St Mark's Index and Seo Index||Correlated with both Truelove and Witt's classification and endoscopic findings22, 23|
|Criterion predictive||Predicted patient clinical relapse16||Predicted need for colectomy24|
- Top of page
- Introduction and Background
- Materials and methods
In this study, we developed a quantitative method to measure the psychometric and performance validity of two non-endoscopic disease activity indices for ulcerative colitis. This methodology also allows investigators to identify specific weaknesses in components of psychometric or performance validity. We found that both non-endoscopic indices have fair to excellent psychometric validity, while the SCCAI has better performance validity than the Seo Index. This is the most rigorous evaluation to date of the validity of disease activity indices in ulcerative colitis. The validity of other disease activity indices for ulcerative colitis remains unknown, and at this point, the SCCAI and the Seo Index have the best documented validity of any ulcerative colitis disease activity indices. This rigorous testing justifies the use of these non-endoscopic indices in clinical trials. Based on the results of this study, the SCCAI should be favoured over the Seo Index for measurement of disease activity in longitudinal clinical trials.
An additional benefit of this study is that it specifically identifies the weaknesses of the current indices. The SCCAI is somewhat lacking in content validity and responsiveness. The content validity could be addressed by adding items to measure the missing domains (laboratory tests and temperature) to the SCCAI, and further research is needed to improve the responsiveness of the SCCAI. It may be that the current response scales to the SCCAI questions do not have enough gradations to detect small changes in disease activity, and improving these scales might improve responsiveness.
The Seo Index is lacking in content validity, construct validity and responsiveness. The content validity could be improved by adding items to address the missing domains. Additional detailed questions about bowel symptoms might improve the construct validity, as the Seo Index has only one question about bowel frequency. The responsiveness might also be improved by additional gradations in the responses to symptom questions. Despite its weaknesses, the Seo Index is remarkable for its innovative use of laboratory tests as simple biomarkers. It is probable that the addition of biomarkers, which are not found in other disease activity indices for ulcerative colitis, causes the Seo Index to have good criterion-convergent and criterion-predictive validity.
Our data support the findings of Jowett et al.,16 who found that the SCCAI has good criterion-predictive ability to predict which patients are in relapse. They also support the findings of Seo et al.22,23,24, who found that the Seo Index has good criterion-convergent validity with St Mark's Index and with endoscopic findings, and that the Seo Index has good criterion-predictive validity in its ability to predict which patients would require colectomy.
This study is limited in that we only assessed patients at a single tertiary care centre, and the results may not be generalizable to clinical study subjects in other centres, and particularly to subjects for whom English is not their first language. In order to maximize the generalizability of these findings, we deliberately included subjects with a wide range of disease activity, such as those undergoing colonoscopic surveillance, patients having colonoscopy for symptoms, and inpatients who were quite ill. A second limitation is that the comparators chosen for determining criterion-convergent validity and construct validity have not themselves been validated. St Mark's index has never been validated, and while the entire IBDQ has been validated, the individual subscores have not. This is an inherent limitation in the evaluation of psychometric validity. Future developments in the measurement of ulcerative colitis may identify better comparators for criterion-convergent validity and construct validity in ulcerative colitis.
Another limitation is the use of a definition of remission as determined by measurement of disease activity at one point in time. While this is currently the standard in clinical trials in ulcerative colitis, this approach has been criticized as inappropriate in an inherently waxing and waning disease. Some [including the Food and Drug Administration (FDA)] have advocated ‘durable clinical remission’. This is a relatively new concept, and durable clinical remission does not have a consensus definition. This variability is an inherent problem in diseases with a waxing and waning course. Given the inherent variability in ulcerative colitis, perhaps durable clinical remission should be defined as remission upon repeated measurement over an extended period of time. This would add substantially to the costs and difficulty of conducting clinical trials. If this were required, this would make endoscopic indices particularly unattractive, as repeated lower endoscopy is expensive and avoided by subjects. An alternative approach would be to identify levels of symptoms, biomarkers or other tests (perhaps including mucosal healing) that truly predict which subjects will have durable clinical remission over the next 6 months. Future research may identify factors that can be proven to be accurate predictors of durable clinical remission.
The definitions of clinical remission and significant clinical improvement used in this study were those of the subjects with ulcerative colitis. These are used because there is no gold standard for clinical remission in ulcerative colitis. Alternatives have been proposed, including biomarkers of inflammation, physician assessment, imaging methods, endoscopic healing or a combination of these. While one can argue that these may be more objective, none of these outcomes matters if the patient does not feel well. Biomarkers can be manipulated with biologics, but a low C-reactive protein (CRP) does not equal health. Other manipulations (i.e. topical therapy) could improve the appearance of imaging or the appearance of the colonic mucosa, but if the patient does not feel well, we have not treated the patient, but only a manifestation of the disease. From a clinical perspective, we must treat the patient, not numbers or images, so we must use the patient as a gold standard to identify which factors are valuable objective predictors of patient outcome.
For many clinicians, performance validity is more important than psychometric validity. If the psychometric validity is reasonably good, the performance validity is the critical evaluation of an instrument. In clinical trials, it is very important that the measurement tool be sensitive enough to detect clinically important changes. It is also critical that the tool be reproducible and stable enough that patients with little or no improvement have stable scores on the measurement instrument, to avoid increasing the placebo rate.
Presently, the SCCAI is the best validated index available for ulcerative colitis. The Seo Index demonstrates the value of biomarkers, as this index is able to predict clinical outcomes despite its demonstrated weaknesses in validity. The results of this study suggest that a better disease activity index for ulcerative colitis can be developed. A novel index that combined the questions of the SCCAI, the biomarker laboratory tests of the Seo Index, and a temperature item would have significantly improved the content validity. Adding more response levels to the questions might improve the responsiveness of the novel index further. The biomarkers of the Seo Index might be improved upon with newer biomarkers, including C-reactive protein,17–19 faecal lactoferrin20, 21 or faecal calprotectin. The role of endoscopic and histologic findings, particularly as part of the current endoscopic indices, also needs to be further defined. When the value of these potential measurement items is determined, a reduced panel of symptom items, biomarkers, and possibly endoscopy or histology will probably yield an improved, valid disease activity index for ulcerative colitis. Our group is currently developing and testing new survey questions and biomarkers for ulcerative colitis for this purpose, with the support of the Crohn's and Colitis Foundation of America.
In this report, we show that the SCCAI is the most rigorously validated index in ulcerative colitis, and that it has good psychometric and performance validity. Its use should be strongly considered for longitudinal clinical trials in ulcerative colitis until improved ulcerative colitis disease activity indices are developed with better validity scores. The quantitative methodology we have introduced in this manuscript allows the direct comparison of the validity of different indices, identifies specific weaknesses in the indices for remediation, and provides a metric for evaluating the validity of future disease activity indices in ulcerative colitis and in other disease states. This methodology can bring a rigorous approach to the development, validation and improvement of future disease indices in a wide range of disease states.