The Simple Cholestatic Complaints Score is a valid and quick patient‐reported outcome measure in primary sclerosing cholangitis

Abstract Background Measuring symptoms and disease burden in patients with primary sclerosing cholangitis (PSC) is increasingly important for daily practice and clinical trials. The Simple Cholestatic Complaints Score (SCCS) is a four‐item questionnaire, that measures cholestatic symptoms (pruritus, fatigue, RUQ abdominal pain and fever) in PSC patients. The aim of this study was to evaluate reliability and validity of SCCS in a Dutch population. Methods The study population consisted of 212 patients from the Dutch prospective PSC registry. Data were collected via digital surveys. Reliability was evaluated by internal consistency and reproducibility. Construct‐, criterion‐ and discriminant validity were determined. The ability to detect clinical change with SCCS was evaluated in patients who underwent endoscopic intervention. Simple Cholestatic Complaints Score collected by email and by a mobile application were compared. Results A total of 153 patients completed the questionnaire. Internal consistency was moderate and increased to 0.71 after removal of the fever item. Test‐re‐test reproducibility was high (intraclass correlation coefficient = 0.96). Criterion validity was good (all > 0.82). Construct validity was in line with a priori hypothesized correlations in 80%. SCCS was able to differentiate between clinically different groups. There was no difference between inflammatory bowel disease (IBD) and non‐IBD patients. Simple Cholestatic Complaints Score was responsive to change after endoscopic intervention in successfully treated patients. Simple Cholestatic Complaints Score measurement by digital questionnaire and a mobile application was comparable. Conclusion The SCCS is a valid instrument to measure cholestatic symptoms in PSC patients. Because of its quick and easy to use properties it is suitable for frequent monitoring of symptoms in clinical trials and daily practice.


| Simple Cholestatic Complaints Score
The SCCS (Table 1) contains four questions about degree of pruritus, fatigue, RUQ abdominal pain and fever in the past 7 days. The pruritus, fatigue and abdominal pain item have a score from 0 to 4, the fever item score is 0 or 1. The sum score, a sum of the four individual scores, ranges from 0 to 13.

| Study population, design and data collection
A cross sectional design was used for this study. Data were collected from August 2017 to November 2017. The study population consisted of patients of the EpiPSC 2 study, a large Dutch populationbased prospective registry. 10 Diagnosis was based on EASL criteria, and all patients underwent careful case ascertainment on site. Small Duct PSC was defined as clinical and histological signs of PSC without cholangiographic changes. All patients received periodic digital surveys via email and/or a mobile application. Data were collected in a web-based database (CastorEDC). A separate group of IBD patients without signs of liver disease was accrued from the WORK-IBD study. 11 Correlations between two different symptom scoring instruments were always based on data collected at the same time.
Data from the DILSTENT trial, 9 in which different interventions during endoscopic retrograde cholangiography (ERC) in PSC patients were compared, were used to assess the ability of the SCCS to detect clinical change after treatment. In the DILSTENT trial 65 PSC patients with a dominant stricture based on imaging and/or clinical profile underwent ERC with balloon dilatation or stent placement. SCCS was scored at baseline and at 3 months after endoscopic treatment.

| Validation process
The topics addressed in the FDA guideline for development and validation of PRO measures 12  correlations and correlations after one-item removal were calculated. As the SCCS measures different complaints instead of one single symptom the internal consistency is expected to be moderate (0.70-0.85). 13 Test-retest reliability (reproducibility) was measured by issuing the SCCS twice with exactly 48 hours in between (T1 and T2). As the SCCS measures symptoms in the past 7 days, only minor changes in scores should be observed. The degree of reproducibility will be determined by the intraclass correlation coefficient (ICC). An ICC ≥ 0.7 is considered as good reproducibility. 14 Criterion validity, defined as the correlation between an item and the gold standard instrument for that specific item, was evaluated.
Scores of the widely accepted and validated questionnaires 5D itch scale, 15 Fatigue Impact Scale (FIS) 16 and the RUQ abdominal pain item from the Liver Disease Symptom Index (LDSI) 17 were correlated to the pruritus, fatigue and RUQ abdominal pain items of the SCCS respectively. There is no gold standard questionnaire for fever, except for measuring body temperature. A correlation > 0.7 was considered as good criterion validity. 14 Construct validity, consisting of convergent and discriminant validity, was determined using Spearman's Rho correlation.
Convergent validity refers to the degree to which two constructs Activity Index). Statistical difference was tested by a paired t test. As multiple matching options were possible for some patients, eight alternative matching scenarios were run and pooled using Rubin's rules to account for within and between scenario variabilities. 19 To assess the ability to detect clinical change (responsiveness) SCCS was measured before and 3 months after ERC in successfully treated patients(responders) and unsuccessfully treated patients

| Adapted SCCS (SCCS-A)-specification of severity and frequency
Some of the answer options of the SCCS contain a severity and a frequency domain. For example: the question about pruritus has the answer options 'I have daily itch' and 'I have unbearable itch'. This might be confusing for patients with daily and unbearable itch. To evaluate whether this impacts the validity of the SCCS an adapted version with separate questions for severity and frequency was tested in parallel. This adapted SCCS (SCCS-A) (Table   S1) scores both severity (range 0-4) and frequency (range 0-4) of itch results in a pruritus score of 6. The SCCS-A has the same time frame of 7 days as the original SCCS.
Patients' experiences about both SCCS and SCCS-A were scored. They were asked if they could express their level of symptoms better in one of the two questionnaires or whether there was no difference.

| SCCS via a mobile application
The reproducibility of SCCS when sent via a mobile application was evaluated. Scores of patients who completed SCCS via both the digital questionnaire (by email) and the mobile application within 1 week were compared. Intraclass correlation coefficient (ICC) was evaluated to test the reproducibility. An ICC ≥ 0.7 was considered as good reproducibility. 14

| Demographics
A total of 153 of 212 patients (72%) completed the questionnaire.
Mean age of the responders was 54 years and median disease duration was 15 years (Table 2). Most patients had LD-PSC (90%) and/or inflammatory bowel disease (56%).

| Criterion validity
The correlations between every SCCS item and its corresponding gold standard instrument were all high. Correlation between the

| Known groups validity
Mean SCCS sum scores from different groups were compared (  (Table S3). In addition, SCCS of PSC-IBD vs PSC only were almost equal in both SD-PSC and LD-PSC (−0.2 to 0.2).
A total of 44 PSC-IBD patients were matched to 44 IBD patients (Table S4). Scores on the pruritus and fever item were significantly higher in the PSC-IBD group. SCCS sum score was on average 0.7 points higher in the PSC-IBD group, however, with a P-value of .054 this did not meet the criterion for significant difference.
In patients with clinical signs of end-stage liver disease SCCS were higher. Patients with impaired general quality of life score, measured by EQ-5D, have higher SCCS sum scores. Also, patients who cannot work because of disease (absenteeism) have higher SCCS.

| Detection of clinical change
In 41 patients who underwent ERC in the DILSTENT trial, treatment response could be rated on biochemical response only. This was considered succesfull in 27 patients according to the predefined criteria.
The pruritus, fatigue and RUQ abdominal pain score of these patients decreased significantly 3 months after the intervention ( Table 6). The intervention had no effect on the fever item, but this had a very low frequency (4/49) at baseline. Mean SCCS score dropped from 3.59 to 1.67 after treatment. No significant decrease of any item or the sum score was observed in the non-responder group (n = 14).

| Adapted SCCS with separate and frequency domains
Correlations of the SCCS-A, with separate questions for severity and frequency, were compared to those of the original SCCS (Table S2). In general, correlations were very similar. The biggest difference in favour

TA B L E 5 'Known groups' validity of SCCS
of SCCS was seen when comparing the pruritus item to LDSI itch (0.029).
On the other hand, the biggest difference (0.060) in favour of SCCS-A was seen when comparing to VAS for fatigue. The category of strength of the correlations was always the same for SCCS and SCCS-A.

| Patients' experiences
A total of 47% of patients had no preference for SCCS or SCCS-A, 23% of patients could express their symptoms better in SCCS and 30% preferred SCCS-A.

| SCCS via a mobile application
A total of 69 patients completed the SCCS via email and the mobile application within 7 days (Table 7). Mean scores on the different items were highly comparable. ICC ranges from 0.65 to 0.92.

CO N FLI C T O F I NTE R E S T
All authors disclosed no financial relationship relevant to this publication.

PATI ENT CO N S ENT S TATEM ENT
All patients provided written consent for participation in this study.

E TH I C S A PPROVA L S TATE M E NT
This study was approved by the IRB of the UMC Utrecht.