A structured review of patient-reported outcome measures for patients with skin cancer, 2013

Authors


  • Funding sources This work was commissioned by the British Association of Dermatologists and the Department of Health.

    Conflicts of interest None declared.

Correspondence

Elizabeth Gibbons.

E-mail: elizabeth.gibbons@dph.ox.ac.uk

Summary

Background

The collection of patient-reported outcome measures (PROMs) within the national PROMs programme for elective procedures is now established mandatory practice in the NHS with high response rates and completion.

Objectives

This review examines the evidence of PROMs for people with skin cancer.

Methods

Comprehensive searches were conducted using several sources and databases, using a detailed search strategy developed by the University of Oxford's PROM Group. Articles were assessed for eligibility. Data were extracted per PROM for each measurement property and appraised using an appraisal framework.

Results

A total of 3517 articles were identified in the searches, and 28 were included in the final review after assessment by two independent reviewers. Two generic instruments (SF-36 and Sickness Impact Profile) and nine condition-specific PROMs were identified.

Conclusions

Overall, there is a limited volume of published evidence for the application of generic PROMs for people with skin cancer. Evaluation of the EQ-5D may be particularly important given its widespread use in many other healthcare contexts in the U.K. The Skin Cancer Index could be considered for piloting in the NHS. For patients with nonmelanoma skin cancers, the Skindex measures may also be considered. The SCQOLIT has some evidence of applicability across both skin cancer types but more evaluations are needed. The FACT-M does have more promising characteristics for patients with malignant melanomas although no evidence of testing in the U.K. was found. The forthcoming EORTC-M may prove a useful measure given the expertise and track record of this European collaboration in cancer and quality of life.

Patient-reported outcome measures (PROMs) offer enormous potential to improve the quality and results of health services, providing validated evidence of health from the point of view of the user or patient. Lord Darzi's interim report on the future of the NHS[1] recommends that patient-reported outcome measures (PROMs) should have a greater role in the NHS. Furthermore, Lord Darzi's 2008 report High Quality Care for All[2] outlines policy regarding payments to hospitals based on quality measures as well as volume including PROMs. Since April 2009, the collection of PROMs for selective elective procedures is mandatory for all acute trusts for patients undergoing elective procedures: primary unilateral hip or knee replacements, groin hernia surgery or varicose vein. Several pilots of PROMs for specific conditions and populations are in progress (long-term conditions, mental health, elective cardiovascular revascularization, cancer survivorship).

Further endorsement of PROMs is supported in the Department of Health's Outcomes Framework: 2013/2014,[3] which outlines five domains and suggested indicators. Domain 3, ‘Helping people recover from episodes of ill health or following injury’, outlines an indicator which represents ‘Total health gain as assessed by patients’ and PROMs are featured as a mechanism for obtaining patients' perceptions of improvement and health gain. PROMs are also featured in Domain 2, ‘Improving the quality of life for people with LTCs [long-term conditions]’ and in the Department of Health's 2011 Improving Outcomes: A Strategy for Cancer[4] specifically for cancer survivors. A structured review was undertaken, building on methods used for other reviews of PROMs for selected cancers (http://phi.uhce.ox.ac.uk/newpubs.php) to produce guidance of PROMs for skin cancer.

Methods

The aim of this report was to identify both generic- and cancer-specific PROMs that have been evaluated with patients with skin cancer. Both types of skin cancer were included in the review: nonmelanoma skin cancers (NMSC) and malignant melanomas. Cancers specific to other soft tissues and muscles were not included, for example those cancers classified as ‘head and neck’.

Comprehensive searches were conducted using several sources and databases (Table 1), using a detailed search strategy developed by the PROM Group (available on request). Articles were assessed for eligibility by two independent reviewers, based on strict inclusion and exclusion criteria (Table 2). Data were extracted per PROM for each measurement property (Table 3) and appraised using an appraisal framework (Table 4).[5-7]

Table 1. Search strategy: the searches were conducted using three main sources
Database search
University of Oxford PROM Bibliography Database (http://phi.uhce.ox.ac.uk) using keywords, up to Dec 2005
Multi-database search using a comprehensive search strategy developed by the PROM Group (available on request). Four databases were searched from January 2006 to November 2012 using the OvidSP© search engine:
AMED (Allied and Complementary Medicine)
EMBASE
PsycInfo
MedLine
Key journals (from November 2011 to November 2012)
Health and Quality of Life Outcomes
Quality of Life Research
Journal of Clinical Oncology
British Journal of Cancer
Cancer
European Journal of Cancer
Supplementary sources
The Cochrane library (http://www.thecochranelibrary.com/)
Reference lists of included articles
‘Instrument name’ searches for commonly cited PROMs identified during the initial phase of review
EQ-5D website: reference search facility (http://www.euroqol.org/)
Websites of PROMs identified
Table 2. Inclusion and exclusion criteria
Inclusion criteria
Titles and abstracts of all articles were assessed for inclusion/exclusion by two reviewers
Included articles were retrieved in full
Published articles were included if they provided evidence of measurement and/or practical properties
Articles were retrieved, assessed for relevance and catalogued according to the PROM for which they provided evidence (note that a single paper frequently provided information on more than one measure)
Population
Patient with skin cancer (any type)
English-speaking populations
Study design selection
Studies where a principal PROM is being evaluated
Studies evaluating several PROMs concurrently
Applications of PROMs with sufficient reporting of methodological issues
Specific inclusion criteria for generic- and disease-specific instruments
The instrument is patient-reported
There is published evidence of measurement reliability, validity or responsiveness following completion in the specified patient population
The instrument will ideally be multi-dimensional
Evidence is available from English-language publications and instrument evaluations conducted in populations within the U.K., North America and Australasia
Exclusion criteria
Clinician-assessed instruments
Studies evaluating the performance of non-patient-reported measures of functioning or health status where a PROM is used as a comparator indicator
Table 3. Data extraction
Data were extracted on the psychometric performance and operational characteristics of each PROM
Assessment and evaluation of the methodological quality of PROMs was performed independently by two reviewers
The final shortlisting of promising PROMs to formulate recommendations is based on these assessments and discussion between reviewers
Evidence is reported for the following measurement criteria:
Reliability
Validity
Responsiveness
Precision
Operational characteristics, such as patient acceptability and feasibility of administration for staff, are also reported
Table 4. Appraisal criteria (adapted from Smith et al.[5] and Fitzpatrick et al.[6, 7])
Appraisal componentDefinition/testCriteria for acceptability
Reliability
Reproducibility/test–retest reliabilityThe stability of a measuring instrument over time; assessed by administering the instrument to respondents on two different occasions and examining the correlation between test and retest scoresTest–retest reliability correlations for summary scores 0.70 for group comparisons
Internal consistencyThe extent to which items comprising a scale measure the same construct (e.g. homogeneity of items in a scale); assessed by Cronbach's α and item-total correlationsCronbach's α for summary scores ≥ 0.70 for group comparisons; item-total correlations ≥ 0.20
Validity
Content validityThe extent to which the content of a scale is representative of the conceptual domain it is intended to cover; assessed qualitatively during the questionnaire development phase through pre-testing with patients, expert opinion and literature reviewQualitative evidence from pre-testing with patients, expert opinion and literature review that items in the scale represent the construct being measured; patients involved in the development stage and item generation
Construct validityEvidence that the scale is correlated with other measures of the same or similar constructs in the hypothesized direction; assessed on the basis of correlations between the measure and other similar measures; the ability of the scale to differentiate known groups; assessed by comparing scores for subgroups that are expected to differ on the construct being measured (e.g. a clinical group and control group)High correlations between the scale and relevant constructs preferably based on a priori hypothesis with predicted strength of correlation; statistically significant differences between known groups and/or a difference of expected magnitude
ResponsivenessThe ability of a scale to detect significant change over time; assessed by comparing scores before and after an intervention of known efficacy (on the basis of various methods including t-tests, effect sizes (ES), standardized response means (SRM) or responsiveness statisticsStatistically significant changes on scores from pre-to post-treatment and/or difference of expected magnitude; the recommended index of responsiveness is the effect size, calculated by subtracting the baseline score from the follow-up score and dividing by the baseline SD; effect sizes can be graded as small (< 0.3), medium (approximately 0.5) or large (> 0.8)
Floor/ceiling effectsThe ability of an instrument to measure accurately across the full spectrum of a constructFloor/ceiling effects for summary scores < 15%
Practical properties
AcceptabilityAcceptability of an instrument reflects respondents' willingness to complete it and impacts on quality of dataLow levels of incomplete data or nonresponse
Feasibility/burdenThe time, energy, financial resources, personnel or other resources required of respondents or those administering the instrumentReasonable time and resources to collect, process and analyse the data

Results

Searches identified 3517 potentially relevant records. When assessed against the inclusion and exclusion criteria of this review, 28 articles were included (Table 5). Within the results, two systematic reviews were identified of PROMs for nonmelanoma[8] and melanoma skin cancers.[9]

Table 5. Search results
SourceResults of searchNumber of articles included in review
University of Oxford PROM Bibliography Database17605
Multi-database search using search engine175714
Hand searching9
Total351728

Generic patient-reported outcome measures

Two generic PROMs were identified that had been evaluated with patients with skin cancer: the SF-36[10-12] and the Sickness Impact Profile (SIP).[13, 14] Tables 6 and 7 outline the number of domains, response options, scoring methodology, administration time and licensing details of these instruments.

Table 6. Summary of generic-, cancer- and skin cancer-specific instruments
Instrument name (total items)Domains (no. items)Response optionsScoringAdministration completion timeLicensing information
  1. Qol, quality of life.

Generic
SF-36: MOS 36-item Short Form Health Survey (36)Bodily pain (BP) (2); General health (GH) (5); Mental health (MH) (5); Physical functioning (PF) (10); Role limitation-emotional (RE) (3); Role limitation-physical (RP) (4); Social functioning (SF) (2); Vitality (V) (4); Global Health (GH) (1)

Categorical: 2–6 options

Recall: standard 4 weeks, acute 1 week

Algorithm

Domain profile (0–100, 100 best health); Summary: Physical (PCS), Mental (MCS) (mean 50, SD 10)

Interview (mean values 14–15)

Self (mean 12·6)

Requires a signed license agreement; license is issued to a specific project; commercial use of the instrument requires payment of a royalty fee
Sickness Impact Profile (SIP) (136)Alertness behaviour (AB) (10); Ambulation (A) (12); Body care and movement (BCM) (23); Communication (C) (9); Eating (E) (9); Emotional behaviour (EB) (9); Home management (HM) (10); Mobility (M) (10); Recreation and pastimes (RP) (8); Sleep and rest (SR) (7); Social interaction (SI) (20); Work (W) (9)

Applicable statements checked; items weighted: higher weights indicate increased impairment

Recall: current health

Algorithm

Domain profile (0–100%, 100, worst health); Index (0–100%)

Summary: Physical (A, BCM, M); Psychosocial function (AB, C, EB, SI)

Interview (range: 21–33)

Telephone: Physical Functioning only (11·5)

Self (19·7)

Use in nonfunded academic research is free of charge; other uses require payment of royalties; all uses require a signed license agreement
Cancer-specific
European Organization for Research and Treatment of Cancer Quality of Life core Questionnaire, EORTC QLQ-C30 (30)

Physical function (5); Role activities (2); Symptoms (12); Cognitive functioning (2); Emotional well-being (4); Social well-being (2); Financial difficulties (1).

Two global questions:

Overall health (1); Overall QoL (1)

4-point Likert scales with 1 (best), 4 (worst)

7-point Likert scales for global health and QoL questions

Recall: past week (except for Physical Functioning)

Subscale scores transformed into 0–100 scores using an algorithm

Aggregation of subscale scores not recommended by developers

Under 10 minNo charge for use in academic settings, but written consent required for each study; royalty fee, based on no. of patients, payable for commercial studies
European Organization for Research and Treatment of Cancer Melanoma Cancer module, EORTC QLQ-MIn development   Evaluations have been conducted in Sweden
Functional Assessment of Cancer Therapy – General version, FACT-G (27)Physical well-being (7); Social/family well-being (7); Emotional well-being (6); Functional well-being (7)5-point Likert scalesRecall: past 7 days

Items are scored from 0 to 4, with negatively phrased items requiring reverse response scores

Higher scores represent better well-being on each of the dimensions or better global QoL when combined

Interview, telephone, or self-administration

5–10 min

Use of English versions of FACT/FACIT measures is free of charge, on condition of sharing data; users must complete an agreement and submit project information for each study
Functional Assessment of Cancer Therapy – Melanoma module, FACT-M (61)Physical well-being (7); Social/family well-being (7); Emotional well-being (6); Functional well-being (7); Melanoma Subscale (MS) (26) and Melanoma Surgery Subscale (MSS) (8), collectively known as the Melanoma Combined Scale (MCS)

5-point Likert scales

Recall: past 7 days

The Trial Outcome Index (TOI) has been defined as the summed scores from all the items

Interview, telephone, or self-administration

5–10 min

As above
Functional Assessment of Cancer Therapy-Biological Response Modifier FACT-BRM (40)Physical well-being (7); Social/family well-being (7); Emotional well-being (6); Functional well-being (7); BRM (13)

5-point Likert scales

Recall: past 7 days

Each question is rated on a 5-point Likert scale, giving a total score for each category as well as a total overall score from 0 (worst QoL) to 135 (best QoL)

Interview, telephone, or self-administration

5–10 min

As above
Skin Cancer Index (SCI) (15)Emotional (7); Appearance (3); Social/family (5)5-point Likert scale of level of impact of diseaseScores are obtained for each domain No details
Facial Skin Cancer Index (FSCI) (36)Emotional (7); Appearance (6); Work/financial (5); Lifestyle (5); Social/family (6); Physical functioning (7)5-point Likert scale of level of impact of diseaseScores are obtained for each domain No details
Dermatology Life Quality Index (DLQI) (10)General dermatology: 10 items measuring impact of skin condition3-point Likert scale  Permission for use requested
Skindex (61)General dermatology: Cognitive (15); Social (10); Depression (7); Fear (7); Embarrassment (4); Anger (5); Discomfort (4); Limitations (9)

5 or 6-point Likert scales

Recall period is 1 month

Standardized scores per domain

Interview, telephone, or self-administration

5–10 min

Use in nonfunded academic research is free of charge; other uses require payment of royalties; all uses require a signed license agreement
SCQOLIT (10)Psychosocial (9); Physical (1)

4-point Likert scale

Recall 1 week

Summation

Total score 30

5 minNo details
Table 7. Summary of generic-, cancer- and skin cancer-specific instruments: health status domains
InstrumentPhysical functionSymptomsGlobal judgementPsychological well-beingSocial well-beingCognitive functioningRole activitiesPersonal constructsTreatment satisfaction
Generic
SF-36 (36)××××× ×  
SIP (136)×  ××××  
Cancer-specific
EORTC QLQ-C30 (30)×××××××× 
EORTC QLQ-MInstrument under development         
FACT-G (27)× ××× ×  
FACT-M (61)×××××  ×× (melanoma surgery)
FACT-BRM (40)××××× × × (BRM effects)
SCI (15)   ××  × 
FSCI (36)×  ×× ×× 
DLQI (10)         
Skindex (61)×× ××× × 
SCQOLIT (10)×  ×     

SF-36: Medical Outcomes Study 36-item Short Form Health Survey

Four studies were included[15-18] that used the SF-36 for people with skin cancer. Two of these studies included patients with NMSC[15, 16] and the other two included people with skin melanoma.[17, 18] None were from the U.K.

Blanchard et al.[18] analysed the lifestyle behaviour and quality of life of 9105 cancer survivors, including 761 individuals with melanoma. The SF-36 discriminated between groups of survivors: skin melanoma survivors who implemented three lifestyle behaviours (consuming five portions of fruit and vegetables every day, not smoking, physical activity) scored significantly higher health-related quality of life (HRQoL) than those who did not (= 0·001). However, in a study assessing HRQoL in 121 patients with cervicofacial NMSC, SF-36 scores did not differ from historical norms.[16]

The SF-36 also discriminated between groups of people with melanoma.[17] A combination of the SF-36 subscales excluding General Health were combined with the STAI (State-Trait Anxiety Inventory) scales and three WOC (Ways of Coping questionnaire) scales were used at baseline to allocate each patient to one of four clusters of melanoma patients: ‘Physically unhealthy’, ‘Psychologically unhealthy’, ‘Physically/Psychologically unhealthy’ and ‘Healthy’. Participants completed the questionnaires at baseline and at 2, 5 and 9 months after treatment completion. Patients in the ‘Healthy’ cluster scored significantly (< 0·001) higher HRQoL than patients in all the other clusters, regardless of time of completion. In addition, there were no significant differences in the General Health subscale of SF-36 between the participants in the ‘Physically unhealthy’ and ‘Psychologically unhealthy’ clusters.

Discriminant validity was also reported in a cross-sectional study,[15] where SF-36 scores were strongly correlated with co-morbidities and a poorer self-reported health status. However, this study included only 19 patients with NMSC in a sample of 132 people with skin diseases.

Good internal consistency of the SF-36 is reported with people with cervicofacial carcinoma[16] with Cronbach's α ≥ 0·73 in all subscales except for General Health (α = 0·65) and Social Function (α = 0·45).

Response rates of studies where SF-36 was used were variable. High response rates were reported[15, 16] (89% and 93%), whereas Blanchard et al.[18] reported that only 32% of cancer survivors completed the questionnaire.

Sickness Impact Profile

This review found only one publication from the U.K.[19] that evaluated the SIP. This study analysed the level of disability caused by NMSC in 44 people at diagnosis, 1 week after treatment, and at 3 months. The SIP demonstrated good responsiveness, detecting changes in levels of disability over time: disability scores were lower at baseline, increased slightly 1 week after treatment and then decreased below the baseline score after 3 months. Construct validity was also reported in the same study: a significant (= 0·01) correlation was found between the SIP and the Dermatology Life Quality Index (DQLI) scores at 1 week after treatment. However, there was no correlation between the questionnaires at baseline and at 3 months, with the exception of the psychosocial dimension at 3 months (= 0·05).

EQ-5D

The EQ-5D has been used in a limited number of studies with people with malignant melanoma to elicit utilities.[9] No evaluations were identified to report on measurement performance.

Skin cancer-specific patient-reported outcome measures

A total of nine cancer and skin cancer-specific PROMs were identified, for which adequate evidence of psychometric properties was available to enable appraisal. Table 8 outlines the condition-specific PROMs included in the review. Tables 6 and 7 provide further details.

Table 8. Condition-specific patient-reported outcome measures
Cancer-specific instruments
European Organization for Research and Treatment of Cancer Quality of Life Questionnaire (EORTC QLQ-C30)
Functional Assessment of Cancer Therapy – General version (FACT-G)
Functional Assessment of Cancer Therapy – Biological Response Modifier version (FACT-BRM)
Skin cancer-specific instruments
European Organization for Research and Treatment of Cancer Quality of Life Questionnaire – Melanoma module (EORTC QLQ-M)
Functional Assessment of Cancer Therapy – Melanoma (FACT-M)
Skin Cancer Index (SCI)
Facial Skin Cancer Index (FSCI)
Skin Cancer Quality of Life Impact Tool (SCQOLIT)
Dermatology-specific instruments
Dermatology Life Quality Index (DLQI)
Skindex

European Organization for Research and Treatment of Cancer Quality of Life Questionnaire – Core Module (EORTC QLQ-C30)[20]

Two articles provided EORTC QLQ-C30 data for this review;[21, 22] one was from the U.K.[22] Both evaluations were of patients with malignant melanoma.

The acceptability and feasibility of using computer touch screen technology in an ambulatory oncology setting was evaluated using several PROMs (EORTC QLQ-C30; Cancer Needs Questionnaire and Beck Depression Inventory).[21] Patients included in the study had different cancers (= 451); approximately 10% (47) of these had melanoma. Results were presented for the total sample and these included 105 people over the age of 70 years. Acceptability and feasibility were evaluated using a specifically developed survey that focused on ease or difficulty of the system. Time to complete questionnaires was monitored by the system.

Overall there was high acceptability of the system with 99% of patients reporting it was easy to use. All items of the EORTC QLQ-C30 were completed by 99% of patients. The average time to complete was 4 min (range 1·7–20·9). Overall, patients who had never used a computer before took longer to complete the questionnaires.

Dixon et al. (2006, U.K.)[22] evaluated quality of life (QoL) and cost-effectiveness in a randomized controlled trial of interferon-α and placebo in malignant melanoma in 674 patients. There were significantly higher scores on the EORTC QLQ-C30 scores in the placebo group suggesting that the intervention had a significantly worse QoL. This may suggest discriminative properties of the instrument.

European Organization for Research and Treatment of Cancer Quality of Life Questionnaire – Melanoma module (EORTC QLQ-M)

Several studies report the development and evaluation of a melanoma module but most have been conducted in Sweden. No studies were identified in English-language populations, but information obtained from the EORTC Melanoma Group website outlines a programme of international research (including U.K.) in progress, developing and evaluating the Melanoma module. Field testing is expected to commence in March 2013.

Functional Assessment of Cancer Therapy – General version (FACT-G)[23]

This review identified only one article using the FACT-G in an English-speaking population with cervicofacial NMSC.[16] Good internal consistency (Cronbach's α = 0·61–0·90) was reported in 123 patients and discriminative validity of the FACT-G was supported with significant correlations between patient existing illnesses, medical risk factors and sun protection behaviours.

Functional Assessment of Cancer Therapy – Melanoma (FACT-M)[24]

Three studies were included in this review. None were from the U.K. Item reduction techniques were performed reducing 24 items to 18 with 402 patients. Factor analysis supported the two-domain structure following the exclusion of three items based on low variability of scores and missing values. Three further items were excluded based on Rasch analysis. Internal consistency for the revised scales was high (> 0·90) and correlations between the full version and reduced items were strong (0·98 and 0·96). Respondent burden was reduced by 25% in the shorter version.[25]

Good internal consistency of all scales (α > 0.81) was reported in 273 patients with different stages of melanoma[26] and the Melanoma subscale (0·79) and high reproducibility reported (intraclass correlation, ICC 0·86).[27]

Significant hypothesized correlations were reported between the FACT-M subscales and other instruments measuring similar constructs [EORTC-QLQ Melanoma; Profile of Mood States (POMS)].[26] Significant differences in scores were also reported between patients receiving active treatment and those self-reporting improvement.

Discriminant validity is supported in 273 patients with significant group differences in scores between patients with different disease stages and those receiving active treatments.[28]

Longitudinal validity was supported with significant improvement in scores in QoL and patient-reported performance (KPS; ECOG-PS).[28]

Minimally important differences (MID) were examined using both distribution and anchor-based methods, with higher MIDs reported with anchor-based methods. The authors recommend the latter and the range of MIDs were reported as follows: Trial Outcome Index: 5–9 points; Melanoma Combined Scale: 4–6 points; Melanoma Subscale: 2–4 points; Melanoma Surgery Subscale 1–2 points.[28]

Functional Assessment of Cancer Therapy – Biological Response Modifier version (FACT-BRM)[29]

The FACT-BRM is a 40-item BRM cancer-specific scale that supplements the general version (FACT-G). It has been reported that there are interferon treatment dose-dependent neuropsychiatric side-effects; patients with melanomas often receive high doses of such treatments and subsequently compliance with treatment may be difficult to sustain.[29]

Internal consistency of each subscale ranged from 0·91 (physical well-being) to 0·50 (social/family well-being); other domains' αs were > 0·75. Good test–retest reliability was also reported.[29]

Hypothesized significant correlations were observed between FACT-BRM subscale scores on measures of somatic complaints, depression and fatigue; correlations were > 0.75 between these scores and Beck Depression Inventory (BDI) and Piper Fatigue Scale (PFS). The FACT-BRM detected change on all subscales except social/family well-being from pre-administration interferon to 1 month follow-up. In addition, change scores were significantly correlated with changes on BDI and PFS. Of note, this was a very small sample study (= 21 patients).[29]

Skin Cancer Index (SCI)[30]

One study was included.[30] Excellent internal consistency has been reported with α > 0·82. Factor analysis supports the three-domain structure.

Construct validity was supported with hypothesized correlations with similar constructs (Lehman's Cancer Worry Scale; Dermatology Quality of Life Index; Rosenburg Self-Esteem; SF-12). Responsiveness has been reported with statistically significant improvement in scores following surgery for 211 patients receiving Mohs surgery for NMSC.[31]

Ceiling effects were reported indicating patients with NMSC report a high level of quality of life. High levels of data completion have also been reported.[30]

Facial Skin Cancer Index (FSCI)[32]

Two studies were included in the review.[32, 33] Factor analysis supported the domain structure and Cronbach's α ranged from 0·78 to 0·87 for all domains except physical functioning (0·63).[32] Good reproducibility was reported with correlations > 0·75.[33] There were no floor and ceiling effects in patient scores or missing data.[32]

Dermatology Life Quality Index (DLQI)[34]

Three studies were included in the review including the development article[34-36]; two were from the U.K.[34, 36] Evaluation during development included comparing scores of patients with dermatological conditions to healthy controls. Scores were significantly higher than for the control group suggesting discriminative validity: higher QoL of healthy populations. Test–retest reliability was high with correlations between testing > 0·95.[34]

Evaluation of the DLQI in patients with NMSC has been conducted but the authors suggest that the DLQI may not capture items important to them in this population, as demonstrated by scores (little impact) and little change over time.[35] A further study (U.K.) with a small sample of older people with skin cancer reported similar results.[36]

Skindex[37]

Four studies were included in the review. Further evaluation was performed during development and of 234 patients, 36 were patients with NMSC. The distribution of scores across the scale demonstrated wide variability between skin conditions.[37]

Internal consistency was reported as high for each subscale (range 0·76–0·86) and reproducibility; ICCs ranged from 0·68 to 0·90.[37]

Factor analysis supported the clustering of hypothesized constructs. Construct validity was supported with hypothesized correlations between Skindex and SF-36 (0·44–0·56).[15]

Responsiveness in small samples of patients with different dermatological conditions was demonstrated using anchor-based methods of patients reporting worsening or improvements in symptoms.[37]

The Skindex was revised by Chren et al.[15] in 1997 and the 61 items were reduced to 29. This study included 136 patients with NMSC from a total sample of 685 patients with different dermatological conditions. Reproducibility and internal consistency remained stable and the distributions of scores consistent with different levels of symptomology. Time to complete was significantly shorter: 5 min compared with 15 min for the longer version. Responsiveness was comparable to the previous developmental study with changes in scores relational to reports of improvement.

Further revisions were made to the Skindex by Chren et al.[38] in 2001; a three-domain (Symptoms, Emotions, Function) 16-item version was evaluated. Two underlying features underpinned the item reduction: reduced burden of completion and the measurement of ‘bother’ rather than ‘frequency’ of experience. The shorter version (16 items) was evaluated with a total of 541 patients, including 74 patients with NMSC. Comparable measurement properties were reported for this shorter version. Furthermore, the Skindex was applied in a prospective cohort study of 633 patients with NMSC receiving different excision methods. The Skindex discriminated between the effect and outcomes of each method and scores were significantly different between groups of patients with NMSC who reported better or worse quality of life.[39]

Skin Cancer Quality of Life Impact Tool (SCQOLIT)[40]

The instrument was developed and evaluated with 54 patients with malignant melanoma and 59 patients with NMSC.[40] Reproducibility was reported with ICCs for both groups > 0·72. Internal consistency for the combined group was 0.8. Convergent validity was supported with significant correlations of change scores between SCQOLIT and the DQLI. The SCQOLIT demonstrated some sensitivity to change with statistically significant differences in scores from baseline to 3 months but this was not considered to be clinically significant.[40]

Discussion

Scientific literature assessing the properties of generic PROMs for people with skin cancer in English-speaking populations was scarce and methodological quality limited (Table 9). Therefore, these conclusions must be read with caution.

Table 9. Appraisal of psychometric and operational performance of cancer, skin cancer and dermatology-specific patient-reported outcome measures (PROMs)
PROM (no. of studies)ReproducibilityInternal consistencyValidity: ContentValidity: ConstructResponsivenessInterpretabilityFloor/ceiling/precisionAcceptabilityFeasibility
  1. Psychometric and operational criteria: 0, not reported; –, no evidence in favour; +, some limited evidence in favour.

Generic
SF-36 (4)0+0++0+0
SIP (1)000++0000
Cancer-specific
EORTC QLQ-C30 (2)000+000++
FACT-G (1)0+++00000
FACT-M (3)+++++00+0
FACT-BRM (1)0++0+0000
SCI (1)0++++00+0
FSCI (2)++++000+0
DLQI (3)0000000
Skindex-(61) (2)+++++00+0
Skindex-(29) (1)+++++00+0
Skindex-(16) (3)++++000+0
SCQOLIT (2)+++++0000

Only five publications met the inclusion criteria for evaluations of generic measures: four studies used the SF-36 (none U.K.) and one study used the SIP (U.K.). Their publication dates ranged from 1996 to 2008. Of the four studies that used the SF-36, two involved people with NMSC and two focused on people with skin melanoma. The U.K.-based study that used the SIP focused on people with NMSC. No studies were identified which used the EQ-5D. However, this instrument has been used to elicit utilities in people with melanoma in some studies.

Overall, there is a limited volume of scientific evidence for the application of generic PROMs in skin cancer: further evaluations are needed. It appears that the scores of patients with NMSCs are comparable to population norms and therefore are not capturing aspects of quality of life important to these populations.

Several cancer- and dermatology-specific instruments were identified in the review. The FACT-G and FACT-M module provide some psychometric evidence; however, the FACT-G has been evaluated only in patients with NMSC. The FACT-M has more promising properties. Both evaluations using the EORTC are with patients with malignant melanomas and those with advanced disease. Of particular interest is the ongoing developmental work regarding a Melanoma module for the EORTC.

Two general dermatology instruments were included in the review: DLQI and Skindex. All of the studies evaluating these instruments were conducted with patients with NMSCs. The DLQI was developed in the U.K. and further evaluations suggest that the items do not reflect what is important to patients with skin cancer. The Skindex does provide more promising properties for patients with NMSCs but most evaluations have included a general dermatological population of patients with small subsamples of patients with NMSCs.

Attempts have been made to develop PROMs specific to NMSCs but there is a limited number of evaluations at present: Skin Cancer Index and the Facial Skin Cancer Index. The Skin Cancer Index is more promising for consideration of piloting in the NHS. The SCQOLIT has some evidence of applicability across both skin cancer types but more evaluations are needed.

The FACT-M does have more promising characteristics for patients with malignant melanomas, especially those with advanced disease and the forthcoming EORTC-M may also be an attractive option.

Ancillary