Validation of measurement scores for evaluating vascular anomaly skin lesions

Abstract Vascular anomalies comprise a heterogeneous group of disorders caused by abnormal proliferation or development of vascular and/or lymphatic vessels. Vascular anomalies present with various symptoms and complications, but no standardized methods evaluate their severity, and to measure treatment outcomes is difficult. To assess the responsiveness of measurement scores for evaluating vascular anomaly skin lesions, we conducted a validation study to compare these measurement scores with patients’ objective data. In this study, data were collected from treated and untreated patients. Skin lesions were photographed at baseline and after a follow‐up period of 3–6 months. The volume of skin lesions, the degree of red or purple coloration, and color tone were measured objectively. Two external dermatologists evaluated patients’ photographs and determined scores, which represented criteria for improvements in skin lesions (size and color) and 6‐point Physician Global Assessment scores. The correlation between these scores and patients’ objective data (lesion volume and color) was assessed to validate the scores. Twenty‐three cases of vascular anomaly (seven vascular tumors, five lymphatic malformations, three venous malformations, and eight lymphatic–venous malformations) were examined. Scores for improvements in vascular anomaly skin lesions (size and color) correlated with a change in lesion volume, the degree of red or purple coloration, color tone score, and 6‐point Physician Global Assessment score. Our findings suggest that these measurement scores are responsive to changes in vascular anomaly skin lesions after observation.

ness of measurement scores for evaluating vascular anomaly skin lesions, we conducted a validation study to compare these measurement scores with patients' objective data.
In this study, data were collected from treated and untreated patients. Skin lesions were photographed at baseline and after a follow-up period of 3-6 months. The volume of skin lesions, the degree of red or purple coloration, and color tone were measured objectively.
Two external dermatologists evaluated patients' photographs and determined scores, which represented criteria for improvements in skin lesions (size and color) and 6-point Physician Global Assessment scores. The correlation between these scores and patients' objective data (lesion volume and color) was assessed to validate the scores. Twentythree cases of vascular anomaly (seven vascular tumors, five lymphatic malformations, three venous malformations, and eight lymphatic-venous malformations) were examined.
Scores for improvements in vascular anomaly skin lesions (size and color) correlated with a change in lesion volume, the degree of red or purple coloration, color tone score, and 6point Physician Global Assessment score. Our findings suggest that these measurement scores are responsive to changes in vascular anomaly skin lesions after observation.

K E Y W O R D S
disease scoring system, drug treatment, lymphatic malformation, vascular tumor, venous malformation (the Kasabach-Merritt phenomenon). Slow-flow vascular malformations involve abnormal venous vessels (venous malformations [VM]), lymph vessels (lymphatic malformations [LM]), and a combination of these (combined vascular malformations). Management of these diseases requires multidisciplinary care, but no standardized treatments exist at present. 1 Most patients manifest with disease in the superficial, cutaneous, or subcutaneous tissues of the face, whole body, and limb, which may cause major concerns with appearance. Treatments for these patients include surgical resection, laser therapy, and sclerotherapy; however, these treatment methods are not curative.
Recently, mammalian target of rapamycin (mTOR) inhibitors, such as sirolimus, have been used to treat VA with promising curative results. [3][4][5]  December 2020, were enrolled. Patients who received drug therapy for VA were included, but patients who had previously undergone surgical resection or laser therapy and patients with infected lesions were excluded.
Attending physicians collected patients' data (type of VA, age, sex, lesion location, and treatment) and evaluated the size (volume) and color of skin lesions. Patients' photographs, which were taken by one photographer under comparable conditions (distance, location, and camera), were taken at baseline (Visit 1) and after a follow-up period of 3-6 months (Visit 2). Volume was determined using the major axis, minor axis, and thickness. Color was quantified by measuring the degree of red or purple coloration (Pantone ® Color Sample (Pantone LLC, Carlstadt, NJ, USA)) (Tables S1 and S2). To objectively calculate the gray value of the affected skin lesion, three regions of interest (ROI) of lesions and normal skin in each photograph were selected, and the mean gray values of ROI were measured using image analysis software (Image J (US National Institutes of Health, Bethesda, MD, USA)). The skin lesion color value was determined as the ratio of cutaneous lesion color to normal skin color.

| Assessment
To ensure assessment score reliability, two dermatologists who were independent from other investigators performed the assessment.
These external evaluators evaluated patients' photographs and determined scores, which were used as criteria for improvements in VA skin lesions (size and color) ( Table 1), VA size, and VA color (Tables S3 and S4). These measurement scores were created based on previously reported scores for facial angiofibroma in patients with tuberous sclerosis complex (TSC). 6 These criteria are categorized on a 6-point scale: "marked" (score of 3), "improved" (score of 2), "slightly improved" (score of 1), "unchanged" (score of 0), "slightly exacerbated" (score of −1), and "exacerbated" (score of −2). The 6point Physician Global Assessment (PGA) score of VA skin lesions was determined quantitatively (0-5) and relatively, and the baseline score (Visit 1) was defined as 4 (Table S5). 7 We compared the clinical scores evaluated by external evalua-

| Statistical analysis
Continuous variables are summarized as median with interquartile range. Categorical variables are summarized as numbers with percentages. Spearman's correlation coefficients were calculated to compare the objective data of patients with the results measured by two external evaluators. ICC using the two-way random model (2,1) of each score were calculated to assess inter-rater reliability. A p-value of less than 0.05 was considered statistically significant. All analyses were performed using the DescTools and irr packages in R version 4.0.3 (The R Foundation for Statistical Computing). The measurement outcomes of patients are shown in Table 3.

| RE SULTS
The median skin lesion volume of all patients decreased, and the median degree of red or purple coloration (Pantone Color Sample) also decreased at Visit 2. The value for color tone calculated using image analysis software (Image J) increased, reflecting an improvement in the intensity of red or purple coloration.
The median improvement values (VA skin lesion size and color, size, and color) measured by two external evaluators were 1, 0, and 1, respectively. Regarding the co-primary end-point, the correlation coefficients between the improvements in VA skin lesions (size and color) and the change in lesion size (volume) and the degree of red or purple coloration (Pantone Color Sample) were −0.695 and −0.869, respectively, and the 95% confidence intervals (CI) were −0.86 to −0.396 (p < 0.001) and −0.944 to −0.713 (p < 0.001), respectively (Table 4). Additionally, the correlation coefficient between the VA skin lesion improvement score (size and color) and change in color tone score (Image J) was 0.442, and the 95% CI was −0.037 to −0.723 (p = 0.035). As for divided scores, the VA skin lesion improvement score (size) correlated with the change in lesion size, and scores (color) were also correlated with a change in the degree of red or purple coloration (Pantone Color Sample) and color (Image J). The change in quantitative 6-point PGA score and the relative 6-point PGA

Score
Improvements Criteria

3
Markedly improved Overall shrinkage, flattening, or disappearance of tumors is observed. A nearly overall large decrease in the intensity of reddishness/purplish or a nearly overall change in reddishness/purplish to the level equal to that of the normal region are obserbed.
2 Improved Nearly overall shrinkage or flattening of tumors or a nearly overall decrease in the intensity of reddishness/purplish are observed. Or, partial disappearance of tumors or a partial large decrease in the intensity of reddishness/purplish are observed.
1 Slightly improved Partial shrinkage or flattening of tumors or a partial decrease in the intensity of reddishness/purplish are observed. Or, a nearly overall slight decrease in the intensity of reddishness/purplish is observed.

Unchanged
There is no definite change in the size or the reddishness/purplish of tumors.
−1 Slightly exacerbated Partial enlargement or new formation of tumors or a partial increase in the intensity of reddishness/ purplish are observed. Or, a nearly overall slight increase in the intensity of reddishness/purplish is observed.
−2 Exacerbated A nearly overall enlargement or new formation of tumors or a partial large enlargement of tumors and a partial large increase in the intensity of reddishness/purplish are observed. Or, more severe exacerbation is observed.
Note: Overall, no less than approximately 75% of the extent of the lesion at baseline; nearly overall, approximately 50-75% of the extent of the lesion at baseline; partial, approximately 25-50% of the extent of the lesion at baseline; (the color intensity is) largely decreased, change of three levels or more in red/purple coloration in terms of Pantone Color Sample (Tables S1 and S2); (the color intensity is) decreased/increased, change of two levels or more in red/purple coloration in terms of Pantone Color Sample; (the color intensity is) slightly decreased/increased, change of one level in red/purple coloration in terms of Pantone Color Sample.

TA B L E 1 Improvements in vascular anomaly skin lesions (size and color)
score correlated with changes in objective patient data, excluding the change in quantitative 6-point PGA score and the change in color (Image J). When comparing treated with untreated patients, skin lesion volume and the degree of red or purple coloration in treated patients decreased significantly, whereas the lesion color tone in treated patients improved more markedly compared with untreated patients (Table S6).

| DISCUSS ION
We conducted a validation study to establish methods to evaluate treatment outcomes in patients with VA skin lesions. We used three measurement scores (improvements in VA skin lesions) and 6-point PGA scores of VA skin lesions and analyzed the correlation between these scores and actual objective data. The data suggest that measurement scores evaluated from patients' photographs were effective at expressing changes in lesion size and color. Additionally, ICC values between external evaluators 1 and 2 showed high reliability.
To our knowledge, this is the first study to assess the responsiveness of VA skin lesion measurement scores.
In the current literature, no uniform assessment methods evaluate the effectiveness of treatment for VA skin lesions. Therefore, mTOR plays a key role in the pathogenesis of various VA, and the mTOR inhibitor sirolimus has been identified as a promising treatment for VA. 3 In a recent systematic review of sirolimus treatment for VA, Freixo et al. 5 described that the definition and assessment of outcomes were heterogeneous between studies, and were thus difficult to compare systematically. In the first large-sample study, Adams et al. 3 showed a high response rate to sirolimus in patients with LA. They used three distinct assessments, including radiological examination, functional impairment score, and QOL score. Triana et al. 4 also reported 41 patients with VA who displayed volume reduction with sirolimus treatment by radiological imaging. In patients with LM, our clinical trial attempted to assess not only lesion volume using magnetic resonance imaging, but also the severity of VA to assess the degree of impairment of affected organs. 8 These clinical scores are very useful to assess organ function in patients with VA, but would be unsuitable to evaluate VA skin lesions. At present, no validated measurement scores for VA skin lesions are available.

TA B L E 3 Measurement outcome and patient assessment by external evaluators
Clinical assessments using photographs of skin lesions have been widely used in dermatology. In clinical trials of IH treated with propranolol, digital photographs of skin lesions were taken before and after treatment. 9 The photographs included a color chart to calibrate color and size, and were assessed by two independent and trained readers using a photograph-based centralized assessment.  10,11 In these studies, photographs will be qualitatively assessed by two blinded independent trained readers without informative data. However, the details of these evaluation methods and criteria of ongoing these trials have not yet been officially announced.
In this study, we used a novel measurement score to evaluate the efficacy of drugs in patients with VA, which was referenced to the outcome score of improvements in facial angiofibroma in patients with TSC. This score is useful and easy to assess. 6 The outcome measurement score of angiofibroma was also assessed by independent evaluators using photographs. We created three distinct scores, including a combined score for size and color, size alone, and color alone. Our results reveal that all measurement scores evaluated by external evaluators correlated with actual objective data in all patients, including treated patients. This suggests that our VA scoring system could be useful for evaluating treatments for skin lesions.
Of note, the 6-point PGA scoring system is a frequently used primary end-point in dermatology clinical trials. 7 It is known that evaluating this score using clinical photographs is useful. 12  evaluate cutaneous microcystic LM as a primary outcome. 11 Although the score is not specified in skin VA, our results show excellent correlation between scores evaluated by two external evaluators and patients' objective data. This suggests that 6-point PGA may also be a useful measurement scoring system as an alternative end-point.
Our study had several limitations. This was a retrospective study, and the number of patients was small. The study included patients with a variety of VA and heterogeneous diseases. Thus, we should consider differences between these diseases in future research.
In conclusion, we validated measurement scores to evaluate treatment for VA skin lesions. External assessments using our modified scores were significantly correlated with actual objective changes in patient data. Our results provide novel measurement scores to evaluate the efficacy of treatments for VA skin lesions.