Clinicopathologic models predicting non‐sentinel lymph node metastasis in cutaneous melanoma patients: Are they useful for patients with a single positive sentinel node?

Abstract Background and Objectives Of clinically node‐negative (cN0) cutaneous melanoma patients with sentinel lymph node (SLN) metastasis, between 10% and 30% harbor additional metastases in non‐sentinel lymph nodes (NSLNs). Approximately 80% of SLN‐positive patients have a single positive SLN. Methods To assess whether state‐of‐the‐art clinicopathologic models predicting NSLN metastasis had adequate performance, we studied a single‐institution cohort of 143 patients with cN0 SLN‐positive primary melanoma who underwent subsequent completion lymph node dissection. We used sensitivity (SE) and positive predictive value (PPV) to characterize the ability of the models to identify patients at high risk for NSLN disease. Results Across Stage III patients, all clinicopathologic models tested had comparable performances. The best performing model identified 52% of NSLN‐positive patients (SE = 52%, PPV = 37%). However, for the single SLN‐positive subgroup (78% of cohort), none of the models identified high‐risk patients (SE > 20%, PPV > 20%) irrespective of the chosen probability threshold used to define the binary risk labels. Thus, we designed a new model to identify high‐risk patients with a single positive SLN, which achieved a sensitivity of 49% (PPV = 26%). Conclusion For the largest SLN‐positive subgroup, those with a single positive SLN, current model performance is inadequate. New approaches are needed to better estimate nodal disease burden of these patients.


| INTRODUCTION
Clinically node-negative cutaneous melanoma patients with a positive sentinel lymph node (SLN) are no longer routinely treated with immediate completion lymph node dissection (CLND). Both the large MSLT-II and smaller DeCOG studies demonstrated that immediate CLND in SLN-positive patients enrolled in these trials did not significantly improve melanoma-specific survival compared to patients in the active surveillance group, who were offered CLND only when regional recurrence was identified. [1][2][3] Consequently, and following a trend even pre-dating the publication of these trial results, most SLNpositive patients are managed with active surveillance 2 as it has been clear for decades that only 10%-30% of clinically node-negative (cN0) SLN-positive melanoma patients will have additional metastatic disease in non-sentinel lymph nodes (NSLNs) at CLND. 3 As there is no widely accepted algorithm to predict NSLN status, there is a growing unmet clinical need to stratify cN0 SLN-positive patients into a high-risk group, which might be managed with active surveillance, and/or systemic therapies, and a low-risk group, which might forgo not only additional surgery and systemic therapy, but nodal basin surveillance altogether.
Among the cN0 SLN-positive patients, approximately 80% have only a single positive SLN. 4 These patients are considered the lowest risk group among Stage III melanoma patients. However, still, one out of six patients will have additional metastases in the NSLN and will relapse in the at-risk basin over time. 5 Therefore, a reliable tool that supports the identification of single SLN-positive patients would be of clinical value.
In recent years, several clinicopathologic models have been developed to predict NSLN positivity in SLN-positive patients, [6][7][8][9] and some have been validated in external independent cohorts. 10 However, only one of these models was designed explicitly for the subset of patients with a single positive SLN. The performance of the other models has not yet been assessed for this patient group. As a result, the utility of these other models for the majority of cN0 SLN-positive melanoma patients remains unknown.
In this study, we sought to assess the ability of existing clinicopathologic models, [6][7][8][9] and of a newly developed clinicopathologic model to identify SLN-positive melanoma patients with an increased risk for NSLN positivity. Specifically, we investigated the performance of these clinicopathologic models in predicting NSLN positivity for patients with a single positive SLN.

| Patient cohort
We retrospectively assembled a cohort of 200 patients treated at Mayo Clinic tertiary care centers in Minnesota, Arizona, or Florida between 2004 and 2017. This cohort represents a Stage III patient subset of a larger cohort previously described. 11 All patients underwent lymphatic mapping with imaging and had a sentinel lymph node biopsy (SLNB) within 90 days of their primary melanoma diagnosis.
Of these SLN-positive patients, 153 underwent a CLND also within 90 days of their primary diagnostic biopsy. Ten patients were excluded for lack of available biopsy material. Thus, the total number of patients included was 143. The human investigations performed in this study were completed after approval by the Mayo Clinic Institutional Review Board and in accordance with the requirements of the Department of Health and Human Services, where appropriate.

| Statistical methods
The probability of NSLN positivity was estimated using a logistic regression model. Specifically, we used LASSO regression, 12 a regularized logistic regression that reduces the number of predictors for increased model interpretability. Models were built in R version 3.6.1 (R Foundation for Statistical Computing) 13 using the glmnet package (v3.0.2). 14 To reduce the number of predictors considered in our model (given the limited size of our cohort), we only included continuous clinicopathologic (CP) variables (exploiting the fact that CP variables are highly correlated among each other, therefore, we do not need all of them): age, number of positive SLN, Breslow thickness, mitotic rate, and the diameter of the largest SLN metastasis. We also log-transformed the values of the last three variables, using a pseudocount of 0.01, to decrease the influence of outlier observations. To avoid potential differences between clinicopathologic practices, the number of positive SLN was used in the model as a binary variable (one or more than one SLN). Missing values were replaced by the median value of the corresponding variable across all patients, assuming that the values are missing at random. The dimension of the largest lymph node metastasis of patients with isolated tumor cells or a diameter less than 0.1 was set to be 0.01 and 0.099 mm, respectively.

| Performance evaluation
Our new model was designed and evaluated using a repeated crossvalidation training/validation scheme, namely the double loop crossvalidation (DLCV), to ensure a form of internal validation method. 15 DLCV efficiently separates feature selection and model optimization, which occur in the inner loop, from the model evaluation in the outer loop. As a result, reliable estimates of model performance on unseen patients can be obtained. We used ten inner folds for optimization of model parameters and five outer folds to evaluate the models. The procedure was repeated ten times. We optimized the clinicopathologic model by finding the combination of λ parameter that minimizes deviance across the inner test folds. We explored all possible combinations resulting from glmnet's default regularization path for λ values.
Cross-validated AUCs of our clinicopathologic model were determined by averaging outer loop AUCs using the R package cvAUC (v1.1.0) and then averaging these estimates over ten repetitions. The discriminative ability of all models was evaluated using widely used measures: sensitivity (SE), specificity (SP), negative predictive value (NPV), positive predictive value (PPV), with a corresponding 95% Clopper-Pearson CI 16 ; and area under the receiver operating characteristic curve (AUC). We aimed to optimize a model for sensitivity and PPV.

| Probability threshold for binary risk labels
Our primary goal was to identify patients at high risk of NSLN positivity correctly. Therefore, the output probabilities of the newly developed clinicopathologic model were converted into risk labels by setting a probability threshold so that patients with probabilities higher than the threshold are deemed high risk, and those with probabilities lower than the threshold are deemed low risk. This threshold was defined to maximize the F1-measure during training.
The F1-measure 17 is the harmonic mean of our target metrics (SE and PPV) and is expressed as 2 × SE × PPV SE + PPV . By maximizing this measure, we aimed to find a probability threshold that would lead to a good tradeoff between our target metrics. We defined two different thresholds during DLCV: one optimized in the entire cohort and another optimized in patients with a single positive node only.

| Comparison with publicly available models
The newly developed model was compared with four publicly available models that predict NSLN positivity via a nomogram or a scoring system developed by Gershenwald et al., 8 Bhutiani et al., 9 the N-SNORE scoring system, 6 and the nomogram developed by Bertolli et al. 7 (Table 1). These models were selected because they were externally validated, and their input variables were available in our cohort. The Bhutiani model 9 is the only model specifically designed for patients with a single positive SLN. The added value of all models was further assessed relative to two simple rules based on SLN tumor burden variables: number of positive SLNs (single vs. more than one) or diameter of largest SLN metastasis (≤1 mm vs. >1 mm). These are referred to as "Positive SLN rule" and "1 mm rule," respectively. These models were implemented in R to calculate the risk score for each T A B L E 1 Characteristics of published models assessed in our cohort

Model
Variables used in the model Type

| Performance of clinicopathologic models in the prediction of NSLN positivity
Our newly developed clinicopathologic model for NSLN positivity prediction included age, Breslow thickness (log), mitotic rate (log), largest SLN metastasis diameter (log), and the number of positive SLN (single vs. more than one node; see Table 3 for coefficients). The  Table 5). The poor sensitivity (<20%) achieved by most existing models illustrates that the recommended probability high-risk score thresholds missed most NSLN positive patients with a single positive SLN. Moreover, varying the probability threshold for these models did not lead to a better tradeoff between sensitivity and PPV ( Figure 2). Our newly developed clinicopathologic model with the probability threshold optimized for the entire cohort also achieved a low sensitivity in the single positive SLN patient group (23%). Therefore, we assessed a probability threshold optimized for patients with a single positive SLN during training. Our clinicopathologic model with an adjusted threshold led to a better tradeoff between the two metrics: PPV increased to 26% and sensitivity to 49%.

| DISCUSSION
In this study, we assessed the ability of state-of-the-art clinicopathologic models, designed to predict NSLN positivity, to identify high-risk clinically node-negative SLN-positive melanoma patients who are most likely to benefit from additional therapeutic and/or intensive active surveillance strategies. We validated the performance of these clinicopathologic models in our cohort, and we found that the  Note: The performance of our newly developed clinicopathologic model, two heuristic rules, and three publicly available clinicopathologic models have been characterized by the area under the receiver operating characteristic curve (AUC), specificity (SP), sensitivity (SE), positive predictive value (PPV) and negative predictive value (NPV). Confidence intervals of publicly available models for AUC and other metrics were calculated using DeLong and Clopper-Pearson methods, respectively. Confidence intervals of the newly developed clinicopathologic model were computed in the double loop procedure. (*) One patient was excluded due to the unknown largest SLN diameter, (**) 22 patients were excluded due to unknown regression, and one due to unknown largest diameter. 49%. Remarkably, changing the probability or score thresholds of all clinicopathologic models also failed to improve the tradeoff between SE and PPV. This would suggest that the performance is intrinsic to the models themselves rather than the chosen threshold.
The use of clinicopathologic models to predict NSLN positivity for clinically node-negative SLN-positive melanoma patients is promising; however, current models showed limited performance in our analysis. The success of the future clinicopathologic models will depend on their ability to complement the current staging system (N and T stages).

| CONCLUSIONS
More work is needed to design models that can accurately predict  Note: The performance of our newly developed clinicopathologic model, one heuristic rule, and four publicly available clinicopathologic models have been characterized by the area under the receiver operating characteristic curve (AUC), specificity (SP), sensitivity (SE), positive predictive value (PPV) and negative predictive value (NPV). Confidence intervals of publicly available models for AUC and other metrics were calculated using DeLong and Clopper-Pearson methods, respectively. Confidence intervals of the newly developed clinicopathologic model were computed in the double loop procedure. (*) One patient was excluded due to the unknown largest SLN diameter, (**) 19 patients were excluded due to unknown regression, and one due to unknown largest diameter.
F I G U R E 2 Positive predictive value (PPV) and sensitivity (SE) achieved by available clinic-pathologic models among patients with a single positive SLN. Models evaluated were assigning patients to high-risk or low-risk groups based on a heuristic rule ("1 mm rule"), Gershenwald, N-SNORE, Bertolli, and Bhutiani models. Curves are generated by varying the probability or score cutoff of the models and obtaining the corresponding PPV and sensitivity. SLN, sentinel lymph node cutaneous melanoma, SLN metastasis, liquid biopsy, or afferent lymphatic channel fluid, in combination with clinicopathologic features, is a fertile avenue of exploration to improve melanoma risk stratification and tailor patient care. 11

DATA AVAILABILITY STATEMENT
The data that support the findings of this study are available from the corresponding author upon reasonable request. ORCID