Risk prediction for metastasis of clear cell renal cell carcinoma using digital multiplex ligation‐dependent probe amplification

Abstract Precise quantification of copy‐number alterations (CNAs) in a tumor genome is difficult. We have applied a comprehensive copy‐number analysis method, digital multiplex ligation‐dependent probe amplification (digitalMLPA), for targeted gene copy‐number analysis in clear cell renal cell carcinoma (ccRCC). Copy‐number status of all chromosomal arms and 11 genes was determined in 60 ccRCC samples. Chromosome 3p loss and 5q gain, known as early changes in ccRCC development, as well as losses at 9p and 14q were detected in 56/60 (93.3%), 31/60 (51.7%), 11/60 (18.3%), and 33/60 (55%), respectively. Through gene expression analysis, a significant positive correlation was detected in terms of 14q loss determined using digitalMLPA and downregulation of mRNA expression ratios with HIF1A and L2HGDH (P = .0253 and .0117, respectively). Patients with early metastasis (<1 y) (n = 18) showed CNAs in 6 arms (in median), whereas metastasis‐free patients (n = 34) showed those in significantly less arms (3 arms in median) (P = .0289). In particular, biallelic deletion of CDKN2A/2B was associated with multiple CNAs (≥7 arms) in 3 tumors. Together with sequence‐level mutations in genes VHL, PBRM1, SETD2, and BAP1, we performed multiple correspondence analysis, which identified the association of 9p loss and 4q loss with early metastasis (both P < .05). This analysis indicated the association of 4p loss and 1p loss with poor survival (both, P < .05). These findings suggest that CNAs have essential roles in aggressiveness of ccRCC. We showed that our approach of measuring CNA through digitalMLPA will facilitate the selection of patients who may develop metastasis.


| INTRODUC TI ON
NGS has revolutionized the analysis of nucleic acids, and has become an essential tool for the detection of somatic mutations in tumor tissues, for research and clinical use. 1 NGS is optimal for detection of single nucleotide variants and small insertions and deletions, but its application remains challenging for the detection of larger alterations such as single gene/exon CN variants. The precise quantification of CNAs in tumor genomes using NGS is difficult, especially when the tumor content of the tissue is low, and/or when target NGS approach is used.
Currently, tumor genome research with NGS is biased toward sequence-level mutations that are easily detected and tends to overlook CNAs, which also have an essential role in tumorigenesis. 2 The standard tools for CN analysis are multiplex ligationdependent probe amplification (MLPA) and array comparative genomic hybridization (aCGH). [3][4][5] However, the former approach has a limited number of targets that can be assessed simultaneously, and the latter is labor intensive, and therefore suboptimal for clinical use.
digitalMLPA is a novel technique for CN detection that combines MLPA and NGS ( Figure 1A), 6 in which amplicons are generated for each ligated digitalMLPA probe, and NGS sequencing is used to determine the read numbers of each amplicon. The CN status of the target regions is then determined by quantification of the relative read counts of the various amplicons.
Among human carcinomas, clear cell renal cell carcinomas (ccRCCs) are known for substantial genetic heterogeneity, ie, involvement of CN losses and gains of large chromosomal regions, [7][8][9][10][11] and low rates of base-substitution mutations. 12 In this study, we used a digitalMLPA assay containing 384 probes to determine the key CNAs in ccRCCs.

| Patient samples
Samples were obtained from patients at the Hospital of Hyogo Fresh ccRCC tissues were resected during surgery, and snap frozen at −80°C. Genomic DNA was extracted from frozen tissues using AllPrep DNA/RNA Mini kits (Qiagen, Hilden, Germany) and purified using NucleoSpin gDNA Clean-up reagents (Macherey-Nagel, Düren, Germany). Clinicopathological data for the patients included in study are described in Table 1.
The digitalMLPA protocol has previously been described in detail 6 and uses 20-40 ng of genomic DNA. For each probe, at least 600 reads were generated. Data analysis was performed using inhouse software by MRC Holland. In each experiment batch, the read number generated for each probe was compared with those of the reference samples, which consisted of 20 genomic DNA samples of 20 healthy young Japanese men (each consist of Epstein-Barr virustransformed B cells) obtained from the Riken DNA bank (Ibaraki, Japan) and then pooled. Samples with insufficient DNA quality or with suboptimal digitalMLPA reaction quality as indicated using quality control probes, were removed. In the same experiment batch for tumor analysis, 22 blood DNAs purified in the same procedure, from patients with ccRCCs were analyzed. The standard deviation (SD) values of CN ratios were calculated and the average SD value was 0.054 among all probes of autosomal genes ( Figure S1A). It was also confirmed that blood DNAs from 16 Japanese healthy individuals showed the similar distribution of SD value for all probes of autosomal genes, indicating minimal variability.
The average ratios of 2 batches of experiments were calculated to determine CNA of each locus. A region containing at least 3 consecutive probes in the same chromosome, showing an average CN ratio outside the range 0.9 to 1.1, was regarded as a possible gene loss or gain, or chromosomal arm loss or gain. For continuing values surrounding the cut-off value, that area was judged to be CNA if >60% of the probes in a chromosomal arm had CN ratios outside 0.9 to 1.1. We applied this particular cut-off range for our present experiments because ccRCCs generally show CNAs of large chromosomal regions. However, it does not represent cut-off value for other solid tumor tissues in general.
Using mesothelioma cell lines from HMMME or H2452 showing segmental CNAs as detected using single nucleotide polymorphism arrays and target NGS in our previous experiments, titration experiments were conducted by mixing DNA from HMMME or H2452 into DNA derived from normal human neonatal male epidermal keratinocytes, (HEKn) (Cascade Biologics, Carlsbad, CA) (n = 3). Samples of tumor cell line DNA and HEKn DNA were titrated in different proportions to produce 7 samples for further analysis, in the following tumor DNA contents: 0%, 5%, 10%, 20%, 30%, 50%, and 100%. For each digitalMLPA reaction, a total of 40 ng of DNA was used to identify the tumor cell DNA proportion at which monoallelic loss in the spiked cancer cell line DNA could be detected reproducibly.

| Real-time RT-PCR
Total RNA was isolated from ccRCC tissues using an AllPrep DNA/ RNA mini kit following the manufacturer's protocol. However, 15 isolated RNA samples degraded of the 60 ccRCCs. Subsequently, for the remaining 45 cases (Table S2)

| Target NGS
Libraries were prepared from 100 ng of genomic DNA from tumors,

| Statistical analyses
Gene expression ratios were compared between, in ccRCCs, with 14q loss and the ones without 14q loss using the chi-squared test. Multiple correspondence analysis was used to evaluate the associations among the variables and identify the risk factors associated with metastasis or poor survival.

| Sensitivity of digitalMLPA to detect CN loss
To estimate the sensitivity and reproducibility of CNA detection with digitalMLPA, titration DNA experiments were conducted.
Tumor cell line DNA from HMMME or H2452, both with losses in We confirmed precision of digitalMLPA assay by comparing CNA data at BAP1 with those CNA data obtained by conventional MLPA (P417-B1 BAP1 probemix) which we reported for 34 ccRCCs previously. 14 The results of both technologies were essentially identical: 31 of 34 ccRCCs showed 1 allele loss, and the remaining 3 had no CNA (data not shown).
In male ccRCCs, loss of chromosome Y was often detected, but gain of chromosome X was frequently detected in ccRCCs with high complexity showing frequent CNAs: cases R06, R09, R38, R44, and R56 (Table 3).

| Frequent 14q loss and the genes with downregulated expression
digitalMLPA results detected 14q losses at a higher frequency (55%

| Sequence-level mutations and chromosomal complexity
The identification of genomic risk factors for metastasis should be as simple as possible for clinical use. We analyzed the sequences of only

| Multiple correspondence analysis
We carried out multiple correspondence analysis to assess relationship between genome alterations, prognosis, and metastasis status ( Figure 5). The early metastasis (<1 y) was associated with 9p loss and 4q loss (both P < .05). This analysis also indicated association of poor survival with 4p loss and 1p loss (both, P < .05).      reported that the TP53 mutation had clinical significance for prognosis, and deletions on 17p13, carrying TP53, was added as a risk factor because this region is associated with an aggressive tumor phenotype. 24,25 Loss of chromosome 4 is not as well known as losses of chromosome 9 and 14q, but has been reported to be positively correlated with high tumor grade and candidate tumor suppressor genes. [26][27][28] In our ccRCCs, losses in 3p were detected in 56/60 (93.3%) and 5q gain was detected in 31/60 (51.7%). These alterations rates were similar to ones from previous reports, 22,23 showing the CN analysis using digitalMLPA worked well. Losses of chromosome 9p (18.3%), 14q (55.0%), and 17p13 (5%), known associate with poor prognosis, were detected. To detect genome alterations with clinical significance, we carried out multiple correspondence analysis ( Figure 5). In the analysis, chromosome losses of 1p, 4p, 4q, and 9p

Copy-number alterations or nucleotide substitution mutations
were associated with poor survival or early metastasis (P < .05). In three of 13 patients having metastasis at diagnosis, biallelic deletion of the CDKN2A/2B genes was detected.
We detected loss of chromosome 14q at a higher frequency (55%) than the frequencies (~40%) previously described, whereas the frequencies of CNAs on other chromosomes and sequencelevel mutations were equivalent to those previously described. 23 Therefore, to validate the significance of high frequency 14q losses from different aspects, we conducted gene expression experiments for 3 representative genes located on 14q: HIF1A (14q23.2), L2HGDH (14q21.3), and KLHL33 (14q11.2). A significant positive correlation was detected with HIF1A and L2HGDH, but not with KLHL33, in terms of 14q loss determined using digitalMLPA and downregulation of mRNA expression ratios ( Figure 4). Therefore, we showed through F I G U R E 3 Association between metastasis status, ie, metastasis-free, late metastasis (≥1 y), and early metastasis (<1 y) with the number of chromosome alterations (*P = .0289). R44 had the largest number of alterations, showing an extreme deviation from the mean. This patient, 1 y after tumorectomy of ccRCC, had suffered from primary lung cancer and mediastinum lymph node metastasis, which showed complete response to anti-PD-L1 antibody therapy F I G U R E 4 Association between 14q deletion and downregulated gene expression. The gene expression ratios of (A) HIF1A and (B) L2HGDH were significantly associated with 14q loss (HIF1A; *P = .0253, L2HGDH; **P = .0117, chi-squared test). In contrast, (C) the gene expression ratios of KLHL33 did not show a significant association with 14q loss (P = .0536). The number of ccRCCs with and without this loss analyzed was 22 and 23, respectively. Gene expression ratios were normalized to GAPDH gene expression analysis, that digitalMLPA can detect fine CNAs and that HIF1A and L2HGDH were genes downregulated by 14q deletion.
Chromosomal complexity increases by chromosomal alterations accumulated over several decades before diagnosis, and is associated with a poor prognosis. 22 The number of chromosome alterations in the ccRCC tumors from the patients with metastasis several months to 3.5 y after tumorectomy showed a tendency to be higher than 6 (for 6/9 patients), but those from the 3 patients with metastasis occurring between 5 and 9 y were lower than 3. Frequent losses of the Y chromosome have been described in normal blood and bone marrow of elderly men, 29,30 and in many types of tumors, including ccRCC. 31 In our ccRCCs, loss of chromosome Y was detected independently of chromosomal complexity, but gain of chromosome X was frequently detected in ccRCCs with high complexity.
ccRCC has a low mutation rate of base substitution, 12 but mutations in several genes are risk factors for metastasis. Although tumors, higher Fuhrman grade, higher stage, and a higher incidence of metastases. 33,34 BAP1 mutations were mutually exclusive with PBRM1 mutations, and were significantly associated with shorter overall survival time. 35 SETD2 mutations were also associated with a high relapse rate. 8 In our samples, the BAP1 mutation carriers tend to have more frequent CNA (median 5 arms). There were 2 cases with mutations of both BAP1 and PBRM1, and 2 cases with mutations in both BAP1 and SETD2. All of these 4 cases belonged to a group of 13 patients who had metastasis at diagnosis. Our multiple correspondence analysis indicated that chromosome losses of 1p, 4p, 4q, and 9p had more important roles than sequence-level mutations in judging malignant tumor behavior.
After the administration of immune checkpoint inhibitors and antiangiogenic targeted therapies, including anti-vascular endothelial growth factor and multi-target tyrosine kinase inhibitors to the patients with metastasis, overall survival improved significantly. 36 We hope that identification of novel characteristic CNAs and mutations using digitalMLPA and target NGS will lead to an understanding of the molecular mechanisms of ccRCCs and improvement in patient care. F I G U R E 5 Multiple correspondence analysis among the metastatic status (1: metastasis-free, 2: late metastasis, ≥1 y, 3: early metastasis, <1 y after tumorectomy) and each characteristic (1: CN alteration or nucleotide-level mutation detected, 0: not detected). The chi-squared test for independence on 2 categorical variables was conducted and χ 2 values associated with early metastasis or with poor survival are described in the figure. The critical value at the .05 level (P = .05) in 1 degree of freedom is 3.841