Comprehensive characterization of Alu‐mediated breakpoints in germline VHL gene deletions and rearrangements in patients from 71 VHL families

Abstract Von Hippel‐Lindau (VHL) is a hereditary multisystem disorder caused by germline alterations in the VHL gene. VHL patients are at risk for benign as well as malignant lesions in multiple organs including kidney, adrenal, pancreas, the central nervous system, retina, endolymphatic sac of the ear, epididymis, and broad ligament. An estimated 30%–35% of all families with VHL inherit a germline deletion of one, two, or all three exons. In this study, we have extensively characterized germline deletions identified in patients from 71 VHL families managed at the National Cancer Institute, including 59 partial (PD) and 12 complete VHL deletions (CD). Deletions that ranged in size from 1.09 to 355 kb. Fifty‐eight deletions (55 PD and 3 CD) have been mapped to the exact breakpoints. Ninety‐five percent (55 of 58) of mapped deletions involve Alu repeats at both breakpoints. Several novel classes of deletions were identified in this cohort, including two cases that have complex rearrangements involving both deletion and inversion, two cases with inserted extra Alu‐like sequences, six cases that involve breakpoints in Alu repeats situated in opposite orientations, and a “hotspot” PD of Exon 3 observed in 12 families that involves the same pair of Alu repeats.

Germline deletions of 380, 200, and 100 kb in VHL patients were first identified by pulsed-field gel electrophoresis  and used to localize the genomic region in which the VHL gene was subsequently identified .
While patients with germline partial deletions (PD), in which one or two exons of VHL are deleted, have been reported to be at risk for the development of an aggressive form of renal cell carcinoma (RCC) and mild incidence of pheochromocytoma, patients with complete deletions (CD) are more likely to exhibit mild kidney disease and virtually no incidence of pheochromocytoma (Chen et al., 1995;Franke et al., 2009;Maranchie et al., 2004). Several previous reports have described the sizes, locations, and nature of VHL germline deletions (Franke et al., 2009;Maranchie et al., 2004). In 2004, we characterized the deletions of 55 VHL families by fluorescence in situ hybridization (FISH) (Maranchie et al., 2004), and Franke et al. (2009) characterized deletions of 54 families, including 33 that were mapped to the exact nucleotide. Maranchie et al. (2004) were the first to observe that presence or absence of the adjacent upstream gene, BRK1 (also known previously as C3orf10 and HSPC300) influences the severity of the RCC phenotype. BRK1 is a subunit of the suppressor of cyclic adenosine monophosphate receptor/Wiskott-Aldrich syndrome protein family verprolin-homologous protein actin nucleating complex and is involved in actin and microtubule organization. Depletion of BRK1 by small interfering RNA results in cytoskeleton abnormalities and cytokinesis arrest in cell lines, including clear cell RCC lines (Cascón et al., 2007). In the cohort described by Maranchie et al., the frequency of RCC was 52.3% in VHL deletion families that retained BRK1 versus 18.9% in those that lost BRK1. In a cohort of 18 VHL deletion probands reported by Cascon et al. (2007), 10 patients who presented with RCC inherited deletions that retained BRK1 whereas six of eight who did not develop RCC carried deletions that included the BRK1 gene. In addition, Franke et al. (2009)

| Patients
Patients were seen at the Urologic Oncology Branch (UOB) of the National Cancer Institute (NCI), National Institutes of Health (NIH) for clinical assessment on institutional review board-approved protocols and provided written informed consent.

| Array-based comparative genomic hybridization (CGH)
An Agilent custom high-definition CGH array (Agilent) had been previously designed to assess copy number aberrations in several selected kidney cancer-associated genes (Benhammou et al., 2011;Vocke et al., 2017). Included within this array were 21 probes selected from the Agilent HD-CGH database from within the 10.2 kb genomic region containing VHL that were computationally preselected to provide an average probe density of ∼2 probes per kb.
Within the 50 kb flanking regions 5′ and 3′ to VHL, a fade-out design achieved an average density of ∼1 probe per kb diminishing to an average of ∼1 probe per 40 kb over the entire genome. The customdesigned arrays were printed on an Agilent 4x44K customer array and processed according to the manufacturer's protocol. Representative patients from 67 VHL germline deletion families were analyzed in this manner; 0.5 μg of patient genomic DNA and 0.5 μg of normal human reference DNA (Promega) were fragmented by AluI/ RsaI digestion, labeled with Cy3/Cy5 fluorescent dyes, and hybridized at 65°C for 24 h. Following hybridization and washing, the arrays were scanned using an Agilent Microarray Scanner. Data were extracted with Agilent Feature Extraction Software (v10.7.1.1) and analyzed with Agilent DNA Analytics 4.0 software (v4.0.85). Deletions were calculated as the distance between the first and last probes that lost~50% of their signal in comparison with the normal signal.

| VHL deletion/duplication analysis
All patients from the families with germline deletions identified by the Agilent custom high-definition CGH array were confirmed by Clinical Laboratory Improvement Amendments (CLIA)-approved VHL deletion/duplication analysis provided by either GeneDx, the Children's Hospital of Philadelphia, or Invitae. Four additional patients were directly evaluated using CLIA-approved VHL deletion/ duplication analysis provided by the same companies.

| RESULTS
3.1 | Germline VHL gene deletion mapping in 71 families A custom CGH array was used to assess germline copy numbers in representative individuals from 67 unrelated VHL families, who possess germline PD or CD (Benhammou et al., 2011;Vocke et al., 2017). Thirteen of the deletions were mapped by CGH array only; the sizes and ranges are shown in Figure 1 and Table S1. The minimal deletion coordinates are based on the GRCh37/hg19 genome assembly and indicate the first and last CGH probes found to have copy loss for each deletion, and the minimal deletion size is calculated accordingly. Nine of these deletions are CDs and the other four exhibit loss of one or two exons of VHL. All 13 of these deletions feature deletion or PD of one or more additional genes upstream or downstream of VHL.
Fifty-four deletions initially defined by the CGH array were then successfully mapped to the exact nucleotide. Combinations of PCR primers situated in the potentially retained chromosomal regions were used to generate novel amplicons that spanned the deletion and sequenced to identify the deletion breakpoints. An additional four deletions were mapped in a similar manner based on germline results received from a CLIA-approved VHL deletion/duplication genotyping service, as opposed to the CGH array. The chromosomal coordinates of the breakpoints were assigned based on the first divergent nucleotide that was observed on the strand that was sequenced; due to the polyA tails on many Alu repeats, the breakpoints often could only be sequenced in one direction. These 58 precisely mapped deletions are shown in Figure 2 and Table S2. there are about one million copies in the human genome. They are about 300 bp in length and are thus categorized as short F I G U R E 1 Sizes and ranges of VHL germline deletions mapped by CGH. Coordinates are based on the GRCh37/hg19 genome. The extents of deletions were defined by the first and last probe with~50% loss in the patient compared with normal control. Deletions are grouped based on which exon(s) are deleted. Additional genes that are included in one or more deletions are shown. VHL is indicated in red, and the three exons in relation to each deletion are shown by the dotted lines. CGH, comparative genomic hybridization; VHL, Von Hippel-Lindau interspersed nuclear elements (SINEs) (Hwu et al., 1986). Among the   homology (Shen et al., 1991), these two particular Alu repeats are extremely homologous, having 92.7% identity including 96% identity in the first 175 bp, suggesting that recombination involving this pair may be particularly favorable.
To the best of our knowledge, these 12 families are all unrelated.
Although it is possible that a founder effect may be responsible for some of the 12 deletions, in at least some cases these deletions appear to be distinct events. Although all 12 deletions involve the same two Alu repeats, the actual breakpoints within the Alu repeats vary as can be shown by the slight sequence differences in the reconstituted breakpoint region ( Figure 5). Seven families (UOB-1147, 1699, 3019, 3465, 3488, 3355, and3550) (Table S2). The remaining deletions all involve unique combinations of Alu or LINE repeat pairs or were not associated with repeat elements.

| DISCUSSION
Germline deletions of VHL represent a significant proportion of all genetic alterations associated with this disorder, and the mapping of these deletions provides important data concerning the spectrum of presentation of VHL and for genetic screening.
The predilection for deletions within this region is inherently linked to the significantly increased presence of Alu repeats within the region of the VHL gene. When the first VHL deletion was mapped by Casarin et al. (2006), they demonstrated that both breakpoints occurred within Alu repeats. They observed that Alu repeats occur approximately once in every 1 kb in the VHL region, representing 20%-25% of the VHL gene sequence, with the frequency as high as one in every 500-600 bp in some parts of the gene (Casarin et al., 2006). This is much higher than the average Alu repeat frequency of one in every 4 kb seen across the genome (Hwu et al., 1986). Alu-mediated recombination has been implicated in many human diseases and it was theorized that Alu-mediated recombination could be a common mechanism for VHL gene deletion (Casarin et al., 2006;Deininger & Batzer, 1999). This theory was demonstrated in several additional studies of germline VHL deletion (Cascón et al., 2007;Franke et al., 2009;Maranchie et al., 2004). This study confirms that the vast majority of germline deletions are directly associated with the Alu repeats within the gene region and identifies a previously unreported hotspot involving two Alu repeats with very high sequence homology.
We have previously reported the presence of germline deletions in two other familial forms of kidney cancer, Birt-Hogg-Dubé (BHD), caused by germline mutations in the folliculin (FLCN) gene, and hereditary leiomyomatosis and renal cell carcinoma (HLRCC), resulting from germline fumarate hydratase (FH) gene mutations (Benhammou et al., 2011;Vocke et al., 2017). Among FLCN deletions in four BHD families that were mapped by sequencing, two involved Alu repeats in both breakpoints, one had a deletion/inversion event involving Alu repeats in two of the four breakpoints, and one had no Alu involvement. A germline duplication in FLCN likewise did not have Alu involvement (Benhammou et al., 2011). Among three HLRCC families with FH deletions that were mapped, none had Alu involvement (Vocke et al., 2017). In contrast, Alu involvement in VHL deletion breakpoints is nearly universal, likely due to the unusually high density of Alu repeats in the VHL genomic region. This may be an explanation as to why such a high proportion (30%-35%) of VHL patients possess a germline deletion, in contrast to a minority of patients in other hereditary kidney cancer syndromes with germline gene deletions (Schmidt et al., 2005;Toro et al., 2003;Wei et al., 2006). Franke et al. (2009) conducted an analysis of 54 VHL germline deletions, 33 of which were precisely mapped by sequencing. They observed deletions ranging in size from 568 bp to 250 kb, and among the 33 sequenced deletions, 90% involved Alu repeats. They found that the single AluYa5 repeat in the region, which they report as evolutionarily the youngest, was involved in 7 of 33 deletions. In our cohort, this AluYa5 is involved in an even higher frequency (26/58, 45%) of breakpoints.
Although our results share many similarities with the above study, we observe several novel features. This study identified the first VHL deletion hotspot, which was observed in 12 families involving an AluYa5 and AluY pair. Notably, this is the same AluYa5 element that was involved in 45% of breakpoints in our cohort. This report also provides the first evidence of~260 bp insertions of Alu-like sequences into deletion breakpoints and identified two novel VHL deletion/inversion events, a type of genetic alteration that had not been previously described in VHL.
We had previously observed a deletion/inversion event in FLCN in a BHD family (Benhammou et al., 2011). The inversions that we observed in both cases were discovered fortuitously; in sequencing PCR products that would represent an expected deletion, additional breakpoints representing additional inversion events were detected. Due to the high frequency of Alu repeats in the vicinity of the VHL gene and the resultant potential for genomic instability, it is possible that deletion/inversion events are underappreciated and that more such events would be detected by other methods such as sequencing long-range PCR products or by whole genome sequencing. Furthermore, the possibility exists that an inversion or other complex rearrangement in the absence of a deletion could take place. Such an event would be difficult to detect by conventional CLIA genotyping, which only looks for point mutations or copy number variations.
Therefore, for a patient who exhibits clinical manifestations of VHL with no detectable germline alteration, whole-genome sequencing should be considered to investigate the possibility of an inversion.
In summary, this report describes the largest known cohort to date of VHL deletions with extensive characterization. The diverse spectrum of sizes, breakpoints, and Alu pair involvement in our study and the studies of others demonstrate the broad range of independent recombination events involving combinations of different Alu repeats and other sequences that may contribute to generating germline VHL deletions leading to the multisystem phenotype of VHL syndrome.