Frequency and spectrum of actionable pathogenic secondary findings in Taiwanese exomes

Abstract Background Exome sequencing has recently become more readily available, and more information about incidental findings has been disclosed. However, data from East Asia are scarce. We studied the application of exome sequencing to the identification of pathogenic/likely pathogenic variants in the ACMG 59 gene list and the frequency of these variants in the Taiwanese population. Methods This study screened 161 Taiwanese exomes for variants from the ACMG 59 gene list. The identified variants were reviewed based on information from different databases and the available literature and classified according to the ACMG standard guidelines. Results We identified seven pathogenic/likely pathogenic variants in eight individuals, with five participants with autosomal recessive variants in one allele and three participants with autosomal dominant variants. Approximately 1.86% (3/161) of the Taiwanese individuals had a reportable pathogenic/likely pathogenic variant as determined by whole‐exome sequencing (WES), which was comparable to the proportions published previously in other countries. We further investigated the high carrier rate of rare variants in the ATP7B gene, which might indicate a founder effect in our population. Conclusion This study was the first to provide Taiwanese population data of incidental findings and emphasized a high carrier rate of candidate pathogenic/likely pathogenic variants in the ATP7B gene.

the American College of Medical Genetics (ACMG) recommended that a list of 56 genes should be reported by a laboratory to the ordering clinician, regardless of the indication for ordering sequencing, and that for most genes, only variants reported previously or predicted to be pathogenic should be reported (Green et al., 2013). The minimal 56-gene list was revised to contain 59 genes in 2016 (Kalia et al., 2017). To identify pathogenic variants, which is also a challenging issue, the ACMG, the Association for Molecular Pathology (AMP) and the College of American Pathologists (CAP) jointly recommended the classification of variants into five categories based on set criteria: "pathogenic," "likely pathogenic," "uncertain significance," "likely benign," and "benign." These criteria included typical types of evidence, such as population data, computational data, functional data, and segregation data (Richards et al., 2015). Recently, to improve the ACMG interpretation framework, the removal of criteria PP5 and BP6 (Biesecker & Harrison, 2018) and the development of a four-step framework for criteria PS3 and BS3 (Brnich et al., 2020) were suggested.
Studies have proposed that the prevalence of pathogenic variants in actionable genes varies among different ethnic backgrounds. One study evaluated actionable pathogenic single-nucleotide variants in 500 European-and 500 African-descent participants, which showed frequencies of 3.4% and 1.2%, respectively (Dorschner et al., 2013). Another population-based study including 196 Korean exomes revealed 11 pathogenic or likely pathogenic variants in 13 individuals (Jang, Lee, Kim, & Ki, 2015), while another Korean study involving 1303 exomes revealed 13 pathogenic and 13 likely pathogenic variants on the ACMG 59 gene list with a carrier rate of 2.46% (Kwak et al., 2017). A Japanese study of 2049 individuals undergoing whole-genome sequencing reported 143 reported pathogenic variants for the 57 autosomal ACMG recommended genes and 21% of the individuals with at least one reported pathogenic allele according to public databases of pathogenic variations (Yamaguchi-Kabata et al., 2018). Subsequently, data from 1005 whole exomes and genomes in Qatar disclosed a frequency of 0.59% actionable pathogenic or likely pathogenic variants in the population, which was lower than the frequencies previously reported in European and African populations (Jain, Gandhi, Koshy, & Scaria, 2018).
Although there is increasing attention being paid to reporting pathogenic variants in actionable genes, no study has been published to evaluate the prevalence rate of these variants in the Taiwanese population. Hence, this study analyzed WES data from 161 Taiwanese individuals and discussed the reporting of pathogenic/likely pathogenic variants on the ACMG 59 gene list in the Taiwanese population.

| Ethical compliance
This study was approved by the institutional review board of National Taiwan University Hospital (IRB NTUH 201703073RINB and 201505135RINA).

| Patient enrollment
From 2017/6 to 2019/6, 166 unrelated patients suspected of having genetic disease who underwent exome sequencing were retrospectively analyzed. Of them, 80 underwent rapid trio-exome sequencing (TruSeq Exome Kit), and 86 underwent single-exome sequencing (Agilent V6) by a thirdparty company; the results were analyzed by the Biomedical Genetic Laboratory of National Taiwan University Hospital. Five patients were excluded since the variants involved in the suspected diseases were included in the ACMG 59 gene list. Therefore, a total of 161 cases without parental data were further analyzed for incidental findings.
Three milliliters of whole blood was collected in an EDTA tube for DNA extraction using a Puregene DNA extraction system (Qiagen) after obtaining informed written consent. According to the informed consent, secondary findings were not disclosed to the participants.

| Exome sequencing
Exome capture was performed using either the TruSeq Exome Capture Kit (Illumina) or the SureSelect Human All Exon V6 Kit (Agilent). Sequencing was performed using the NextSeq500 (Illumina) or HiSeq4000 (Agilent) kit. A 75-bp paired-end run was performed. A mean raw coverage over 100-fold was obtained for each sample. Sequence alignment to the human reference genome (GRCh37) was performed using the Burrows-Wheeler aligner (BWA), and variant calling was performed using the Genome Analysis Tool Kit (GATK V3.5, Broad Institute) (Wang, Li, & Hakonarson, 2010).

| Criteria for actionable gene list variants
As algorithm shown in Figure 1, the incidental findings listed as disease-causing in the HGMD or pathogenic/likely pathogenic in ClinVar were included in the reported group. Additionally, the other variants were included in the candidate group if they met the following criteria: 1. a maximal minor allele frequency <0.05; 2. a variant calling quality >300; and 3. a location at an exon/splice site.
Afterward, the variants in both groups were reviewed according to all available lines of evidence, such as the primary literature cited in the HGMD and in PubMed. The ACMG standard guidelines (Richards et al., 2015) were applied to classify the variants into three categories: pathogenic/likely pathogenic, uncertain significance, and benign/likely benign.

| Prevalence of variants from actionable gene list in the Taiwanese population
The data set analyzed in this research was composed of 161 whole exomes. As shown in Table 1 and Figure 1, a total of 3122 distinct variants of the 59 ACMG reportable genes were identified in the 161 exomes. Of these, 76 potentially pathogenic variants were classified as disease-causing in the HGMD or pathogenic/likely pathogenic in ClinVar (reported group). In addition, 275 variants were identified based on the above criteria (candidate group).
In total, all 351 variants were reviewed according to all available lines of evidence, such as the primary literature cited in the HGMD or PubMed and the ACMG standard guidelines. The analysis revealed seven potentially pathogenic variants with four in the reported group and three in the candidate group, and ATP7B c.2804C>T was identified in two individuals. Detailed information on these seven potentially pathogenic variants is listed in Table 2.
In total, eight participants, 4.97% of the participants analyzed, were identified as having pathogenic or likely pathogenic variants listed as incidental findings in the ACMG 59 gene list (Table 3). Of these participants, five (3.11% of the participants analyzed) had autosomal recessive variants in one allele, and three (1.86% of the participants analyzed) had autosomal dominant variants. Only pathogenic (known or expected) variants for dominant disease or biallelic variants for recessive disease met the ACMG criteria for reporting. Further investigation revealed that all the autosomal recessive variants were in the ATP7B gene, which highlighted the high carrier rate of Wilson disease in our population. F I G U R E 1 Algorithm of the selection of pathogenic or likely pathogenic variants. The exomes were first screened for variants from the ACMG 59 gene list. Then, the identified variants were included into the reported group or candidate group if they met the criteria. After application of all available lines of evidence and the ACMG standard guidelines, the identified variants were classified into three categories: pathogenic/likely pathogenic, uncertain significance, and benign/likely benign ACMG 59 incidental finding panel

Reported group (76); Candidate group (275)
Reported group (variants listed as DM in HGMD or pathogenic/likely pathogenic in ClinVar); candidate group (1. maximal minor allele frequency < 0.05; 2. variant calling quality > 300; 3. located at exon/splice site; 4. not previously reported DM in HGMD or pathogenic/likely pathogenic in ClinVar) Categorized as pathogenic/likely pathogenic (7) or uncertain significance (202) or benign/likely benign (142) Manual literature review (primary literature cited in HGMD or in PubMed and application of ACMG standard guideline)

| Stop-gain and frameshift variants
This study additionally evaluated stop-gain and frameshift variants in the reported and candidate groups. A total of one stop-gain and two frameshift variants were identified using the initial filtering criteria, which accounted for a lower percentage than missense variants. The stop-gain variant in the reported group was MSH6 c.1444C>T (p.R482X), which was predicted to be pathogenic. As shown in Table  2, six of seven databases reported this variant as pathogenic/ likely pathogenic. The maximum minor allele frequency was 0.006. The MSH6 c.1444C>T variant has been reported in several individuals with Lynch syndrome-associated cancers (Baglietto et al., 2010;Hendriks et al., 2004;Okkels et al., 2012;Sjursen et al., 2010).
Of the two frameshift variants, ATP7B NM_000053.4:c.3775_3776insAAAG p.(G1259Efs*14) in one participant was predicted to be likely pathogenic (Table 2). In contrast, one frameshift variant, MSH6 NM_000179.3:c.4068_4071dupGATT p.(K1358Dfs*2) (rs55740729), was identified in nine participants. The maximum minor allele frequency of this variant is 3.90%, which is not consistent with the disease presentation. The HGMD also reports this as a functional polymorphism. Therefore, by applying the PVS1, PP3, BS1, and BS2 criteria of the ACMG guidelines, this study classified this variant as a variant of uncertain significance, although several submitters classified it as benign/likely benign in ClinVar (Variation ID: 89518).

| DISCUSSION
This study is the first to search for actionable pathogenic variants in the Taiwanese population. In this study, approximately 1.86% of the Taiwanese individuals had reportable pathogenic or likely pathogenic variants, that is, two AR alleles or one AD allele, as determined by WES. According to ACMG, pathogenic (known or expected) variants for dominant disease or biallelic variants for recessive disease should be reported.
In the published literature regarding incidental findings in WES and whole-genome sequencing (WGS), the frequency of pathogenic/likely pathogenic variants has been variable, but approximately half are related to cardiovascular diseases (Amendola et al., 2015;Jain et al., 2018;Jang et al., 2015;Kwak et al., 2017;Olfson et al., 2015;Tang et al., 2018). By using the ACMG 59 gene list, a study involving 954 East Asian genomes found the frequency of pathogenic or likely pathogenic variants to be 2.5% (Tang et al., 2018). By using the ACMG 56 gene list, a study in Korea found a carrier frequency of actionable variants of 2.46% in 1303 individuals (Kwak et al., 2017). Another population-based study including 196 Korean exomes identified 11 pathogenic or likely pathogenic variants in 13 individuals (Jang et al., 2015). In addition, a study involving 6503 individuals reported frequencies of actionable variants of 2% and 1.1% in European and African groups, respectively (Amendola et al., 2015). This study only enrolled 161 individuals, which is a smaller sample size than the above studies, which could have had an impact on the results. Additionally, there could be some rare polymorphisms that were identified as pathogenic/likely pathogenic variants in this study.
This study reported four of seven distinct pathogenic or likely pathogenic variants in the ATP7B gene, and the carrier rate was 3.11%. Previous studies showed variable prevalence rates across different ethnicities. One study in France estimated the prevalence rate of 1.5 cases per 100,000 . A study in Taiwan found a prevalence rate of 1.81 cases per 100,000 (Tai et al., 2018); therefore, the heterozygous carrier rate was 0.85% based on Hardy-Weinberg equilibrium. In our study, we identified five out of 161 individuals who were carriers of an ATP7B pathogenic variant. Thus, the carrier frequency of WD-related mutations was one in 32 (3.11%). This frequency is lower than that in France (3.2%) (Collet et al., 2018) but higher than those in Hong Kong (1.36%) (Mak et al., 2008), Korea (2%) (Park, Ki, Lee, & Kim, 2019), and the USA (1.1%) (Gao, Brackley, & Mann, 2019) according to molecular studies. This finding is consistent with Gao et al.'s observation that the East Asian population has the highest prevalence of Wilson disease, and the genetic prevalence of Wilson disease is greater than the epidemiological estimates (Gao et al., 2019). This finding is further supported by the fact that most of the pathogenic/likely pathogenic variants of ATP7B are found in the East Asian population in the Genome Aggregation Database (GnomAD), in which all eight individuals with variant c.2828G>A, 46 of 49 patients with variant c.2804C>T and all 37 patients with variant c.2333G>T were from the East Asian population. Considering those published studies and databases, the prevalence rate of Wilson disease and carrier rate of ATP7B gene mutations vary across different ethnicities. In particular, the prevalence rate and carrier rate are higher in East Asian countries, such as Taiwan, Korea, and China, and, surprisingly, in France. Therefore, given the history of the surrounding region, the higher carrier rate of mutated ATP7B gene variants could be due to the founder effect. Additionally, there could be some rare polymorphisms in the ATP7B gene. During the investigation, we found correctly classifying ATP7B variants to be challenging. Many rare variants still lack functional analysis, and some variants might be rare polymorphisms, especially in East Asia. Classifying those variants and deciding whether further reporting was warranted remain debatable issues among clinicians, and those decisions might influence future medical decisions for individuals. Thus, further studies on these rare variants in the ATP7B gene will be needed to answer these questions and guide laboratories with regard to the need to report those variants. There were some limitations of this study. First, this study only included 161 Taiwanese participants, which may not fully represent the population. In the future, a multicenter study involving a larger number of participants will be necessary to evaluate the epidemiology of these pathogenic or likely pathogenic variants, especially those in the ATP7B gene.
In conclusion, we provided the first Taiwanese population data of incidental findings and highlighted a high carrier rate of candidate pathogenic/likely pathogenic variants in the ATP7B gene. This helps delineate the challenge of variant classification and benefits further interpretation.

ACKNOWLEDGMENTS
This work was funded by a grant from the Ministry of Science and Technology (107-2314-B-002-164-MY3 and MOST 108-2321-B-002-050) of Taiwan. We would like to express our sincere gratitude to patients and their families participating in this study.