Familial adenomatous polyposis (FAP; MIM# 175100) is an autosomal-dominant precancerous condition caused by germline mutations in the tumor suppressor adenomatous polyposis coli (APC; MIM# 611731) gene [Miyoshi et al., 1992a, 1992b]. APC consists of 15 exons and encodes a protein of 2,843 amino acids [Aretz et al., 2004]. To date, more than 900 different APC germline mutations have been registered in Human Gene Mutation Database (www.hgmd.cf. ac.uk/ac) [Han et al., 2011].
Most of the mutations in APC are small insertions, deletions, or single-base substitutions that cause premature truncation of the APC protein. More than 60% of these mutations have been localized within exon 15 [Han et al., 2011]. Indeed, more than half of germline mutations are localized in the 5′ portion of exon 15, specifically in the area between codon 712 and codon 1579 [Nagase et al., 1992]. Similarly, two thirds of somatic mutations found in cancers are concentrated in less than 10% of exon 15, codon 1286 to codon 1513 [Miyoshi et al., 1992a, 1992b].
Instead, a small number of mutations have been demonstrated to occur in introns, affecting the consensus splice-site signals and are predicted to induce skipping of the neighboring exons [Aretz et al., 2004]. One of the first intronic sequence variants identified, either in somatic and germline lines, was described in a poly A trait of APC intron 3–exon 4 junction [Miyoshi et al., 1992a, 1992b]. This region is characterized by the presence of a repetitive AT-rich region T7A13 (intron 3, nucleotides −23 to −4; Fig. 1). During the execution of diagnostic tests, based on direct sequence analysis, we detected that this region was recalcitrant to conventional sequencing techniques. Therefore, the aim of the present study was to adopt a sequence analysis technique via mutagenesis to obtain sequence data from this difficult region. We subsequently determined the nucleotide sequence of this region in 100 healthy donor individuals.
From January 2005 to January 2010, we performed APC mutational analysis on patients enrolled to Interinstitutional Multidisciplinary Biobank (BioBIM) of the Department of Laboratory Medicine and Advanced Biotechnologies, IRCCS San Raffaele Pisana, Rome, Italy, using single-strand conformation polymorphism (SSCP). DNA samples showing an abnormal SSCP electrophoretic profile were further analyzed by direct sequencing analysis as previously described [Palmirotta et al., 1995]. Beginning in 2011, we changed the operating laboratory procedures to perform genetic screening of FAP patients. Since then, the coding sequence and intron–exon borders of all 15 exons of APC were amplified by polymerase chain reaction (PCR) and analyzed by direct sequencing without prescreening by SSCP. An automated protocol for DNA extraction was performed using MagNA Pure LC instrument with MagNA Pure LC total DNA isolation kit I (Roche Diagnostics, GmbH, Mannheim, Germany) in accordance with the manufacturers' instructions. PCRs were performed in a Veriti 96-well thermal cycler (Applied Biosystems, Foster City, CA) using HotStarTaq Master Mix Kit (Qiagen Inc., Hilden, Germany). Sequencing reactions were performed in both forward and reverse strands with proper primers using Big Dye Terminator v 3.1 Cycle Sequencing Kit (Applied Biosystems), and run on an ABI 3130 Genetic Analyzer (Applied Biosystems).
Exon 4 Analysis Procedure 1: For exon 4, we carried out a 264 bp PCR amplification using primers 4F and 4R (Fig. 1), complementary to the two intronic regions adjacent to exon on the basis of the APC Ensembl (Ensembl accession number ENST00000257430) and GenBank sequences (GenBank accession numbers NG_008481.1 and NM_000038.5). PCR reaction proceeded for 32 cycles of 30 sec denaturation at 94°C, 30 sec annealing at 56°C, and 1 min extension at 72°C. An initial denaturation step of 15 min at 95°C and a final extension step of 10 min at 72°C were employed. The same pair of primers was employed for sequence analysis.
Exon 4 Analysis Procedure 2: Suspecting a sequencing artifact induced by the presence of the repetitive sequence, we selected another pair of primers, 4F1 and 4R1 (186 bp), complementary to different sites of exon 4 and intron 3 (Fig. 1). We adopted some different conditions of PCR amplification using both pairs of primers in combination with a high fidelity, Phusion Hot Start (Finzymes, Keilaranta, Finland) and KAPA HiFi (Kapabiosystem, Boston, MA). All the amplified products were also cloned using Pcr4-TOPO Vector (TOPO TA cloning kit; Invitrogen, Carlsbad, California), and then purified and sequenced.
Exon 4 Analysis Procedure 3: We later developed a new PCR protocol using the HotStarTaq Master Mix and included in the reaction mix the nucleoside analogue dPTP (Jena Bioscience, Jena, Germany) [Keith et al., 2004; Zaccolo et al., 1996]. The PCR reaction was performed in 30 μl of a mixture containing 50 ng of DNA, 0.5 μM of each oligonucleotide primer, a variable amount of distilled water, 15 μl of HotStarTaqMaster Mix, and 400 μM of dPTP. At the end, we obtained an equimolar amount of dNTPs and dPTP. PCR conditions, cloning, and sequencing were performed as described above. Sequences including mutations induced by dPTP were compared with reference sequences using the alignment tools of Mac Vector Inc. (Cary, North Carolina, USA) 10.6.0 software. We considered that during PCR amplification, equimolar mixtures of the four normal dNTPs and dPTP lead to a major frequency of nucleotide changes: A > G, G > A, T > C, and C > T [Zaccolo et al., 1996]. The study was performed under the appropriate institutional ethics approvals and in accordance with the principles embodied in the Declaration of Helsinki. Written informed consent was obtained from each participating individual.
Using conventional sequencing analysis (procedure 1) performed on APC exon 4, we observed, both in forward and reverse, the presence of a frameshift involving the T7A13 sequence in all examined samples (Fig. 2A). The various attempts to avoid the “frameshift artifact” in the repetitive region upstream APC exon 4 did not give any useful results (procedure 2). Despite these efforts, in the sequence reactions analysis, we always found an apparent A or T deletion based on the use of forward or reverse primers, respectively (Fig. 2A and B). On the contrary, when the obtained product was sequenced by cloning a PCR reaction carried out by the use of dPTP (procedure 3), a clear and readable electropherogram without any double peak and frameshift artifact was obtained (Fig. 2C).
Subsequently, to detect the relative frequency of the A13T7 polymorphism in the general population, we assayed 100 healthy selected individuals (200 alleles) enrolled in our BioBIM. For this study, exclusion criteria were an age less than 50 years, personal history of cancer, or a family history of cancer in a first-degree relative at enrollment.
A (T)7 polymorphism is reported as a deletion/insertion polymorphism SNP (rs35031194:-/T) at the NCBI browser (http://www.ncbi.nlm.nih.gov/sites/entrez?db = snp), whereas the (A)13 is reported as a variant (rs71828958:-/A) without clinical significance and frequency data.
None of the tested samples showed a sequence variant in the poly-T region. On the contrary, in three cases (3%), we observed the presence of a polymorphism involving a single-nucleotide variation of (A)13 repetitive trait. Particularly, all of the three cases presented an A deletion (A)12 (Fig. 2D). Using the most popular web tools, the normal splice acceptor site of APC exon 4 was not recognized, with the exception of SplicePort (http://spliceport.cs.umd.edu/SplicingAnalyser2.html) that, however, did not identify any mutated alternative splice site. After analyses of RNA and subsequent PCR amplification and sequence, no exon splices mutations were found at APC exons 3–5 (Supp. Fig. S1).
In the present study, we report our experience on a technical sequencing issue that occurred during the mutational analysis of APC exons 3–5. We observed that the direct sequence analysis of the repetitive AT-rich region T7A13 in the splice acceptor site of intron 3 could be misinterpreted as frameshift mutation and can thus lead to a misinterpretation of molecular results.
The A deletion of this region has been reported as a pathogenetic variant [Miyoshi et al., 1992a, 1992b] and is listed in the Human Gene Mutation Database (www.hgmd.cf. ac.uk/ac, accession number CD921029) and Leiden Open Variation Database (http://chromium.liacs.nl/LOVD2/colon_cancer/home.php; accession number APC_00744), described as c.423-4 delA determining a splice mutation.
To bypass this potential methodological bias, we engaged a strategy involving locus-specific mutagenesis, based on the introduction by PCR of random mutations in the amplified DNA, and subsequent cloning/sequence analysis. Using this experimental protocol, it is possible to generate multiple DNA variants similar to the original sequence. However, these variants present random mutated copies with the absence of the repetitive AT-rich region, allowing a correct sequence analysis [Keith et al., 2004; Zaccolo et al., 1996].
Specifically, we designed and performed these series of experimental tests for evaluating the intron–exon border beyond the acceptor site. Indeed, this region of the APC appears to be particularly prone to intronic mutations that lead to RNA splicing defects [Nasioulas et al., 2001].
Mutations at the APC-IVS3-1 and APC-IVS3-2 acceptor (AG) sites that determine skipping of APC exon 4 (c.423-1G>A, c.423-1G>C, c.423-1G>T, c.423-2A>G, c.423-2A>T) were identified by several authors [Aretz et al., 2004; Friedl et al., 2001; Kraus et al., 1998; Kurahashi et al., 1995; Rivera et al., 2011; Spirio et al., 1999]. Similarly, a c.423-5A>G in the A13 region and a c.423G>T (p.Arg141Ser) leading to the same mentioned effect were described [Aretz et al., 2004; Friedl et al., 2001; Murphy et al., 2007]. A nonpathogenetic T deletion in the repetitive T7 trait has been reported as a deletion/insertion polymorphism, and identified in both colorectal cancer patients and healthy control population [Boardman et al., 2001; Neklason et al., 2004]. Moreover, the deletion in the repetitive tract A13 has been described as a pathogenic mutation in either germline or somatic line [Miyoshi et al., 1992a, 1992b]. These findings, reported also in WEB mutational databases, could easily induce in a misinterpretation of the molecular results that may lead to erroneous information for the patients.
In the present study, we did not find the reported T deletion/insertion polymorphism. However, we observed A deletion in heterozygous condition in 3% of unrelated samples. We also confirmed the nonpathogenic nature of the variant by analyzing the products of APC exons 3 through 5 complementary DNA amplification.
To our knowledge, although the presence of the extremely AT-rich sequence in several APC intronic regions has been demonstrated [Cowie et al., 2004], it has never been previously reported that this particular region is recalcitrant to conventional sequencing techniques. Similarly, the manual of sequencing techniques, adopted in many early mutational studies of APC, did not disclose the problem of an erroneous frameshift. In fact, in data presented in previously published studies, the sequence manually performed on this APC region did not exhibit any problem and was perfectly readable [Kraus et al., 1998; Kurahashi et al., 1995]. It is noteworthy that numerous diagnostic and research centers currently perform APC analysis using screening methods such as SSCP or denaturing high-pressure liquid chromatography, as suggested by international guidelines [Aretz et al., 2011], never addressing this possible issue. Very likely, in most cases, the interpretation of sequence analysis has focused on the splice site, considering the upstream frameshift as a polymorphic T7 tract with little significance.
Strengths of this study include that, for the first time, a potential bias involving APC exon 4 has been observed and analyzed. Moreover, the “via mutagenesis” method used in this study is relatively simple and easily applicable in most molecular diagnostic laboratories and can be useful to analyze other APC repetitive regions [Cowie et al., 2004] (applicable as a second-level analysis in all cases in which conventional direct sequencing) showed that the presence of a frameshift variant resulted from a suspicious technical artifact.
Limitations of the present study include small number of subjects, which may explain at least in part why we did not observe any deletion of poly T7; and the presence of possible laboratory bias in performing a pilot technique not yet tested for diagnostic purposes.
We applied this novel approach to perform analysis of the repetitive AT-rich trait in the splice acceptor site of APC intron 3. Furthermore, we demonstrated that a variant previously described as a pathogenetic mutation is a polymorphism hardly detectable with conventional analytical methods. Although this is a pilot study, further studies are necessary to corroborate our results to update the WEB mutational databases, and to better define the importance of the implications for the design of mutation-detection strategies in FAP patients.