Significance of CpG Methylation for Solar UV-Induced Mutagenesis and Carcinogenesis in Skin


  • Hironobu Ikehata,

    Corresponding author
    1. Department of Cell Biology, Graduate School of Medicine, Tohoku University, Sendai, Japan
    Search for more papers by this author
    • Current address: Division of Genome and Radiation Biology, Department of Cell Biology, Graduate School of Medicine, Tohoku University, Aoba-ku, Sendai, Japan.

  • Tetsuya Ono

    1. Department of Cell Biology, Graduate School of Medicine, Tohoku University, Sendai, Japan
    Search for more papers by this author
    • Current address: Division of Genome and Radiation Biology, Department of Cell Biology, Graduate School of Medicine, Tohoku University, Aoba-ku, Sendai, Japan.

  • This invited paper is part of the Symposium-in-Print: Photobiology in Asia.

*email: (Hironobu Ikehata)


Mutations detected in the p53 gene in human nonmelanoma skin cancers show a highly UV-specific mutation pattern, a dominance of C→T base substitutions at dipyrimidine sites plus frequent CC→TT tandem substitutions, indicating a major involvement of solar UV in the skin carcinogenesis. These mutations also have another important characteristic of frequent occurrences at CpG dinucleotide sites, some of which actually show prominent hotspots in the p53 gene. Although mammalian solar UV-induced mutation spectra were studied intensively in the aprt gene using rodent cultured cells and the UV-specific mutation pattern was confirmed, the second characteristic of the p53 mutations in human skin cancers had not been reproduced. However, studies with transgenic mouse systems developed thereafter for mutation research, which harbor methyl CpG-abundant transgenes as mutation markers, yielded complete reproductions of the situation of the human skin cancer mutations in terms of both the UV-specific pattern and the frequent occurrence at CpG sites. In this review, we evaluate the significance of the CpG methylation for solar UV mutagenesis in the mammalian genome, which would lead to skin carcinogenesis. We propose that the UV-specific mutations at methylated CpG sites, C→T transitions at methyl CpG-associated dipyrimidine sites, are a solar UV-specific mutation signature, and have estimated the wavelength range effective for the solar–UV-specific mutation as 310–340 nm. We also recommend the use of methyl CpG-enriched sequences as mutational targets for studies on solar-UV genotoxicity for human, rather than conventional mammalian mutational marker genes such as the aprt and hprt genes.


Human nonmelanoma skin cancers (NMSC), whose major constituents are squamous cell carcinoma and basal cell carcinoma, are thought to result from repetitive exposures of the skin to sunlight, especially to the ultraviolet (UV) component, which consists of UVB (290–320 nm) and UVA (320–400 nm) wavelength-band regions (1,2). One part of evidence for the relevance of UV to human skin cancers is the mutation spectrum observed in the p53 gene, which plays a key role in maintaining the genome stability against genotoxic insults (3) and is known to mutate at a remarkably high frequency in various kinds of human cancers including NMSC (4–6). The p53 mutation spectrum observed in NMSC was highly UV-specific: a dominance of C→T base substitutions at dipyrimidine sites, where two pyrimidine bases directly neighbor each other in the DNA sequence, plus a small but considerable fraction of CC→TT tandem substitutions, which are considered a signature of DNA damage induced by UV (5–10).

Another important feature of the p53 mutations in NMSC is the preference of their occurrence for a 5′-CG-3′ dinucleotide (CpG) sequence, which is the consensus target motif for epigenetic DNA methylation in the vertebrate genome, through which the cytosine residues of CpG dinucleotides are methylated at position 5 of the pyrimidine ring (11). The CpG preference of the UV-specific mutations in NMSC leads, to the appearance of mutation hotspots at several of the CpG sites in the p53 gene (9). Whereas mutation hotspots in the p53 gene have been detected in various types of human cancers and often correlate with CpG sites, the hotspots in NMSC are unique (Fig. 1): four of the six most frequent hotspots occur at CpG-associated dipyrimidine sites, and actually, the second most frequent of them (at codon 196) has been scarcely observed as a major hotspot for other types of tumors (6,9,12,13).

Figure 1.

 Distribution in the p53 gene of point mutations (base substitutions and frameshifts, total number 322) detected in human nonmelanoma skin cancers from non-XP patients (12). Data of non-XP basal cell and squamous cell carcinomas were retrieved from a human p53 somatic mutation database (R8, The abscissa indicates codon positions in the p53 coding sequence. Frequencies of the mutations at each codon are shown by vertical lines. The codons associated with the CpG motif are shown by lollipops, in which closed and open circles indicate the CpG sites with or without coincidental dipyrimidine sites, respectively. The lollipops with a hatched circle indicate codons with CpG sites that include multiple CpG-associated cytosines, one or two of which belong to a dipyrimidine, whereas the other(s) belongs to a non-dipyrimidine dinucleotide. Codon numbers are given for relatively frequent recurrent sites including hotspots. The open rectangles on the top of the graph indicate arrangement of exons of the p53 gene along its coding sequence. The thick line under the rectangles shows CpG methylation status of the p53 coding region: an unbroken portion indicates a methylated region and broken portions indicate regions the methylation status of which is unknown.

Until the 1990s, research on mammalian mutation spectra induced by UVB, UVA, and solar UV had been performed with cultured cell systems mainly using the endogenous aprt gene and exogenous bacterial genes on a shuttle vector as a mutational target (14,15). In these studies, the obtained spectra simulated to a pretty good extent one of the features of the p53 mutations in human NMSC: the dominance of the UV-specific mutation (C→T or CC→TT at a dipyrimidine site). However, the specificities of the distribution of mutation hotspots in genes were quite different: in the aprt gene, no hotspots appeared at CpG sites, whereas several hotspots were detected at non-CpG sites (14). Recently, this failure in the reproduction of the sequence preference observed for the NMSC p53 mutations has been remedied by the advent of transgenic mouse systems developed for mutation research (16–20). The successful outcome in transgenic systems comes from the difference in the methylation status of the mutational target sequences from small-sized endogenous genes. We review studies on the effect of DNA methylation on UV-induced mutagenesis, and evaluate the significance of the genome methylation for sunlight skin carcinogenesis. We also discuss the benefits of using CpG-methylated sequences as a mutational target for studies on the genotoxicity of sunlight UV.

Significance of cpg methylation for uv mutagenesis

Effects of CpG methylation on UV damage formation

UV induces specific base damage of cyclobutane pyrimidine dimer (CPD) and pyrimidine(6–4)pyrimidone photoproduct (64PP) at dipyrimidine sites in DNA (21). Pfeifer’s group studied the distribution of CPD and 64PP along the p53 gene by irradiating human cultured cells with UVC (254 nm) and found that the hotspots of the photolesions did not always correspond to those of the mutations detected in NMSC (22). They changed the UV source to natural sunlight, repeated the experiments and found that the formation of CPDs was enhanced up to 15-fold at the CpG-associated dipyrimidine sites that produced mutation hotspots in human NMSC (23). A similar enhancement of CPD formation was observed after UVB irradiation by Drouin and Therrien (24). The UVB/solar UV-promoted CPD formation at CpG sites requires methylation of the cytosine residue (23), whereas CpG methylation does not enhance the CPD production by UVC (25). Pfeifer’s group also determined the methylation status in the human p53 gene and confirmed complete CpG methylation at least along the exon 5–8 in all the tissues and cells they examined including keratinocytes (26). On the other hand, cytosine methylation is known to inhibit UVC-induced 64PP formation in DNA (27,28). Although it is unclear whether the same inhibitory effect occurs in the UVB-UVA range, the steady-state amount of 64PP formed on naked or cellular DNA is reduced and saturated at a low level, at least partly by conversion ato Dewar valence isomers, in the solar UV-wavelength range (29–31).

Effect of CpG methylation on UV mutagenesis

To confirm that the p53 mutations in human NMSC are caused directly by exposure to solar UV, Monshinsky and Wogan analyzed mutations induced in human p53 cDNA using an expression vector, which was UVC irradiated and replicated in yeast (32). Although the obtained mutation spectrum was UV-specific and similar to that of the NMSC p53 mutations, the distributions of mutations along the gene were different, failing to reproduce the NMSC-specific hotspot profile of preference for some of the CpG-associated dipyrimidine sites (see Fig. 1). Methylation of some of these CpG sites of the p53 cDNA also had no effect on recovering the mutation hotspots at those methylated sites after UVC irradiation (25).

The inability of UVC to cause mutational hotspots at dipyrimidine sites associated with methylated CpG (mCpG) was also reported by Pfeifer’s group (16,17), using fibroblasts from BigBlueTM mice, which harbor transgenic bacterial lacI genes on a λ phage shuttle vector as a mutational target (33). They used two different target sequences for the mutation analysis: the lacI transgene and the phage cII gene, both of which were shown to retain heavy methylation at CpG sites (16,17). As they found previously that solar UV produced more CPDs at mCpG sites than UVC (23), they analyzed the effect of UV from a solar UV simulator on the mutation spectra and distributions in the lacI and cII genes, and succeeded in recovering UV-specific C→T transitions frequently at some of the mCpG-associated dipyrimidine sites (16,17). One of those mutation recurrent sites actually formed a hotspot in either gene, confirming that the high frequency of CPD formation at mCpG sites leads to high recovery of UV-specific mutations at some of those sites.

We have studied the mutation spectra induced in skin epidermis following exposure to UVB, UVA, and sunlight (18–20), using another transgenic mouse line, MutaTM, which possesses the bacterial lacZ gene as a mutational marker transgene (34). The lacZ transgenes were proved to be heavily methylated at every examined CpG site in various tissues including the skin (18,35). Although similar and highly UV-specific mutation spectra were observed among those three UV sources (Table 1), we found a wavelength-dependent promotion of mutation occurrences at mCpG sites in the lacZ transgene. The longer the wavelength range of the UV source shifted, namely from UVB to solar UV (which consists of a part of UVB and the whole UVA) and further to UVA, the more frequently UV-specific mutations occurred at CpG sites (33% with UVB, 43% with sunlight and 66% with UVA; see Table 1). Moreover, the distributions of the recovered mutations along the lacZ gene showed clearly that the number of CpG-associated dipyrimidine sites where mutations were recovered increased as the wavelength range of UV source shifted longer (from 8 sites for UVB to 14 and 15 sites for sunlight and UVA, respectively; Fig. 2). Furthermore, mutational hotspots appeared for sunlight (position 1187) and UVA (positions 1187 and 1627), and one of the UVA hotspots was remarkably strong (position 1187, 19 mutations recovered; see Fig. 2). These results indicated that the longer wavelength components of UV contribute more to the induction of mutations and the appearance of hotspots at mCpG-associated dipyrimidine sites.

Table 1.   Base substitutions in skin epidermis of mice irradiated with different UV sources.
Number (%)% (Py–Py)§Number (%)% (Py–Py)§Number (%)% (Py–Py)§
  1. *Data from Ikehata et al. (18). †Data from Ikehata et al. (20). ‡Data from Ikehata et al. (19). §Percentage of the mutation number occurring at dipyrimidine sites.

C→T (CpG)26 (35)96%34 (43)100%55 (69)96%
C→T (nonCpG)39 (52)100%38 (48)100%18 (23)100%
T→C1 (1)100%0 0 
C→G2 (3)100%0 0 
C→A1 (1)100%5 (6)100%2 (3)50%
T→G0 0 0 
T→A4 (5)100%2 (3)100%1 (1)100%
CC→TT2 (3)100%1 (1)100%4 (5)100%
Others0 0 0 
Total75 80 80 
Figure 2.

 Distribution in the lacZ transgene of mutations detected in UVB-, sunlight- and UVA-irradiated MutaTM mouse skin epidermis. The horizontal open bar on the bottom of each graph indicates nucleotide positions in the lacZ coding sequence. Numbering of the positions begins from 1 at the first nucleotide of the start codon. Frequencies at each position of UVB-induced (total number 77), sunlight-induced (total 81) and UVA-induced (total 83) mutations are shown by vertical lines above the horizontal bar in the left, center and right graphs, respectively. The positions associated with the CpG motif are shown by lollipops, in which closed and open circles indicate the CpG sites with or without coincidental dipyrimidine sites, respectively. Position numbers are given for relatively frequent recurrent sites including hotspots.

The UVA source we used was black-light tubes, which emit the longer portion of UVB (310–320 nm) and UVA2 (320–340 nm) as well as UVA1 (340–400 nm), whereas the UV components from our UVB source and sunlight were 270–370 nm and 290–400 nm, respectively (20). As UVA1 is much less effective at inducing cytosine-containing CPDs than shorter UV (30), our results on the mutation distribution along the transgene suggest that the most effective UV wavelength for mutation induction at mCpG sites should reside in the range of longer UVB and UVA2 (310–340 nm), which is consistent with the results of Pfeifer’s group, who reported greater effectiveness of solar UV in enhancing CPD formation and mutation occurrence at mCpG sites, compared with UVC/UVB (16,17,23). As the wavelength range of 310–340 nm is the main solar-UV fraction that is abundant in sunlight and is dermatologically effective (for skin epidermis penetrability, DNA photolesion production, mutation induction, etc.), its influence would be reflected in the skin genotoxicity induced by solar UV. The accordance in the mutational hotspot preference of our experimental data with the NMSC p53 mutations supports this idea. Thus, the mCpG-preferred mutation occurrence specifies the genotoxicity of solar UV, especially for mammals. This idea may also be applicable to other vertebrates and plants, because their genome too is methylated at cytosine residues (11). We propose here that the C→T transition at a mCpG-associated dipyrimidine (Py-mCpG) site is a signature of solar UV.

In Pfeifer’s studies, the preference of UV-specific mutations for mCpG sites was less evident (16,17) compared to those observed for the NMSC p53 mutations and our lacZ transgene mutations (Figs. 1 and 2). This difference might result from the influence of the shorter wavelength components in solar UV that could be blocked by the skin cornified layer. The UV components less than ∼300 nm, which are largely attenuated by the stratum corneum (36), may be less effective in producing CPDs preferably at mCpG sites, but rather form them randomly at dipyrimidine sites irrespective of their methylation status. Pfeifer’s study on differences in CPD formation at mCpG sites among UVC, UVB and solar UV (23) partly supports this idea. The difference in hotspot appearances in our studies between UVB and sunlight/UVA (18–20) might also reflect this situation. Pfeifer’s group used cultured cells, which cannot prevent the shorter UV components from penetrating to the genomic DNA. Mutations induced by the shorter UV would have obscured the influence of the longer UV components effective in inducing mCpG-associated mutations by producing much more mutations at non-CpG sites.

Mechanism of mCpG-favored mutagenesis by solar UV

The mechanism for the induction of the UV-specific mutation of C→T transition at dipyrimidine sites has not been determined conclusively yet, but a persuasive model has been presented by Tessman and Kennedy (37,38). Although UV produces photolesions specifically at dipyrimidine sequences in DNA, it is known that UV-induced mutations prefer cytosine to thymine of dipyrimidines as a target residue, resulting in C→T base substitutions (39–43). The amino residue at position 4 of cytosine is a key to the mutation, although it is relatively stable in native state, deaminating at a half-life of about 30 000 years in double-stranded DNA at 37°C (44). Deaminated cytosines result in uracils, which are premutagenic to cause a C→T transition if not repaired. CPD formation at cytosine-containing dipyrimidine sites drastically enhances the instability of the amino residue of the cytosine and promotes deamination with a half-life of 2–100 h (45–48), resulting in uracil-containing CPDs. The uracil-containing CPDs could be bypassed ‘correctly’ according to the Watson–Crick’s base pairing rule with some specific translesion DNA synthesis (TLS) (37,38), inducing G→A base substitution opposite the uracil residue of the deaminated CPDs. Importantly, CPD formation at 5′-CC-3′ dipyrimidines could lead to double deamination from both the cytosine residues with a half-life of 3 h to 5 days, and the two deamination reactions seem to be sequential events and to occur not independently, but synergistically (46,48). The resultant products could lead to CC→TT tandem base substitutions, which are known as the UV signature mutation, through TLS.

Methylation of cytosine at position 5, which occurs widely in vertebrate and plant genomes at CpG sites (11), also stimulates the deamination of cytosines in DNA, which results in thymine residues, up to a half-life of about 2000 years (49). This chemical instability of 5-methylated cytosine (5 mC) is considered to lead to mutational hotspots of spontaneous origin, which have been detected in genetic mutation assay systems (35,50) and in genes affected in human genetic diseases (51) and cancers (52). The methylation shifts the UV absorption peak of cytosine from 267 to 273.5 nm at pH 7.2, and the molar absorption coefficient of 5 mC is about five-fold of that of cytosine at 290 nm, which is the shortest wavelength of UV included in the Earth’s surface sunlight (16,53). This absorption spectrum shift could, at least partly, contribute to the enhancement of CPD formation at mCpG sites by solar UV. Moreover, the presence of 5 mC in CPDs seems to further accelerate the deamination of the cytosine residue more than the acceleration by CPD formation itself (54), although the mechanism of the methylation-dependent deamination enhancement has been controversial (55–57). In addition, CpG methylation could also promote the double deamination of CC-CPD (47). Thus, CpG methylation enhances not only the solar UV-induced CPD formation but also the acceleration of cytosine deamination by the CPD formation itself, leading to the promotion of the UV-specific C→T and CC→TT mutations at mCpG sites through TLS in solar-UV mutagenesis.

TLS is mediated by a specialized type of DNA polymerase belonging mostly to the Y-family polymerase (58,59). The TLS polymerase specific for the CPD bypass is considered to be DNA polymerase η (pol η) (60,61), which performs an efficient and error-free bypass DNA synthesis specifically for CPDs (60–63). The high fidelity of pol η for DNA photolesions has also been supported by genetic data. Defects in the gene XPV, which encodes pol η, cause the variant type of xeroderma pigmentosum (XP), a human genetic disease associated with a high frequency of sunlight-induced skin cancers (64,65). Cells from XP variant patients are highly mutagenic after UV irradiation (66). In addition, studies with genetic mutation assay systems showed that TC-CPDs were not mutagenic unless they were deaminated (67), and that UV photolesions at 5′-CC-3′ or 5′-TC-3′ dinucleotides could be bypassed error-free under the presence of pol η (68). The error-free property of pol η is fitted well to Tessman-Kennedy’s model of TLS (37,38), which predicts an accurate bypass DNA synthesis opposite CPDs, ‘unlucky’ deamination of which happens to lead to a C→T or CC→TT mutation. However, in the absence or shortage of pol η, these UV-specific mutations still appeared as a major type of mutation after UV irradiation (69,70), which might suggest that other TLS polymerases could replace the role of pol η in the error-free CPD bypass or simply perform error-prone TLS against UV damage by the ‘A-rule’ (71,72).

Why traditional mutation assay systems are not useful for solar uv study

Mammalian aprt gene, hprt gene and non-mammalian mutation assay systems

Traditionally, the hprt, aprt and tk genes have been utilized widely as mutational targets for mammalian mutation research. As these genes are endogenous, their output should have been expected to reflect the events that generally occur in vivo. The well-established methods to select forward and reversion mutants also benefit them as mutational markers more than the other mammalian genes (73,74). Furthermore, the small sizes of the aprt gene (∼2.7 kb) and the hprt cDNA (∼1.4 kb) facilitate molecular analysis of mutational changes in their DNA sequence (75–77). Wavelength-dependent UV-induced mammalian mutation spectra were studied intensively in the aprt gene using Chinese hamster ovary cell lines by Drobetsky et al. (14). They observed UV-specific mutation spectra after UVC, UVB and solar UV irradiations and also detected a few mutational hotspots. Those hotspots, however, did not appear at CpG sites as they did for the human NMSC p53 mutations (18,20). As mentioned before, most of the frequent mutation hotspots for human NMSC p53 mutations emerged at CpG-associated dipyrimidine sites (see Fig. 1). The CpG-favored hotspot appearance of UV-specific mutations in the p53 gene was confirmed in the studies of UVB/solar UV-induced experimental skin cancers using hairless mice (78,79).

The human p53 gene has 42 CpG sites in its coding region, which are fully methylated at least within exons 5–8 (26; see Fig. 1), where almost all mutation hotspots are detected in human cancers (80). Among the 84 cytosines of the 42 CpG sites (double-stranded DNA includes two cytosine residues at a single CpG site), at least 46 of them should be methylated because 23 of the CpG sites are located within exons 5–8. Twenty-one of those 46 cytosines reside within dipyrimidine sites and 19 of them are mutable by UV, producing amino acid changes (including chain termination) in gene products by C→T mutations (Table 2). If all the 42 CpG sites in the human p53 gene are methylated, 32 cytosines in them could be mutable through UV-specific mutagenesis (Table 2). On the other hand, although the hamster aprt gene has 20 CpG sites in its coding region, 15 of them reside in an unmethylated CpG island (81,82), resulting in only five methylated CpG sites, in which only five dipyrimidine sites coincide and four of them are mutable (Table 2). Thus, the number of UV-mutable Py-mCpG sites in the aprt gene is considerably smaller than that in the human p53 gene (4 vs. 19–32), whereas the coding region sizes of these genes are different by only two-fold (543 bp vs. 1188 bp; see Table 2). The rarity of the Py-mCpG sites in the aprt gene would have precluded detection of UVB/solar UV-induced CpG-associated mutational hotspots in Drobetsky’s study (14). To evaluate unbiasedly the sensitivity of each gene to a UVB/solar–UV-specific mutation (C→T transition at Py-mCpG), we introduced an index called ‘solar UV-mutagenic sensitivity’ (Table 2), calculated by dividing the number of mutable cytosines in Py-mCpG sites in the gene by the length of the coding region (bp). With this index, the hamster aprt gene is estimated to be 2–3-fold less sensitive to the solar UV-specific mutagenesis than the human p53 gene (0.74%vs. 1.6–2.7%; see Table 2). Additionally, because only four or five hotspots have been detected among the 19–32 mutable Py-mCpG sites in the p53 gene (Table 2), the number of the mutable sites in the aprt gene (four sites; see Table 2) seems too small to bring about an outstanding mutation recurrence at those sites. Considering from the situation of the human p53 gene, at least five or six mutable Py-mCpG sites should be required in a given sequence to observe one hotspot for the solar UV-specific mutations. Thus, the aprt gene may be unsuitable for the detection of mutations occurring preferably at mCpG sites such as the solar UV-specific mutations.

Table 2.   CpG methylation status and solar UV-mutagenic sensitivity of genes used for mutation research.
GeneCoding region (bp)Number of sites (number of cytosine residues included in the sites)Solar UV-mutagenic sensitivity (%)†
  1. *Py-mCpG sites mutable to a missense or nonsense mutation by C→T transition of the methylated cytosine residue.

  2. †Calculated by dividing the number of cytosines included in the ‘Mutable’ sites by the length (bp) of the coding region of each gene.

  3. ‡Sixteen of them have been demonstrated to actually produce detectable mutations by a C→T transition (91).

Human p531,18842 (84)23–42 (46–84)(21∼39)(19∼32)1.6–2.7
Hamster aprt54320 (40)5 (10)(5)(4)0.74
Human hprt6578 (16)4 (8)(4)(3)0.46
MutaTMlacZ3,090291 (582)291 (582)(265)(173)5.6
BigBlueTMlacI1,08395 (190)95 (190)(77)(41)‡3.8
λ phage cII29422 (44)22 (44)(20)(12)4.1

Although the hprt gene is about 45 kb long, it has a relatively short coding region (657 bp) which is divided into nine exons, the first of which is in a CpG island and deduced to be unmethylated (82,83). The coding region of the human hprt gene has eight CpG sites, four of which reside in exon 1 and are assumed to be hypomethylated on an active X chromosome (83). Among eight cytosine residues of the other four downstream CpG sites, which are thought to be methylated on active X chromosomes (83), only three coincide in mutable dipyrimidine sites (Table 2). The solar UV-mutagenic sensitivity of the human hprt gene is 0.46%, which is even lower than that for the aprt gene (0.74%; see Table 2), suggesting that the hprt gene too would not be useful for studies on solar UV-induced mutagenesis, although it has not been utilized so much for such a purpose until now. It is known that the CpG frequency is usually suppressed to about one fifth of that which is expected in mammalian genome (11,82), and that most of those rare CpG sites are in CpG islands and are methylation-free (82). On the other hand, solar UV-specific mutations prefer those rare mCpG sites as shown in this review. Therefore, it should be cautioned that genes with a small-sized coding region, such as the aprt and hprt genes, are generally inappropriate as targets for UVB/solar UV-induced mutations.

In genetic studies on mutation, nonmammalian genetic systems like bacteria and fly are also often used as a substitute for mammals, with the expectation that the obtained results would be applicable for mammals. However, many of the genetic systems widely used in laboratories, like Escherichia coli, yeast and Drosophila, have cytosine-methylation-poor genomes (11). Hence, these nonmammalian systems would also not seem to be suitable for the detection of UVB/solar UV-induced mutations.


The advent of technology to produce transgenic cells and animals has brought great benefits to a broad region of research, including mutation research, in which transgenes are usually utilized as reporter sequences to detect mutations (84,85). The transgenes used for mutation assays are mainly of bacterial origin and not expressed in their host unless specially designed to do so, and most of them are subject to full CpG methylation, probably because of their genomic insertion in a multiple-copy tandem array and the abundance of CpG motifs (86,87). For example, it was actually shown for MutaTM and BigBlueTM rodents that their transgenes were heavily methylated in all organs and tissues examined (18,35,88,89). Although the inability to express the transgenes in hosts would be disadvantageous for studies on the effect of the transcription status on mutation induction, the abundance of mCpG may facilitate the efficient detection of methylated cytosine-mediated mutational events in those transgenes. In fact, this idea has been demonstrated most clearly in our studies of UVB/solar UV-induced epidermal mutations with MutaTM mice (18–20), where we succeeded in reproducing the spectrum and mutation-site preference of the human NMSC p53 mutations as mentioned above.

The lacZ transgene in MutaTM mice has 291 CpG sites, which is roughly three-fold more frequent in appearance per base compared with mammalian endogenous genes (Table 2). As all those CpG sites should be methylated, the transgene possesses 582 methylated cytosine residues, 265 of which coincide with dipyrimidine sites. Among these Py-mCpG sites, 173 sites are mutable (Table 2). Thus, the lacZ transgene is remarkably more abundant in Py-mCpG sites sensitive to solar UV-induced mutations than mammalian endogenous genes, and shows a solar UV-mutagenic sensitivity 2–3-fold higher than the p53 gene (5.6%vs. 1.6–2.7%; see Table 2). The abundance of solar UV-mutation sensitive sites in the lacZ gene also comes from the relatively large size of its coding region (∼3 kb). Until recently, large-sized genes had been considered undesirable for mutational targets, because long sequences would hinder efficient molecular analysis of mutational changes in the target. However, recent innovations in DNA sequencing technology have changed the situation. A size of 3 kb is an obstacle no longer for the analysis then using a capillary DNA sequencer and dye-terminator sequencing chemistry as shown in our research (90). The large size of the gene is now rather beneficial to mutation research, not only because it supplies a multitude of mCpG sites, but also because it provides a variety of sequence motifs in abundance, which would enable us to find the preference of some types of mutations for some specific, relatively longer sequence contexts that have not been identified yet.

For comparison, two more transgenes, BigBlueTMlacI gene and λ phage cII gene, which have also been widely used in in vivo mammalian mutation studies, are listed in Table 2. Similar to MutaTMlacZ gene, these transgenes are thought to retain a high level of CpG-methylation through the coding sequences (16,17,88,89), and are relatively rich in mutable Py-mCpG sites compared with the aprt and hprt genes. Furthermore, their mutagenic sensitivities to solar UV are also higher than endogenous mammalian genes and nearly equal to that of the lacZ gene (see Table 2). Hence, these two transgenes may also be suitable for studies of solar UV mutagenesis, as shown in Pfeifer’s studies (16,17).

Significance of genome methylation for mammalian skin cancers

As shown in this review, Py-mCpG sites in DNA are highly sensitive to solar UV-induced C→T mutations. This fact suggests that CpG-methylated genes in the mammalian skin genome should be vulnerable to the sunlight genotoxicity, and mutations in some of those genes could eventually lead to skin carcinogenesis. The p53 gene is the best known representative case, undergoing mutations with a preference for Py-mCpG sites in more than half of human NMSC (5,7,9; see Fig. 1). In general, the coding region of mammalian genes is divided into exons and separated by introns of various lengths. The first exon and the 5′-flanking region of the genes are often located in a CpG island, which is enriched with CpG dinucleotides and usually nonmethylated (82,92). In contrast, the regions of the genes outside the CpG island, including the downstream exons, are usually highly methylated at CpG sites although those sites are rare. Therefore, genes with a coding sequence long enough to include many CpG sites should be sensitive to the solar UV-specific genotoxicity, if most of those CpG sites reside in the exons outside CpG islands and are methylated. Actually, the first exon of the human p53 gene is separated from the other exons by ∼11 kb and contains no coding sequence. The downstream exons, which are located together within an 8.3-kb region, constitute the 1188 bp-long coding sequence for p53, which is abundant in the CpG motif (42 CpG sites), at least half of which are methylated (see Fig. 1 and Table 2). In addition to the p53 gene, another gene sensitive to solar UV-specific genotoxicity might be found among the genes involved in the induction of NMSC. The INK4a/ARF, PTCH, SMO and SHH genes might be such candidates (93). Further molecular analyses of mutations in those genes from NMSC will be needed to evaluate their sensitivities to solar UV-specific mutation.

Acknowledgements— We thank Ms. S. Kikuchi and Mr. B. Bell for manuscript preparation, and Dr. Takeshi Todo for providing the opportunity to summarize our recent studies and to clarify our proposition on solar UV mutation by reviewing the studies in the relevant fields.