A functional DNA methylation system in the pea aphid, Acyrthosiphon pisum


Thomas K. Walsh, CSIRO Entomology, Centre for Environment and Life Sciences (CELS), Floreat Park, WA 6914, Australia. Tel.: +61 (0)8 9333 6126; fax: +61 (0)8 9333 6646; e-mail: tom.walsh@csiro.au


Methylation of cytosine is one of the main epigenetic mechanisms involved in controlling gene expression. Here we show that the pea aphid (Acyrthosiphon pisum) genome possesses homologues to all the DNA methyltransferases found in vertebrates, and that 0.69% (±0.25%) of all cytosines are methylated. Identified methylation sites are predominantly restricted to the coding sequence of genes at CpG sites. We identify twelve methylated genes, including genes that interact with juvenile hormone, a key endocrine signal in insects. Bioinformatic prediction using CpG ratios for all predicted genes suggest that a large proportion of genes are methylated within the pea aphid.


Methylation of DNA at cytosine residues and the modification of chromatin through post-translational modifications of histone proteins are the main epigenetic mechanisms in eukaryotes. This ‘epigenetic code’ regulates expression without changing the DNA sequence by altering the accessibility of genes to the transcription machinery. Methylation of cytosine residues at CpG dinucleotides in DNA is common in many eukaryotic genomes and is usually a suppressive mechanism acting on specific genes, selfish genetic elements, or even entire chromosomes in the case of X chromosome inactivation (Delcuve et al., 2009).

Epigenetic mechanisms play key roles in adaptation to a changing environment and have been implicated in the regulation of phenotypic plasticity (Moczek & Snell-Rood, 2008). In insects, many species use phenotypic plasticity to adapt rapidly to new environmental conditions, and aphids display an extreme example of this. A single female aphid genotype can self-propagate by viviparous parthenogenesis or, in response to environmental signals, can produce up to seven additional distinct morphs (Le Trionnaire et al., 2008). In some species this polyphenism can include soldier morphs, but more commonly involves the production of unwinged reproductive morphs (apterae), winged dispersal morphs (alatae), and sexual morphs of either gender. Winged individuals develop as a result of over-crowding or poor nutritional conditions (Sutherland, 1969; Muller et al., 2001). Sexual morphs are produced in autumn, upon day-length decrease. How external stimuli such as changing photoperiod or crowding are perceived by an aphid, and how that stimulus results in the production of alternative morphologies remain unclear. Epigenetic control of gene expression is one potential mechanism. Clearly, endocrine signals are involved: Juvenile hormone (JH) regulates the switch from parthenogenesis to sexual reproduction in aphids, and may be involved with the development of wings (Corbitt & Hardie, 1985; Braendle et al., 2006).

For many years it was thought that very little if any methylation occurred in insect genomes. In Drosophila melanogaster DNA methylation is reported at very low levels, though the actual amount and significance awaits further clarification (Goll & Bestor, 2005). Where it has been observed, methylation is found in intergenic regions at asymmetric CpT and CpA dinucleotides (i.e. non-CpG dinucleotides), and does not appear to be important for proper development (Lyko et al., 2000; Kunert et al., 2003; Field et al., 2004; Marhold et al., 2004; Phalke et al., 2009). Recent evidence from several insect species has suggested a greater than hitherto suspected role for DNA methylation in insects (Field et al., 2004; Wang et al., 2006; Mandrioli & Borsatti, 2007; Kronforst et al., 2008; Kucharski et al., 2008). An excellent example can be seen during caste differentiation in the honey bee, Apis mellifera. In this species, the development of queens is induced by the addition of royal jelly to the diet of genetically identical larvae destined otherwise to be workers (Haydak, 1943; Barchuk et al., 2007). Kucharski et al. recently demonstrated that silencing the expression of DNA methyltransferase 3 (AmDnmt3), a key component of de novo DNA methylation, results in all larvae phenotypically resembling queens (Kucharski et al., 2008). This work suggests that DNA methylation has a key role in regulating queen determination in honeybees and that epigenetic regulation could regulate morph transitions in other insect species.

Previous studies have shown that aphids exhibit DNA methylation. Specifically, organophosphate-resistant green peach aphids, Myzus persicae, are methylated within the coding sequence of the gene responsible for the resistance phenotype, the E4 esterase gene (Field et al., 1996, 1999; Hick et al., 1996). Interestingly, unlike the general suppressive effect of methylation seen in higher plants and animals, this aphid E4 esterase appears to be only expressed if methylated (Field, 2000). More recently, Mandrioli and Borsatti used an immuno-histochemistry approach to show that DNA methylation is also present in the pea aphid, Acyrthosiphon pisum (Mandrioli & Borsatti, 2007). They suggest that the pattern of DNA methylation in the pea aphid appeared different to that of other eukaryotes. Ribosomal RNA genes, methylated in many eukaryotic genomes (Tweedie et al., 1997), showed no evidence of DNA methylation in the pea aphid (Mandrioli & Borsatti, 2007). However, this study did not identify genes or regions of the genome where methylation occurred.

The sequencing of the pea aphid genome provides the ideal opportunity to examine the role of DNA methylation in aphid development. In this paper we show that A. pisum has the full repertoire of DNA methylation enzymes. We also demonstrate that methylcytosine is present within the genome, and we identify twelve genes that are methylated in the pea aphid. Some of these methylated genes are associated with JH regulation, a key endocrine signal in all insects.


Identification of pea aphid DNA methyltransferases

Two DNA methyltransferase (Dnmt) 1 genes, known as maintenance methyltransferases, were identified: ApDnmt1a (ACYPI004389) and ApDnmt1b (ACYPI006318). They encode ∼1300 amino acid proteins that are 77% identical, and which have 49% identity and 64% similarity to the AmDnmt1a and AmDnmt1b honey bee orthologues (46% and 64% to human Dnmt1). The pea aphid ApDnmt2 (ACYPI007944) shows 46% identity and 64% similarity to AmDnmt2 (47% and 64% to Drosophila melanogaster; 44% and 62% to human Dnmt2). Both the predicted proteins from these genes contain BAH (cd04760) conserved domains and the cytosine-C5 specific DNA methylase conserved domain (cd00315) characteristic of Dnmt1 genes. Two potential Dnmt3 genes, the de novo methyltransferases, were also identified. ApDnmt3 (ACYPI633911) appears to be partial in the genome because of a large gap in the genome assembly at the 5′ end of the gene, and our attempts to extend the sequence have so far been unsuccessful. Over the available length (330 amino acids out of ∼800) ApDnmt3 has 38% identity and 55% similarity to that of Ap. mellifera (41% and 62% to human Dnmt3b isoform 1) and contains cd00315. In addition, another potential Dnmt3 (ApDnmt3X, ACYPI677434, 569 amino acids) was observed that is only distantly related to other known Dnmt3 genes and does not contain a recognized cd00315 domain (24% identity and 40% similarity to human Dnmt3a).

Alignment of the predicted amino acid sequences of the Dnmts with ClustalX shows that they all cluster with their homologues in vertebrates and the honey bee (Fig. 1). Most of the Dnmts so far discovered possess conserved residues in the catalytic domains and A. pisum Dnmt sequences align well with the conserved residues identified in the honey bee genes (Fig. 2). ApDnmt3X seems to be missing many of the conserved amino acids thought to be necessary for Dnmt activity. The functional significance of ApDnmt3X is not clear; it may be a non-functional pseudogene or it may function in a similar manner to the vertebrate Dnmt3L.

Figure 1.

Clustal neighbour-joining analysis of the amino acid sequences of the DNA methyltransferases (Dnmts). Dnmts from the pea aphid Acyrthosiphon pisum (Ap), the honey bee Apis mellifera (Am), the fruit fly Drosophila melanogaster (Dm), the human Homo sapiens (Hs) and the mouse, Mus musculus (Mm) were aligned with ClustalX (ApDnmt1a ACYPI192698, ApDnmt1b ACYPI495020, AmDnmt1a GB15130, AmDnmt1b GB19865, HsDnmt1 NP001370, MmDnmt1 NP034196, ApDnmt3 ACYPI633911, ApDnmt3X ACYPI677434, AmDnmt3 GB14232, HsDnmt3a NP787046, HsDnmt3b NP072046, MmDnmt3a NP031898, MmDnmt3b NP001003961, AmDnmt2 GB1076, ApDnmt2 ACYPI416799, DmDnmt2 AAF03835, HsTrdmt1 NP004403, MmTrdmt1 NP034197). Only 400 amino acids at the C terminus of the Dnmt3s were included due to the truncated nature of the ApDnmt3 predicted gene.

Figure 2.

Pea aphid homologues of vertebrate DNA methyltransferases (Dnmts). The predicted amino acid sequences of the Dnmts for the cytosine-C5 specific DNA methylase conserved domain (cd00315) from the fruit fly, the honey bee and the human were aligned with the predicted pea aphid Dnmts using ClustalX. Roman numerals indicate the conserved regions within cd00315. Conserved amino acids are identified by shading. The arrow indicates an amino acid change uniquely associated with Acyrthosiphon pisum.

In addition to the Dnmts, the A. pisum genome contains many of the other genes that are involved in DNA methylation. Methyl-CpG-binding protein (ApMBP ACYPI004592, 50% identity and 68% similarity to AmMBP) and Dnmt1-associated protein (ApDmap1 ACYPI004738 55% identity and 71% similarity to AmDmap1) are present. The presence of these genes and chromatin modification proteins such as histone deacytylases confirms that all the components for epigenetic regulation of gene expression exist in the pea aphid genome (Rider et al., 2010).

Methylcytosine quantification in the pea aphid genome

The total level of methylcytosine as a percentage of cytosine in the A. pisum genome calculated by LC-ESI-MS/MS was 0.69% (±0.25% n= 18) of all the cytosines with a range from 0.43% to 1.27%. gDNA samples extracted from whole parthenogenic aphids (n= 9), oviparous females (n= 3), males (n= 3) and sexuparae (viviparous aphids that produce sexual offspring n= 3) were used. Males possessed a greater amount of methylcytosine (1.16% ± 0.10) compared with the females (0.60% ± 0.14), potentially due to the absence of embryos or eggs. A positive control of calf thymus DNA (10.78% ± 1.58 n= 6) and p-GEMTeasy-vector plasmid DNA (0.010% ± 0.009 n= 6) was included. Overall the pea aphid genome is very AT rich with a GC content of only 29.6%, while the transcripts show a higher GC content 38.8% (International Aphid Genomics Consortium, 2010). While we have identified some non-CpG methylation, if most of the methylation occurred at CpG sites, approximately 2.76% of all CpGs in the pea aphid genome would be methylated.

Methylated genes

The first approach we used to identify methylated genes in the pea aphid was to generate genomic DNA libraries putatively enriched for methylated genome sequence. These were created using two different methods: amplification of inter-methylated sites (AIMS) and immuno-precipitation (MeDIP). A subset of 40 randomly selected clones from each library was screened for methylation with bisulphite sequencing covering a total of about 20 000 base pairs. No methylation was found in the AIMS generated library, but four cloned genomic regions from the MeDIP library displayed methylation and all of these regions encoded potential genes (Table 1). All of the methylated cytosines were found at CpG sites within predicted coding sequence.

Table 1.  Genes containing methylated cytosine identified by bisulphite sequencing in Acyrthosiphon pisum
Methylated gene IDMethod of identificationAnnotationPreviously identifiedMethylated CpGsBase pairs examinedESTCpG ratio
ACYPI000089Direct bisulphite sequencingCasein kinaseYes –Apis mellifera1191Yes0.36
ACYPI004010Direct bisulphite sequencingDynactin p62Yes –Ap. mellifera15762Yes0.69
hmm239374Direct bisulphite sequencingYes –Ap. mellifera4790No0.68
ACYPI007883Direct bisulphite sequencingNucleoporin Nup205Yes –Ap. mellifera51210No0.45
ACYPI009632Direct bisulphite sequencingSimilar to Nup205No7647No0.55
ACYPI008731Immuno precipitationNo2401Yes1.11
ACYPI009563Immuno precipitationNo4455Yes1.12
ACYPI001529Immuno precipitationsimilar to cahedrinNo6390Yes0.64
LOC100169634Immuno precipitationT-complex subunit epsilonNo3263Yes0.94
ACYPI563350Direct bisulphite sequencingJHEBPNo111074Yes1.05
ACYPI154871Direct bisulphite sequencingcytosolic JHBPNo101219Yes0.76
ACYPI307696Direct bisulphite sequencingJH epoxide hydrolaseNo51503Yes1.36

Our second approach to identify methylated genes was to analyse genes known to be methylated in other insects, which included six from the honey bee and one from the green peach aphid M. persicae (Hick et al., 1996; Wang et al., 2006). Homologues to five of the methylated honey bee genes were identified in the pea aphid genome and bisulphite sequencing of three clones per gene was performed. We observed methylation in four of the five pea aphid homologues to the honey bee genes in the same regions of the genes (Table 1). One of the targeted A. mellifera genes had two A. pisum homologues and both were methylated. All the methylated cytosines were found at CpG sites and all were in the predicted coding sequence of these genes. The closest A. pisum homologue (ACYPI623066) to the methylated E4 esterase gene in M. persicae (Field et al., 1996) was found not to be methylated in the pea aphid.

Our last approach was to examine the methylation state of specific target genes of interest. Because previous work has implicated JH in regulating development and caste determination in many insect species (Verma, 2007) as well as phenotypic plasticity in aphids (Le Trionnaire et al., 2008), we analysed several pea aphid genes identified by screening the genome for homologues to known JH-regulating genes in other insects (International Aphid Genomics Consortium, 2010). These genes were subjected to bisulphite sequencing from gDNA of whole individuals. A predicted cytoplasmic JH-binding protein (ApJHBP, ACYPI006428), JH-esterase-binding protein (ApJHEBP, ACYPI563350) and a JH epoxide hydrolase (ApJHEH3, ACYPI307696) were found to be methylated in the regions examined, with 13, 11, and 5 methylated CpG sites, respectively (Table 1). We found no evidence for DNA methylation in either JH acid methyltransferase 1 and 2 (ACYPI007696, ACYPI001588) or JH epoxide hydrolase 1 and 2 (ACYPI006263, ACYPI008135).

We then examined the methylation pattern of JHBP and JHEBP in more detail by comparing the frequency of methylation in at least 10 clones randomly selected from PCR product libraries generated from winged and unwinged parthenogenic morphs (Fig. 3). For these experiments, genomic DNA was extracted from adult heads to ensure that no embryonic DNA was analysed. Considerable variation among clones was observed in the methylation state at individual sites, perhaps indicating tissue-specific variation in methylation patterns within the heads. At some methylated sites, methylation was detected in all the clones sequenced while, at others, the methylation frequency was as low as 16% (Fig. 3). Overall, there was no clear difference in methylation pattern associated with morph. However, at one CpG site in ApJHBP the frequency of methylation was 85.7% in apterae compared with 54.5% in alate (2 sample t-test Z = 1.76, P= 0.08) (Fig. 3). MassARRAY analysis of three pooled PCR reactions at this site confirmed that that methylation was reduced by approximately 50% in alate compared with apterous morphs.

Figure 3.

Methylation patterns in juvenile hormone associated genes. Juvenile hormone binding protein (ApJHBP) and juvenile hormone esterase binding protein (ApJHEBP). The location of amplicons is indicated by grey shaded boxes, which also show the number of methylated cytosines within the region. The frequency of methylation of each identified methylated cytosine is shown in the graphs beneath each gene, expressed as a percentage of all independent clones examined. The black dot indicates a differentially methylated CpG site between alate and apterous aphids. The black box indicates where CpA methylation was identified.

The methylated cytosines associated with these genes were found almost exclusively at canonical CpG sites within the coding regions of the genes. The only exception was in ApJHBP, where CpA methylation was observed at three CpA dinucleotides in an intron, located just after the end of the exon. The frequency of this pattern varied, but all three cytosines were methylated in 72% of the sequenced clones (n= 47).

DNA methyltransferase expression

Any change in methylation state associated with the development of the alate/apterous morphs could necessitate a change in expression of one or more Dnmts. Environmental signals could effect a change in Dnmt expression in the generation detecting the signal (mothers), or in the generation whose development is altered (developing embryos). Unfortunately, it is difficult to test for differential Dnmt expression in developing embryos because one cannot differentiate between alatae and apterae at this developmental stage. However, when we measured Dnmt expression in the heads of crowded (which induces alate formation) vs. uncrowded mothers, we found that ApDnmt2 exhibited significantly higher levels in the crowded treatment (1-sided t-test, P= 0.038) (Fig. 4). There was also a trend towards higher expression of both Dnmt1 enzymes in the crowded treatment, but neither was significant (P > 0.10).

Figure 4.

Real-time quantitative PCR analysis of the DNA methyltransferases (Dnmts). The relative expression levels of each ApDnmt in the heads of crowded (grey) and uncrowded (white) unwinged asexual females. ApGAPDH was used as a control for normalization. Each mean is given plus or minus one standard error resulting from six biological replicates of each treatment.

CpG ratio calculation

Methylated cytosines are prone to deamination producing uracil, which after DNA repair produces thymidine. As a consequence, methylated CpGs are likely to decrease in abundance over evolutionary time, and the ratio of observed to expected CpGs can be used to predict historically methylated genes (Suzuki et al., 2007). We calculated the frequencies of CpGs in all predicted A. pisum genes, comparing these frequencies with the expected occurrence based on the actual GC content within the gene.

The histogram of CpG ratios calculated for the coding sequence of each predicted gene of the pea aphid showed a bimodal distribution (Fig. 5A) (International Aphid Genomics Consortium, in press). This suggests that there are two broad groups of genes within the pea aphid, those that show evidence of historical patterns of DNA methylation and those that do not. This bimodal pattern in the CpG ratio data is similar to that observed in another phenotypically plastic insect, the honey bee, in contrast to the unimodal peaks observed for D. melanogaster and Tribolium castaneum (Elango et al., 2009; International Aphid Genomics Consortium, 2010). Both the pea aphid and the honey bee CpG ratio distributions have negative kurtosis values (−0.60 and −0.58, respectively), which is one indicator of a bimodal distribution.

Figure 5.

CpG ratios in the coding sequence of Acyrthosiphon pisum (observed/expected). The observed CpG frequencies for each gene were extracted from available genome sequence data, while the expected values were calculated based on the GC content of each of the 10 465 sequences. The best fit unimodal peaks predicted to represent low CpG vs. high CpG genes were calculated as described in the methods.

We assume that the observed bimodal distribution arises from two ‘overlapping’ distributions, representing, in this case, genes that show evidence of historical methylation and genes that do not. Assuming also that the two underlying peaks are symmetrical, we predicted the actual distributions for the ‘low CpG’ and ‘high CpG’ distributions based on the best fit, when combined, to the observed distribution (χ2= 51.0, df = 37, P < 0.05) (Fig. 5B).


We have demonstrated that the pea aphid, A. pisum, has homologues to all Dnmts present in vertebrates, and that DNA methylation is found in multiple genes. All of the DNA methylation we observed was within genes – the majority at CpG sites within the coding sequence. Our analyses of total cytosine methylation and of CpG ratios in predicted pea aphid genes both suggest that gene body DNA methylation occurs widely in this species. Our results are the first to identify the DNA methylation machinery in a hemimetabolous insect, and show that Hemiptera is the second identified insect order after Hymenoptera with a fully functional vertebrate-style methylation system. The Hymenoptera are the basal group of holometabolous insects, suggesting that other holometabolous insects such as the Diptera have lost aspects of the methylation system (Tweedie et al., 1999; Savard et al., 2006).

DNA methylation machinery in Acyrthosiphon pisum

All known examples of cytosine methylation rely on the DNA methyltransferase (Dnmt) family of enzymes that catalyse the transfer of a methyl group to DNA using S-adenosyl methionine as the methyl donor. The A. pisum genome contains homologues to all the Dnmt genes found in vertebrates, and all were actively expressed (Dnmt3X was not examined). Dnmt1 is the maintenance methyltransferase and is responsible for passing on the methylation signals during cell replication, while Dnmt3 is involved in establishing new methylation patterns. In mammals, Dnmt2 has been identified as an RNA cytosine methyltransferase and is not thought to methylate DNA (Goll et al., 2006). However, it is the only Dnmt expressed in D. melanogaster, a species that does exhibit limited intergenic DNA methylation including CpA methylation (Kunert et al., 2003).

In addition to the Dnmts, A. pisum has many of the other genes that have been shown to be crucial to the maintenance and de novo methylation of DNA. Genes such as methyl-CpG-binding protein and Dnmt1 associated protein are thought to be involved in the recruitment of histone modification proteins such as histone deacetylases (Kass et al., 1997; Jones et al., 1998). It is clear from our work that A. pisum has the full complement of methyltransferase and methylation associated genes required to have a functional DNA methylation system.

Location and extent of Acythosiphon pisum DNA methylation

LC-ESI-MS/MS analysis of A. pisum genomic DNA indicated that only a small percentage of cytosines are methylated (0.69% of all cytosines). This is more than observed in D. melanogaster where 0% to 0.3% was detected in adults and embryos, respectively, but considerably less than observed in humans where 2–10% of all cytosines are methylated (Razin & Riggs, 1980; Ehrlich & Wang, 1981; Lyko et al., 2000). Combined across all our approaches, we analysed over 50 000 bp of pea aphid genomic DNA for the presence of methylated cytosines. Despite the fact that around half of this targeted sequence was intergenic (primarily from the genomic libraries), we failed to identify any methylated intergenic regions and it has been known for some time that gene body DNA methylation occurs in aphids (Field, 2000). Our results add to a growing body of evidence that most methylated CpGs in A. pisum and other invertebrates are located in the coding regions of genes. Gene body methylation has also been demonstrated in insects in the honey bee (Kucharski et al., 2008), as well as in other invertebrate species (Simmen et al., 1999). Interestingly, though our bisulphite sequencing did not identify any methylated CpG sites outside of coding sequences, we did identify three methylated CpA sites in one intronic sequence. To our knowledge, this is the first example of CpA methylation in insects outside of Drosophila (Kunert et al., 2003).

Additional evidence for gene body methylation comes from immunofluorescence studies on A. pisum, which suggested that aphid heterochromatin (where transcription is limited) lacks DNA methylation, and that methylated CpGs are localized in euchromatin (Mandrioli & Borsatti, 2007). In addition, DNA methylation is absent in ribosomal DNA genes (Mandrioli & Borsatti, 2007), a group of genes typically methylated in vertebrates. Indeed, methylation within coding sequences appears to be associated with an ancestral function, perhaps acting to enhance expression by preventing aberrant transcription initiations (Field et al., 2004; Mandrioli, 2004). The recent discovery of extensive DNA methylation within coding sequences in Arabidopsis thaliana and humans suggests that this function may be retained in plants and animals (Hellman & Chess, 2007; Zilberman et al., 2007; Backdahl et al., 2009; Ball et al., 2009). Indeed, the primary role now associated with DNA methylation in vertebrate promoter regions – gene silencing – appears to be a derived function (Mandrioli, 2007). That said, the regulation of E4 esterase in the aphid Myzus persicae by intragenic CpG methylation does have the on-off switching characteristic of the vertebrate methylation system, except that it is the presence, not the absence of methylation that is associated with active expression (Field, 2000).

While our overall estimate of methylated cytosine within the genome is low compared to vertebrates, coding sequence is estimated to make up less than 1% of the A. pisum genome. Assuming that methylation in A. pisum is indeed found exclusively or predominantly at CpG sites within coding sequence, our estimate of 3.2% genomic CpG methylation would suggest that a large number of pea aphid genes could be influenced by DNA methylation. This conclusion is supported by the CpG ratio data for predicted genes, which predicts that 40% of predicted genes show evidence of historical methylation (Fig. 5). However, while methylated cytosine is more susceptible to the mutation to thymidine, any sequence change would only be passed on if the mutation occurred in the germ cells; mutations elsewhere would have no effect on the DNA sequence of the offspring. If DNA methylation is common in the coding sequence of genes then this could explain why some of the genes with methylation sites do not have a low CpG ratio (Table 1). The implication of this result is that DNA methylation may influence a greater number of genes than the 40% of genes that we predict show evidence of historical methylation. It is also interesting to note that many of the genes in the predicted ‘high CpG ratio’ peak have a greater than expected relative abundance of CpGs (CpG ratio >1, Fig. 5). Why there are a large number of genes with a high CpG ratio is not clear though it has been speculated that a mechanism may exist to maintain or select for CpG enrichment in honey bee genes (Elango et al., 2009).

Methylated genes in Acyrthosiphon pisum

In this study we have identified twelve methylated genes in A. pisum, this number includes genes that have previously been identified as methylated in Ap. mellifera, as well as some novel genes identified using whole genome screening approaches and targeted bisulphite sequencing. Four of the five methylated Ap. mellifera gene homologues we identified were also methylated in A. pisum, suggesting that there may be a conserved subset of methylated genes among some insects. The closest homologue in A. pisum to the only methylated gene previously identified in aphids, the E4 esterase gene in M. Persicae (Field, 2000), was not methylated in the fragments we analysed. However, methylation of E4 in M. persicae is uniquely associated with amplification of up to 80 identical copies of the gene, leading to insecticide resistance (Hick et al., 1996), a situation not observed with the E4 homologue we identified in the A. pisum genome. The carboxylcholinesterase group as a whole in A. pisum has 30 members and a further investigation would be warranted to investigate methylation in other esterases (Ramsey et al., 2010). We identified a further three genes with CpG methylation by targeting genes associated with JH regulation. These are all highly conserved genes in insects (Riddiford, 2008), but we do not yet know whether methylation of these genes is widespread in insects.

Four genes were identified as methylated using MeDIP and none were identified using AIMS. It is not known whether the four novel genes identified by immuno-precipitation are also methylated in other species of insects. The poor performance of both the AIMS and the MeDIP assays is worthy of note. Both methods are designed to target mammalian DNA methylation patterns, characterized by dense CpG methylation in regions with a high GC content. The more dispersed pattern of DNA methylation in insects could interfere with the ability of the anti-methyl cytosine antibody to bind in sufficient numbers to enrich adequately for methylated DNA. Similarly, the AIMS protocol relies on methylation sensitive and insensitive restriction enzymes that recognize the CCCGGG motif commonly found in vertebrate CpG islands, but which is rare in coding sequence. In fact, none of the methylated CpG sites found in A. pisum were found to be part of such a motif. It may be possible to improve the AIMS technique in insects by using a combination of enzymes that target a recognition site more frequently observed in coding regions. Ultimately, new techniques such as whole genome sequencing after bisulphite treatment may be required to comprehensively identify methylated regions of insect and other invertebrate genomes.

Phenotypic plasticity and DNA methylation

Of the methylated genes identified in this study, perhaps the most interesting are those whose products are involved in the metabolism and transport of JH. Many developmental processes in insects are governed by JH, and it also contributes to the regulation of morph determination in aphids, including parthenogenic vs. sexual morphs and possibly apterous vs. winged morphs (Corbitt & Hardie, 1985; Le Trionnaire et al., 2008). JH is also known to be involved in the development and maintenance of caste differences in Ap. mellifera (Capella & Hartfelder, 1998; Barchuk et al., 2002), queen bee differentiation is affected by Dnmt3 expression (Kucharski et al., 2008) and methylation appears to be involved in the development of other castes as well (Elango et al., 2009). Clearly, it is reasonable to hypothesize that aphid polyphenism may be regulated in part by methylation and potentially the methylation of genes involved in the synthesis, transport, or catabolism of JH. The fact that Dnmt2 expression varied significantly between crowded and uncrowded morphs is puzzling, because it is not known to methylate cytosines at CpG sites. However, we have demonstrated CpA methylation and it is possible that there are yet more motifs within the pea aphid that would be affected by Dnmt2. Further research on the role of CpA methylation in the pea aphid and other insects is certainly warranted.

We did not see any consistent changes in DNA methylation patterns in JH-associated genes in head gDNA extracts between winged and unwinged morphs. JH titre in the haemolymph is modulated on an organism-wide scale, primarily by the regulation of JH synthesis (Shinoda & Itoyama, 2003). Neither of the two identified JH synthesis genes, the JH acid methyltransferases (JHAMT), displayed evidence of methylation in A. pisum. It is more likely that any JH regulation of aphid wing polyphenism would be achieved locally through tissue-specific gene expression of enzymes involved in JH transport or catabolism. We found juvenile hormone esterase binding protein (ApJHEBP) and juvenile hormone binding protein (ApJHBP) to be extensively methylated, though no differences were detected between winged and unwinged morphs in the DNA samples we examined. JH catabolism has been proposed as a mechanism for localised control of JH, usually through degradation by JH epoxide hydrolase (JHEH) or JH esterase (JHE) (Anand et al., 2008). We detected low levels of DNA methylation in ApJHEH3, but we saw no obvious variation in methylation patterns between morphs

The fact that we did not detect differences between morphs in this study does not preclude the possibility that methylation of these genes is contributing to the regulation of morph development. Given the tissue-specific nature of some DNA methylation patterns in insects and vertebrates (Wang et al., 2006; Nagase & Ghosh, 2008), it is possible that changes in methylation are localized to specific tissue types, and a much more targeted approach would be necessary to detect them all.


DNA methylation in A. pisum appears to be concentrated at CpG sites in coding regions, which seems to be the norm in many invertebrates (Mandrioli, 2004, 2007; Elango et al., 2009). Interestingly, there is no obvious difference in the methylation machinery or the chemistry of DNA methylation between invertebrates and vertebrates, yet there appears to be a clear difference in function (Mandrioli, 2007). Comparing and contrasting DNA methylation in invertebrates and vertebrates should present an exciting opportunity to understand the evolution of DNA methylation, and the pea aphid is an excellent invertebrate model to include in these studies.

Experimental procedures

Biological material

Where possible LSR1-A1-G1, the aphid strain used for the genome project was used, however, a local Western Australian and an American line were also used. All of the aphids used in this work were originally obtained from alfalfa (Medicago sativa) which should minimize the confounding effects of host adaptation. To maintain cultures of primarily wingless, viviparous, parthenogenetic individuals (apterae), aphids were reared at low densities (two to three aphids per plate) in an incubator at 19 °C with 16 light hours alternating with eight dark hours, in 15 mm Petri dishes, each with a leaf of Medicago arborea inserted into 3 ml of 1% agar containing 1 gl−1 Miracle-Gro (Brisson et al., 2007). To induce winged individuals (alatae), aphids were placed in groups of 10 in a 55 mm Petri dish containing a moistened piece of Whatman paper for 24 h (Sutherland, 1969; Braendle et al., 2005; Brisson et al., 2007). To ensure controlled comparisons, aphids were also placed singly in similar 10 mm Petri dishes to generate apterae for analysis. Sexual morphs were induced in A. pisum clone (International Aphid Genomics Consortium, 2010) maintained on broad bean (Vicia fabae) at 18 °C by decreasing the day-length from 16 h to to 12 h light. After one generation, parthenogenetic females producing sexuals (sexuparae) were removed (Le Trionnaire et al., 2007). gDNA from all morphs was extracted from either whole bodies or individually-dissected heads (without antennae). gDNA was extracted in 300 ul of Tri Reagent (Molecular Research Center, Inc., Cincinnati, OH, USA) and frozen at −20 °C.

Gene annotation

Homologues of different insect genes were identified by mining the genomic data for the A. pisum genome (Acyr 1.0 version of the assembly) at AphidBase (http://www.aphidbase.com). This was done using the corresponding D. melanogaster or Ap. mellifera sequences as bait and the collection of predicted proteins for A. pisum as targets (program blastp) (Altschul et al., 1990). In order to ensure all the potential DNA methyltransferases were identified we also searched the genome using all DNA methyltransferases from Arabidopsis thaliana, Danio rerio and the model filamentous fungus Neurospora crassa.

Amplification of inter-methylated sites

Regions between methylated sites were selectively amplified from the pea aphid genome using the AIMS protocol (Toyota et al., 1999; Frigola et al., 2002). Briefly, genomic DNA was isolated using the DNeasy genomic DNA extraction kit (Qiagen, Valencia, CA, USA) and 2 µg of gDNA was digested with 20 units of the methylation sensitive restriction enzyme SmaI (New England Biolabs, Boston, MA, USA) for 3 h at 25 °C. This enzyme results in the generation of blunt ends. The sample was then digested overnight at 37 °C with 20 units of XmaI (New England Biolabs), which is methylation insensitive and generates sticky ends (C/CCGGG). Adaptors were prepared by incubating the oligonucleotides (Table 2) at 65 °C for 2 min followed by cooling at room temperature for 1 hour. 1 µg of digested DNA was ligated to 2 nmol of adaptor using T4 DNA ligase (Fermentas, Glen Burnie, MD, USA) overnight at 16 °C. The ligated DNA was purified using the HighYield Gel/PCR DNA extraction kit (RBC Bioscience, Chung Ho City, Taiwan) and eluted in 100 µl. PCR amplification was done with 1 µl of the purified, digested, and adaptor-ligated DNA using primer 1 and platinum Taq (Invitrogen, Carlsbad, CA, USA) in a 50 µl reaction, 94 °C for 2 min x1, 94 °C for 30 sec, 58 °C for 30 s and 72 °C for 3 min × 30 on a PCT100 thermal cycler (MJ research, Watertown, MA, USA). PCR products were purified with a PCR purification kit (RBC Bioscience, Chung Ho City, Taiwan) and ligated into p-GEMTeasy-vector (Promega, Madison, WI, USA). DNA sequencing was carried out by AGRF (Australian Genome Research Facility, St Lucia, QA, Australia).

Table 2.  Primers used in this work. Only those bisulphite primers that were successful in identifying methylated sites are included
Bisulphite PCR primersPea aphid Juvenile hormone related 
Pea aphid honey bee homologs 
Amplification of intermethylated sites (AIMS)AdaptCCGGTCAGAGCTTTGCGAAT


A method based on immunoprecipation was used to generate a gDNA library enriched for methylated sequences (Salzberg et al., 2004). Briefly, genomic DNA was extracted using the Qiagen DNeasy kit (Qiagen), and 2 µg of gDNA was digested for 3 h at 37 °C with 20 units of the restriction enzyme DpnII (New England Biolabs) leaving a GATC overhang. Adaptors were ligated overnight at 16 °C (DpnII-adapt1 agcactctccagcctctcaccgca and DpnII-adapt2 gatctgcggtga) with T4 DNA ligase (New England Biolabs). The 12 mer adapter was removed by heating the reaction to 72 °C for 3 min and the ends filled in with Taq (5U) for 5 min at 72 °C and purified using a HighYield Gel/PCR DNA extraction kit (RBC Bioscience).

The reaction volume was made up to 150 µl with phosphate-buffered saline (PBS), 0.05% Triton X100 and incubated at 4 °C overnight with 5 µg of the anti-5′-methylcytosine antibody on a rotating platform (Aviva Systems Biology, San Diego, CA, USA). 40 µl of Protein G sepharose beads were added and incubated on a rotating platform for 1 h at room temperature. The beads were washed three times with 5 ml of PBS and then a final wash with 1 ml TE. The DNA was eluted by incubating at 65 °C for 15 min (vortexing every 2 min) in 50 µl TE, and then spun for 30 s at maximum. The supernatant was transferred to another tube. This elution was repeated twice and the supernatants pooled. The DNA was then purified using a PCR clean up kit (RBC Bioscience).

PCR amplification was done with 1 µl of the purified, digested and adaptor-ligated DNA using DpnII-adapt1 and Platinum Taq (Invitrogen) in a 50 µl reaction, 94 °C for 2 min x1, 94 °C for 30 sec, 58 °C for 30 s and 72 °C for 3 min × 30 on a PCT100 thermal cycler (MJ Research, Watertown, MA, USA). PCR products were purified with a PCR purification kit (RBC Bioscience) and ligated into p-GEMTeasy-vector (Promega). DNA sequencing was carried out by AGRF (Australian Genome Research Facility, St Lucia, QA, Australia).

Bisulphite sequencing

Once potential targets for methylation were identified they needed to be validated by bisulphite sequencing. Genomic DNA was extracted using the Qiagen DNeasy kit (Qiagen). Bisulphite primers were then designed using the Methprimer web based primer design tool (Li & Dahiya, 2002) (Table 2). Bisulphite conversion of gDNA was done using the Zymo Research EZ bisulphite conversion kit. PCR was performed using Platinum Taq (Invitrogen) in a 50 µl reaction, 94 °C for 2 min × 1, 94 °C for 30 sec, 48 °C for 30 s and 72 °C for 1 min × 44 on a PCT100 thermal cycler (MJ Research) and the product was either cloned or sequenced directly. If the PCR product was cloned between three and ten clones were sequenced.

The bisulphite conversion process is never 100% and to avoid the possibility of unconverted cytosines being indentified as methylated we performed a control conversion using the Universal methylated DNA Standard (Zymo Research Corp., Orange, CA, USA). Complete conversion of the unmethylated cytosines was observed after PCR and sequencing of 8 clones. In addition, repeated sequencing of the same fragments (up to n= 50) from different bisulphite treatments and different PCR products consistently showed unconverted cytosines at the same sites. While there were occasionally other cytosines in the sequences, these unconverted cytosines were almost always in a single cloned PCR product and they represented 0.37% of all the cytosines in the sequenced DNA, a conversion rate consistent with the quality control data provided by the manufacturer (>99% conversion using the Zymo Research EZ bisulphite conversion kit).

Total DNA methylation analysis

The total amount of methyl cytosine was quantified using a liquid chromatography-electrospray ionization-tandem mass spectrometry (LC-ESI-MS/MS) method for determination of cytosine and 5-methylcytosine in DNA (Kok et al., 2007). Briefly, 1 µg of RNAse treated gDNA from three separate phenol chloroform gDNA extractions was hydrolysed at 150 °C for 3 h in formic acid with aqueous internal standards of [13C2, 15N3]-cytosine and [2H4]-5-methylcytosine. A positive control of calf thymus DNA and a negative control of plasmid DNA were also run. All analyses were run in triplicate and performed at Proteomics International (University of Western Australia, Australia) on an Applied Biosystems 4000Q-trap mass spectrometer (Applied Biosystems, Foster City, CA. USA).

Calculation of CpG ratios

Genome-wide analysis of CpG frequencies has been used in invertebrates as an in silico method of predicting levels of intragenic methylation (Suzuki et al., 2007). Methylated cytosine tends to deaminate to uracil which is subsequently repaired as thymine. As methylation is most commonly found at CpG sites, the frequency of CpG sites within methylated genes will decrease over time unless under positive selection.

CpG frequencies were calculated for every predicted gene in the aphid genome using the reference sequence data (International Aphid Genomics Consortium, 2010). For each gene, the number of CpG dinucleotides within the coding sequence was determined to give the observed CpG frequency, while the expected CpG frequency was calculated based on the GC content of the gene and assuming random assortment of nucleotides. For example, if all the nucleotides were present at the same frequency then the expected frequency of CpG within the sequence would be 1/16th (6.25%). A CpG ratio (observed/expected) was then calculated for each gene; a histogram was plotted showing the frequency distribution of genes within a range of CpG ratios.

To provide support for bimodality, the kurtosis value (β2) of the distribution was calculated using the formula:


where X is each observed value and µ and σ are the mean and standard deviation, respectively, of the n observed values (Pearson, 1905). Normal distributions have a kurtosis value of ‘0’, while bimodal distributions will tend to have high negative values.

The distributions of the two curves (for low CpG and high CpG genes) underlying the bimodal distribution was then predicted based on the assumption that the two underlying distributions are both symmetrical, which is true for the unimodal CpG distributions we have so far observed in insects (International Aphid Genomics Consortium, 2010). Possible distributions for each curve were generated by predicting the internal shape of the curve from the external shape at each data point up to the peak. A population of possible bimodal distributions was then generated by summing all combinations of ‘low CpG’ and ‘high CpG’ distributions. Chi-square analysis was used to determine the combination of underlying distributions that best fit the observed data.

RNA extraction and expression analysis

RNA was isolated from each sample using a phenol/chloroform extraction. Each of the six biological replicates for each treatment (crowded and uncrowded) contained 20 heads. We treated each sample with rDNaseI (Ambion, Austin, TX, USA) at 37 °C for 30 min. We synthesized cDNA using the High Capacity cDNA Reverse Transcription kit (Applied Biosystems). We used real-time quantitative PCR (qPCR) to quantify the relative transcript accumulation between crowded and uncrowded samples. Primers to target transcripts were designed with Primer3 (Rozen & Skaletsky, 2000) and are listed in Table 1. Primers were tested initially on genomic DNA. 5 µl of cDNA or water (for negative control reactions) was added to a total PCR reaction volume of 25 µl containing 2X Power SYBR Green Master Mix (Applied Biosystems). Samples were run on a DNA Engine Opticon 2 Continuous Fluorescence Detection System (Bio-Rad) for 10 min at 95 °C and then 40 cycles of 15 s at 95 °C followed by one minute at 60 °C. Primer specificity was verified using disassociation curve analysis. Two technical replicates per biological sample were run and their resulting Ct values were averaged prior to analyses. We initially considered both Apactin and ApGlyceraldehyde-3-phosphate dehydrogenase (GAPDH) as our endogenous controls. We used the program GeNorm (Vandesompele et al., 2002) to assess their stability and thus utility as an endogenous control. Based on this assessment, we used ApGAPDH as our control. Therefore, each gene of interest was quantified in each sample relative to the expression level of ApGAPDH. Crowded vs. uncrowded samples were contrasted using the comparative Ct method.


This work has been funded by the CSIRO OCE postdoctoral fellow program, Award Number K99ES017367 from the National Institute of Environmental Health Sciences to JAB and ANR Holocentrism (France). The authors would like to thank the staff at the Human Genome Sequencing Centre at the Baylor College of Medicine for their work in sequencing the pea aphid genome. The authors would also like to thank Dr Don Gilbert for his input into the identification of the methyltransferases, Paul Yeoh for statistical input and Jenny Reidy-Crofts for maintaining the aphids. In addition we would like to thank Dr Robyn Russell, Dr Stephen Cameron and Prof Ross Crozier for their comments on an earlier version of the manuscript.