The histone proteins H2A, H2B, H3, and H4 represent the core particle in the nucleosome. The histones H3 and H4 represent some of the most highly conserved proteins in eukaryotes, but the other histones (H1, H2A and H2B) are often highly divergent between different taxa. The linker histone (H1) is an important component of chromatin that is bound outside of the nucleosome in the internucleosomal space and confers additional stability to the nucleosomes by impeding their mobility (Pennings et al., 1994).
Despite a relatively steady rate of gene duplications in the pea aphid genome, our analysis indicates that in the pea aphid there is a relatively low number of histone coding loci. Over half of the histone-protein coding genes identified existed in uniquely organized clusters, as evidenced by six assembled scaffolds that contained combinations of different histone genes (Supplemental Fig. S2). An additional 13 scaffolds were identified encoding single histone genes. The distributed organization of histone coding genes in the pea aphid contrasts sharply with that of the large arrays of the histone genes seen in genomes like D. melanogaster and other invertebrates.
Unfortunately, the genomic organization and expression of histone genes has been examined in detail for only a few taxa and there are large gaps in the phylogenetic samplings. Therefore, it remains unclear if there are truly any biological consequences to changes in histone gene copy number variation, and histone gene organization among taxa. Nonetheless, each of the anticipated core histone coding genes was identified in the pea aphid genome, demonstrating conservation in the fundamental structural components of chromatin.
In addition to the above-mentioned proteins, histone variants that have specialized functions have been shown to exist in many species, and may be required e.g., during meiosis, DNA replication, recombination, or DNA damage and repair (Ausio, 2006). In the pea aphid genome, a single copy locus was identified that encodes H2A.Z, an evolutionarily conserved histone variant involved in many processes including chromosome segregation, gene regulation, and nucleosome positioning. Another evolutionarily conserved histone gene was identified, the H3.3 histone, or so-called ‘replacement histone’. Histone H3.3 is associated with actively transcribed genes where it apparently replaces histone H3 during gene transcription.
A clear homologue of the centromere-specific H3-like protein-coding gene (cid/cenp-a) was not found. CENP-A in D. melanogaster is an essential component of the centromere and is required for proper centromere segregation. Similar proteins have been identified in many other species, but it is clear that this protein is fairly rapidly evolving among different taxa. However, one predicted protein (XP_001945390.1) was discovered that has some homology to HCP-3, a Caenorhabditis elegansholocentric chromosome-binding protein that is analogous to centromeric H3. The predicted aphid protein contains a highly conserved domain of unknown function and is also related to a human protein (MAD2L1-binding protein) that interacts with components of the kinetochore, suggesting a link to centromere function.
We were also unable to clearly identify aphid orthologues encoding the protamine proteins (D. melanogaster mst35Ba and mst35Bb or mammalian prm1 and prm2) that are involved in spermatid chromosome condensation (Lewis et al., 2003). Likewise, an orthologue of the D. melanogaster male-specific transcript mst77F, which encodes a sperm-specific linker histone (Jayaramaiah Raja & Renkawitz-Pohl, 2005), could not be identified among the scaffolds or predicted proteins. A homologue of the Drosophila loquacious protamine RNA-binding protein is probably present (ACYPI002395; XP_001942593), but in other organisms, these proteins are not restricted to binding with protamine RNAs. An orthologue of HIRA was also identified (ACYPI005751; XP_001947756). In D. melanogaster HIRA is essential for the re-packaging of the sperm pronucleus following protamine removal and fertilization (Bonnefoy et al., 2007).
The pea aphid appears to be dependent on single loci for the production of highly conserved variant histones involved in several essential processes. However, a number of possible histone-like protein coding loci present in D. melanogaster and humans were not identified in the draft pea aphid genome sequence. Our inability to identify centromeric H3, protamines or the sperm-specific H1-like protein may be due to the poorly conserved nature of these protein orthologues from divergent taxa or gaps in the aphid genome assembly. On the other hand, aphids possess holocentric chromosomes, and some invertebrates package sperm chromatin using histones. Thus it is possible that the pea aphid possesses either highly divergent genes for these histone-like proteins, or utilizes a system different from what might be anticipated.
This set of enzymes catalyses the addition of acetyl groups to available lysine groups on core histone components, a process that is thought to create a chromatin architecture that is accessible to the transcriptional machinery. Known histone acetyltransferases (HATs) belong to one of two superfamilies: MYST-type acetyltransferases and GCN5-related N-acetyltransferases (GNAT) (Marmorstein, 2001; Utley & Cote, 2003). MYST HATs are named for the founding members: MOZ, YBF2/SAS3, SAS2, and TIP60. D. melanogaster possesses six GCN5-related HATs and 4 MYST-type HATs (Supplemental Table S1). MYST family members from D. melanogaster include Males absent on the first (MOF), Enoki mushroom (ENOK) and Chameau (CHM). MOF is capable of histone H4 lysine 16 (H4K16) acetylation and is partly responsible for the phenomenon of dosage compensation (Hilfiker et al., 1997; Rea et al., 2007). ENOK promotes cell proliferation in the mushroom body of the brain, wing and ovaries (Scott et al., 2001) while CHM activity has been demonstrated to suppress position effect variegation (Grienenberger et al., 2002). Thus, insect MYST family members are involved in gene regulation and development, including aspects associated with aphid polyphenisms (e.g. reproduction and wing development). Interestingly, the pea aphid genome not only contains homologues of the D. melanogaster MYST family members, but also duplications of mof and enok. Such duplications appear to be unique to the pea aphid when compared with other arthropod genomes (See Supplemental Table S2).
GCN5 in D. melanogaster (also known as PCAF) functions as a histone H3 acetyltransferase that regulates oogenesis and morphogenesis (Carre et al., 2005). In mammals, alternative splicing may produce a second protein that lacks the N-terminal PCAF domain, but that retains both the HAT and the bromodomains (Xu et al., 1998). The complement of HATs from the GNAT superfamily and the MYST superfamily are similar between the pea aphid and D. melanogaster. However, while the D. melanogaster genome encodes a single PCAF/GCN5 orthologue, the pea aphid possesses at least two PCAF/GCN5 paralogues that are about 78% identical, making the number of GNAT family members greater in the pea aphid than in D. melanogaster. The number may be even greater if, like in mammals, alternative splicing produces HAT enzymes lacking the PCAF domain. There also exists an unusual PCAF domain-only protein that shares 39–42% identity with the other two pea aphid PCAF domains. Whether this PCAF represents a functional protein remains to be determined. If so, it may represent a member of a protein family that has not been studied previously.
The histone deacetylases remove the post-translational modifications generated by HATs and are represented by three structurally unrelated enzyme superfamilies. HDACs are generally associated with a chromatin architecture that represses gene expression. The HDAC superfamilies present in animals include those related to the yeast reduced potassium dependency 3 (RPD3), and the Silent information regulator 2 (SIR2; the family is also known as sirtuins) (Gray & Ekstrom, 2001). The third family (HD2, originally discovered in Zea mays) appears to be present only in plants and some protozoa (Lusser et al., 1997; Aravind & Koonin, 1998). All three families have been demonstrated to regulate gene expression and developmental identity. The D. melanogaster genome encodes 5 RPD3-type and 5 SIR2-type histone deacetylases (Frye, 2000; Foglietti et al., 2006). Members of the RPD3 superfamily include diverse proteins that can be grouped into related families based on domain architecture and are further lumped into broader categories (designated as classes) based on the sequence motifs present in the HDAC domains (Gregoretti et al., 2004). The human genome includes a few families not present in the D. melanogaster genome but that do appear in some other arthropods. Thus, a comparison of the RPD3 family members in the pea aphid would not be complete without considering the presence of types that may have been lost from the Drosophila lineage. A comparison of the HDAC proteins from humans, D. melanogaster and the pea aphid is presented in Fig. 3.
Figure 3. RPD3-type HDAC proteins. The pea aphid (Ap), Drosophila melanogaster (Dm), and humans (Hs) are represented. The phylogram is a neighbour-joining tree based on ClustalX alignments. Numbers at nodes indicate bootstrap support (per cent). Unique proteins from the pea aphid include 3 or 4 RPD3 paralogues, 2 HDAC8 paralogues, and an HDAC10. Additional information on the Aphid proteins can be found in supplemental Table S1. HDAC classes are highlighted with alternating grey or white boxes.
Download figure to PowerPoint
The pea aphid genome encodes nine RPD3-type HDAC proteins and five sirtuins. Three of these proteins are from families not represented in D. melanogaster, but that are represented in the human genome (HDAC8 and HDAC10). On the other hand, the pea aphid appears to lack a classIV HDAC, while Dipterans and Coleopterans appear to possess this unusual type of HDAC (refer to Supplementary Table S2). Pea aphid HDAC duplications include three paralogues related to D. melanogaster RPD3, and two HDAC8 paralogues. Interestingly, HDAC8 appears to have been lost among a number of arthropod lineages, and currently the only other arthropod with more than one HDAC8 appears to be Aedes aegypti (Supplemental Table S2). The putative aphid HDAC10 protein has greater homology to the aphid HDAC6 (which apparently lacks a zinc finger) than to other HDAC10 proteins, but the domain architecture is much more indicative of an HDAC10 (see Supplemental Fig. S3). These aphid-specific additions to the histone acetylation pathway may have important consequences for the regulation of chromatin structure and gene regulation.
The known consequences of histone methylation may be either repression or activation of gene expression depending on the residue and extent of methylation. We assessed methyltransferase activity in the pea aphid using indirect immunofluorescence of sexual and asexual germaria. Histone H3 methylation of lysine 4, a marker for active transcription, is abundant in somatic follicle cells, trophocyte/nurse cell nuclei of asexual and sexual germaria, and young sexual and asexual oocytes (Figs 4A–F). This signal is greatly reduced in asexual syncytial blastoderm nuclei, suggesting that these cells may be largely transcriptionally silent. However, the H3K4 trimethylation signal reappears later in development (Figs 4M–P). Interestingly, the H3K4 trimethylation signal is absent or reduced in specific chromatin regions (arrows in Figs 4C,F). The significance of this remains unclear. However, trimethylation of lysine 9 of histone H3, a marker for constitutive heterochromatin (transcriptionally silent regions), has been detected at specific locations on several pea aphid chromosomes, including the sex chromosome (Mandrioli & Borsatti, 2007). We were also able to observe H3K9 trimethylation signal in specific chromatin regions that appear more condensed (Figs 4G–L). These trimethylated H3K9 chromatin regions may be mutually exclusive with regions of H3K4 trimethylation, confirming the expected antagonistic functions of these chromatin modifications.
Figure 4. Histone H3K4 and H3K9 methylation in the pea aphid. A-F. A marker for actively transcribed regions, H3K4 trimethylation (arrowheads) is detected by indirect immunofluorescence in trophocyte/nurse cell nuclei of asexual (A-C) and sexual germaria (D-F) but is absent from some chromatin regions (arrows in C, F). Insets in D-F show enlarged region encompassing the sexual germ cell and oocyte nucleus. G-L. A marker for constitutive heterochromatin, H3K9 trimethylation is shown here in an asexual germarium (G-I) and sexual germarium (J-L). Insets in G-I (asexual oocyte nucleus) and J-L (sexual oocyte nucleus) show staining in a different focal plane. H3K9 trimethylation appears localized to only certain chromosomes or chromatin regions (arrows). M-N. H3K4me3 signal is greatly reduced in asexual blastoderm, indicating that these cells may have generally suppressed transcriptional activity. Note that H3K4me3 staining is concentrated in the periphery of trophocyte nuclei. O-P. H3K4 trimethylation is detected in a later stage asexual embryo, suggesting a resumption of transcriptional activity. Scale bars and stains are indicated.
Download figure to PowerPoint
Three broad categories of histone methylating enzymes exist: SET proteins, DOT1-like proteins and protein arginine methyltransferases (PRMT). SET proteins derive their name from a conserved motif observed in the Suppressor of variegation 3-9 [SU(VAR)3-9], Eenhancer of zeste [E(Z)] and Trithorax [TRX] genes of D. melanogaster that are important regulators of development and cellular identity (Dillon et al., 2005). SET domain proteins often possess additional chromatin-binding domains such as PHD fingers, chromodomains, the HMG box, etc. (refer to Supplemental Fig. S4). The pea aphid may have as many as 25 SET-domain containing histone methyltransferases, possibly being the greatest number among the sequenced arthropod genomes. Many of the aphid SET proteins appear to be related to an orthologue that can be found in D. melanogaster. Phylogenetic analyses of whole protein sequences produced a number of inconsistent results. Therefore, domain architecture provided a more reliable basis on which to assign orthology relationships. Three of the 25 aphid loci encode partial SET domains, possibly indicating the presence of a few pseudogenes. Each of the remaining pea aphid SET domain loci that lack a clear orthologue in D. melanogaster appear to be duplicated loci that have undergone some divergence or that have lost specific domains related to chromatin function. Among the duplicated loci are 5 loci related to eggless, three of which do not encode an expected methyl DNA binding domain; 4 loci related to Su(var)3-9, including one previously cloned but that is not present in the pea aphid draft genome sequence (one lacks chromodomains); and 4 additional loci that encode proteins whose domain architecture is related to CG4565, a poorly characterized protein comprised of a single SET domain. Duplications in the Eggless orthologues are potentially interesting because loss of the single eggless gene in D. melanogaster causes arrest of oogenesis, implicating a role for this gene in female fertility (Clough et al., 2007). The additional eggless-related loci may also play roles in the alternate reproductive pathways in female aphids (e.g. parthenogenesis and oogenesis). The predicted aphid Set2 protein lacks an expected Sri-2 domain, but one of the two duplicated proteins (with partial SET domains) that are related to Set2 possesses an Sri-2 domain. Thus, the Set2-related ‘duplicated’ proteins may represent portions of a mis-assembled Set2 orthologue. Interestingly, the putative trithorax-related orthologue has an N-terminus that contains five plant homeodomain (PHD) fingers and a high mobility group (HMG) box, making it more similar to vertebrate Mixed lineage leukemia 2 (MLL) orthologues than to the D. melanogaster trithorax-related protein with respect to domain architecture. Despite the presence of as many as nine duplications of the SET-type histone methyltransferases, clear orthologues of PR-Set7 (also known as dSET8) and SU(VAR)4-20 were not found. Both proteins are involved in histone H4K20 methylation, a mark that plays important roles in heterochromatin formation in other species (Ebert et al., 2006). Although ASH1 is capable of H4K20 methylation in D. melanogaster, it remains unclear how methylation of histone H4 at lysine 20 may be carried out in the pea aphid (Beisel et al., 2002). The remaining histone lysine methyltransferases in the pea aphid are related to Disruptor of telomeric silencing-1 (DOT1), an H3K79 methyltransferase from yeast (Feng et al., 2002; van Leeuwen et al., 2002). Two DOT1-related proteins are present in D. melanogaster as well as the pea aphid.
Histones may also be methylated at arginine residues through protein arginine methyltransferases (PRMT). In D. melanogaster, 9 PRMT homologues have been identified (designated DART1-DART9), including Capsuleen which is also known as DART5 (Boulanger et al., 2004). Capsuleen is the D. melanogaster orthologue of human PRMT5, a histone arginine methyltransferase associated with gene repression. However, DART5 also methylates other proteins, and plays a significant role in germ cell specification (Gonsalvez et al., 2006; Anne et al., 2007). The pea aphid possesses 11 PRMT-related loci, including orthologues of DART1, DART4 and DART7, all three of which are likely to dimethylate arginine residues on histones (see Supplemental Fig. S4). DART1 and DART4 (CARMER) were recently demonstrated to play roles in ecdysone-mediated responses in D. melanogaster (Cakouros et al., 2004; Kimura et al., 2008). The pea aphid genome encodes three proteins related to Capsuleen, and two proteins related to DART8, although one may represent a pseudogene. Two of the three remaining putative PRMTs are related to one another, but none are closely related to any of the DART proteins. These may be novel PRMTs or another type of methyltransferase. The last of the putative PRMTs may be a distant orthologue of DART3. The number of PRMT-like proteins encoded among arthropod genomes is relatively unchanged (see Supplemental Table S2). However, the similarities between PRMTs in D. melanogaster and the pea aphid suggest interesting possibilities for several aphid genes that may be associated with reproductive development, as seen in D. melanogaster.
Histone demethylation, like histone methylation, may be associated with gene activation or repression. Histone arginine demethylation in mammals is known to occur through peptidyl arginine deiminase (PADI) enzymes (Wang et al., 2004), but no clear orthologues for these enzymes appear to be present in D. melanogaster or any other arthropods for which DNA sequence is currently available. Consistent with these observations, no PADI homologues were identified in the draft pea aphid sequence. Two families of histone lysine demethylases are known – those related to flavin-containing monoxygenases and those containing a motif (JmjC) that was first identified in several chromatin-related proteins, including the mouse Jumonji protein (Takeuchi et al., 2006; Forneris et al., 2008). JmjC-domain containing proteins are further categorized by the presence of other domains, and those possessing an AT-rich DNA interacting domain (ARID) have been sometimes designated as JARID family proteins. D. melanogaster possesses two flavin-type histone demethylases, which have orthologues present in the pea aphid and two JARID proteins: little imaginal discs (LID) and the D. melanogaster orthologue of Jumonji (Sasai et al., 2007; Lee et al., 2009). Two additional JmjC proteins are encoded by the D. melanogaster genome that also carry a JmjN domain, which has been implicated as playing an essential role in the demethylation reaction for those enzymes (Lloret-Llinares et al., 2008). The D. melanogaster genome also encodes roughly nine other JmjC domain proteins that lack JmjN domains but these proteins have not been well characterized. Several of these proteins have orthologues in the pea aphid (Fig. 5). Two contain additional functional domains – either tetratricopeptide repeats (CG5640) or a zinc finger (CG11033). The pea aphid may have as many as 19 JmjC domain proteins, including homologues related to each of the six D. melanogaster JmjN domain proteins (Fig. 5). While the draft pea aphid genome sequence is predicted to encode at least three proteins that only possess the JmjC domain, all of the six remaining putatively complete JmjC domain-containing proteins possess a JmjN domain. The presence of the JmjN domain implies that each of these represents a bona-fide histone demethylase. Four of the JmjN+JmjC loci in the pea aphid genome also encode a conserved region related to PHD zinc fingers that bind specifically to methylated histone H3. Interestingly, these are represented by a clade that is unrelated to any of the proteins present in the D. melanogaster genome. None of the aphid JmjC proteins appears to be the result of very recent duplications as they share between 19% and 84% identity in the regions that can be aligned (Supplemental Table S3). With a complete, yet expanded set of enzymes modulating histone methylation, the pea aphid may have a complex means of controlling chromatin architecture and gene expression.
Figure 5. Jumonji domain containing proteins. The phylogram is a neighbour-joining tree based on ClustalX alignments of Drosophila melanogaster and pea aphid sequences. Node labels indicate bootstrap support (per cent) for nodes with support greater than 50 per cent. All aphid sequences begin with ‘ACYPI’, while the other sequences are from D. melanogaster. Known specificities of each of the four D. melanogaster proteins that have been examined for histone demethylase activity are indicated. The four aphid JmjC+JmjN domain proteins that possess a region related to the plant homeodomain (PHD) are in bold. Additional information on the aphid proteins can be found in supplemental Table S1. Alternating grey and white boxes indicate putatively orthologous groups.
Download figure to PowerPoint
Ubiquitin (UB) and the Small UB-like modifier (SUMO) represent post-translational modifications that occur on a number of proteins including transcription factors, chromatin remodelling proteins and histones (Gill, 2004; Hilgarth et al., 2004). Notably, UB and SUMO have been demonstrated to function as transcriptional repressors when attached to histone proteins, possibly through prevention of acetylation of lysine residues that function in transcriptional activation (Sun & Allis, 2002; Gill, 2004; Weake & Workman, 2008). The pea aphid possesses several UB-like proteins (Fig. 6), as well as proteins that contain UB-like domains. Although D. melanogaster possesses a single SUMO protein (SMT3), it also possesses UB and several other UB-like protein-coding genes. The UB-like proteins, however, are better characterized from humans and, like humans, the pea aphid possesses several SUMO-related protein-coding genes (an SMT3 orthologue and two additional proteins related to human SUMO-1). The other UB-like proteins identified in the pea aphid include two polyubiquitins of different lengths, a bi-ubiquitin, and orthologues of at least 5 other UB-like proteins that are present in humans and D. melanogaster. Although the human FAT10 protein possesses two UB-like domains, it is phylogenetically distant from the structurally similar bi-ubiquitin in the pea aphid. While the human HDAC6 protein possesses a zinc finger on the C-terminus that interacts with FAT10 (Boyault et al., 2007; Kalveram et al., 2008), the pea aphid HDAC6 may lack a zinc finger. It is possible that if the aphid HDAC6 truly lacks the zinc finger domain, then this would be consistent with the apparent lack of a clear FAT10 orthologue in the aphid and may suggest some novel features in the regulation of HDAC6 function in the pea aphid.
Figure 6. Relationships among selected Ubiquitin-like proteins. Pea aphid (Ap), human (Hs) and Drosophila melanogaster (Dm) sequences are included. The phylogram is a neighbour-joining tree based on ClustalX alignments. Node labels indicate bootstrap support (per cent) for nodes with support greater than 50 per cent. There are many 1:1:1 orthologies with aphid, D. melanogaster and human ubiquitin-like proteins. There is a single SMT3 orthologue and two additional SUMO-like proteins in the pea aphid. If the aphid Bi-ubiquitin protein is functional, it does not appear to be orthologous to the human FAT10 protein that is structurally similar (each may have two ubiquitin-like domains). Additional information on the aphid proteins can be found in supplemental Table S1. Alternating grey and white boxes indicate putatively orthologous groups.
Download figure to PowerPoint
The attachment of UB-like proteins to their targets is a multi-step process mediated after protein maturation by several enzymes (E1, E2 and E3 proteins; Fig. 2) (Hilgarth et al., 2004). These enzymes appear to be present in the pea aphid, based upon high scoring BLAST results for known E1, E2 and E3 proteins (supplemental Table S1). We found that the pea aphid genome possessed genes related to the maturation protease SENP1. We identified individual loci that encode the AOS1 and UBA2 proteins that form the heterodimeric E1 protein, an orthologue of UBC9 (the E2 protein known as Lesswright in D. melanogaster), as well as a second protein related to Lesswright. E3 proteins confer substrate specificity and enhance the SUMO ligation process. We found seven loci-encoding proteins that may have E3 ligase activity that are related to protein inhibitor of activated STAT (PIAS), which, in humans, helps target SUMO onto the methyl-CpG-binding protein (Lyst et al., 2006). Another E3 protein was also found that is an orthologue of RAN binding protein2 (RANBP2) that targets SUMO onto HDAC4 (Kirsh et al., 2002). Several proteins were also identified that may be involved in the proteolytic removal of SUMO from modified substrates (see Supplemental Table S1). Although the E1, E2 and E3 proteins that were associated with other UB-like modifications were not examined in greater detail, complete SUMO modification pathways appear to be present in the pea aphid.
The smt3 gene from D. melanogaster has been demonstrated to function in the larval-pupal transition and has been implicated in the packaging of sperm chromatin while the SUMO proteins from humans appear to have partially overlapping, but distinct, functions based upon identified protein targets. The presence of multiple SUMO-like proteins in the pea aphid suggests the possibility of a similar diversification of SUMO protein functions within the pea aphid.
Phosphorylation and ribosylation
Additional histone modifications of potential interest are phosphorylation and ribosylation. Although each of these modifications can be produced on histones by multiple different enzymes, we focused on identifying genes encoding specific enzymes of interest. One gene was identified in the A. pisum genome encoding the nuclear histone kinase (NHK-1). This is of particular interest because phosphorylation of H2A by NHK-1 occurs on nucleosomal histones and is required for the deposition of specific downstream marks of acetylation onto histone H3 and H4 during meiosis (Ito, 2007). Thus, NHK-1 provides a connection among histone phosphorylation, chromatin modifications associated with gene expression and reproductive physiology. Evidence is also emerging that mono- and poly-ADP ribosylation of histones are important aspects of chromatin remodelling (Tulin et al., 2002; Tulin & Spradling, 2003). A family of proteins related to poly-ADP ribose polymerase (PARP) was identified in the pea aphid (see Supplemental Table S1).
The bulk of our efforts were focused on the post-translational modifications of histones. However, chromatin conformations rely heavily on the relative positions of nucleosomes within the DNA. Nucleosome positioning can be modified by a number of distantly related proteins that possess an ATPase domain related to that from the yeast Mating type switching/Sucrose non-fermenting (SWI/SNF) chromatin remodelling protein (Becker & Horz, 2002; Mohrmann & Verrijzer, 2005; Bouazoune & Brehm, 2006). These proteins may function as motor proteins for the mobilization of nucleosomes along the DNA strand. The pea aphid genome possessed many of the same of chromatin-remodelling ATPases as D. melanogaster (Supplemental Table S1). However the pea aphid sequence lacked a clear orthologue of CHD3, but may possess an additional Imitation switch (ISWI) locus. One ISWI encompasses nearly half of the scaffold with which it belongs, while the other is well within a scaffold of about 89 kb. Beyond the ISWI region, neither scaffold can be readily aligned. The large size of the ISWI locus (∼11 kb) and the reasonably high conservation suggests that they could be allelic, but the lack of homology in the surrounding regions and the relatively high number of sequences present in the trace archive are compelling pieces of information that indicate that these loci are paralogues. In addition, it is possible that the aphid orthologue of Mi-2 may be non-functional because the predicted mRNA appears to contain a frame-shift that results in the loss of a conserved C-terminal domain. It should be noted that this shift occurs within a region that is not well conserved or represented by expressed sequence tag coverage and it is possible that the prediction represents a mis-annotation. The possible loss of Mi-2 function is intriguing because loss of function mutants in D. melanogaster are usually embryonic lethal, perhaps due to inappropriate expression of hox genes during embryogenesis (Kehle et al., 1998; Khattak et al., 2002). However, in other species like Arabidopsis thaliana (a plant), null mutants of the Mi-2 orthologue (Pickle; PKL) are viable and fail to suppress pathways that lead to totipotency (Ogas et al., 1999). However, sexual reproduction in pkl mutants is relatively normal. Thus, in A. thaliana, PKL functions to help suppress certain aspects embryo development that may lead to asexual reproduction. If Mi-2 in aphids is truly non-functional, it may provide a partial explanation for their increased capacity for asexual reproduction.
Relative to other arthropods, the pea aphid possesses a more diverse repertoire of histone modification machinery that remains to be examined experimentally. Numerous examples from other species indicate that the chromatin enzymes encoded by the pea aphid should be involved in developmental fate, perhaps serving vital functions in the differentiation of aphid polyphenisms, particularly reproduction. Several instances of gene duplication among the chromatin modifying enzymes were discovered, most notably among the enzymes involved in histone acetylation and methylation. It is interesting that there appears to be a rough balance in pea aphids among the histone-modifying enzymes that perform antagonistic biochemical reactions. For example, increases in the number of HAT loci correspond roughly with increases in the number of RPD3-type HDAC loci and increases in the number of SET domain methyltransferases coincide with increased numbers of JmjC domain demethylase genes.
Characterization of the expression profiles of these expanded chromatin-modifying genes may yield important clues to their roles in different aphid morphs. In other systems, the ectopic expression of histone-modifying enzymes can have dramatic effects on cell differentiation and development, and certain chromatin modifying proteins are required to be in a relative balance for normal cellular function. In yeast, loss of the HAT GCN5 causes a transcription defect that can be ameliorated by elimination of the HDAC RPD3, indicating a general need for balancing the activities of the two enzymes (Perez-Martin & Johnson, 1998). A similar phenomenon is observed in D. melanogaster with the HAT Chameau and the HDAC dRPD3 (Miotto et al., 2006). In D. melanogaster, disruption of normal levels of HAT activity in non-lethal eye models result in cell death, indicating that major changes in relative HAT activity are deleterious (Taylor et al., 2003; Lee et al., 2008). In addition, the Male specific lethal complex, which includes the histone acetyltransferase MOF, requires tight quantitative regulation for normal targeting to the appropriate regions in D. melanogaster chromosomes (Gu et al., 1998). Thus, it is becoming clear that balance is required in the levels or activities of histone-modifying enzymes. Is an evolutionary scenario analogous to the gene-for-gene interactions that exist, e.g., between plants and their pathogens affecting pea aphid chromatin gene duplications? If additional copies of individual chromatin loci cause deleterious effects that can be squelched by additional copies of their antagonists, this could represent a previously undescribed phenomenon that has been shaping the evolution of the pea aphid genome.
However, given the intricacies of chromatin remodelling, the idea that increased gene copies could offset extra copies of specific chromatin loci for an antagonist is overtly simplistic. For example, in D. melanogaster, the histone demethylase LID functions in a complex that can inhibit the activity of the dRPD3 HDAC (Lee et al., 2009) and genetic interactions between dSETDB1 and SU(VAR)3-9 indicate that these genes must be present in a balanced state for normal genome function (Brower-Toland et al., 2009). Thus, there are a number of different control mechanisms in place to regulate the activities of specific chromatin modification activities. It is possible, then, that the extra copies of various chromatin loci could have no net negative effect and that the presence of increased copies of these genes, along with their antagonists, is merely a coincidence. In fact, the pea aphid genome appears to have a relatively steady rate of gene duplication across many different types of loci, including the chromatin-modifying enzymes. In the absence of any deleterious effects due to gene duplication, paralogous chromatin loci in the pea aphid may still be undergoing divergence in function and may have adopted unique and essential roles. This is an exciting area that merits further investigation and could provide insight into the roles of chromatin modifications in pea aphid development and the regulatory systems involved in aphid polyphenisms.