The pea aphid genome


H. Charles J. Godfray, Department of Zoology, University of Oxford, South Parks Road, Oxford OX1 3PS, UK. Fax: +44 1865 310447; e-mail:

It is an exciting time for insect biology. The genomic juggernaut rolls on and for the first time has sequenced the genome of an insect in the order Hemiptera: the pea aphid, Acrythosiphon pisum. The papers collected in this volume provide the fascinating details to the more telegraphic highlights in the article announcing the sequence (International Aphid Genomics Consortium, 2010). In this brief introduction I provide a little background to the pea aphid and its natural history and biology, try to explain its significance and hence why it was chosen for sequencing, and explore what the genome sequence may mean for further research on the pea aphid.

First, a little natural history. The most basal insects are the Apterygota which never had wings and include the springtails, silverfish, etc. From these sprang the Pterygota, all other insects. The least derived pterygotes are termed hemimetabolous because they have incomplete metamorphosis: they grow up through a series of nymphal stages and become an adult without the need of a pupa. This contrasts with the holometabolous insects which include the true flies, butterflies and moths, ants, bees and wasps, and beetles where the juvenile stage is radically different and a pupal stage occurs between larva and adult. To date all insect genomes that have been sequenced have been holometabolous and hence the pea aphid's is the most basal insect genome we currently possess.

The largest of the orders of hemimetabolous insects is the Hemiptera. These are economically very important insects containing a host of agricultural and forestry pests as well as some medically important species, such as the bedbug and the vector of Chagas disease. Perhaps the most noxious pests occur in a group called the Sternorrhyncha which contain scale insects, whiteflies and aphids. The last is a largely, though not exclusively, temperate family of insects containing pests of global importance such as the peach-potato aphid, Myzus persicae, the Russian wheat aphid, Diuraphis noxia, and many more. The pea aphid is a member of the subfamily Macrosiphoninae, one of the more derived groups. It is a pest of peas and other legumes, though does not cause the very major economic damage of related species. The insight given by the pea aphid genome into the structure of aphid ion channels (Dale et al., 2010) and other insecticide targets is very likely to repay the costs of sequencing in the not too distant future.

Aphids feed on juices that they suck from plant phloem using long piercing stylets. The majority of aphids are moderately to strongly host specific and the need to locate particular host plants may account for the abundance of odorant-binding proteins, olfactory receptors and gustatory receptors in the genome, and the evidence for recent rapid evolution in some of these genes (Smadja et al., 2009; Zhou et al., 2010). The pea aphid is interesting in that populations have become specialized on different food plants (Peccoud et al., 2009). It seems that selection associated with host plant use is balanced by gene flow preventing further divergence and speciation from occurring (Hawthorne & Via, 2001). The pea aphid has become a model organism for evolutionists studying specialization and ecological speciation, and the possibility of using novel high-throughput sequencing technology to obtain multiple rapid and inexpensive genome sequences from a variety of different host-adapted clones will allow novel tests of speciation theory. Though different pea aphid clones can feed on different host plants this is not polyphagy in the sense of that found in M. persicae which is recorded from hundreds of host plants. A comparative analysis of detoxification enzymes in A. pisum and M. persicae showed that the latter has relatively more detoxifying cytochrome P450s and it will be interesting to see if the correlation holds up as more aphids are examined (Ramsey et al., 2010b).

Phloem is rich in carbohydrates but low in nitrogenous compounds and is not something on which a eukaryote can survive unaided (Douglas, 1998). Aphids have highly modified gut morphology and the genome, perhaps not surprisingly, reveals many sugar transporter proteins (Price, 2009). They also possess primary symbiotic bacteria called Buchnera which provide substances missing in the diet. The Buchnera genome had been sequenced previously (Shigenobu et al., 2000) and comparison of host and symbiont genomes reveal the exquisite ways that both partners cooperate in different stages of processes such as purine and amino acid synthesis and metabolism (Ramsey et al., 2010a; Wilson et al., 2010). Buchnera is transmitted purely vertically and its phylogenetic tree is the same as its host – essentially it is now an organelle. But unlike mitochondria very few genes have been transferred to the host genome (in fact only a single pseudogene), though bioinformatic studies do reveal a few aphid genes of bacterial origin (Nikoh & Nakabachi, 2009).

Aphids have unusual life cycles in that nearly all species are cyclical parthenogens, reproducing asexually during the summer with a single episode of sexual reproduction in the autumn which produces overwintering eggs. Cyclical parthenogenesis is rare in nature but the genome of one other species, the water flea Daphnia (a crustacean), has also been sequenced. Curiously, both species show highly elevated levels of gene duplication compared with other sequenced arthropods (Gilbert, 2009), and it will be fascinating to see whether this occurs in other species with the same biology. The switch between sexual and asexual reproduction occurs in the autumn triggered by day-length changes and, therefore, the study of homologues to Drosophila circadian rhythm genes that have been identified in the pea aphid offers the prospect of a better understanding of this process (Cortés et al., 2010). A consequence of this life history is that individuals of the same genotype may be asexual females, or sexual males or females. In addition there is a wing polymorphism – some individuals are wingless and others winged. All this diversity must be generated phenotypically and a number of studies are beginning to explore differential expression or miRNA activity in different morphs, often using our understanding of homologous genes in Drosophila to suggest candidates for investigation (Brisson et al., 2010). Perhaps relevant to the needs for flexible developmental control, the proteins involved in miRNA processing in aphids have over evolutionary time been more duplicated than in any other species whose genome has so far been sequenced (Jaubert-Possamai et al., 2009). The pea aphid is already used as a model species for studies of gene regulation and development, and this will be much helped by the availability of the sequence (Huang et al., 2010; Rider et al., 2010; Shigenobu et al., 2010a; Walsh et al., 2010).

In the field pea aphids are attacked by a range of natural enemies, from viruses and fungal diseases, through parasitoid wasps whose larvae develop in the bodies of aphids, to ladybirds, hoverflies and many other predators (Van Veen et al., 2008). It is thus perhaps surprising that many of the immune systems identified in Drosophila and other insects seem to be absent in the pea aphid (Shigenobu et al., 2010a). A possible explanation is that they are missing because it would interfere with aphid symbionts though I think this unlikely. In my opinion a more likely explanation is that aphids are selected for extremely high rates of reproduction – they have to colonize a plant, form a colony and produce winged individuals before their natural enemies find and exterminate them. We know there are trade-offs between insect defences against natural enemies and other fitness components and in aphids natural selection may have favoured reproduction at the expense of defence. In addition to being attacked by pathogens aphids are also major vectors of viral plant diseases, one of the ways they cause major economic damage. Some viruses have complex adaptations to facilitate their transmission by aphids, and the use of the genome is already helping understand how the virus colonizes its vector (Tamborindeguy et al., 2010).

In addition to its obligate, primary symbiont, the pea aphid forms symbiotic associations with a range of other micro-organisms. The three most important are bacteria: Hamiltonella defensa, Regiella insecticola and Serratia symbiotica. These symbionts affect aphid biology in a variety of fascinating ways, influencing responses to biotic challenges such as from parasitoids and fungal pathogens, and abiotic challenges such as heat shock (Chen et al., 2000; Oliver et al., 2003; Scarborough et al., 2005). They also have some effect on host-plant use, though the full picture here is not yet clear. The genome sequence of Hamiltonella was known previously (Degnan et al., 2009b) and now we also have Regiella (Degnan et al., 2009a). Though Hamiltonella and Regiella are sister species amongst known bacteria they are remarkably unalike, with major differences in the structure of the genome and only 55% of genes in common. Evidence from the presence of a common Type 3 Secretory System suggests the common ancestor was also a symbiont and that it existed very many million years ago, though it is hard to be more precise. The bacterial genomes have many transposable elements, though curiously despite the fact that Hamiltonella and Regiella can occur in the same host they have few in common (Degnan et al., 2009a).

Biologists working on the pea aphid now have a valuable new set of tools to attack novel questions. In addition to the raw genome sequence we have the functional annotation (Huybrechts et al., 2010; Shigenobu et al., 2009a) and information about gene family evolution (Huerta-Cepas et al., 2010), collections of cDNAs, ESTs and miRNA resources (Legeai et al., 2009a; Shigenobu et al., 2010b), and novel or expanded gene familes to investigate (Nakabachi & Miyagishima, 2010). AphidBase, the centralized bioinformatic platform for annotation of the pea aphid genome (Legeai et al., 2010b), will become increasingly valuable. We can combine genomic sequence-based techniques with proteomics, for example to identify the proteins injected into the plant with aphid salivary fluid (Carolan et al., 2009). Studies on the pea aphid will inform our understanding of aphid biology and of insects more generally, with clear economic benefits at a time of increasing concern about food security. We also have an even better model organism, possessing a rich and fascinating natural history, with which to explore challenging questions in ecology and evolution. To mangle Wordsworth: Bliss was it in that dawn to be alive, but to be a pea aphid biologist was very heaven!