Neuropeptide and neurohormone precursors in the pea aphid, Acyrthosiphon pisum


Denis Tagu, UMR 1099 INRA/Agrocampus Ouest/Université Rennes 1, BiO3P, Domaine de la Motte, F-35653 Le Rheu, France. Tel.: +33 223 48 51 65; fax: +33 223 48 51 50; e-mail:
Jan Veenstra, Université de Bordeaux, CNRS, UMR 5228, CNIC, Talence, France. Tel: +33 540 008 751; Fax: +33 540 008 743; e-mail:


Aphids respond to environmental changes by developing alternative phenotypes with differing reproductive modes. Parthenogenetic reproduction occurs in spring and summer, whereas decreasing day lengths in autumn provoke the production of sexual forms. Changing environmental signals are relayed by brain neuroendocrine signals to the ovarioles. We combined bioinformatic analyses with brain peptidomics and cDNA analyses to establish a catalogue of pea aphid neuropeptides and neurohormones. 42 genes encoding neuropeptides and neurohormones were identified, of which several were supported by expressed sequence tags and/or peptide mass analyses. Interesting features of the pea aphid peptidome are the absence of genes coding for corazonin, vasopressin and sulfakinin and the presence of 10 different genes coding insulin related peptides, one of which appears to be very abundantly expressed.


Aphids are insects that are in the centre of a network of biotic and abiotic interactions (Tagu et al., 2008). As plant pests, they live in interaction with host plants by sucking phloem sap. They transmit phytoviruses, have symbiotic bacteria, cope with pathogens, and sense the presence of congenerics, enemies and parasites. Moreover, aphids respond quickly to changing environmental cues by developing alternative phenotypes. This phenotypic plasticity is governed by seasons, over-crowding and/or alarm pheromone sensing. For instance, shortening of photoperiod triggers a switch of reproductive mode from viviparous parthenogenesis to sexual reproduction (Le Trionnaire et al., 2008). Day length as well as circadian rhythm sensing is critical for this switch. The early transduction of the input signal required for phenotypic plasticity takes place in the nervous tissues and the transmission of the photoperiod signal involves neuro-endocrine regulation (review by Hardie & Lees, 1985). Whereas, normally, short day conditions are needed to induce the production of sexual morphs, the ablation of two groups of five neurosecretory cells (group I) located in the anterio-dorsal part of the protocerebrum leads to their production even under long day conditions (Steel & Lees, 1977; Hardie, 1987). Whereas in all other insect species the axons of such neuroendocrine cells in the pars intercerebralis project to the corpora cardiaca, the axonal projections of the aphid group I cells pass ventrally, follow the dorsal neuropile tracts to the thoracic ganglion mass and leave through the median abdominal nerve (Steel, 1977). These axonal projections may terminate in the vicinity of the ovarioles, but it is not clear whether they directly connect to the ovaries (as discussed by Steel, 1976) to release their putative neurohormone onto the target cells in the ovarioles. Apart from a single paper describing allatostatin A and adipokinetic hormone immunoreactivity (Tilley et al., 2000) very little is known with regard to the identity of aphid neuropeptides. Although a summary of publicly available aphid neuropeptide expressed sequence tags (ESTs) was recently published (Christie, 2008), the availability of the genome of the pea aphid is an opportunity to establish a more complete repertoire of these molecules (International Aphid Genomics Consortium, 2010).

In this paper, we describe the neurohormone and neuropeptide precursor genes for the pea aphid, Acyrthosiphon pisum. We identified in silico 42 such genes that can generate potentially more than 70 biologically active neuropeptides and neurohormones. In parallel, we extracted peptides from dissected central nervous systems, which were analysed by MALDI-TOF mass spectrometry. Thirty ion peaks could be attributed to the theoretical molecular mass of predicted peptides. We also prepared cDNA libraries and performed EST sequencing from pea aphid brain. ESTs confirmed the expression of 15 of the predicted neuropeptide precursor genes, while the results of the MALDI-TOF analysis suggest the expression of another seven genes. The results represent a useful catalogue for identification of the role of neuropeptides in phenotypic plasticity of the pea aphid.

Results and discussion

Neuropeptide and neurohormone catalogue

Using the BLAST program on the pea aphid genome, 42 genes encoding neuropeptides and neurohormones were identified (Table 1 and Table S1). It is not always easy to correctly predict the different exons of all genes. On several occasions, initial predictions based exclusively on genomic sequences had to be revised subsequently and sometimes substantially once EST sequences were identified from AphidBase and GenBank (Table S2), implying that genes for which no EST sequences are available may contain similar errors. To obtain more neuropeptide-specific EST sequences two brain cDNA libraries were constructed. After cleaning of the raw data 1413 usable sequences were obtained from these libraries. Nearly 39% of these correspond to new ESTs not found in the initial 167 000 ESTs available for the pea aphid in GenBank and AphidBase, indicating the interest of focusing on specific tissues for the discovery of new transcripts. Nearly 43% of these brain ESTs had no hit or corresponded to hypothetical proteins (Fig. 1). A large number of the other ESTs code for proteins used in development, signal transduction, ion channels or receptors that are known to be involved in brain physiology, with more than 10% of the identified ESTs corresponding to proteins known to be involved in neurobiological processes in Drosophila melanogaster. Thus, the brain EST catalogue is a rich resource for further studies on the nervous system in the pea aphid. However, only five ESTs coding neuropeptides were found (Table S2). They represent CCAP, EH, myosuppressin, NPLP1 and orcokinin. Of a total of 42 neuropeptide/neurohormone genes discovered ESTs were identified for 21 genes, while for six additional genes ESTs from other aphid species were found in the databases. Several of these ESTs have previously been reported (Christie, 2008). It is noteworthy that the predicted neurohormone and neuropeptide precursors are very similar between the different aphid species, which was useful in some cases for the determination of exon-intron boundaries in A. pisum. Masses corresponding to the predicted neuropeptides were observed for a total of 15 neuropeptide genes, including six for which no A. pisum ESTs were found (Fig. 2 and supporting information Table S1). This is a fairly large number considering that the size of the larger neurohormones like the insulins, bursicon, eclosion hormone, and the glycoprotein hormone GPA2/GPB5 makes the chances of successfully observing their masses rather small.

Table 1.  List and sequence of pea aphid neuropeptides and neurohormones
NamePredicted biologically active peptides
  1. A complete description is provided in Table S2.

Allatostatin CSYWKQCAFNAVSCFamide
LeucokininQKTVFSSWGamide, QSTYPYGamide, PAFSSWGamide, ASDKHamide, PKQTFSSWGamide, SSDFFPWGamide
Figure 1.

Distribution by functional categories of non-redundant expressed sequence tags from Acyrthosiphon pisum brain. Distribution has been performed after a sequence similarity analysis using BlastX against proteins with known function in public databases (Uniprot).

Figure 2.

MALDI-TOF MS spectrum generated from a nervous tissue peptide extract from Acyrthosiphon pisum. The upper panel gives an overview of the measured mass range (500–5000 Th). The lower panels (A–C) are details of the following mass ranges, 980–1300 Th, 1300–1450 Th and 1470–1900 Th, respectively. If the theoretical mass of a predicted peptide corresponded to an ion signal in the spectrum (mass + 1H+) its amino acid sequence was added.

Several insect neuropeptide precursors yield a number of structurally similar peptides, such as the FMRFamide, allatostatins A and B, the leucokinins, tachykinins and pyrokinins (Table 1). The structures of the peptides predicted are generally similar to those known from other insect species, with the exception of allatostatin B. Allatostatins B normally contain the sequence W(X)6Wamide, where W is a tryptophan residue and X can be any amino acid residue. The A. pisum allatostatin B precursor is unusual in that it codes for two copies, having seven instead of six amino acid residues between the two tryptophans. As in the honeybee (Hummon et al., 2006) the ETH precursor of A. pisum encodes only ETH (Table 1) while in all other arthropods this precursor produces both ETH and pETH, structural similar neuropeptides with slightly different biological activities (Zitnan et al., 1996; Park et al., 1999; Riehle et al., 2002; Li et al., 2008; Gard et al., 2009). Other noteworthy observations include the presence of three different genes for eclosion hormone and the observation of a mass which matches the predicted mass for allatostatin CC. This recently identified peptide has been predicted on the basis of genomic and cDNA sequences in various insect species (Veenstra, 2009) and although a matching mass is not definitive proof, it is the best physical evidence for the existence of this peptide.

Failure in detecting some neuropeptides

Although most of the known insect neuropeptide genes are present in A. pisum, some genes were not found. Not finding a particular gene may be attributable to different reasons. First, due to limited structural homology with the known peptides it may not be possible to identify the pea aphid homologue unequivocally. Such is likely to be the case for prothoracicotropic hormone, and possibly also for neuroparsin. Second, the particular neuropeptide gene may be encoded by part of the genome which has not been sequenced as an estimated 15% remains uncovered. No sequences with high similarity to the ∼170 000 available ESTs were found in the unassembled reads suggesting a low number of protein coding genes in the unassembled fraction of the dataset (International Aphid Genomics Consortium, 2010). However, this may be the case for PDF, an arthropod neuropeptide essential for maintaining the circadian rhythm in Drosophila (Renn et al., 1999) and likely also in other insect species. We were unable to identify the PDF gene, but its receptor is encoded by the genome (data not shown). Third, it may be a result of the genuine absence of the neuropeptides in question. The latter appears to be the case for corazonin, sulfakinin and vasopressin, where neither the genes coding the neuropeptides nor those coding their receptors were found.

The most fascinating aspect of aphid physiology is no doubt their capacity to adopt different phenotypes. Migratory locusts can also change their phenotype depending on the environmental conditions, and it is known that corazonin, a neuropeptide initially discovered as a cockroach cardioacceleratory peptide (Veenstra, 1989), is responsible for some of these phenotypic changes (Tawfik et al., 1999). Nevertheless, the absence of both corazonin and its receptor from the aphid genome demonstrates that it cannot play a similar role in aphids. Corazonin is also lacking from the Tribolium genome, and may well be absent from all Coleoptera (Tanaka, 2006).

Insulin-related peptides

Together with the absence of the three neuropeptide genes mentioned above, the presence of 10 different insulin genes is probably the most salient neurohormone feature of the pea aphid genome. Insulins typically consist of two subunits, the A and B chains, which in the mature peptide are connected by two cysteine bridges while the A chain has a third intrachain disulphide bridge. The two cysteine residues of the B chain are separated by 11 amino acid residues, while the four cysteine residues of the A chain are classically ordered as CC(X)3C(X)8C in which C is a cysteine residue, and X any other amino acid residue. In the prohormone the A and B chains are linked by the C (connecting) peptide which is cut from the precursor by a prohormone convertase, present in classical endocrine cells and peptidergic neurons. The pea aphid genome has four such classical insulin genes, which we have called IRP-1 through IRP-4 (Fig. 3, Table S1). For one of these genes (IRP-4) an EST was identified from a head library (Sabater-Muñoz et al., 2006). From Myzus persicae two ESTs from head and whole body, corresponding to the homologues of IRP-1 and either IRP-2 or-3 genes of the pea aphid, are available in the databases (Fig. 3).

Figure 3.

Sequence alignment of insulin related peptides (IRPs) from aphids as predicted from genomic and expressed sequence tag (EST) sequences. Potential translated sequences from three pseudogenes are also indicated. Conserved cysteines are marked by an asterisk on top, while dibasic amino acid residues from canonical prohormone convertase cleavage sites (group 1 and group 3) or from furin cleavage sites (group 2) have been indicated by vertical bars. For comparison IRPs predicted from ESTs obtained from aphid species other than Acyrthosiphum pisum have also been indicated. Myzus: Myzus persicae, Toxoptera: Toxoptera citricida, X: stop codon. For GenBank ID numbers see Table 1.

Not all insulin-related hormones are produced by specialized endocrine cells. Insects synthesize some insulin-like hormones in non-endocrine cells, notably in midgut muscle cells (Ikeya et al., 2002; Veenstra et al., 2008), while others are secreted by the fat body (Okamoto et al., 2009). In such cells there is no prohormone convertase and the hormones are not cleaved by this enzyme (e.g. Okamoto et al., 2009), which typically cleaves behind the dibasic Lys-Arg pair in hormone and neuropeptide precursors (Veenstra, 2000). However, although some insect insulin-related hormones, such as the honeybee insulin AmILP-1 (Wheeler et al., 2006) and the silkmoth insulin-like growth factor (Okamoto et al., 2009), may consist of a single protein chain, others appear destined to be cleaved by the general cell protein convertase furin, as they contain a consensus processing site for this enzyme.

The pea aphid genome contains three genes in which the insulin precursor is likely cleaved by furin. All three, the IRP-5, -6 and -7 genes, have a single furin-like cleavage site between the A and B chains and which are therefore predicted to consist in the mature stage of an A and B chain; their precursors lack a C peptide. Two or three pseudogenes also form part of this group. Two of them can be found using BLAST by homology at the nucleotide level to IRP-6, for the third one it is its putative translation that shows homology with an insulin A-chain (Fig. 3). The latter pseudogene is in the same contig as IRP-5, suggesting that it likely has its origin in a gene rearrangement. There are a total of 16 independent EST sequences for IRP-5, suggesting that this gene is very abundantly expressed. The databases contain ESTs encoding a homologue of this hormone from both M. persicae and Toxoptera citricida, reinforcing the notion that IRP-5 has a prolific expression in aphids. It is worth noting that the different insulin precursors encoded by these three aphid species are highly homologous, indicating that aphids in general may have very similar insulin genes (Fig. 3).

As insulin is essential for growth in Drosophila (Brogiolo et al., 2001; Ikeya et al., 2002) and aphids grow very fast, it seems rather likely that IRP-5 is stimulating growth. As r-strategists, fast growth rates are important for aphids. The large majority of the IRP-5 ESTs come from whole insect libraries, while its M. persicae homologue is derived from the digestive tract. Although the intestine contains endocrine cells producing regulatory peptides, in Drosophila the most abundantly expressed insulin peptide is not expressed in endocrine cells, but in midgut muscle (Ikeya et al., 2002; Veenstra et al., 2008). This suggests that IRP-5 may be predominantly expressed by non-endocrine cells.

The third group of aphid IRPs consists of IRP-8, -9 and -10 (Fig. 3). Whereas IRP-1 through IRP-7 all have the classical spacing of cysteine residues in the A chain, in these IRPs, the cysteine spacing is CC(X)3C(X)9C. This group also has a single convertase cleavage site which suggests that like IRP-5, -6 and -7, the mature IRP-8,9 and -10 consist of an A and a B chain. No ESTs were found for these genes either in A. pisum, or in other aphid species. Nor did we find the masses of these peptides, but this was not unexpected as these pepides are too big to be reliably detected under the conditions used.

A recent transcriptomic analysis performed on pea aphid genes showed that the insulin receptor gene was downregulated and an insulin-degrading enzyme gene was upregulated in short-day reared insects (Le Trionnaire et al., 2009). This suggests that the insulin pathway is inhibited under short day conditions during the switch to the sexual reproductive mode, but the nature of the identity of the insulin receptor ligand is still unknown.

Neuropeptide and photoperiod signalling

As noted in the Introduction, the most fascinating aspect of aphid neuroendocrinology concerns two groups of five neuroendocrine cells in the pars intercerebralis, which have been shown to be necessary to maintain the production of asexuals (Steel & Lees, 1977). These cells were identified by staining with paraldehyde fuchsin, a classical staining method for neurosecretory cells that is sensitive to disulphide bridges. It is likely that cauterization of the group 1 cells did not only remove those paraldehyde fuchsin positive cells but also other neuroendocrine cells in the pars intercerebralis, and therefore the possibility cannot be excluded that it is these other neuroendocrine cells that are necessary for the production of asexuals. Nevertheless, the very unusual axonal projections of these neuroendocrine group 1 cells, which in all other insect species have axons projecting only to the corpora cardiaca, reinforces the hypothesis that it is in fact those paraldehyde fuchsin positive cells, and not some other cells nearby, that are responsible for maintaining the production of asexuals.

The question is which of the insect neurohormones and neuropeptide precursors have a sufficient number of disulphide bridges to be responsible for this staining? Neuroparsin and its mosquito homologue are both produced by strongly paraldehyde fuchsin positive cells in the pars intercerebralis (Tamarelle & Girardie, 1989; Brown et al., 1998). This makes neuroparsin a strong candidate, but unfortunately we were unable to identify unambiguously an A. pisum gene encoding a neuroparsin homologue. As no neuroparsin receptor has been identified, and the primary sequence of neuroparsin within insects is not very well conserved, it is impossible to exclude the possibility that such a gene is present in the aphid genome but was missed. Other insect neurohormones containing a number of disulphide bridges are the insulins, eclosion hormone, ITP and the two glycoprotein A/B dimers, bursicon and GPA2/GPB5. Hence, all of these hormones are potential candidates. Having the genomic as well as the predicted amino acid sequences of these aphid hormones available makes it relatively easy to test whether any of the candidate hormones are produced by the group 1 cells, either by in situ hybridization or by raising antisera to specific peptide sequences. Thus, whereas the presence of a neuropeptide and its receptor does not tell us anything about how it is used, the present study provides the tools to probe the functional significance of aphid hormones.

Experimental procedures

Biological experiments

The holocyclic clone LSR1.AC.G1 of A. pisum (the pea aphid) was reared on Vicia fabae in controlled conditions under a constant temperature of 18 °C and at 16 h of light. This photoperiod regime allows a clonal parthenogenetic viviparous reproduction. The LSR1.AC.G1 clone is the reference clone for which the genome is available (International Aphid Genomics Consortium, 2010). Sexual oviparous females were obtained by rearing the pea aphid LSR1.AC.G1 at 12 h of light at a constant 18 °C temperature (Le Trionnaire et al., 2007).

MALDI-TOF mass spectrometry

For mass spectrometry, neural tissues (the brain and retrocerebral complex, the suboesophageal ganglion and thoracic ganglion mass) from adult viviparous parthenogenetic females were dissected in Ringer solution and transferred to a 0.5-ml Eppendorf tube, containing 50 µl methanol/water/acetic acid (90/9/1) (Ringer solution: 9.82 g/l NaCl, 0.32 g/l CaCl2, 0.48 g/l KCl, 0.73 g/l MgCl2, 0.25 g/l NaHCO3, 0.19 g/l NaH2PO4, pH 6.5). Approximately 50 dissected tissues were used for one mass spectrometry analysis. The samples were sonicated (on ice) three times for 1 min and the debris was discarded by centrifugation (10 min at 9500 g). Prior to the mass spectrometry analysis, the supernatants were dried in a vacuum centrifuge and reconstituted in 50 µl of 0.1% aqueous TFA. Desalting columns ZipTipC18 (Millipore, 15 µm) were pre-equilibrated for sample binding using 0.1% aqueous TFA containing 50% CH3CN, followed by 0.1% aqueous TFA. The reconstituted supernatants were loaded and, after flushing with 0.1% aqueous TFA to remove salts and other impurities, eluted directly onto a MALDI target plate in 1 µl 0.1% TFA containing 70% CH3CN.

Matrix-assisted laser desorption/ionisation (MALDI) time-of-flight (TOF) mass spectrometry was performed on a Reflex IV (Bruker Daltonic GmbH, Berlin, Germany), equipped with a N2 laser and pulsed ion extraction accessory. One µl of the sample solution was transferred to a ground steel target plate, mixed with 0.5 µl of a saturated solution of α-cyano-4-hydroxycinnamic acid in acetone and air-dried. The instrument was calibrated using a standard peptide mixture (Bruker Daltonic GmbH). Spectra were recorded in the reflectron mode within a mass range from m/z 500 to m/z 5000.

All spectra were manually processed (background subtraction, smoothing and peak picking) using the FlexAnalysis software (Bruker Daltonic GmbH) in order to generate a peak list file. This peak list was compared with the theoretical masses of predicted neuropeptides from A. pisum (Suppl data, Table S1).

cDNA library construction and expressed sequence tag sequencing

For RNA extraction, brain tissue was dissected free from the suboesophageal and thoracic ganglia on ice in NaCl solution. Samples were kept in RNA extraction buffer (see below). About 50 brains were used for each extraction of total RNA. Two separate brain cDNA libraries were made, one from wingless viviparous adult parthenogenetic female brains (Ap-BrVp library), the other from wingless adult oviparous female brains (Ap-BrOvp library), using about 50 brains for each library. Total RNA was extracted, immediately after dissection using the RNeasy Plant Mini Kit (Qiagen, Hilden, Germany). Brains were crushed in the RTL extraction buffer, following the manufacturer's instructions. Complementary DNA synthesis and cloning were performed with the CreatorTM SmartTM cDNA Library Construction Kit (BD Biosciences Clontech, Palo Alto, CA, USA) as described in Tagu et al. (2004), using 450 ng and 190 ng of total RNA for Ap-BrVp and Ap-BrOvp, respectively. Ligation products were electroporated in electrocompetent Escherichia coli TOP10 cells (Invitrogen, Cergy-Pontoise, France). Bacterial colonies (n= 1056 for each library) were inoculated into 96-well plates containing selective LB medium and 10% (v/v) glycerol, grown overnight in standing culture at 37 °C, and stored at −80 °C. Backup plates were also produced. Polymerase chain reaction (PCR) of cDNA inserts was performed as described by Tagu et al. (2004) from defrosted bacterial glycerol stock as a template. Excess primers and nucleotides were removed by filtration on Sephadex plates. The resulting purified PCR products were used as templates (30–50 ng/µl) for a sequencing reaction at the sequencing facilities of Biogenouest® (Roscoff, France). The name given to each EST corresponds to the name of the cDNA libraries (Ap-BrVp for A. pisum, Brain Viviparous and Ap-BrOvp for A. pisum, Brain Oviparous), followed by the Roman number of the microplate, the letter of the row in the microplate and the Arabic number of the column in the microplate (eg II-G10). Sequences (n= 1413) have been posted to dbEST database under the accession numbers GH706950-GH708362.

Sequence processing was performed by using the Staden package ( to clean vector and adaptor sequences and poly-A tails using default parameters. Sequences were mapped to the pea aphid genome using Sim-4 and BLAST. ESTs mapping a predicted gene were annotated by using the predicted function of the gene model. ESTs not mapping to a predicted gene were annotated through the Uniprot database as described in Sabater-Muñoz et al. (2006).

Gene identification by database searches

Sequences of neuropeptides as well as their precursors from D. melanogaster (Vanden Broeck, 2001), A. mellifera (Hummon et al., 2006), T. castaneum (Li et al., 2008) and the newly identified insect neuropeptides allatostatin CC (Veenstra, 2009) and the two CCHamides (Roller et al., 2008) were used in homology searches against the genomic database of A. pisum (scaffolds from v1.0 release of the assembled genome, December 2007), available at AphidBase (, using tblastn (Altschul et al., 1997) with default values for all parameters. Alternatively, sequences were recovered by using the blast package (blast-2.2.18) downloaded from, which allowed adjusting parameters to increase chances of finding neuropeptide genes. Each retrieved sequence was further analysed and manually curated by using the Apollo tool (Lewis et al., 2002) available online at AphidBase.

Prediction of potential signal peptide cleavage sites were performed with SignalP 3.0 (, Bendtsen et al., 2004). Neuropeptides within the precursor sequences were located by similarities with known neuropeptides and the presence of mono- and dibasic proteolytic cleavage sites (Veenstra, 2000). The theoretical molecular masses (all monoisotopic) of these predicted peptides were calculated by using the Biolynx Protein/Peptide Editor in MassLynx V3.5 (Micromass Ltd, Manchester, UK).


The authors would like to thank Fabrice Legeai and Jean-Pierre Gauthier (AphidBase, INRA Rennes, BiO3P) for their support in bioinformatics. We appreciated the discussion with Stéphanie Jaubert-Possamai (INRA Rennes, BiO3P). Sequencing was performed at the sequencing facilities of Biogenouest® at Roscoff, France. This work was funded by ANR Exdisum. JH is a postdoctoral researcher of the Fund for Scientific Research-Flanders, Belgium (F.W.O.-Vlaanderen).