C. Tamborindeguy, Department of Entomology, Texas A & M University TAMU 2475, College Station, TX 77843-2475, USA. Tel.:+1 979 845 7072; fax:+1 979 845 6305; e-mail: firstname.lastname@example.org
Aphids are the primary vectors of plant viruses. Transmission can occur via attachment to the cuticle lining of the insect (non-circulative transmission) or after internalization in the insect cells with or without replication (circulative transmission). In this paper, we have focused on the circulative and non-propagative mode during which virions enter the cell following receptor-mediated endocytosis, are transported across the cell in vesicles and released by exocytosis without replicating. The correct uptake, transport and delivery of the vesicles cargo relies on the participation of proteins from different families which have been identified in the Acyrthosiphon pisum genome. Assemblage of this annotated dataset provides a useful basis to improve our understanding of the molecules and mechanisms involved in virus transmission by A. pisum and other aphid species.
Aphids are responsible for the transmission of 28% of all plant viruses (Hogenhout et al., 2008). In some cases, the virus is transmitted from plant to plant simply attached to the cuticle of the mouthparts or the foregut (non-circulative transmission) and, in other instances, the virus is internalized by vector cells (circulative transmission). In this last mode of virus transmission, virions, acquired by the insect while feeding on an infected plant, are internalized into insect cells and circulate through the insect body prior to being injected into a new plant with the saliva component during a subsequent feeding. The circulative mode of transmission without replication in insect cells is exploited by members of the Luteoviridae family, which contains single-stranded RNA plant viruses specifically transmitted by aphids. Luteovirids have non-enveloped icosahedral capsids composed of two proteins: the major coat protein (or CP) and a minor compound, the readthrough protein (or RT) which is produced by a translational readthrough mechanism of the CP stop codon (Brown et al., 1996). Luteovirids display a high level of vector-specificity; each virus is efficiently transmitted by a limited number of aphid species (Herrbach, 1999). In this family, Pea enation mosaic virus (PEMV) and Soybean dwarf virus (SbDV) are of particular interest because they are efficiently transmitted by the pea aphid, Acyrthosiphon pisum, whose genome has been sequenced. Over the last few years, remarkable progress has been made in defining the circulative mode of transmission of luteovirids. During transport throughout the aphid body, virions cross two aphid cell layers (the midgut and/or the hindgut and the accessory salivary glands) (Gildow, 1999; Reinbold et al., 2003). Because non-transmissible viruses fail to cross one of these layers, they are referred to as ‘transmission barriers’. Transcytosis of these cell layers relies on the presence of specific aphid components able to recognize (receptors) and then to sustain virion transport (endocytic components) from one pole to the other in the cell (Gildow, 1999; Brault et al., 2007). Transmissible virions enter the cells following a clathrin-mediated endocytosis (CME) process and are transported across the cells enclosed in different types of vesicles (Gildow, 1993). It is believed that the virions hijack a naturally occurring mechanism which most likely enables transport of essential aphid macromolecules (Marsh & Helenius, 2006).
Luteovirid transmission has been shown to depend on virus and aphid factors, both capsid proteins are required for efficient virus transmission (Jolly & Mayo, 1994; Brault et al., 1995, 2000; Chay et al., 1996; Gildow et al., 2000; Reinbold et al., 2001) and several aphid genes acting in an additive manner regulate the transmission process (Burrows et al., 2007). In addition, different aphid genes might be involved in the transmission of different virus strains with specific genes acting at each barrier for virion recognition and transport. Several aphid proteins potentially involved in luteovirid transmission have already been identified in the green peach aphid (Myzus persicae) and the greenbug aphid (Schizaphis graminum) including actin, GAPDH, Rack-1, cyclophilin and luciferase (Seddas et al., 2004; Yang et al., 2008). However, because luteovirids are transported by membrane trafficking mechanisms, proteins involved in endocytosis, vesicle transport or exocytosis must be considered as key elements that regulate the luteovirids transmission process as they might control virion transport across the transmission barriers.
Membrane trafficking mechanisms allow the transport of substances between different compartments in eukaryote cells and distinct families of proteins have been identified as regulators of this process. For example, coat protein complexes (CPCs) are involved in the biogenesis of cargo vesicles from donor membranes; whereas, Rab GTPases, exocyst proteins, synaptotagmins and soluble N-ethylmaleimide-sensitive factor (NSF) attachment protein (SNAP) receptors (SNAREs) mediate targeting, docking and fusion of these vesicles to membranes. In addition, transfer of cargo vesicles from the origin to the destination point relies on a network of cellular projections composed of actin filaments.
The completion of the A. pisum genome provides a unique opportunity to catalogue genes potentially involved in membrane trafficking in this species and allows comparative genomic analyses with similar protein families within other insect species, in particular with the dipteran Drosophila melanogaster. The identification and annotation of these genes in aphids may be a key to unravelling the precise steps that occur during phytovirus transcytosis in insect cells.
Results and discussion
Conservation of the clathrin-mediated endocytosis pathway in Acyrthosiphon pisum
Clathrin-mediated endocytosis is an important vesicle biogenesis pathway. It is the chief means for internalization of transmembrane proteins, bound cargos and lipids from the plasma membrane and may also occur in the trans-golgi network and on some endosomes (Bonifacino & Lippincott-Schwartz, 2003). The primary function of CME is nutrient and growth factor uptake into the cell; however, despite its specificity and tight regulation, this process is susceptible to hijacking by toxins and viruses. CME commences with the formation of clathrin coated pits (CCP) from which cargos are packed into vesicles that are surrounded by a coat predominantly made of clathrin and adaptor proteins (Benmerah & Lamaze, 2007). Concurrently, a number of endocytic accessory proteins and alternative adaptors are recruited to the cell surface (Fig. 1). The accessory proteins can simultaneously or sequentially become engaged in interactions with other components of this endocytic pathway, such as lipids and transmembrane proteins anchored in plasma membranes. This series of inter-connecting events has recently been described in several organisms as a dynamic network in which adaptor protein 2 complex (AP2) and clathrin hold a central position (Schmid et al., 2006; Schmid & McMahon, 2007). While accessory proteins primarily play a regulatory role in determining the contents and initiating the formation of coated pits, AP2 and clathrin maintain additional functions as structural components of coated vesicles. Forty-five endocytic genes were annotated in the A. pisum genome (Table 1). This finding represents a quasi-perfect correlation with D. melanogaster (Table S1), the only difference identified between these two species is the absence of the ced-6 gene in the A. pisum genome but the additional presence of the low-density lipoprotein receptor adaptor protein 1 gene (discussed below). This result is in good agreement with previous reports affirming the conservation of the CME network across Animalia (Schmid & McMahon, 2007) and might be viewed as a hallmark of the central role of this pathway in the correct performance of the cell.
Table 1. Genes in Acyrthosiphon pisum potentially involved in transcytosis
Candidate: proteins interacting with luteovirids; CME, Clathrin-mediated endocytosis; EST, expressed sequence tag.
Clathrin heavy chain
Clathrin light chain
Adaptor-related protein complex 2, alpha subunit
Adaptor-related protein complex 2, mu subunit
Adaptor-related protein complex 2, sigma subunit
adaptor-related protein complex 1, gamma subunit
adaptor-related protein complex 1, mu subunit
adaptor-related protein complex 3, beta subunit
adaptor-related protein complex 3, delta subunit
adaptor-related protein complex 3, mu subunit
adaptor-related protein complex 3, sigma subunit
Epsin-related protein (enthoprotin)
Phosphatidylinositol binding clathrin assembly protein
Low density lipoprotein receptor adaptor protein 1
Target of myb1
Auxillin (cyclin G associated kinase)
syntaxin binding protein 1.1 (Ras opposite)
syntaxin binding protein 1.2 (Ras opposite)
ADP ribosylation factor 79F
ADP ribosylation factor 102F
ADP ribosylation factor 84F
Arflike at 72A
Dynamin related protein 1
Dymanin-like protein 1
Dymanin-like protein 2
Dymanin-like protein 3
Dymanin-like protein 4
Dymanin-like protein 5
Dymanin-like protein 6
Dymanin-like protein 7
Dymanin-like protein 8
Dymanin-like protein 9
Dymanin-like protein 10
Dymanin-like protein 11
Dymanin-like protein 12
Actin-related protein A
Actin-related protein B
Actin-related protein 1
Actin-related protein 2
Actin-related protein 3
Actin-related protein 4
Actin-related protein 4
Actin-related protein 4
Actin-related protein 4
Actin-related protein 5
Actin-related protein 6
Actin-related protein 8
Actin-related protein 11
Guanine nucleotide binding protein (G protein), beta polypeptide 2-like 1
Glyceraldehyde 3 phosphate dehydrogenase 2
AMP dependent coa ligase 2
peptidylprolyl isomerase B (cyclophilin B)
Rab-related protein 4
Rab-related protein 3
Synapse protein 25A
Synapse protein 25B
lethal (1) G0155B
lethal (1) G0155
Coat components. Two genes encoding the main components of the coat, the clathrin heavy chain (Chc) and the clathrin light chain (Clc), were noted in the A. pisum genome. These complexes form three-legged trimers, called triskelions that can oligomerize to produce polygonal arrays (Kirchhausen & Harrison, 1981). Although clathrin is addressed to the plasma membrane, it does not have an affinity with plasma membrane components and relies on endocytic adaptors recruitment for its targeting.
Adaptor protein complexes and accessory proteins. Endocytic adaptors bridge interactions between the lipid phosphatidylinositol (4,5)-biphosphate [PtdIns(4,5)P2] anchored in the plasma membrane, clathrin and specific signals located within the tails of transmembrane receptors. Depending on the number of polypeptides in the adaptor complexes, two classes have been defined: (1) multimeric adaptor proteins which contain four polypeptides (α, β, γ, δ, µ or σ); and (2) the monomeric clathrin-associated sorting proteins (CLASPs) which are often referred to as accessory proteins (Maldonado-Baez & Wendland, 2006). A complete set of the genes which comprise the different subunits of the three multimeric adaptor proteins (AP1-AP3) are present in the pea aphid genome (AP4 complex is lacking in both D. melanogaster and A. pisum genomes). While AP2 is the principal adaptor of the endocytic process occurring at the plasma membrane (Slepnev & De Camilli, 2000), the other adaptor complexes are involved in vesicular trafficking from the trans-golgi network, the endosomal compartment or in the basolateral pathway (Owen et al., 2004).
Accessory proteins are involved in multiple steps of this pathway including membrane binding or bending (Wendland, 2002). For example, AP180, has a clathrin-cage assembly activity and promotes the formation of uniformly sized clathrin-coated vesicles (Zhang et al., 1998; Traub, 2003). The gene that encodes this protein could not be located within the A. pisum and the D. melanogaster genomes. However, a homologous protein (PiCalm) shown to perform a similar function in D. melanogaster was successfully identified (Table S1). Other accessory proteins, like epidermal growth factor pathway substrate 15 (Eps15) and Intersectin act as scaffolding proteins and, thus, represent organizing proteins (Tebar et al., 1996). As in D. melanogaster, only one member of each family is present in A. pisum genome (Table S1). A number of accessory molecules functioning as alternative adaptors for specific transmembrane receptors (Wendland, 2002) were conserved in A. pisum as compared with D. melanogaster: this concerns Numb and Arrestin genes encoding alternative adaptors for Notch and activated G-protein-coupled receptors (Table S1). On the contrary, disparities between D. melanogaster and A. pisum were found with the low-density lipoprotein receptor adaptor protein 1 (Ldlrap1, also known as ARH) and the ced-6 (homolog of human GULP1) genes which act as alternative adaptors for the low-density lipoprotein receptor. To ensure proper annotation, sequence homology was confirmed by multiple sequence alignment with similar sequences derived from other insects (Fig. 2). Ldlrap1 homolog was identified in A. pisum but not in D. melanogaster. In contrast, a homologous D. melanogaster ced-6 gene was not detected in the pea aphid genome. The presence of the Ldlrap1 gene and absence of the ced-6 gene has also been documented in one other insect genome (Tribolium castaneum), but not in the other insect genomes that have been described thus far (Table S2). These genes encode proteins containing the phosphotyrosine-binding domain (PTB) that attaches to the cytoplasmic tails of the LDL receptor and PtdIns(4,5)P2 (Eden et al., 2007). The redundancy of the PTB-binding domain-containing proteins may explain the partial conservation of these proteins in different insect species. Ced-6 proteins have also been shown to be essential for engulfment of apoptotic cells (Liu & Hengartner, 1998). These proteins may participate in lipid transport along with two other proteins found within the A. pisum genome, Dab2 and Numb.
Finally, amphysins and dynamins aid in the separation of the vesicle from the plasma membrane and Hsc70, Auxilin and endophilins promote uncoating (Brodsky et al., 2001; Conner & Schmid, 2003). With the exception of dynamins (see below), these genes display a high degree of conservation with those found in D. melanogaster (Table S1). Annotation of Hsc70 is described elsewhere (Gerardo et al., 2009).
Dynamins: a novel type only present in aphids. Dynamins are large GTPases involved in various processes including endocytosis and budding of transport vesicles, which are crucial for the circulative mode of virus transmission (Cherry & Perrimon, 2004; Praefcke & McMahon, 2004). Their basic function appears to be to constrain the shape of the lipid membrane to perform fission or fusion. Animals including insects typically have three types of dynamins (Dyn, Drp1, and Opa1) (Miyagishima et al., 2008). In many cases, a single gene corresponds to each type.
Screening of the A. pisum genome revealed 15 putative dynamin genes (Nakabachi & Miyagishima, 2010) (Table 1). In addition to a single orthologue for each of Dyn, Drp1, and Opa1 that are common in metazoa, 12 genes encoding a novel type of dynamins, not yet identified in any other organism, were found. Expressed sequence tag (EST) analyses and reverse transcription (RT)-PCRs showed that at least 11 of these novel type genes as well as three canonical genes are transcribed, suggesting that they are functional. Real-time quantitative RT-PCR further demonstrated that expressions of four out of 12 novel type dynamin genes are highly upregulated in the midgut, through which aphids take in phloem sap diets and plant viruses (Nakabachi & Miyagishima, 2010). As this type of dynamin is absent from all other fully sequenced organisms, the products of these genes may function in processes that are unique to aphids.
Actins. Actin is one of most highly conserved and abundant proteins in the cell. It is a ubiquitous protein throughout the eukaryotic system with high sequence similarity among species (Kaksonen et al., 2006). Actin was first described in muscle cells (Halliburton, 1887) but today actin is known to participate in a large array of functions. More importantly, the actin cytoskeleton interacts with various endocytic components (Apodaca, 2001), resulting in inter- or intra-cellular transport or relocation of the endogenous macromolecules as well as viruses (Ploubidou & Way, 2001). For example, actin protein interacts with dynamin at the neck region of the budding vesicle after plasma membrane invagination (Merrifield et al., 2002). Moreover, M. persicae actin was shown to interact with the luteovirid Beet western yellow virus (BWYV) (Seddas et al., 2004).
Actin also has an essential role in exocytosis. The actin filament creates a physical barrier which prevents trafficking or docking of the secretory materials towards the plasma membrane. Therefore, actin filaments have to be removed or relaxed from the plasma membrane to allow exocytosis (Miyake et al., 2001). Disruption of the actin cytoskeleton results in increased exocytosis (Jog et al., 2007), but complete actin depolymerization, inhibits exocytosis (Muallem et al., 1995).
Six actin genes have been identified in D. melanogaster (Tobin et al., 1980): Act88F and Act79B are adult muscle-specific, Act57B and Act87E are larval muscle-specific and Act5C and Act42A are cytoplasmic. Additionally, nine actin-related (ARP) genes are also described in the D. melanogaster genome. Four potential actin genes and 13 actin-related genes were identified in the A. pisum genome (Table 1). No orthologous genes for the muscle-specific Act79B, Act88F, Act57B or Act87E were found within the A. pisum genome, but instead, two different genes were identified, Act2 and Act3 (Fig. S1). The difference observed between A. pisum and D. melanogaster may be related to insect development (hemimetabolism for A. pisum and holometabolism for D. melanogaster). Two genes, Act1 and Act4, are similar to the cytoplasmic actin genes of D. melanogaster, Act5C and Act42A (Fig. S1). The cytoplasmic actin is believed to be implied in different functions and in particular in virus transcytosis. In spite of its name, the D. melanogaster gene Arp53D is a conventional actin protein (Muller et al., 2005). Arp53D orthologues are only found in other Drosophila species. The other 13 A. pisum genes encode for ARPs.
ARPs were discovered in the 1990s in eukaryotic cells. Presently, 11 ARP subfamilies (ARP1 to ARP11) have been defined according to their similarity to conventional actin sequences, with ARP1 being the most similar and ARP 10 the least similar. ARP11 was discovered after this classification was established and has higher similarity with conventional actin than ARP8. ARP1-ARP3, ARP10 and ARP11 are localized in the cytoplasm where they participate in actin assembly and movement of vesicles along microtubules (Schafer & Schroer, 1999). The other ARPs are predominantly localized in the nucleus where they participate in chromatin remodelling, DNA repair and regulation of transcription (Blessing et al., 2004). Other orphan ARPs have been identified in some organisms (Muller et al., 2005). Alignment of ARPs with conventional actins identified, for each ARP subfamily, the location of hotspots for insertions and deletions; for 8 ARP subfamilies (ARP1, ARP2, ARP3 and ARP5 to ARP9) discriminating motifs and single residues were found (Muller et al., 2005). Potential A. pisum ARP genes were assigned to different ARP subfamilies using the identified discriminating motifs. As in D. melanogaster ARP1 to ARP6, ARP8 and ARP11 genes were identified, no ARP7, ARP9 or ARP10 were found (Fig. S1). The results obtained for both insects are in accordance with the phylogenetic distribution of ARP genes. The absence of ARP7, ARP9 and ARP10 is not surprising because these genes are restricted to fungi (Muller et al., 2005). Four potential ARP4 genes were found in A. pisum, and partial EST coverage was identified for only one of them. This gene is also duplicated in Apis mellifera and T. castaneum. Two additional ARP genes were found within the A. pisum genome, ARPA and ARPB (Fig. S1). These genes appear to be unique to A. pisum, no homologues were identified in any other organism. These genes could not be assigned to any subfamily using the discriminating motifs, they are probably orphan ARPs.
Seventeen key reference residues involved in nucleotide binding have been identified (Muller et al., 2005). Conventional actins and ARP1–ARP3 (cytoplasmic ARPs) have more than 60% identical residues and 90% similar residues, and are able to bind ATP, whereas the rest of the ARPs (ARP4–ARP11), with fewer identical and similar residues might not bind ATP or bind with less affinity or through other residues. The two orphan ARPs have a low percentage of identical and conserved amino acids for the 17 key residues; they might be unable to bind ATP.
No transcriptomic data are available verifying the expression of these genes in A. pisum or in other aphids. Therefore, more experiments are needed to verify whether these genes are functional or not, and whether they are ‘aphid-exclusive’ or are more extensively widespread.
Conservation of intracellular traffic and exocytosis pathways
Common steps in membrane targeting include the following: (1) SNARE activation and Rab recruitment to proper organelle sites; (2) membrane attachment; and (3) membrane fusion and bilayer mixing (Tuma & Hubbard, 2003).
Rab GTPases. Rab GTPases are small proteins anchored to membranes by geranylgeranyl hydrocarbon chains (Alory & Balch, 2000). They act as molecular switches by alternatively binding to GDP (inactive form) and GTP (active form). Only the Rab GTP-bound is able to interact with effector proteins that participate in the coupling of endomembranes to motors, vesicle docking and tethering (Zerial & McBride, 2001; Deneka et al., 2003). Additionally, Rab proteins take part in cargo selection, vesicle budding, movement along actin and tubulin networks, and targeting (Pfeffer, 2001; Ali et al., 2004; Pfeffer, 2005). Eleven members have been documented in yeast, 33 in D. melanogaster and 60 in humans (Bock et al., 2001; Zhang et al., 2007). The phylogenetic analysis of Rab proteins revealed a phylogeny of function: Rab proteins of similar function in different organisms co-segregate (Pereira-Leal & Seabra, 2001).
A total of 26 different potential Rab genes were identified in the A. pisum genome (Table 1). With few exceptions, one orthologue of each one of the D. melanogaster Rab genes was identified (Fig. S2). Ten Rab genes identified in D. melanogaster are missing in A. pisum. Six of these genes (RabX2, Rab9D, Rab9Db, Rab9E, Rab9Fa and Rab9Fb) are very similar. These genes cluster on the X chromosome of D. melanogaster and orthologues have only been found in other Drosophila species and thus may have evolved recently (Zhang et al., 2007). Despite being identified in other insects, the following genes are also missing from the A. pisum genome: Rab4, Rab27, RabX5 and RabX6. In mammals, receptor recycling following internalization can be done via the ‘short-loop’ pathway controlled by Rab4 or the ‘long-loop’ pathway regulated by Rab11. Rab4 is absent from A. pisum and several other insect genomes (e.g. A. mellifera, Toxoptera citricida, Nasonia vitripennis), while Rab11 is highly conserved in all the sequenced insect genomes. In the species lacking Rab4, the recycling of receptors might therefore be exclusively performed by Rab11 or the Rab4 function is performed by another protein. A similar situation is encountered with Rab3 and Rab27 which are involved in regulated secretion in many types of secretory cells. Rab27 is absent from A. pisum genome (as well as from other insect genomes) and Rab3 is conserved among all the sequenced insect genomes, it is therefore possible that this protein fulfills Rab27 function in insects lacking Rab27. RabX5 has been identified in Drosophila species and mosquitoes, whereas RabX6 has only been identified within genomes of the Drosophila species. The appearance of these genes might have occurred after the separation of the aphid family.
The RabX4 and Rab32 genes seem to have been duplicated in A. pisum genome (Fig. S2). These duplications were not identified in any other insect genome available to date. In both cases, the two duplicated A. pisum genes seem to evolve at different evolutionary speeds. However, the estimation of the numbers of synonymous and nonsynonymous nucleotide substitutions showed that both set of genes were under purifying selection (Table S3).
The Rab2b gene was identified in A. pisum. This gene is absent from all other sequenced insects but present in mammals.
Synaptotagmins. Synaptotagmins were originally identified as synaptic vesicle proteins involved in membrane fusion events. They contain a small N-terminal intravesicular domain, a single transmembrane domain and a large cytoplasmic region with two tandemly-arranged, distinct C2 domains that are calcium-dependent phospholipid-binding motifs (Perin et al., 1990). This family of proteins has drawn attention for its potential role as a calcium sensor in synaptic exocytosis of neurotransmitters (Littleton & Bellen, 1995). To date, 16 genes have been described in humans but only seven were identified in D. melanogaster based on DNA sequence similarity. Analyses of the expression pattern of different synaptotagmin members have shown that some are primarily expressed in the nervous system and other endocrine organs (e.g. synaptotagmins 1–5 and 10–12) (Mizuta et al., 1994; Babity et al., 1997; Berton et al., 1997), while others are more ubiquitous (synaptotagmins 6–9) (Li et al., 1995). Moreover, some members are not associated with vesicles but with other cellular compartments such as the plasmamembrane or lysosomes (synaptotagmins 3, 4, 6 and 7) (Butz et al., 1999; Martinez et al., 2000). The release of plant genome sequences has allowed the identification of synaptotagmins in plants (six in Arabidopsis thaliana and at least eight in rice) (Craxton, 2004), suggesting that their function is not restricted to neurotransmitter release (Marqueze et al., 2000) but probably extends to a more global role in exocytosis.
Based on sequence similarity, only four genes putatively encoding synaptotagmins were identified in A. pisum (Table 1). These genes represent three orthologues of D. melanogaster: Syt, Syt7 and SytIV. Syt and SytIV are present in many vertebrate and invertebrate genomes and are thought to mediate evolutionary conserved functions required in animals. Syt7 was identified in A. pisum genome but is missing in two invertebrate genomes (Aedes aegypti and Anopholies gambiae, Table S4). No orthologues of the four other D. melanogaster synaptotagmin genes and of one additional gene identified in N. vitripennis were noted in A. pisum although they are present in other insect genomes (Table S4). D. melanogaster synaptotagmin proteins were localized in different subcellular compartments which suggested that they cannot substitute each other (Adolfsen et al., 2004). The knowledge of the precise function in insect organisms of the five synaptotagmins absent in A. pisum is still limited, therefore their function is either dispensable in A. pisum or provided by yet unidentified proteins.
The SNARE superfamily. SNARE proteins form a protein bridge between an incoming vesicle and the acceptor compartment (the SNARE complex) which is required for the fusion of two lipid bilayers. Using previously characterized members of the SNARE superfamily from D. melanogaster, 23 putative SNARE genes were identified in A. pisum (Table 1). In general, the A. pisum SNAREs share the conserved structural features of these proteins: a SNARE motif ∼60–75 amino acids arranged in a series of heptad repeats followed by either a transmembrane domain or a site for post-translational modification (e.g. farnesylation/geranylgeranylation).
The A. pisum candidate genes were categorized as either Qa-, Qb-, Qc-, Qbc-, or R-SNAREs in accordance with the currently accepted classification scheme (Fasshauer et al., 1998). The SNARE seed region designated by Pfam (http://pfam.sanger.ac.uk) was identified for each sequence and used for the phylogenetic analysis (Fig. S3). Although the Qa-SNARE syntaxin genes 4, 13 and 18 were present in the D. melanogaster genome (Littleton, 2000), they were not found within the A. pisum genome. This finding is not surprising because function redundancy has been observed among these proteins (Jahn & Scheller, 2006). SNAP24 was not found within the A. pisum genome; this gene is only present within genomes of the Drosophila species.
Potential sites for post-translational modifications were found in some A. pisum SNARE proteins as in both variants encoded by the lethal (1) G0155 gene which possess putative prenylation sites. Such a modification is thought to facilitate attachment to cell membranes. However, other proteins such as SNAP25A and SNAP25B are void of such predicted sites as well as transmembrane domains. Targeting of these proteins to membranes is probably accomplished through association with other SNAREs as observed with human SNAP25 and Syntaxin (Vogel et al., 2000).
The majority of A. pisum SNAREs contain a single SNARE motif located at the C-terminal end of the protein. However, SNAP25A and SNAP25B possess dual SNARE motifs, a feature characteristic of animals, higher plants, and fungi (Weimbs et al., 1997; Besteiro et al., 2006).
Additional structural features were identified in the A. pisum Qa-SNAREs. With the exception of Syntaxin 17, all the proteins were found to carry a three α-helix bundle with a left-handed twist fashion at their N-terminal ends. This structure, denoted Habc, may fold back onto the SNARE domain and enable the molecule to adopt a ‘closed’ conformation that prevents assembly of the core fusion complex (i.e. Habc could act as an autoinhibitor of SNARE function) (Lerman et al., 2000; Teng et al., 2001). A second type of autonomous N-terminal structure, a profilin-like domain, is present within some R-SNAREs as well as putatively unrelated proteins (Rossi et al., 2004). This structure may also fold back onto its SNARE domain and thus inhibit binding of other molecules.
The exocyst complex. Originally identified in yeast, the exocyst is a complex of proteins that is involved in vesicle trafficking (Novick & Schekman, 1979). These proteins are either located on the plasma membrane, attached to Rho GTPases (e.g. EXOC1 and EXOC7) or on the vesicle membrane (e.g. EXOC6) (Boyd et al., 2004). They are involved in an array of cellular processes including exocytosis where they mediate the targeting and tethering of post-Golgi secretory vesicles for subsequent membrane fusion. In yeast and mammals, eight subunits are recognized (EXOC1 to EXOC8, formerly denoted as Sec3, Sec5, Sec6, Sec8, Sec10, Sec15, Exo70 and Exo84). Based on sequence similarity to D. melanogaster, nine putative exocyst genes were noted in the A. pisum genome (Table 1). A single putative orthologue for each one of the eight exocyst genes was identified with a potential duplication of EXOC3 (Sec6). One of the predicted genes appears to encode the first 482 amino acids of a Sec6 protein, but is interrupted due to absence of sequence data in this region. In the other Sec6 gene analysed, the entire coding sequence is located in the same exon. Therefore, it is unlikely that this feature represents a pseudogene as transcriptome data support the existence and expression of these two genes.
Aphid receptors of luteovirids are still unknown but aphid proteins able to bind in vitro purified virions have been observed and, in some cases, identified (Li et al., 2001; Seddas et al., 2004; Yang et al., 2008). Some of these proteins are believed to drive endocytosis of BWYV or the RPV strain of Cereal yellow dwarf virus (CYDV-RPV) in their respective vector M. persicae and S. graminum.
Receptor for activated C kinase 1 (Rack-1) also called guanine nucleotide binding protein (G protein) beta polypeptide 2-like 1 (Gnb2l1), glyceraldehyde-3-phosphate dehydrogenase 2 (GAPDH), actin and a cuticular protein were identified as potential candidates involved in BWYV transmission by M. persicae by a virus-overlay assay (Seddas et al., 2004). Rack-1 is known to regulate cell surface receptors and intracellular protein kinases (Choi et al., 2003). This protein is probably an intracellular signalling molecule in the endocytosis pathway because an extracellular localization for Rack-1 has not been reported to date. GAPDH is an important enzyme of the glycolysis pathway that has also been shown to regulate endocytosis when phosphorylated (Tisdale, 2002). Actin, as already mentioned, is a crucial determinant of intracellular trafficking, whereas the function of the cuticular protein in luteovirid transcytosis remains unclear. Another technique, based on proteome comparisons between S. graminum populations exhibiting contrasting transmission efficiency for CYDV-RPV, allowed the identification of four additional proteins potentially involved in transmission of this virus. Two of these proteins showed similarities with hypothetical proteins, whereas the other two exhibited similarities with isomerase cyclophilin B and acyl-coenzyme A ligase proteins (homologous to firefly Luciferase) (Yang et al., 2008). The latter two proteins are of particular interest because they contain specific domains for addressing to vesicles similar to those enclosing luteovirid virions in insect cells.
We have annotated in the A. pisum genome the sequence corresponding to Rack-1 (Table 1) and observed that this gene is highly similar to the M. persicae homologous gene (96.56% nucleotide identity between coding sequences). Six genes encoding GAPDH are predicted in the A. pisum genome (Table 1, Table S5), whereas only three are found in D. melanogaster. Two of the predicted GAPDH genes in A. pisum exhibit greater sequence similarity with yeast (e.g. Candida albicans) or plant (e.g. Pinus sylvestris) homologous genes than to insect ones and have been annotated as GAPDH-like (Fig. S4). Expression of both proteins is still hypothetical because no EST has been reported for those two genes. The other four genes encoding GAPDH display a high similarity but only two of them bear the enzyme active site. Evidence of gene expression was only reported for one of them suggesting that GAPDH gene could have been subjected to gene duplication. Actin is another candidate for BWYV transcellular transport and genes encoding the protein have been annotated in A. pisum (see above). The other additional protein interacting with BWYV was identified with a very low confidence (Seddas et al., 2004) as a M. persicae cuticular protein (gi|16798648). This M. persicae cuticular protein exhibits 95% of sequence similarity with the A. pisum counterpart (NP_001127758). Annotation of this gene and other A. pisum cuticle proteins will be reported elsewhere. The A. pisum gene encoding Acyl-coenzyme A ligase has been located on the genome (Table 1) together with 12 additional genes encoding proteins of the superfamily composed of Acyl-coenzyme A ligases, peptide synthetases and firefly luciferase (Table S5). Genes homologous to Cyclophilin B and other peptidylprolyl isomerases were also identified within the A. pisum genome (Table 1 and Table S5).
Conclusions and outlook
Our data on the identification in A. pisum genome of the major genes involved in transcytosis point to a well-developed vesicle internalization, fusion and trafficking machinery that is conserved throughout the various Metazoan lineages. However, some interesting features have been uncovered such as the existence of multiple copies of a new dynamin class gene as compared with D. melanogaster. The functional characterization of these genes will help to understand their role in the aphid biology and their function, if any, in luteovirid transmission. Conversely, the synaptotagmin family was found to be significantly reduced, because only three out of the eight known proteins seem to be encoded by the pea aphid genome.
The absence in A. pisum of several other proteins involved in vesicle transcytosis has also been reported in other insect species, suggesting the existence, in this pathway, of proteins with overlapping function. The overall analysis of the A. pisum phylome revealed the existence of large gene family expansions as well as the loss of several well-conserved gene families (International Aphid Genomics Consortium, 2010). Our analysis did not reveal the loss of any gene family involved in transcytosis which pinpoints the importance of these families in this ubiquitous pathway.
We have identified over 144 genes in the genome of A. pisum that potentially play a role in endocytosis, trafficking and exocytosis of vesicular content. Each of the encoded proteins may participate in virus transcytosis through A. pisum cells. Nonetheless, additional biological experiments will need to be conducted to confirm/refute their involvement in virus transport in aphids. In particular, the RNA interference strategy recently applied to inhibit A. pisum gene expression could be used to analyse the effect of the identified proteins in virus transmission by aphids (Mutti et al., 2006; Jaubert-Possamai et al., 2007).
The identification and annotation of the dataset described herein will enable us to design targeted functional studies and ultimately lead to a better understanding of the transcytosis pathway which underlies virus transmission by A. pisum as well as by other aphid vectors.
Previously identified members of the CME pathway, Rab, synaptotagmin, exocyst, actin and SNARE families were retrieved from D. melanogaster, Homo sapiens and Rattus norvegicus using FlyBase (http://www.flybase.org) and the NCBI protein database (http://www.ncbi.nlm.nih.gov). These sequences were input into TBLASTN with A. pisum specified as the search set. High probability hits were compared with the Gnomon prediction in AphidBase (http://www.aphidbase.com, 2008 release). When available, EST data were used to verify the coding sequence of each candidate. To control for possible erroneous annotation, multiple sequence alignments were performed and clustering patterns on individual phylogenetic trees were noted. Gene names were assigned following the closest D. melanogaster or human orthologue and were annotated using the Apollo annotation tool.
Sequence alignments and phylogenetic analysis
Amino acid sequences were aligned with the Alignment Explorer/CLUSTAL algorithm using either the entire coding sequence or the appropriate seed region (as determined by InterProScan, http://www.ebi.ac.uk/Tools/InterProScan/). The data was then exported as a multiple sequence file (MSF) and used to construct a phylogenetic tree with MEGA 4.1 freeware (Tamura et al., 2007). Each tree was made via the Neighbor-Joining Method (NJ) and Poisson-corrected amino acid distance used. The reliability was then tested by bootstrapping (1000 pseudoreplicates).
Tests of purifying selection were conducted using the Nei-Gojobori method in MEGA4 (Nei & Gojobori, 1986). All positions containing alignment gaps and missing data were eliminated only in pairwise sequence comparisons (pairwise deletion option).
We thank the International Aphid Genomics Consortium and the Baylor College of Medicine Human Genome Sequencing Centre for making the A. pisum genome sequences publicly available prior to publication. Part of this work was funded by USDA grant 2005-35604-15446.