In all examined species of the family Aphididae, the bacterial endosymbiont Buchnera aphidicola carries a plasmid encoding the genes leuABCD (involved in leucine biosynthesis) along with repA1, repA2 and ORF1. The gene organisation of the leucine plasmids was conserved, except in Buchnera isolated from Pterocomma populeum, where ORF1 was located in a different position. An inverted repeat (LIR1) located between repA2 and leuA is found in all of the Buchnera leucine plasmids examined. The predicted secondary structure of the LIR1 transcript conforms to a long hairpin loop, suggesting an involvement in transcription termination or messenger stability. Phylogenetic reconstruction based on repA2 sequences suggests that horizontal transfer of Buchnera leucine plasmids has not occurred.
Aphids are plant sap-feeding insects that maintain an endosymbiotic association with Buchnera aphidicola, a bacterium closely related to Escherichia coli. The association is considered obligate for both partners: Buchnera cannot be cultured outside the aphid host whereas aphids treated with antibiotics grow slowly and are unable to reproduce . Phylogenetic analysis of B. aphidicola 16S rDNA sequences and host morphology have shown that the association is the result of a single infection of an aphid ancestor by a bacterium about 200–250 million years ago. Host and symbiont lineages have subsequently diverged strictly in parallel . The association is believed to have a nutritional basis. It has been shown that Buchnera provides its host with essential amino acids, nutrients required by aphids but in short supply in their plant phloem sap diet . More recently, genetic studies of Buchnera from several aphid species have provided evidence for a bacterial capacity to overproduce the essential amino acids tryptophan and leucine. Genes encoding rate-limiting enzymes in each of the biosynthetic pathways were found to be amplified and relocated to plasmids [4, 5]. In the case of tryptophan, the genes for anthranylate synthase (trpEG) are carried in several copies, arranged as tandem repeats on a low-copy-number plasmid in Buchnera from species of the family Aphididae. The remaining genes of the pathway are chromosomal and present in one single copy [4, 6]. In the case of leucine, the genes (leuABCD) that encode the three main enzymes of the pathway were found to be carried on a 7.8-kb plasmid (pRPE) in Buchnera from the aphid Rhopalosiphum padi, a species also belonging to the Aphididae . The plasmid contained two additional genes, repA1 and repA2, encoding putative plasmid replication initiation proteins and an open reading frame (ORF1) of yet unknown function.
Recently van Ham et al.  revealed the presence of a leucine plasmid (pBTs1) in Buchnera from the distantly related aphid Thelaxes suberi (Thelaxidae). This plasmid contained all the previous genes plus an additional open reading frame encoding a small heat shock protein, but further differed from pRPE in the gene order of both the leucine operon and the two copies of repA. Finally, Buchnera from the aphid Tetraneura caerulescens, belonging to the divergent family Pemphigidae, was found to harbour a small plasmid (pBTc1) that contained only one copy of repA plus ORF1. Based on the differences in organisation and the phylogenetic distribution of the plasmids, van Ham et al.  proposed independent evolutionary origins for the two leucine plasmids from an ancestral replicon resembling pBTc1. In the present paper we extended the study of the leucine plasmids to representatives of the three tribes of the family Aphididae in order to assess their phylogenetic relationships as well as to identify candidate regulatory signals for gene expression.
2Materials and methods
2.1Nomenclature and aphids
B. aphidicola strains from different species of aphids are designated by B followed by the name of the aphid host, i.e. B(Pterocomma populeum) refers to B. aphidicola from the aphid Pterocomma populeum. The following species were obtained from Dr. J.C. Simon: Aulacorthum solani, Diuraphis noxia, Metopolophium dirhodum, Rhopalosiphum cerasifoliae and Rhopalosiphum insertum. Acyrthosiphum pisum was obtained from Dr. Y. Rahbé, and the rest were collected in the field.
DNA was extracted following Latorre et al. . The leucine plasmid of B(P. populeum) was cloned by a pUC18/HindIII-shotgun cloning of a mtDNA preparation of the host aphid. Two clones, pp29 and pp51, cover the entire plasmid.
The following primers from a previously designed structural PCR assay for the Buchnera leucine gene cluster  were used: inverse long PCR leuA.dl3 and leuA.du2. Amplification of the region from ORF1 to leuA: ORF1.up2 (5′-GTWATGGTWATGTTTTCWGGWTA-3′), repA.lo1 (5′-TCATWGCACAWGCWCKWTG-3′), RepAp1 (5′-GTATCAGAAATAGACGTTGC-3′) and leuA.dl3. PCR was performed with a GeneAmp PCR System 2400 thermal cycler (Perkin Elmer) using standard cycling conditions. PCR conditions for inverse long PCR were as in . PCR products were either directly sequenced, or first cloned in T-pBluescript II SK+/−. The sequence of B(Pterocomma populeum) was obtained from the clones pp29 and pp51. Nucleotide sequencing was performed with the AmpliTaqF Dye Deoxy Terminator Cycle Sequencing kit (Perkin Elmer) using either PCR primers described above, internal primers, or T7 and T3 primers and carried out with an ABI 373 automated DNA sequencer. The accession number of the B(R. padi) leucine plasmid is X71612. Sequence data from this article have been deposited in the EMBL Data Library under accession numbers AJ006872–AJ006881.
2.3Computer and phylogenetic analyses
Computer analyses were performed with the Genetics Computer Group (GCG) program package 8.1 for the VAX/WMS. Amino acid and nucleotide sequence alignments were obtained with the CLUSTAL W program . An alignment of the deduced amino acid sequences from repA genes was used to obtain the corresponding nucleotide sequence alignment, which served for the estimation of the nucleotide distances with the method of Tajima and Nei . The phylogenetic tree was obtained using the neighbour-joining method . The significance of the nodes was determined by 500 bootstrap replicates. All these analyses were performed with the program TREECONW .
3.1Structure of the leucine plasmid in the family Aphididae
A structural PCR assay for the rapid screening and characterisation of the leucine gene cluster in diverse lineages of Buchnera was designed previously . The inverse long PCR from this assay, performed with degenerate, outwardly oriented primers complementary to leuA, demonstrated that a plasmid of about 7.5 kb was ubiquitous among Buchnera from aphids of the two main tribes (Aphidini and Macrosiphini) of the family Aphididae (Acyrthosiphum pisum, Aphis fabae, Aphis sambuci, Cavariella aegopodii, Dysaphis plantaginae, Hyalopterus pruni, Hyperomyzus lactucae, Macrosiphonielle helichrysi, Macrosiphum rosae, Myzus persicae, Schizaphis graminum, Sitobion avenae, Toxoptera aurantii and Uroleucon sonchi) (data not shown). Further structural information on these plasmids was obtained by regular PCR using a primer complementary to leuA in combination with primers specific for either ORF1 or repA2. The sizes of the products obtained (data not shown) were fully in accordance with the expectation based on the configuration of the completely sequenced plasmid from B(R. padi) (Fig. 1), suggesting a conserved organisation of this region of the plasmids.
The aphid Pterocomma populeum belongs to the third and evolutionarily least advanced (Pterocommatini) of the tribes that constitute the family Aphididae . Most taxonomic schemes presume that the divergence of this tribe predates the split between Aphidini and Macrosiphini. The potential implications of these phylogenetic relationships for the origin of the leucine plasmids prompted us to investigate the presence of a leucine plasmid in B(P. populeum). The clones pp29 and pp51 carried insert sequences homologous to repA2 and leuA. Subsequent Southern hybridisations, using cloned pRPE as a probe, showed that the two inserts together cover a plasmid of about 7.5 kb. A major part of the plasmid was sequenced and its organisation was compared to pRPE (Fig. 1). The two plasmids had an identical gene content, but showed a single difference in gene order: ORF1, which encodes a putative integral membrane protein , occurs immediately downstream of leuD in B(P. populeum), whereas it is located between the two copies of repA on pRPE. In order to assess which of the two plasmid configurations was present in Buchnera from other aphids of the family, we sequenced completely the region from ORF1 to leuA in strains of nine species from the subfamily Aphidinae. In all species analysed the order and orientation of the genes were as observed for pRPE from B(R. padi)(Fig. 1).
3.2Intergenic sequence alignment
The intergenic sequences between ORF1, repA2 and leuA were compared among species in a search for potentially conserved regions that might be involved in the regulation of gene expression. The intergenic region between repA2 and leuA contained an almost perfect long inverted repeat (LIR1) of about 50 bp in each of the sequences. The stop codon of repA2 was invariably located at the beginning of this inverted repeat. Putative Shine-Dalgarno sequences and the start codon of leuA were always located outside the inverted repeat and separated from it by a stretch of 4–31 bp, depending on the species. A multiple alignment of the sequences revealed that only the nucleotide sequence of the central part of the repeat unit was conserved among species (Fig. 2A). The function of this inverted repeat is unknown but if a mRNA were transcribed from this region, it would have the potential to form a very large hairpin, around 100 nt long. Free energy values of the different repeats range from −49.8 to −65.1 kcal mol−1 (Fig. 2B, as an example in P. populeum). In addition to this long inverted repeat, the B(P. populeum) plasmid contains a second, long inverted repeat (LIR2) (putative hairpin with free energy value of −44.2 kcal mol−1), which is located 167 bp downstream of the stop codon of the ORF1 gene and immediately before the putative origin of replication (Figs. 1 and 2B). Similar to pRPE , a third, relatively short inverted repeat (SIR), which resembles a rho-independent terminator of transcription, was found immediately downstream of leuD.
The comparative analysis of the intergenic region between ORF1 and repA2 did not show regions significantly conserved among all species. However, analyses restricted to Buchnera from either the Aphidini or Macrosiphini revealed that sequences of the former group share two conserved elements (TTGAAA) and (TWWWWWW) (Fig. 3) closely resembling the −35 and −10 boxes of E. coli and Bacillus subtilis promoters. A search for these elements in the entire B(R. padi) plasmid yielded only two further occurrences, one located immediately (15 bp) upstream of ORF1 and, on the opposite strand, one located 156 bp upstream of repA1. A similar analysis of the B(P. populeum) sequence revealed one of these elements (68 bp) upstream of ORF1 (Fig. 1).
3.3Phylogenetic reconstruction of the endosymbiont lineage
Phylogenetic relationships among Buchnera of the Aphididae were estimated from the sequences of the repA2 genes sampled in this and previous studies . repA1 sequences from B(P. populeum) and B(R. padi) were used as outgroups. The repA2 phylogeny (Fig. 4) is congruent with the classical taxonomic divisions of the family into two subfamilies (Pterocommatinae and Aphidinae), and three tribes (Pterocommatini, Macrosiphini and Aphidini) . At a lower taxonomic level, relationships are relatively well resolved, particularly within the Aphidini. This analysis suggests that repA, which is the most variable of the genes commonly found on Buchnera plasmids , represents an adequate tool for phylogenetic studies of endosymbiont lineages within the family Aphididae.
Gene amplification of the leucine biosynthetic genes through their transfer to plasmids in Buchnera reflects a capacity of the endosymbiont to overproduce leucine for its aphid host [5, 7]. Here, we have shown that these plasmids are ubiquitous among Buchnera from aphids of the family Aphididae, which may further illustrate the importance of bacterial essential amino acid provisioning to the symbiotic association. Moreover, the finding of a leucine plasmid in Buchnera from the aphid P. populeum, a species representing the most basal lineage of the family , suggests that the origin of these plasmids predates the diversification of the Aphididae. The main difference between the plasmids from B(P. populeum) and those from all other Buchnera of the Aphididae lies in the position of ORF1. Whereas, in the absence of clear termination signals, it is possible that transcriptional coupling of ORF1 and repA2 occurs on the latter group of plasmids, the configuration found in B(P. populeum) precludes such a possibility.
A conspicuous feature of leucine plasmids is the presence of a long inverted repeat (LIR1; Figs. 1 and 2) that encompasses nearly the entire intergenic spacer between repA2 and leuA. The conservation of this inverted repeat among all plasmids studied suggests that it is subject to strong functional constraints. Although its exact role is unknown, it might be involved in the control over the expression of the leucine operon through antitermination of transcription  or stabilisation of mRNA. Both 5′- and 3′-terminal stem-loop structures can stabilise mRNA [17, 18]. The potential function of an additional long inverted repeat, which was only detected on the B(P. populeum) plasmid (LIR2; Figs. 1 and 2B) is also unknown. However, the short inverted repeat followed by a stretch of Ts found at the 3′-end of the leucine operon in both B(P. populeum) and B(R. padi) probably constitutes a rho-independent terminator of transcription. No such elements have been found elsewhere on either of these plasmids.
Identification of potential promoter regions in Buchnera is often hampered by the high AT content of its DNA . However, our comparative sequence analysis of intergenic spacers of the leucine plasmids revealed the presence of a well conserved region upstream of repA2 (Fig. 3) that bears similarity to the −35 and −10 sequences of eubacterial σ70 promoters . Similar potential −35 boxes in conjunction with a −10 box at a spacing of 16–19 nt have only two further occurrences in the B(R. padi) sequence, both located, in opposite orientation, between repA1 and ORF1. Similar −35 box sequences were previously proposed for the Buchnera 16S and 23S rRNA operons , and the Buchnera trp genes . Evidently, the growing body of inferences on potential Buchnera promoter sequences now awaits verification through the experimental determination of transcription initiation sites.
The repA2 phylogeny and the clustering of the two repA1 sequences as an outgroup clearly indicates that the corresponding genes are paralogous and that an eventual duplication of the gene must have occurred prior to the diversification of Buchnera from the Aphididae. The congruence of the tree with taxonomic subdivisions of aphids further confirms the absence of horizontal transfer of Buchnera or its plasmids  and justifies the use of endosymbiont phylogenies to resolve aphid host phylogenetic relationships.
We are indebted to Dr F. González-Candelas and Dr A. Moya and the Servei de Bioinformàtica (Universitat de València) for providing computing facilities and support; to Dr J.C. Simon and Dr Y. Rahbé for providing insect samples; to the S.C.S.I.E. (Universitat de València) for sequencing facilities. This work has been supported by Grant PB96-0793 C04-01 from DGES.