The yellow fever mosquito Aedes aegypti is an important human health pest which vectors yellow fever and dengue viruses. Olfaction plays a crucial role in its attraction to hosts and although the molecular basis of this is not well understood it is likely that odorant-binding proteins (OBPs) are involved in the first step of molecular recognition. Based on the OBPs of Drosophila melanogaster and Anopheles gambiae we have defined sequence motifs based on OBP conserved cysteine and developed an algorithm which has allowed us to identify 66 genes encoding putative OBPs from the genome sequence and expressed sequence tags (ESTs) of Ae. aegypti. We have also identified 11 new OBP genes for An. gambiae. We have examined all of the corresponding peptide sequences for the properties of OBPs. The predicted molecular weights fall within the expected range but the predicted isoeletric points are spread over a wider range than found previously. Comparative analyses of the 66 OBP sequences of Ae. aegypti with other dipteran species reveal some mosquito-specific genes as well as conserved homologues. The genomic organisation of Ae. aegypti OBPs suggests that a rapid expansion of OBPs has occurred, probably by gene duplication. The analyses of OBP-containing regions for microsynteny indicate a very high synteny between Ae. aegypti and An. gambiae.
It is well established that insects use a wide range of chemical signals or semiochemicals such as pheromones, plant volatiles or animal odours to detect each other and to locate suitable plant or animal hosts (Zwiebel & Takken, 2004). These semiochemicals are small hydrophobic molecules which enter the antennae and other sensory organs via pores and pass across the hydrophilic sensillum lymph surrounding the olfactory neuronal dendrites. There is evidence that this passage is facilitated by odorant-binding proteins (OBPs), including the pheromone-binding proteins (PBPs) and the so-called general odorant-binding proteins (GOBPs), which solubilize and transport the hydrophobic molecules to the insect olfactory receptors (ORs) (Xu et al., 2005; Syed et al., 2006). In the Lepidoptera it has been suggested that the two families of OBP genes originated from a common ancestral gene by gene duplication (Vogt et al., 2002).
OBPs are small (15–20 kDa, ca. 120–150 amino acids), water soluble, globular proteins with a signal peptide. They were first reported in the Lepidoptera (Vogt & Riddiford, 1981), but now many studies have described such proteins and their associated genes in a wide range of insect species. OBPs are highly concentrated in the lymph of chemosensilla (up to 10 mM) (Vogt et al., 1985; Klein, 1987) and many have been shown to bind pheromones and other odorants (for review see Pelosi et al., 2006) supporting a role in insect molecular recognition. This is further supported by the finding that most OBPs from Lepidoptera as well as some from Drosophila melanogaster are expressed specifically in the antennae (Shanbhag et al., 2001; also reviews Steinbrecht, 1998; Vogt et al., 2002; Leal, 2003; Pelosi et al., 2006). Furthermore, cells expressing particular ORs have been shown to be closely associated with cells expressing the corresponding PBPs (Steinbrecht, 1998; Krieger et al., 2005). It has been reported that the OBP LUSH of D. melanogaster is required for pheromone detection (Xu et al., 2005) and recently, Syed et al. (2006) demonstrated the involvement of a Bombyx mori PBP (BmPBPl) in facilitating pheromone activation of the corresponding OR. Although it is not clear exactly how mosquitoes locate their hosts at the molecular level and hence transmit diseases to humans, there is some evidence for an involvement of OBPs in odour recognition in Anopheles gambiae (Justice et al., 2003; Li et al., 2005).
With the publication of the genome sequences of D. melanogaster (Adams et al., 2000), An. gambiae (Holt et al., 2002), Apis mellifera (The Honeybee Genome Sequencing Consortium, 2006) and Aedes aegypti (Nene et al., 2007), it has become possible to look at how many genes encoding putative OBPs are present. There are 51 in D. melanogaster (Galindo & Smith, 2001; Graham & Davies, 2002; Hekmat-Scafe et al., 2002), for An. gambiae there are three independent genome annotations, reporting up to 57 (Vogt, 2002; Xu et al., 2003; Zhou et al., 2004) and a recent report suggests 21 in the honeybee A. mellifera (Forêt & Maleszka, 2006). A genome-wide analysis of chemosensory proteins (CSPs), another subfamily of putative OBPs (Pelosi et al., 2006), from many species has also been reported (Zhou et al., 2006). However, for Ae. aegypti only one OBP (Aaeg-OBP10) has been found by conventional molecular techniques (Bohbot & Vogt, 2005), and no genome-wide analysis has been carried out. We have collated the insect OBP genes identified previously, aligned them into their distinct subgroups according to the number of, and spacing between, the conserved cysteines. This has allowed us to develop a motif search algorithm (MotifSearch) to search insect genomes and expressed sequence tags (ESTs) for all predicted transcript sequences that contain an OBP motif. This has found 66 genes encoding putative OBPs in Ae. aegypti and an additional 11 OBPs in An. gambiae. We have identified four Ae. aegypti-specific OBPs and three groups of OBPs which are present in all three dipteran species, ie Ae. aegypti, An. gambiae and D. melanogaster. We have examined the genomic organisation of the genes and can speculate on their possible evolutionary origins.
Results and discussion
Determination of motifs for the identification of genes encoding putative OBPs
An alignment of 33 D. melanogaster and 29 An. gambiae Classic OBPs showed that they all fitted the motif C1-X15–39-C2-X3-C3-X21–44-C4-X7–12-C5-X8-C6 and we designated this as the ‘Classic Motif. Similarly, alignment of the 16 previously annotated Atypical OBPs in An. gambiae (Xu et al., 2003, Zhou et al., 2004) produced an ‘Atypical Motif’ of C1-X26–27-C2-X3-C3-X36–38-C4-X11–15-C5-X8-C6, which has a less flexible spacing between C3 and C4. A ‘Plus-C Motif’ of C1-X8–41-C2-X3-C3-X39–47-C4-X17−29-C4a-X9-C5-X8-C6-P-X9–11-C6a was derived from the previously reported Plus-C OBP sequences of An. gambiae (Xu et al., 2003), D. melanogaster and D. pseudoobscura (Zhou et al., 2004), where the number of amino acids between C4 and C5 is increased from (7–12) to (27–39) and there are two additional conserved cysteines (C4a, C6a) and a highly conserved proline immediately after C6. These motifs were then used in an algorithm ‘MotifSearch’ that expands the motifs to any of the possible combinations and uses each combination to search peptide sequences for the occurrence of the conserved cysteines and the spacing between them. It then retrieves the motif-containing fragments and sequence ID and finally compares them with known OBPs using Blastp with the scoring matrix BLOSUM62 (Fig. 1).
Identification of OBPs using MotifSearch
The MotifSearch algorithm was used to search peptide sequences from the Ae. aegypti genome data. To verify the use of MotifSearch, we also analysed the genomes of An. gambiae and D. melanogaster. Five previously reported D. melanogaster OBPs (OBP8a, OBP44a, OBP99c, OBP99d and OBP59a) and three An. gambiae OBPs (AgamOBP16, AgamOBP22 and AgamOBP42) have no OBP motif as defined by our algorithm because they lack one or more of the conserved cysteines and some have shorter 5′- and 3′-terminals so are missing the first, fifth, and sixth cysteines, respectively. However, some of them have a significant similarity to other OBPs with E-values from 10−2 to 10−90 (data not shown), so they were included in our data analysis.
For An. gambiae we found 33 Classic OBPs, 19 Plus-C OBPs and 14 Atypical OBPs (Table 1). Of the 33 Classic OBPs, four had not been annotated previously and we have named these as new OBPs, AgamOBP65, AgamOBP66, AgamOBP67 and AgamOBP68 (for an alignment with other Ae. aegypti Classic OBPs see Supplementary Material Fig. S1). We also identified seven new Plus-C OBPs and named them as AgamOBP58, AgamOBP59, AgamOBP60, AgamOBP61, AgamOBP62, AgamOBP63 and AgamOBP64. For D. melanogaster the MotifSearch found 49 putative OBPs, 37 had the Classic Motif and all of these had been annotated previously as OBPs. The Plus-C Motif found all of the 12 known Plus-C OBPs and there were no peptide sequences with the Atypical Motif, in agreement with the previous comparison between D. melanogaster and An. gambiae (Xu et al., 2003). The total of 49 peptide sequences in the D. melanogaster genome is lower than the 51 reported previously which included ‘C-minus OBPs’ (Hekmat-Scafe et al. 2002), which do not have any sequence features of the Classic OBPs.
Table 1. Odorant-binding proteins (OBPs) identified from insect genomes using MotifSearch and Blastp
The number of peptide sequences including alternately spliced transcripts downloaded from genome databases and searched with MotifSearch.
The numbers in parentheses indicate the number of OBPs reported previously.
Two dimers (DmelOBP83cd and DmelOBP83ef) are included.
For Ae. aegypti, we identified 34 Classic OBPs, 17 Plus-C OBPs and 15 Atypical OBPs (Table 1). Their identities and names are listed in Table 2 and the alignments are presented in Supplementary Material Figs S1 (Classic OBPs), S2 (Atypical OBPs) and S3 (Plus-C OBPs). The previously reported Ae. aegypti OBP (Aaeg-OBP10) (Bohbot & Vogt, 2005) was identified by our algorithm and named as AaegOBP10 (Table 2). Our finding of Atypical OBPs in Ae. aegypti, along with the previous reports in An. gambiae but not D. melanogaster (Xu et al., 2003; Li et al., 2005) suggests that this type of OBP may be unique to mosquito species.
Table 2. List of identified Aedes aegypti odorant-binding proteins (OBPs) and comparison to OBPs of Anopheles gambiae and Drosophila melanogaster
We did not find any irregular transcript lengths among the Classic OBPs of Ae. aegypti when we compared each sequence to OBPs of An. gambiae and D. melanogaster (Supplementary Material Fig. S1). However, some of the Ae. aegypti transcripts, encoding Atypical and Plus-C OBPs, predict proteins with differences to those of other OBPs. For example, transcript AAEL009599 (AaegOBP41), an Atypical OBP, has a predicted protein with an extra 29 amino acid residues after C6 (Supplementary Material Fig. S2) and we were able to manually assemble a contig from seven ESTs in NCBI dbEST using the ContigExpress programme of Vector NTI software (InforMax, Inc. Frederick, MD) to confirm the integrity of the annotated sequence. Another transcript AAEL010718 (AaegOBP44), an Atypical OBP, has a shorter 3′-terminal (∼60 amino acids) but nine ESTs allowed us to extend this terminal to obtain a full-length transcript named as EST010718N (AaegOBP44B). Two transcripts AAEL006106 (AaegOBP26) and AAEL006109 (AaegOBP23) of Plus-C OBPs have very long 5′-terminals (∼106 amino acids) (Supplementary Material Fig. S3) and they encode the same protein, which corresponds to seven ESTs (EB101959, DV260479, DV408951, DW217440, EB101753, BQ789643 and BQ789654). The assembly of these seven ESTs (AaegOBP23B) has a shorter 5′-end without changing the rest of the coding region and contains a Plus-C motif and a signal peptide. We named it EST006106N (AaegOBP23B). The transcript AAEL000139 (AaegOBP5) has a very long 3′-terminal (Supplementary Material Fig. S3) and in this case there is no matching EST. It has a homologue (AgamOBP58) in An. gambiae (similarity of 41.3% and identity of 30.5%) which again has no EST.
Comparison of Ae. aegypti OBPs with OBPs from D. melanogaster and An. gambiae
The identified Classic OBPs of Ae. aegypti have an overall amino acid sequence similarity of 21.7%, which is higher than the 13.3 and 16.6% found for D. melanogaster and An. gambiae Classic OBPs, respectively. The overall 25.5% similarity of the Plus-C OBPs of Ae. aegypti is much higher than the 9.4% for An. gambiae and slightly higher than 22.1% for D. melanogaster. The similarity of Ae. aegypti Atypical OBPs is 36.2%, which is comparable with 38.3% for An. gambiae. When the Classic OBPs for all three dipteran species are compared, the overall amino acid identity is less than 5%. Most of the 66 Ae. aegypti OBPs have a homologue in An. gambiae with E-values better than 10−22 (Table 2) and similarities ranging from 16% (AaegOBP48 and AgamOBP46) to 63% (AaegOBP2 and AgamOBP6). However, there are four Classic OBPs, AaegOBP17, AaegOBP18, AaegOBP19 and AaegOBP64, specific to Ae. aegypti (E-values less than 10−7 when compared to An. gambiae OBPs and less than 10−3 compared to D. melanogaster OBPs). These have mature proteins of molecular weights 13.5, 13.7, 13.7 and 14.1 KDa and pIs of 4.7, 5.4, 5.4 and 5.1, respectively. It is likely that these OBP genes evolved after the divergence of the two mosquito species Ae. aegypti and An. gambiae about ∼150 million years ago (Mya; Krzywinski et al., 2006), possibly by gene duplication events.
Sequence comparisons (Table 2) and phylogenetic analyses (Fig. 3) of OBPs of the three dipteran species also revealed nine Ae. aegypti OBPs which have homologues in both An. gambiae and D. melanogaster. These OBPs can be clustered into three groups that we have named after the D. melanogaster OBP in the group. Thus there is an OS-E/OS-F group, a LUSH group and an OBP19a group (Fig. 2). The OS-E/OS-F homologues of An. gambiae have been reported previously (Vogt, 2002). The similarities and identities of the mature proteins are 89.8 and 34.3%, respectively for the OS-E/OS-F group, 85.6 and 28.0% for the LUSH group and 83.5 and 15.8% for the OBP19a group. These values are much higher than the overall similarity and identity of insect OBPs. It is therefore likely that these OBPs have functional roles common to dipteran insects and may have evolved from an ancestral gene before the divergence of mosquitoes and fruit flies about ∼250 Mya (Gaunt & Miles, 2002). Alignment of the OS-E/OS-F OBPs of the three dipteran species (Fig. 2A) reveals a highly conserved region, LKCYMNC, around the second and third conserved cysteines and there are other regions with highly conserved residues. Similarly, there are residues conserved in the other two groups of OBPs (Fig. 2B–D). The finding of conserved amino acids in the OBPs of all three dipteran species suggests that these regions may have an important function in the proteins.
Genomic organisation of OBP genes
There is no detailed genomic map for Ae. aegypti, so the genomic organisation of the OBP genes was estimated from their positions on supercontigs, obtained from the NCBI GenBank and the genome project (AaegL1.41) using the sequence ID retrieved by MotifSearch (Fig. 1) and visualized using MGAlign (Fig. 4). Many of the OBP genes are clustered together, with eight pairs being less than 1 kb apart (AeagOBP3 and AaegOBP4, AeagOBP17 and AaegOBP18, AeagOBP49 and AaegOBP50, AeagOBP52 and AaegOBP53, AeagOBP62 and AaegOBP63, AeagOBP42 and AaegOBP43, AeagOBP32 and AaegOBP33, AeagOBP40 and AaegOBP41). For each of the three OBP subgroups there is a large cluster of genes. Nine Classic OBPs are clustered on supercontig 1.61 within a 169 406 bp region (Fig. 4A), there are eight Plus-C OBP genes on supercontig 1.584 within a 182 245 bp region (Fig. 4B) and five Atypical OBP genes on supercontig 1.203 within a 54 422 bp region (Fig. 4C). The adjacent nine Classic OBP genes have a similarity of 64.6% and the six clustered Atypical OBPs are 93.1% similar. The similarity value is lower for the clustered Plus-C OBPs at only 42.1%. The OBP genes within each cluster have similar intron-exon structures and the size of most introns is about 60 bp (see below). Most Classic OBPs have one intron, most Atypical OBPs have no intron and most Plus-C OBPs have two introns. These data provide good evidence that the Ae. aegypti OBP genes evolved rapidly by gene duplication, as has been reported for An. gambiae (Xu et al., 2003) and D. melanogaster (Hekmat-Scafe et al., 2002). The Ae. aegypti-specific OBPs (AaegOBP17, AaegOBP18, AaegOBP19 and AaegOBP64) are clustered on the same supercontig 1.115 within a 61 200 bp region (Fig. 4A).
We combined both EST assembly and manual annotation to check the transcripts with introns larger than 60 bp using ContigExpress of Vector NTI software (InforMax, Inc.) and GenScan (http://genes.mit.edu/GENSCAN.html). We were able to predict an alternative AaegOBP37 (AAEL008009) so that it has same intron/exon structure as the adjacent AaegOBP36 without changing the original coding sequence (as annotated in the genome project). EST data support the presence of the large introns in the other Classic OBPs, AaegOBP15 (AAEL002598), AaegOBP3 (AAEL000051), AaegOBP4 (AAEL000073), AaegOBP23 (AAEL006109), AaegOBP24 (AAEL006108), AaegOBP25 (AAEL006103) and AaegOBP26 (AAEL006106) (Fig. 4). Both EST data and manual GenScan prediction showed that AaegOBP28 (AAEL006393) is the same sequence as AaegOBP29 (AAEL006387) and encodes the same protein. GenScan also predicted new Ae. aegypti transcripts AaegOBP30 and AaegOBP64 from the supercontigs 1.203 and 1.115, respectively (Table 2 and Fig. 4). Interestingly, each of two pairs (AaegOBP23 and AaegOBP24, AaegOBP25 and AaegOBP26) of OBP genes with large introns have the same sequence and encode the same protein. This clearly demonstrates that one pair of OBPs was derived by gene duplication from the other pair on the same supercontig (separated by 1 791 782 bp Fig. 4C).
Isoelectric points and molecular weights of predicted proteins with OBP motifs
When the predicted isoelectric points (pIs) of mature OBPs of Ae. aegypti are plotted against their predicted molecular weights (MWs) (Supplementary Material Fig. S4A), it shows that they have similar pIs and MWs to those of An. gambiae (Supplementary Material Fig. S4B) and D. melanogaster (Supplementary Material Fig. S4C). The range of pIs for the dipteran OBPs is between 4 and 10, a wider range than that reported for the acidic pIs of lepidopteran OBPs. Thus the OBPs in the dipteran species can be positively or negatively charged at the physiological pH in insect antennae. The MWs of the Ae. aegypti Classic OBPs are less than 15.5 kDa in agreement with the MWs of other insect OBPs. Most of the Plus-C OBPs have MWs between 17 and 25 kDa (except AaegOBP5 and AaegOBP23) and the Atypical OBPs have MWs between 25 and 35 kDa (except AaegOBP44). For the Atypical OBPs of both Ae. aegypti and An. gambiae there is a range of MWs between 27 and 38 kDa (Supplementary Material Fig. S4A,B); this is mainly because of the long C-terminal present after the sixth conserved cysteine. It has been suggested that this C-terminal may occupy the binding pocket of the proteins at low pHs as demonstrated for AgamOBP1 of An. gambiae (Wogulis et al., 2006), BmPBP1 of B. mori (Wojtasek & Leal, 1999; Horst et al., 2001; Lee et al., 2002) and D. melanogaster OBP LUSH (Kruse et al., 2003).
Synteny among genomic regions of An. gambiae, D. melanogaster and Ae. aegypti containing OBP genes
We used the longest OBP-containing regions (microsyntenic regions) in Ae. aegypti to analyse the presence and nature of the microsynteny among An. gambiae, D. melanogaster and Ae. aegypti for Classic OBPs (supercont1.61:1448800-1618500), Plus-C OBPs (supercont1.584:253000-437000) and Atypical OBPs (supercont1.203:1428000-1597500) (Fig. 4 and Table 3). We compared the amino acid sequences of each OBP-containing region with the syntenic region in An. gambiae and D. melanogaster obtained from the VectorBase and Ensembl precomputed tBLAT DNA-DNA comparison for these three insect species (Lawson et al., 2007). The sequence characteristics of the Ae. aegypti microsyntenic regions and comparison with syntenic regions in An. gambiae and D. melanogaster are given in Tables 3 and 4. Gene density in these regions is generally higher in An. gambiae and D. melanogaster than in Ae. aegypti. However, only three homologues out of a total of 83 genes were found in the syntenic regions of D. melanogaster with synteny qualities (see Experimental procedures) of 5.9, 6.1 and 3.9%, respectively, for the Ae. aegypti microsyntenic regions containing Classic, Plus-C and Atypical OBPs (Table 4). There is a substantial synteny quality of 72.0% in the microsyntenic region containing Classic OBPs between Ae. aegypti and An. gambiae, with all nine Ae. aegypti Classic OBPs having homologues in the An. gambiae syntenic region. There are five OBP homologues (AgamOBP46, AgamOBP47, AgamOBP48, OBPjj16 and AGAP007283) in the syntenic region of An. gambiae with a synteny quality of 30.0% to the Ae. aegypti microsyntenic region containing Plus-C OBPs. Some of these An. gambiae OBPs (AgamOBP47 and AgamOBP48) are clustered within an Ae. aegypti OBP expansion clade (Fig. 3B). These An. gambiae Plus-C OBPs might have evolved along the same monophyletic lineage with the Ae. aegypti OBPs in the clade which duplicated within Ae. aegypti. There are two Atypical OBPs (AgamOBP31 and AgamOBP44) out of three homologues in the An. gambiae genomic region syntenic to the Ae. aegypti mircosyntenic region containing Atypical OBPs with a synteny quality of 40.0% (Table 4). They are clustered into a separate clade forming an An. gambiae OBP expansion group although closely related to one Ae. aegypti expansion group (Fig. 3C), suggesting the duplications occurred after the speciation of Ae. aegypti and An. gambiae. The Classic OBPs are more divergent than the Atypical and the Plus-C OBPs without forming a large expansion group (Fig. 3A), but with high synteny between the two mosquito species. This may be the result of the duplications of the Classic OBPs occurring before the divergence of mosquitoes and the fruit fly.
Table 3. Sequence characteristics of Aedes aegypti odorant-binding protein (OBP) microsyntenic regions
The total number of genes and homologous genes were obtained in Ensembl MulticontigView.
Table 4. Relative synteny quality of Anopheles gambiae and Drosophila melanogaster regions syntenic to the longest odorant-binding protein (OBP)-containing regions in each OBP subgroup of Aedes aegypti
Classic (supercont1.61: 1448800-1618500)
Plus-C (supercont1. 584:253000-437000)
Atypical (supercont1.203: 1428000-1597500)
The syntenic region was obtained from the VectorBase and Ensembl precomputed tBLAT (translated BLAT) DNA-DNA comparisons for these three species.
The total number of genes and homologous genes were obtained in Ensembl MulticontigView.
Values (in %) calculated as described in Cannon et al. (2006), with collapsing tandem duplications (two homologous genes within tandem duplicated regions counted as one) and by excluding transposable elements.
The Drosophila database FlyBase (http://FlyBase.bio.indiana.edu/genes/fbgquery.hform) was searched using ‘odorant binding’ and ‘pheromone binding’ as text inputs and both DNA and peptide sequences were retrieved. For An. gambiae OBP sequences the gene accession numbers of previous studies (Vogt, 2002; Xu et al., 2003; Zhou et al., 2004) were used to obtain the full length peptide sequences. Sequences were stored and maintained using Vector NTI software (InforMax, Inc.).
Identification of OBP motifs
The peptide sequences of each distinct subgroup of all annotated OBPs were aligned using ClustalX (8.1) (Thompson et al., 1997) with default gap-penalty parameters of gap opening 10.0 and extension 0.2. These alignments were then used to construct OBP sequence ‘motifs’, where the number of amino acid residues between the conserved cysteines in each peptide sequence was counted. In some cases manual adjustment was necessary to align the cysteines. This was then used to produce OBP motifs and used in the ‘MotifSearch’ algorithm (see Results).
Identification of sequences encoding peptides with OBP motifs in the Ae. aegypti genome
The FASTA files of predicted peptide sequences of the whole genome projects of An. gambiae (AgamP3.41) and Ae. aegypti (AaegL1.41) were downloaded from the Ensembl mosquito database (http://www.ensembl.org/info/data/download.html) and VectorBase (http://aaegypti.vectorbase.org/index.php). EST sequences were downloaded from the EST database (http://www.ncbi.nlm.nih.gov/dbEST/). The genome sequences were searched with an in-house algorithm MotifSearch (Fig. 1). Briefly, the downloaded nucleotide sequences (genome sequences or ESTs) were translated in the six possible frames into peptide sequences and combined with the predicted peptide sequences of the genome projects into one single FASTA file as the input search file and then searched with the MotifSearch to obtain the motif-containing sequences. Apart from the downloading of sequences from the database and the combining of these into a FASTA file containing all published OBP sequences for comparison, the built-in functions in the MotifSearch automatically retrieve motif-containing sequence fragments and the sequence identities from the downloaded genome sequences, comparing them with published OBP sequences and identifying putative OBP sequences with E-values better than a threshold value of 0.005. The MotifSearch is simpler than homology searching by PSI-Blast or PHI-Blast and is an alternative annotation approach for very diverse protein families. It can be applied to any raw peptide data, produced from genome sequencing projects, for sequence motifs of small size containing conserved residues with constant spacings between them.
The annotated OBP sequences were blasted against the GenBank entries to identify Ae. aegypti-specific OBP genes and Ae. aegypti homologues in An. gambiae and D. melanogaster (Table 2). The predicted MWs and isoelectric points of the putative OBPs were determined using Vector NTI software (InforMax, Inc.) and the ExPASy ProtParam tool (http://us.expasy.org/tools/protparam.html).
The mature peptide sequences identified in the previous section and by previous studies were aligned using ClustalX (8.1) (Thompson et al., 1997) with default gap-penalty parameters of gap opening 10.0 and extension 0.2. The alignments were used to construct phylogenetic trees using mega4 software (Tamura et al., 2007). The final unrooted consensus trees were generated with 1000 bootstrap trials using the neighbor-joining method with a cut-off bootstrap value of 50 and p-distance model (Saitou & Nei, 1987).
The identified Ae. aegypti OBP sequences were used as queries to blast search the Ae. aegypti genome in a reiterative manner for further OBP-like proteins which could be missed by MotifSearch. Syntenic regions of An. gambiae and D. melanogaster to the largest OBP-containing regions in each OBP subgroup of Ae. aegypti (Fig. 4) were analysed using VectorBase and Ensembl precomputed tBLAT DNA-DNA comparison for these three insect species (http://aaegypti.vectorbase.org/Genome/ContigView/) (Lawson et al., 2007). The syntenic regions were investigated within the ‘Aedes MultiView’ section at the VectorBase to look at all of the genes. The relative synteny quality in a region, expressed as a percentage, was calculated by dividing the sum of the conserved genes in both syntenic regions by the sum of the total number of genes in both regions, excluding retroelements and transposons and collapsing tandem duplications (Cannon et al., 2006).
Rothamsted Research receives grant-aided support from the Biotechnology and Biological Sciences Research Council of the United Kingdom. Xiao Li He was funded by a BBSRC initiative ‘Selective Chemical Intervention in Biological Systems’.