Species‐complex diversification and host‐plant associations in Bemisia tabaci: A plant‐defence, detoxification perspective revealed by RNA‐Seq analyses

Abstract Insect–plant associations and their role in diversification are mostly studied in specialists. Here, we aimed to identify macroevolution patterns in the relationships between generalists and their host plants that have the potential to promote diversification. We focused on the Bemisia tabaci species complex containing more than 35 cryptic species. Mechanisms for explaining this impressive diversification have focused so far on allopatric forces that assume a common, broad, host range. We conducted a literature survey which indicated that species in the complex differ in their host range, with only few showing a truly broad one. We then selected six species, representing different phylogenetic groups and documented host ranges. We tested whether differences in the species expression profiles of detoxification genes are shaped more by their phylogenetic relationships or by their ability to successfully utilize multiple hosts, including novel ones. Performance assays divided the six species into two groups of three, one showing higher performance on various hosts than the other (the lower performance group). The same grouping pattern appeared when the species were clustered according to their expression profiles. Only species placed in the lower performance group showed a tendency to lower the expression of multiple genes. Taken together, these findings bring evidence for the existence of a common detoxification “machinery,” shared between species that can perform well on multiple hosts. We raise the possibility that this “machinery” might have played a passive role in the diversification of the complex, by allowing successful migration to new/novel environments, leading, in some cases, to fragmentation and speciation.


| INTRODUCTION
Species complexes, including cryptic ones, are present in a wide range of taxonomic groups and are being discovered at an increasing rate (Bickford et al., 2007). They provide an excellent tool for connecting the study of taxonomy and phylogenetic patterns with ecosystems functioning and evolutionary processes such as speciation (Struck et al., 2017), giving insights not only into the establishment of reproductive barriers between the diverging species (Rundle & Nosil, 2005), but also into functional changes occurring in their genomes (Simon et al., 2015).
Species complexes of herbivorous insects are of special interest, because host specialization is the favoured and dominant evolutionary strategy in this insect group (Forister et al., 2015). It is to be expected, therefore, that within species complexes of herbivorous insects, most species will be specialists or oligophages, with the exception of a few "true" generalists (Loxdale & Harvey, 2016). The traditional assumptions argue that specialism predominates because it allows host-specialized herbivores to become optimally adapted to the nutritional and secondary defensive chemistry of their host plants (Cornell & Hawkins, 2003), while generalism is adaptive mostly when the availability of high-quality host plants is unpredictably variable (Forister, Dyer, Singer, Stireman, & Lill, 2012).
Recently, however, Forister and Jenkins (2017) demonstrated that communities enriched in specialized species (relative to generalized taxa) can evolve in the absence of genotype-by-environment interactions that confer a direct advantage to the specialists over the generalists. Alternatively, random diversification forces acting on fragmented populations over geographical ranges are sufficient for producing multiple speciation events. In most cases, examination of such closely related species groups is considered to be the best experimental approach for studying the evolution of different degrees of generalism or specialism and their putative derived effects on divergence and speciation processes (Forister et al., 2012;Nylin & Janz, 2009;Nyman, 2010). This is because species complexes tend to share a common genetic background, with biological differences mainly associated with their feeding ecology, thus avoiding the "phylogenetic noise" present when comparing phylogenetically distant taxa that have accumulated more diverse adaptations (Roy et al., 2016).
We focus here on the whitefly, Bemisia tabaci (Hemiptera: Aleyrodidae) species complex, which is a cosmopolitan complex widely distributed throughout tropical and subtropical regions. Until about 20 years ago, B. tabaci was considered as a highly variable species comprising a series of morphologically indistinguishable biotypes that differ mostly in fecundity, insecticide resistance and the capability of transmission of viruses (Brown, Frohlich, & Rosell, 1995). However, multiple studies (mostly phylogenetic analyses and crossing experiments) came to the conclusion that it might be more accurate to regard B. tabaci as a cryptic species complex, rather than a highly variable species (De Barro, Liu, Boykin, & Dinsdale, 2011;Dinsdale, Cook, Riginos, Buckley, & De Barro, 2010;Liu, Colvin, & De Barro, 2012). Dinsdale et al. (2010) proposed a general threshold of 3.5% for mitochondrial cytochrome oxidase I (mtCOI) DNA sequence divergence for species delimitation, which currently leads to the identification of at least 35 distinct species assigned to~11 major clades (Barbosa et al., 2014;Hu et al., 2014). With few minor nonconclusive exceptions, reproductive compatibilities' experiments confirm so far the accuracy of the cryptic species concept adding credibility to the mtCOI approach (Liu et al., 2012;Qin, Pan, & Liu, 2016).
The understanding of the mechanism(s) driving the extreme and unusual diversification of the B. tabaci species complex is limited.
Explanations provided so far largely point towards geographical (allopatric) divergence as the key driving force, associated with the separation of continental landmasses, which overlapped with a period of global diversification across the plant and animal kingdoms (Boykin, Bell, Evans, Small, & De Barro, 2013;De Barro, Trueman, & Frohlich, 2005). These arguments were based on the association of different species with particular geographical (continental) regions, the lack/minimal of gene flow between species and the assumption that most species share a similar and broad host range (De Barro, 2005). However, our knowledge of the host-plant range of the different species is patchy. Moreover, the allopatric divergence model by itself might fail to explain prominent within continental expansions in the complex, as observed, for example, in the Asia II (~12 species) or sub-Saharan Africa (~13 species) major genetic groups (Lee, Park, Lee, Lee, & Akimoto, 2013;Mugerwa et al., 2018).
Here, we explored the possibility that evolved differences in host utilization could have played a role in the diversification of the B. tabaci species complex. Our major goal was to identify macroevolution changes in the relationship between B. tabaci species and their host plants that have the potential to promote diversification. At first, we tested the hypothesis that the species complex is a more host-specific taxon than commonly thought. Specifically, we hypothesized that species in the complex will vary not only in the hosts they actually use, as this might reflect the hosts that are present in a specific geographical area, but also in their host range (Brown et al., 1995;Perring, 2001). We expected, however, that the latter will show more of a quantitative effect (significant differences in performance on hosts that have been lost by some species) than a qualitative one (cases in which some hosts cannot be utilized at all by some species), as previously demonstrated in few studies (De Barro & Bourne, 2010;Iida, Kitamura, & Honda, 2009;Xu, Lin, & Liu, 2011). Host-associated differentiation was shown to play an important role in speciation of various nongeneralist phytophagous insects (Berlocher & Feder, 2002;Stireman, Nason, & Heard, 2005). However, evidence for the putative importance of the phenomenon in generalist species is also accumulating. For example, the polyphagous aphid Myzus persicae comprises a specialized form on tobacco that is formally designated as the subspecies M. persicae nicotianae (Margaritopoulos, Malarky, Tsitsipis, & Blackman, 2007). Other examples include grasshoppers and green mirids, which feed on multiple hosts from different families, yet were shown to exhibit host-associated differentiation (Antwi, Sword, & Medina, 2015). These cases of hostassociated differentiation were shown to maintain gene flow across large distances (Dermauw, Pym, Bass, Van Leeuwen, & Feyereisen, 2018;Hereward, Walter, DeBarro, Lowe, & Riginos, 2013).
We then focused on one important component (in herbivory) of successful plant feeding, the ability to counteract (detoxify) the toxic effect of plant-chemical defences (Celorio-Mancera et al., 2013;Dermauw et al., 2013;Heckel, 2014;Ragland et al., 2015;Wybouw et al., 2015). The detoxification system works in three phases: The first phase involves oxidation, hydrolysis and/or reduction by enzymes such as P450 monooxygenases (P450s) and carboxylesterases (COEs); the second phase involves conjugation with hydrophilic groups such as glutathione, sulphate or sugars by glutathione Stransferases (GSTs), sulfotransferases or UDP-glucosyltransferases (UDPGTs) to increase polarity and facilitate excretion; and the third phase involves active export of the conjugated toxins out of the cell by ATP-binding cassette transporters (ABC transporters; Després, David, & Gallet, 2007). Multiple studies have indicated that, in insect herbivores, an extensive re-arrangement of detoxification gene expression (transcriptional plasticity) takes place shortly after host shifts, clearly suggesting that the detoxification system plays an important role in the insects' survival, when first encountering a novel or adverse host plant (Celorio-Mancera et al., 2013;Grbić et al., 2011;Matzkin, 2012;Yu, Fang, Zhang, & Jiggins, 2016). Moreover, it was argued that selection on the mode of expression of these genes could be disproportionately strong (Wybouw et al., 2015) as insect herbivores may use the transcriptional plasticity displayed upon the first exposure to the novel host, not only to secure their initial survival (phenotypic accommodation) but also to facilitate subsequent adaptation (Nylin & Janz, 2009;Schlichting & Wund, 2014). If inclusion of the novel host in the normal range of hosts is important enough for fitness, plastic traits may undergo genetic accommodation by acquiring quantitative genetic changes that can either increase or decrease their environmental responsiveness (Levis & Pfennig, 2016). In some cases, selection can cause a plastic trait to lose its environmental responsiveness which can result in the constitutive expression of genes (Schneider & Meyer, 2017). Constitutive expression levels can allow the refinement of the expression levels (via selection), for obtaining optimal performance that outcompetes that obtained by the ancestral plastic trait (Levis & Pfennig, 2016;Pfennig & Ehrenreich, 2014;Wang et al., 2017). We therefore expected that both plastic and/or constitutive expression differences in detoxification genes will be present between closely related species that differ in their host range (Ragland et al., 2015;Roy et al., 2016;Wybouw et al., 2015).
We hypothesized that, during evolution, various species in the B.
tabaci complex have acquired genetic changes that relate to their ability or inability to utilize plant hosts. Therefore, we asked if the expression profile of the insect's detoxification system is shaped more by phylogenetic constraints or by differences in the ability to perform well in heterogeneous environments containing multiple suitable and novel plant hosts. We hypothesized that, if genetically more similar species would show more similar expression patterns, it would suggest that random genetic drift and/or phylogenetic constraints have shaped the detoxification gene expression evolution within each clade. This finding would preclude us from drawing a link between differences in the species performance on various host and speciation as the former might only be a consequence of reproductive isolation rather than causing it (Peccoud et al., 2010). Our alternative hypothesis was that the detoxification gene expression patterns will discriminate between species showing high performance on multiple hosts and species not capable of performing well on multiple hosts (more or less specialized species, respectively).
More precisely, we hypothesized that species that might be only distantly related in the phylogenetic tree, but are capable of utilizing successfully multiple hosts, will be found to share common detoxification patterns (mechanisms) either by maintaining an ancestral expression state or by convergent evolution that allow to cope with a wide variety of plant-chemical defences. This can support a species diversification and expansion model in which the general expression pattern facilitates larger geographical distributions and/or survival in new and/or novel environments (plant hosts), which might lead to fragmentation and eventually speciation due to both random (drift) and adaptive (selection) processes (Forister et al., 2012;Janz & Nylin, 2008;Nylin & Janz, 2009;Nyman, 2010). Evidence that such processes could occur was recently presented in three polyphagous and widespread lepidopteran species: the brown tail moth Euproctis chrysorrhoea (Marques, Wang, Svensson, Frago, & Anderbrant, 2014), the gypsy moth Lymantria dispar (Lazarević et al., 2017)   with a specific mtCOI barcode were maintained (papers are listed in Supporting Information Table S1). Due to too sporadic sampling reports, data of species within the Mediterranean (MED), New-World (NW), Sub-Saharan Africa (SSA), Italy, Asia II and China genetic groups were collapsed. The combined data set (the absence/ presence matrix) was used as input for the HEATMAP.2 function (Euclidean distance with a complete linkage method) of the R GPLOTS package (R Core Team, 2017).
For ancestral host-range reconstruction (order, family and genus levels), a maximum likelihood (ML) inferred tree, for all major genetic groups of B. tabaci and two related outgroups (Bemisia afer and Dialeurodes citri), was first produced (Supporting Information Fig-ure S1a). The mtCOI nucleotide sequences were downloaded from the GENBANK database, clustered with CD-HIT (Fu, Niu, Zhu, Wu, & Li, 2012) at 98% identity, and a cluster representative for each major group was selected (in case where more than one cluster was obtained, a single representative was selected). A codon-based alignment was performed with the REVTRANS2.0 web server (Wernersson & Pedersen, 2003), and IQ-TREE (Nguyen, Schmidt, von Haeseler, & Minh, 2014) was used to calculate the best codon model (MGK+F3X4+R2) and ML tree (5,000 ultrafast bootstraps and 5,000 SH-aLRT). The host plant ranges of the last common ancestors (LCAs) of the different B. tabaci species were estimated with the ace function (type = "discrete," method = "ML," CI = T, marginal = F) in the APE package (Paradis et al., 2004) from R using the information from Supporting Information Table S1. The resulting tree, together with the corresponding presence/absence matrix of order or genus host usage, was used for the ML (marginal) and maximum parsimony (MP) ancestral states reconstruction, using the ace (equal rate) and MPR functions of the APE package, respectively (R software).

| Bemisia tabaci and host plant species
Six species of B. tabaci, representing different geographical and dietbreadth groups, were selected for analyses: SSA1-SG3 (Sub-Saharan Africa genetic group 1, sub-group 3, collected in Tanzania  The Solanaceae/Solanales seems to be one of the ancient host families/orders of B. tabaci, common to many species in the complex (Supporting Information Figure S1b), but this observation relates mostly to the Solanum genus. Moreover, differences in the probability of being part of the ancestral host repertoire of B.
tabaci were observed between the Solanum and Capsicum genera (p = 0.98 and p = 0.5, respectively; Supporting Information Figure S1c). Toxic defensive secondary metabolites that are known to be present in pepper include flavonoids, phenols and capsaicinoids (Mokhtar et al., 2015). Like the Solanaceae/Solanales, the Euphorbiaceae/Malpighiales seem to be one of the ancient plant host families/orders of B. tabaci, common to many species in the complex (Supporting Information Figure S1d). However, cassava is considered to be a well-defended plant and a suitable host only for some SSA and Asia II species of B. tabaci (Colvin, Omongo, Maruthi, Otim-Nape, & Thresh, 2004;Ellango et al., 2015). Important defensive metabolites present in cassava include cyanogenic glucosides (Alves, 2002) and flavonoids (Prawat et al., 1995). Unlike the Solanaceae/Solanales and the Euphorbiaceae/Malpighiales, Brassicaceae/Brassicales plants are not utilized by many species in the B. tabaci complex. Moreover, the probability of the family/order to be part of the ancestral host repertoire of B. tabaci was estimated to be only 0.5 (Supporting Information Figure S1e). According to the literature survey, species within the B. tabaci complex mainly utilize host plants from one tribe, the Brassicaceae. Important toxic metabolites present in kale include glucosinolates and flavonoids (Schmidt et al., 2010). All experimental plants were grown in rearing rooms maintained at 28 ± 2°C, 60% humidity and a14:10 -hr light:dark cycle.

| Performance assay
Groups of 50 B. tabaci adults, from each of the six species reared on eggplant, were transferred 1-3 days after emergence, to one of the four host plants (eggplant, pepper, kale and cassava), at the 5to 8-leaf stage. The adults fed on the four host plants for 24 hr, after which the proportion of survivors was recorded. Proportional data were arcsin-square root transformed. Two-way ANOVA followed by sequential Bonferroni comparisons, using the conservative Dunn-Sidak method (Sokal & Rohlf, 1995), was carried out to compare the mean survival rate of the six species on the four plant hosts.

B. tabaci
Raw data from several B. tabaci transcriptomes were downloaded: MEAM1: SRX022878, SRA036954 and SRR835757; MED-Q1: SRX018661, SRR316271, SRR835756 and PRJNA293094; and Asia II-3: SRR062575. Transcriptomes were assembled with TRINITY V2.0.6 adapted to pair-end libraries (with the exception of SRA036954, which was assembled as single end) and with the following options "trimmomatic" and "normalize_reads" activated. Re-assembled transcriptomes were deposited in an in-house Galaxy server to perform the following steps. The sequences were clustered by CD-HIT-EST at 95% identity, considering the sequences belonging to the same cluster as allelic variants. For sequence annotation, a blastx similarity search against the NCBI protein database nr (e-value threshold 10 −6 ) was performed, keeping the accession number of the top hit in the insect model species: Drosophila melanogaster, Helicoverpa armigera and Acyrthosiphon pisum (Supporting Information   Table S2). This process allowed us to produce a non-redundant detoxification consensus gene data set containing 104 P450s, 25 GSTs, 24 COEs, 71 UDPGTs, 20 sulfotransferases and 54 ABC transporters.

| RNA isolation and Illumina sequencing
Groups of 200 newly emerged adults from each species, grown on eggplant, were subjected to a feeding period of 72 hr on 10% sucrose diet, to obtain a standardized detoxification gene expression pattern, which was host plant-independent. The groups were then transferred for a feeding period of 24 hr, to the four experimental host plants: eggplant, pepper, kale and cassava. Next, the surviving adults were collected for RNA extraction. About 50 adults were pooled for each RNA sample to obtain sufficient RNA. Three feeding experiments per host plant were performed for each B. tabaci species. Total RNA was extracted according to the manufacturer's instructions (Isolate II mini kit, Bioline). Library construction and sequencing were performed by the Centre for Genomic Technologies at the Hebrew University of Jerusalem, using a NextSeq 500 desktop sequencer, which produced approximately 27 million 75-bp single-end reads per sample.

| Gene expression analysis
The reads obtained were then subjected to quality control using the FASTQC software (http://www.bioinformatics.babraham.ac.uk/projec ts/fastqc/). For mapping and expression analysis, a reference backbone of 46,898 genes data set, established for MED-Q1, was used (provided by Prof. Xiao-Wei Wang, Zhejiang University, China). The data set was manually curated to include the consensus detoxification gene data set described above. The reads were mapped using RSEM (reads per kilobase of transcript per million mapped reads) v1.2.18. The transcript reference was first prepared (rsem-preparereference), followed by rsem-calculate-expression with the parameter Bowtie2. The percentage of mapped reads ranged from 41 to 78.
Genes that did not have at least 10 reads in 4% of the samples were filtered out. The RSEM gene quantification for all the remaining genes was used as input for the DESEQ2 R package, version 1.10.1. The gene counts were normalized using DESEQ2 defaults, taking into account the different read mapping percentage of the different samples. Differential expression analysis was performed using a full two-factorial model. Pairwise comparisons were performed between plants within species and between species between plants.
All pairwise comparisons were applied with the parameter "cooksCutoff = FALSE." False discovery rate (FDR) was corrected for all the 30,012 genes that were not filtered out. The 95% log 2converted fold-change range (2.5%-97.5% quantiles) was −3.65 to 4.14 between species and −1 to 1.2 within species. Log 2 -converted fold-change differences and corrected p-values were considered at 1 and 0.05, respectively, for the constitutive expression comparisons (between species feeding on eggplant) and 0.58 (1.5-fold expression change) and 0.05, respectively, for the plastic comparisons (within species after transfer from eggplant to cassava, kale or pepper).
Two tests were performed to show that DNA sequence differences between the six B. tabaci species did not bias our results due to differences in mapping efficiency. We first performed an analysis of sequence similarity, focusing on the set of 298 detoxification genes. We produced one assembled transcriptome for each of the six analysed species, using RNA-Seq data from all the species' RNA samples. Next, we used a "blast reciprocal best hit" approach to check the identity of each gene/contig (of the relevant detoxification gene) in the species' transcriptome to its putative orthologous gene/ contig in the manually curated data set (see above). All reported alignments include genes/contigs that had at least 70% of their sequence aligned with a cut-off of at least 50% identity. As can be seen in Supporting Information Figure S2, the mean identity for all six species was higher than 95%, meaning that the mean number of mismatches in the mapping process of reads of 75-bp long was up to ≈3 (1.22-3.31), which is less than that allowed by the default option of Bowtie2. In addition, arcsin-square root transformed pro-  The expression levels of nine detoxification genes that were differentially or equally expressed by the RNA-Seq approach were validated using quantitative reverse transcription-PCR (qRT-PCR).
Comparisons were made between the SSA1-SG3 and MEAM1 species, which showed the highest (98.33%) and lowest (95.58%) gene identity to the curated data set. The technical details of the qRT-PCR analyses appear in Supporting Information Table S3. A perfect match was observed between the RNA-Seq and qRT-PCR analyses (all qRT-PCR results are summarized in Supporting Information Table S4).

| Do all the species of the B. tabaci complex share a common host range?
We first conducted a literature survey in which we documented all of the major sampling efforts of B. tabaci. Clustering analysis, at the botanical family level (Figure 1a), indicated that one species in the complex (MEAM1) can be considered as a true "generalist," four species can be considered to be species with "extended" host ranges  Table S5). Further analysis by an MP algorithm showed that the Malpighiales may also be considered as ancestral hosts of B. tabaci (Supporting Information Figure S3).

| Selected species performance on the various plant hosts
Adult survival was monitored 24 hr after subjecting newly emerged adults from the six selected species to eggplant, cassava, kale and pepper plants. Both main treatments (species and plant host) were found to affect adult survival significantly (p < 0.0001 and p = 0.0001, respectively), but their interaction was not significant

| General expression profiles of detoxification genes on the different host plants
We first used ordination and statistical methods to analyse coexpression between the candidate genes, comparing gene expression values from the six analysed species on the four host plants. PCA showed that the samples group together mainly according to their plant species association (Supporting Information Figure S4). This None of these 500 trials clustered the species according to their host-performance groups.

B. tabaci species
The transcriptomic profile of each species on eggplant (a suitable host plant for all six species, Figure 2a) was used as a baseline, and the constitutive expression differences in detoxification genes were compared. Only genes significantly overexpressed or underexpressed in one species compared to all others were considered.
Overall, from the 298 genes analysed, 105 were significantly constitutively overexpressed or underexpressed in one species (compared to all others), with a slightly higher percentage of overexpressed ones (62%; Figure 4). SSA1-SG3 showed the highest number of genes that are constitutively overexpressed (29), followed by Asia II-1 (14) and NW2 (13). On the other hand, MEAM1, MED-Q1 and Uganda-MED-ASL had the lowest number of constitutively overexpressed genes (7, 4 and 2, respectively). A different pattern appeared when genes significantly underexpressed in one species (compared to all others) were considered. Here, there was a remarkable difference between the three species in the "low performance" group, NW2, MED-Q1 and Uganda-MED-ASL, with 10, 12 and 17, underexpressed genes, respectively, to the three "high performance" species, MEAM1, Asia II-1 and SSA1-SG3, with 2, 2 and 0, underexpressed genes, respectively. F I G U R E 4 Constitutive expression differences, in detoxification genes, between the six analysed Bemisia tabaci species after feeding for 24 hr on the common host plant, eggplant. Genes significantly overexpressed or underexpressed in one species compared to all others are indicated by asterisks. Expression values were plotted as standardized rld values for each gene across the six species. For each listed gene, the Bta number or accession number in the B. tabaci MEAM1 genome database (Chen et al., 2016) is provided in Supporting Information Table S2. The circular plot was made with circos [Colour figure can be viewed at wileyonlinelibrary.com] 3.6 | Do B. tabaci species share a common "essential detoxification machinery"?

| Plastic expression differences within and between B. tabaci species
It has been predicted that more specialized species retain an essential "machinery" that allows some level of utilization of host plants that have been lost from the species repertoire (Nylin & Janz, 2009).
We asked, therefore, whether a common response in detoxification gene expression exists in B. tabaci. To avoid considering nonconsistent changes from which a clear pattern cannot be obtained, only genes differentially expressed in more than one species and showing the same expression pattern (upregulation or downregulation) in at least two species were considered. This reduced the original list of 56 genes, differentially expressed in more than one species, to 44.
The list of 44 genes ( Figure 6) was also significantly enriched in genes belonging to the P450 (24 genes) and UDPGT (13 genes) detoxification families (χ 2 conditions. Third, hierarchical clustering analysis of their differentially expressed detoxification genes clustered the six species according to their aforementioned host-performance groups and not according to their phylogenetic relationship, bringing evidence for the existence of a common detoxification "machinery," shared between species that can perform well on multiple hosts. The hypothesis that the different species in the B. tabaci complex differ in their "actual" host range is definitely not a new one (Brown et al., 1995), although it was rightly argued that this is more a hypothesis than a "solid" fact, as there are only few experimental studies that compared performance across different hosts (De Barro et al., 2011;Xu et al., 2011). What is new here is the possible link we draw between divergence in this group and the documented dif- Our transcriptomic findings add a new dimension to these previous data. As stated above, the hierarchical clustering analysis clustered the six species according to their ability to perform on multiple hosts. This suggests the existence of a shared detoxification gene expression pattern that can be associated with a more general feeding nature of some species, regardless of the estimated time of their separation. Moreover, it allows us to speculate that these species have likely retained an ancestral pattern of expression that was already present in the common ancestor of the B. tabaci species complex, which likely displayed a feeding habit towards the less specialized side of the B. tabaci spectrum. It is important to note that our data do not allow us to exclude the possibility that convergent evolution to similar detoxification patterns has taken place (Ujvari et al., 2015) in species displaying a more general feeding nature.
We would like to turn back now to our main research question: "Could evolved differences in host utilization play a role in the diversification of the B. tabaci species complex?" Our literature survey data clearly indicated that the B. tabaci species complex should be recognized as a group of more or less specialized species. In parallel, we found evidence for the maintenance of a common (ancestral or converged) expression pattern of the detoxification "machinery" that F I G U R E 6 The putative common "essential detoxification machinery" of the Bemisia tabaci species complex. The ideogram presents transcripts that were significantly plastically-expressed (indicated by asterisks) in more than one species and showed the same expression pattern (upregulation or downregulation) in at least two species. For each listed gene, the Bta number or accession number in the B. tabaci MEAM1 genome database (Chen et al., 2016) is provided in Supporting Information is shared among species that can perform well on multiple common and novel hosts. As species with expanded host ranges tend to show larger geographical distributions, they are more susceptible to fragmentation due to both neutral and adaptive processes (Forister & Jenkins, 2017;Forister et al., 2012;Janz, Braga, Wahlberg, & Nylin, 2016;Nyman, 2010). Therefore, it is possible that the ancestral ability to perform well on multiple hosts might have played a passive role in the evolution of the B. tabaci species complex, by enhancing the probability for geographical separation between populations.
Under this scenario, most of the subsequent divergence, including some adaptation/specialization events, could have occurred in allopatry, fitting the observations and predictions made by De Barro (2005).
Few additional interesting and likely important observations were made while exploring our transcriptomic data. For example, some common detoxification capabilities were found to be lost only by species that cannot perform well on multiple hosts, although it seems that these species did not converge to one "other" pattern.
This suggests that genetic drift or selection pressure might cause the loss of some detoxification genes that are not required when species become more specialized and lose their ability to perform well on some host plants. It has been previously argued that production of detoxification proteins might be energetically costly or capable of endangering the organism by producing modified/bio-activated deleterious molecules (Feyereisen, 1999). For example, β-asarone bioactivation was mediated via insect P450 activity in Peridroma saueiaas (Koul, Smirle, Isman, & Szeto, 1990) and P450 activity in Helicoverpa zea, was detrimental in the presence of a plant pathogen that produces aflatoxin, a toxin that can be bio-activated by P450s activity (Zeng, Wen, Niu, Schuler, & Berenbaum, 2009). Other examples include mitochondrion-associated transcripts or chaperonin responses that can mitigate the effect of stressful or foreign environments but are also associated with major energetic costs (Feder & Hofmann, 1999;Ragland et al., 2015).
Our transcriptomic data also provide new insights into the mode  (Mathers et al., 2017). Although the outlined experimental system differed from ours, the common findings in the two systems highlight the possibility that successful short-or long-term host shifts of generalist phloem feeders do not necessarily require significant plastic or constitutive changes in gene expression.
Another striking similarity between our study and the one described by Mathers et al. (2017) relates to the enrichment, in both systems, of differentially expressed genes from to the P450 and UDPGT families responding to host changes. Interestingly, Mathers et al. (2017) noticed that these genes have a tendency to be arranged as tandem repeats in the M. persicae genome, drawing a putative link between gene duplication/family expansion events, expression plasticity and macro-evolutionary processes involved in host adaptation (Edger et al., 2015).
Before concluding, we would like to highlight few important noncompleted matters that require further investigation. First, although our transcriptomic study is the most extensive one conducted so far on the B. tabaci complex, it is clear that analyses of more combinations of species/plant hosts are required for further solidification of our findings and insights. Second, our literature survey included multiple sampling attempts from every relevant continent spanning more than 20 years. Therefore, we consider the effect of sampling bias negligible. Still, for some specific species, especially in the Asia II and China genetic groups, more data are required in order to determine the species "true" host range. In addition, it is also possible that some species in the complex, identified in the literature survey as having an "extended" host range, are in fact a mixture of individual genotypes that are adapted to certain plant types (Loxdale & Harvey, 2016). This did not affect our study, as each species was represented by one genotype (specific mtCOI barcode), but should be taken into consideration when comparisons to other studies are made in the future. Third, due to the lack of reliable annotated data, our work (so far) focused only on the detoxification system. Other systems that insects require for feeding successfully on their host plants should be targeted in the future.
These include among others: systems that allow host-plant perception and continued feeding through olfactory and/or gustatory cues, digestion and plant nutrient uptake systems, and the management of parasites present in the insect diet (Koenig et al., 2015). Fourth, it should be noted that "gene expression" could be also regulated at the translational and post-translational levels. Detoxification genes that were identified as being expressed differentially across species ("constitutive" genes) and those with induced or suppressed RNA levels after host shift ("plastic" genes) may have an added layer of regulatory complexity that were not revealed in this study and should be addressed in the future. For instance, amino acid residue polymorphisms across the six B. tabaci species or possible changes in the levels of the detoxification proteins (due to altered translation or turnover).
We would like to conclude by highlighting few of our findings that might apply across other phytophagous insect systems. First, common detoxification "machineries" that allow the successful utilization of multiple plant hosts and the exploration of new/novel environments are likely to exist in other generalist species complexes. Second, our findings provide an insight into how more generalized and more specialized genotypes evolve. They raise the possibility that successful short-or long-term host shifts of generalist phloem feeders do not necessarily require significant plastic or constitutive changes in gene expression, which makes this feeding guild quite different from other studied systems (Celorio-Mancera et al., 2013;Ragland et al., 2015;Wybouw et al., 2015). Third, they also provide a mechanistic platform for explaining why specialization should not be considered as a dead end, as even more specialized species retained parts of the environmental-responsive detoxification "machinery." We hypothesize that this essential "machinery" is in general adaptive, as selection has had an opportunity to act on the genetic variation for plasticity throughout most of the host range of the species. It might even allow populations of more specialized species to survive to some extent on novel hosts (Nylin & Janz, 2009).
It is important to note, however, that the way forward, especially in nonmodel organisms, requires the integration of high-quality genomic and epigenomic data, which will allow to accurately define the interplay between genotypes and phenotypes and between ecological and evolutionary processes (Forister et al., 2012;Schneider & Meyer, 2017).

ACKNOWLEDGEMENTS
We thank Drs. Pnina Moshitzky and Rita Mozes-koch, Galit Eakteiman, Ofer Aidlin Harari and Ksenia Juravel for their excellent assistance. We also thank four anonymous reviewers for their excellent comments on earlier versions of this manuscript and the sub-

DATA ACCESSIBI LITY
Raw data files from the RNA sequencing project described above have been deposited in the NCBI Short Read Archive SRP127757.