Structural features of the mitogenome of the leafhopper genus Cladolidia (Hemiptera: Cicadellidae: Coelidiinae) and phylogenetic implications in Cicadellidae

Abstract The first two complete mitogenomes of the leafhopper genus Cladolidia (C. biungulata and C. robusta) were sequenced and annotated to further explore the phylogeny of Cladolidia. Both the newly sequenced mitogenomes have a typical circular structure, with lengths of 15,247 and 15,376 bp and A + T contents of 78.2% and 78%, respectively. We identified a highly conserved genome organization in the two Cladolidia spp. through comparative analysis that included the following assessments: genome content, gene order, nucleotide composition, codon usage, amino acid composition, and tRNA secondary structure. Moreover, we detected the base heterogeneity of Cicadellidae mitogenomic data and constructed phylogenetic trees using the nucleotide alignments of 12 subfamilies of 58 leafhopper species. We noted a weak heterogeneity in the base composition among the Cicadellidae mitogenomes. Phylogenetic analyses showed that the monophyly of each subfamily was generally well supported in the family Cicadellidae; the main topology was as follows: (Deltocephalinae + (Treehoppers + ((Megophthalminae + (Macropsinae + (Hylicinae + (Coelidiinae +Iassinae)) + (Idiocerinae + (Cicadellinae + (Typhlocybinae + (Mileewinae + (Evacanthinae +Ledrinae)))))))))). Within Coelidiinae, phylogenetic analyses revealed that C. biungulata and C. robusta belong to Coelidiinae and the monophyly of Cladolidia is well supported. In addition, on the basis of complete mitogenome phylogenetic analysis and the comparison of morphological characteristics, we further confirm the genus Olidiana as a paraphyletic group, suggesting that the genus may need taxonomic revisions.

These characteristics have been widely used for species identification as well as phylogenetic, phylogeographic, and genomic evolution studies (Anderson et al., 1981;Chuan et al., 2012;Nelson et al., 2012).
Most previous studies on Coelidiinae relationships have focused on morphological characteristics. However, the phylogeny of Coelidiinae remains to be explored using mitogenomic data. The lack of mitogenome sequences has limited the expansion of knowledge regarding the molecular evolution and population genetic diversity of this subfamily. Nielson (2015) removed C. biungulata, C. robusta, and five other species from Calodia and created the genus Cladolidia based primarily on the differences in the processes of aedeagus between these groupings. However, the position of the genus Cladolidia within the subfamily is yet to be ascertained (Nielson, 2015).
In the present study, we sequenced two complete mitogenomes of the genus Cladolidia (C. biungulata and C. robusta) using high-throughput sequencing; C. biungulata and C. robusta are the first and second species, respectively, that have been described for this genus. In addition, we described their molecular phylogenetic relationships with 58 leafhopper and 5 treehopper species.
Furthermore, this study provides an insight into the identification, phylogeny, conservation genetics, and evolution of Cladolidia and its related species.

| Sample collection and DNA extraction
Detailed information on the specimens collected is presented in Table   S1. The collected specimens were identified based on their morphological characteristics, as described previously (Li & Fan, 2017;Zhang, 1990). After the species were accurately identified, the specimens were preserved in absolute ethanol and stored at −20°C until genomic DNA extraction. Genomic DNA was extracted from the whole body of adult males after removing the abdomen using DNeasy ® Blood & Tissue Kit. In brief, the samples were incubated at 56°C for 6 hr to lyse the cells completely and the total genomic DNA was eluted in 100μl double-distilled water. The subsequent steps were performed according to the manufacturer's instructions. After evaluating the extracted genomic DNA quality using 1% agarose gel electrophoresis, it was stored at −20°C until further use. Both the voucher specimens with male genitalia and DNA samples have been deposited at the Institute of Entomology, Guizhou University, Guiyang, China.

| Sequence analysis
The two complete mitogenomes of C. biungulata and C. robusta were sequenced by Berry Genomics on the HiSeq 2500 platform  (Wang et al., 2017;, which were retrieved from GenBank and identified via BLAST searches on NCBI to confirm sequence accuracy. We used the MITOS web server and BLAST searches on NCBI (https://blast.ncbi.nlm.nih. gov/Blast.cgi) to annotate the assembled sequences using invertebrate genetic codes (Altschul et al., 1997;Bernt et al., 2013) as well as the search server tRNAscan-SE 1.21 to identify the locations and predict the secondary structure of 22 typical tRNAs (Laslett & Canbäck, 2008;Schattner et al., 2005;Tamura et al., 2013). All rRNA genes were identified based on the locations of adjacent tRNA genes and comparisons with sequences of other leafhopper mitogenomes deposited in NCBI. ORF Finder in Geneious Prime was used to predict 13 protein-coding gene (PCG) locations using invertebrate genetic codes. The mitogenomic map and comparative analysis were performed using CGView comparison tool (Stothard et al., 2017). Furthermore, the relative synonymous codon usage (RSCU) values and codon numbers were calculated using the MEGA version 7.0 program (Sudhir et al., 2016). Finally, chain asymmetry was calculated using the following formulas: AT skew = (A − T)/(A + T) and GC skew = (G − C)/(G + C) (Perna & Kocher, 1995).

| Phylogenetic analysis
In total, 58 leafhopper and 5 treehopper species were selected to construct the phylogenetic tree after the removal of sequences that were unverified, lacked an accurate scientific name, and were repetitive. Phylogenetic analysis was performed using alignments  (Table S2). Each PCG was aligned using the TranslatorX online tool, employing MAFFT to perform protein alignment (Abascal et al., 2010;Castresana, 2000;Katoh et al., 2017). Then, the resulting 13 alignments were assessed and manually corrected using the MEGA version 7.0 program (Sudhir et al., 2016). The best schemes for partition and substitution models (Table S3) were determined in PartitionFinder version 2.1.1 using the Akaike information criterion and the greedy search algorithm (Lanfear et al., 2017). For phylogenetic analyses, the maximum likelihood (ML) and the Bayesian inference (BI) methods were used to construct the ML and BI trees based on two datasets (13PCG12, first and second codons of 13 PCGs these datasets were deposited in Dryad: https://doi.org/10.5061/ dryad.zkh18 93b3). The third codon positions may suffer from mutation saturation, which can lead to noise in the phylogenetic analysis (Blouin et al., 1998;Breinholt & Kawahara, 2013). Hence, the third codons were discarded from the phylogenetic analysis. The heterogeneity of sequence divergence within the two datasets was analyzed using AliGROOVE, with the default sliding window size (Kück et al., 2014).
ML analysis was performed with 1,000 rapid bootstrapping replicates using iqtree (Suchard & Huelsenbeck, 2012), whereas BI analysis was performed in MrBayes 3.2.7a with 4 chains and sampling of the chains every 1,000 generations (Nguyen et al., 2014). Two independent runs of 10 million generations were performed. After the average standard deviation of split frequencies fell to <0.001, the initial 25% of the samples was discarded as burn-in and the remaining trees were used to generate a consensus tree and calculate the posterior probabilities. The BI and ML analyses were performed on the CIPRES Science Gateway (https://www.phylo.org) website.
The phylogenetic trees were visualized using FigTree 1.4.2.

| General features of Cladolidia mitogenome
The annotations of the mitogenomes of the two Cladolidia species and the circular maps are shown in Table 1 Table 1). The gene order of these two mitogenomes is identical to that of all previously published mitogenomes of Cicadellidae and the ancestral Drosophila yakuba (Clary & Wolstenholme, 1985).
These two mitogenomes of Cladolidia contain 10 nucleotides that are dispersed among six intergenic spacers (ranging from 1 to 4 bp), and the longest spacer sequence (4 bp) is located between trnH and nad5, trnA, and trnR. There are a total of 14 overlapping regions (ranging from 1 to 11 bp), and the conserved 11-bp overlapping nucleotide sequence between trnW and trnC is extremely common in Cicadellidae (Du, Zhang, et al., 2017;Du et al., 2019;Wang et al., 2017Wang et al., , 2018Wang et al., , 2020Wang et al., , 2021. The nucleotide composition of the two Cladolidia species reveals a strong A + T bias in the entire mitogenome, and the A + T contents between C. biungulata and C. robusta are nearly equal (78.2% in C. biungulata and 78.4% in C. robusta). As with other Coelidiinae, the nucleotide composition of the two mitogenomes is clearly biased toward A/T nucleotides, with 13 PCGs, 22 tRNAs, <2 rRNAs, and a control region. This phenomenon to some extent is due to the damage or accumulation of mutations in the mitochondrial DNA (Martin, 1995).

| PCGs and codon usage of Cladolidia mitogenome
A total of 13 PCGs were identified in each of the two Cladolidia mitogenomes. In both mitogenomes, all PCGs use the canonical initiation codon ATN and the canonical stop codon TAA/TAG, except for cox2 and cox3. C. biungulata also harbors nad2, which uses an incomplete stop codon T--. This phenomenon has also been noted in other Coelidiinae insects (Wang et al., 2017;. The incomplete stop codons are modified into complete TAA codons via posttranscriptional polyadenylation during mRNA maturation (Perna & Kocher, 1995). Of note, cox1, cox3, and atp6 in each species have the same start and stop codons. The longest PCG is nad5 (1,674 bp), and the shortest is atp8 (150 bp). Only four genes (nad5, nad4, nad4l, and nad1) are present on the N-strand. The other nine genes (cox1, cox2, cox3, atp8, atp6, nad2, nad3, nad6, and cob) are located on the J-strand ( Figure 1, Table 1), which is similar to the mitogenome structure of most other Coelidiinae insects (Wang et al., 2017(Wang et al., , 2021. The RSCU values and codon number for C. robusta (very similar to C. biungulata) are shown in Figure 2. The most frequently used codon is AUA (Met, N = 367), followed by AUU (Ile, N = 340), UUA (Leu, N = 333), and UUU (Phe, N = 290). However, in previous studies, the most frequently used codon was UUU (Phe) (Wang et al., 2017(Wang et al., , 2021. Moreover, the majority of frequently used codons end with A or U (Figure 2). These two factors appear to contribute to the high A + T content of PCGs and the AT bias of the whole mitogenome.

Comparative analysis revealed that the mitogenome of
Coelidiinae is a conservative poly-T (with 28-31 bp) structure ( Figure 3). Such a large poly-T structure is not found in the mitogenomes of other leafhoppers; hence, we hypothesized that this particular structure serves as a DNA barcode for the subfamily.

| tRNAs and rRNAs of Cladolidia mitogenome
All 22 tRNAs of C. biungulata and C. robusta mitogenomes were identified; they ranged from 57 to 68 bp in length. Among the tRNA genes, 14 are located on the J-strand and 8 on the N-strand, which is the coding pattern observed in almost all Cicadellidae mitogenomes (Du, Zhang, et al., 2017;Du et al., 2019;Wang et al., 2017Wang et al., , 2018Wang et al., , 2020Wang et al., , 2021. The 22 tRNA genes in the two Cladolidia species were identified, and their secondary structures are shown in Figure 4. All these gene products are folded into the typical cloverleaf secondary structure, except trnS1, which lacks the dihydrouridine (DHU) arm; the loss of the DHU arm in trnS1 is  (Wang et al., 2017(Wang et al., , 2018. The combined length of tRNA genes of C. biungulata and C. robusta is 1,411 bp and 1,394 bp, with A + T contents of 79.4% and 78.9%, respectively. rrnS is located between trnL2 (CUN) and trnV, whereas rrnL is flanked by trnV and the control region ( Figure 1, Table 2). Two rRNA genes, rrnS and rrnL, in C. biungulata and C. robusta have the same total length (2,222 bp).
In Cladolidia, the A + T (81.8%) contents are the same and AT skews can be either positive or negative. The 22 tRNA and 2 rRNA genes are highly conserved, particularly trnI, trnA, trnR, and trnE, and the secondary structures are exactly the same between C. biungulata and C. robusta.

| Control region of Cladolidia mitogenome
The control regions are located between rrnS and trnI, with lengths of 1,016 (C. biungulata) and 1,199 bp (C. robusta), respectively. The control region has the highest A + T content (83% and 82.5%) among the two complete C. biungulata and C. robusta mitogenomes (Table 2).
Comparative analysis of the base composition of every component of the Coelidiinae mitogenomes indicated that the control regions have the highest A + T content, ranging from 82.5% (C. robusta) to 85.9% (O. obliqua). In the control region, both AT and GC skew are negative, indicating that T and C are more abundant than A and G.
The GC content was the most significant factor in determining the F I G U R E 1 Mitogenome map of Cladolidia spp codon bias among organisms, which is consistent with the general tendency of the complete mitogenome.

| Phylogenetic relationship
By detecting the base heterogeneity of mitogenome datasets used for constructing a phylogenetic tree, we can determine whether the base heterogeneity of each dataset will cause a major error in the tree construction process (Li et al., 2015;Liua et al., 2018;Morgan et al., 2013;Sheffield et al., 2009;Song et al., 2016;Timmermans et al., 2015).
On the basis of the calculation results obtained from the AliGROOVE (Kück et al., 2014) software, the heterogeneity of PCG12 and AA datasets in the mitogenomic data of Cicadellidae is weak ( Figure 5). Hence, the two datasets could be used to construct a phylogenetic tree.
BI and ML analyses using 13PCGs12 and the AA datasets generated phylogenic trees with two topologies (Figures 6, 7, S8, and S9). The monophyly of each subfamily was generally well supported in the family Cicadellidae, which is consistent with the findings of some previous molecular phylogenetic studies (Du, Zhang, et al., 2017;Du et al., 2019;Wang et al., 2017Wang et al., , 2018Wang et al., , 2020Wang et al., , 2021. However, this finding is different from that of the studies by (Xue et al., 2020) and . F I G U R E 5 Heterogeneity of amino acids (left) and PCG12 (right) in the mitogenome of Cicadellidae. Differences in heterogeneity between sequences are represented by color, with dark red (−1) to dark blue (+1) representing differences from heavy to light. PCG, protein-coding gene
However, the ML tree showed the following phylogenetic relationships: (Ledrinae + (Evacanthinae + (Mileewinae + (Typhlocybinae + (Cicadellinae + (Idiocerinae + ((Macropsinae + (Megophthalminae +Hylicinae)+ (Deltocephalinae + ((Treehoppers + (Coelidiinae +Iassinae) +))))))))))) ( Figure 7). In all BI analyses with higher approval ratings than ML analyses, this phenomenon is commonly noted in the analyses performed in previous studies; other recent analyses of relationships among some leafhopper subfamilies have yielded trees with low support for many deep internal branches (Wang et al., 2018(Wang et al., , 2020. These two relationships of BI analyses and ML analyses differ primarily in the positions of Deltocephalinae and Ledrinae. In ML-AA analysis, Ledrinae occupied the basal branch of leafhopper species in all phylogenetic analyses. This further confirms that the subfamily Ledrinae is an ancient group of leafhoppers, which is consistent with the findings of previous molecular phylogenetic studies (Du, Zhang, et al., 2017;Du et al., 2019;Wang et al., 2017Wang et al., , 2018Wang et al., , 2020Wang et al., , 2021. However, Deltocephalinae, rather than Ledrinae, occupied the basal branch of leafhopper species in other (BI-AA, BI/ML-PCG12) phylogenetic analyses. Our analyses confirm that Iassinae and Coelidiinae are assigned to the sister groups of treehoppers, Macropsinae, and Megophthalminae with high approval ratings (ML, BS = 100; BI, PP = 1.00); this result is different from that observed in previous studies (Du et al., 2019;Wang et al., 2017Wang et al., , 2018Wang et al., , 2020Wang et al., , 2021. In the present study, phylogenetic relationships showed that the subfamily Megophthalminae is a sister group of Macropsinae instead of treehoppers. In all analyses, the two species of the genus Cladolidia also clustered closely with the genus Taharana; the results showed that the genus Cladolidia is a monophylic group. However, the genus tongmaiensis). This conclusion was further confirmed based on significant differences in their morphological characteristics, which were characterized by body color, shape, and the position of the processes on the aedeagus shaft. Therefore, on the basis of the complete mitogenome phylogenetic analysis and the comparison of morphological characteristics, we propose that Olidiana is not monophyletic; hence, this genus may need taxonomic revisions.
Future studies on both the morphological and molecular characteristics of additional species are warranted to reveal phylogenetic relationships within Coelidiinae.