Mitochondrial DNA genomes of five major Helicoverpa pest species from the Old and New Worlds (Lepidoptera: Noctuidae)

Abstract Five species of noctuid moths, Helicoverpa armigera, H. punctigera, H. assulta, H. zea, and H. gelotopoeon, are major agricultural pests inhabiting various and often overlapping global distributions. Visual identification of these species requires a great deal of expertise and misidentification can have repercussions for pest management and agricultural biosecurity. Here, we report on the complete mitochondrial genomes of H. assulta assulta and H. assulta afra, H. gelotopoeon, H. punctigera, H. zea, and H. armigera armigera and H. armigera conferta’ assembled from high‐throughput sequencing data. This study significantly increases the mitogenome resources for these five agricultural pests with sequences assembled from across different continents, including an H. armigera individual collected from an invasive population in Brazil. We infer the phylogenetic relationships of these five Helicoverpa species based on the 13 mitochondrial DNA protein‐coding genes (PCG's) and show that two publicly available mitogenomes of H. assulta (KP015198 and KR149448) have been misidentified or incorrectly assembled. We further consolidate existing PCR‐RFLP methods to cover all five Helicoverpa pest species, providing an updated method that will contribute to species differentiation and to future monitoring efforts of Helicoverpa pest species across different continents. We discuss the value of Helicoverpa mitogenomes to assist with species identification in view of the context of the rapid spread of H. armigera in the New World. With this work, we provide the molecular resources necessary for future studies of the evolutionary history and ecology of these species.


| INTRODUC TI ON
Accurate species identification is the foundation for all biological research; however, the scientific community is often distracted by polarized support either for traditional morphological or for molecular identification of species (e.g., Hebert, Penton, Burns, Janzen, & Hallwachs, 2004;Rubinoff, 2006). However, it is also becoming increasingly clear that both methods contribute value and should be better integrated to provide stronger support for defining species status (e.g., Desalle, 2006). Confusion in the scientific literature, especially relating to visually similar organisms, can lead to substantial difficulty in formulating and developing management, trade, and economic policies. Furthermore, the availability of high-throughput sequencing data is revealing that hybridization between so-called species, is perhaps more common than was previously thought Anderson, Tay, McGaughran, Gordon, & Walsh, 2016;Elfekih et al., 2018).
Examples of this conflict between molecular and morphological identification include the stored grain beetle, Cryptolestes spp., where despite recent studies combining molecular data and morphology, confusion remains (Tay, Beckett, & De Barro, 2016;Wang et al., 2014). In contrast, a successful example of integrating DNA data, with morphological and phenotypic characters to differentiate species, is the differentiation Asian and European honeybee mite species, Varroa jacobsoni and V. destructor, respectively (Anderson & Trueman, 2000).
Confident and unambiguous identification of invasive organisms especially those with agricultural and economic significance is becoming increasingly important in a highly mobile world. This can be seen with the recent incursion of the Old World cotton bollworm, Helicoverpa armigera, into the New World (e.g., Czepak, Albernaz, Vivan, Guimarães, & Carvalhais, 2013;Tay et al., 2013), and the detection of both sister species of the fall army worm (FAW), Spodoptera frugiperda, in Africa (Cock, Beseh, Buddie, Cafa, & Crozier, 2017;Goergen, Kumar, Sankung, Togola, & Tamò, 2016;Nagoshi et al., 2017;Otim et al., 2018). Although the timing of S. frugiperda's arrival to the African continent is as yet unknown, the arrival of H. armigera in Brazil occurred sometime before the cropping season of 2012/13 when it was first identified from historical sampling efforts (Sosa-Gómez et al., 2015). The morphological similarity between H. armigera and the New World H. zea was likely an important factor for the delay in detection. Various studies (Anderson et al., , 2016Arnemann, 2015;Arnemann et al., 2016;Arneodo, Balbi, Flores, & Sciocco-Cap, 2015;Mastrangelo et al., 2014;Tay et al., 2013) have shown that H. armigera populations in Brazil and neighboring countries had wide potential geographic origins from Asia, Africa, and Europe, with their introductions having a strong association with global agricultural and horticultural trade movements into South America (Tay, Walsh et al., 2017).
Co-occurring with H. armigera across the Old World is the Solanaceae specialist H. assulta, while H. punctigera, a major agricultural pest in itself, is endemic to Australia (for a review see Hardwick, 1965). Helicoverpa armigera, H. punctigera, and H. zea are morphologically similar and identifying them has traditionally relied on dissecting the adult male and female genitalia (e.g., Hardwick, 1965;Pogue, 2004), which is both time consuming and technically challenging. Studies by Behere, Tay, Russell, and Batterham (2008) and Fang et al. (1997) have previously assessed mtDNA and nuclear DNA genes to distinguish between the major Helicoverpa pest species. Behere et al. (2008)  genes. Arneodo et al. (2015) applied the concept of Behere et al. (2008) and developed a RFLP method to assist with the rapid differentiation between New World H. zea and H. gelotopoeon and H. armigera. However, both Behere et al. (2008) and Arneodo et al. (2015) used different mtDNA COI gene regions, and identification by PCR-RFLP between these five Helicoverpa pest species would therefore require different PCR amplicons.
Recent studies relating to the molecular characterization of complete mitochondrial DNA genomes (mitogenomes) have used high-throughput sequencing technology that enables rapid mitogenome assembly of a wide range of insect species. High-throughput sequencing platforms with improved bioinformatic pipelines for assembling mitogenomes have also been shown to be an ideal option for studying historical specimens, in vertebrates (e.g., Anmarkrud & Lifjeld, 2017) as well as insects (e.g., Tay, Elfekih et al., 2017), where genomic DNA is typically fragmented due to the age of samples, and/or poor preservation conditions. These factors represent a significant challenge to the Sanger method (Sanger & Coulson, 1975) of sequencing PCR amplicons. Furthermore, applying high-throughput sequencing methods also bypasses potential primer annealing issues, gDNA template limitations, and reduces the chances and impact of contamination.
Currently, there are published mitogenomes of H. armigera (Yin, Hong, Wang, Cao, & Wei, 2010) from China, H. punctigera (Walsh, 2016) from Australia, H. zea (Perera, Walsh, & Luttrell, 2016) (Hardwick, 1965), H. armigera conferta, (present in Australia) and H. armigera armigera (present in Asia, Europe, Africa (Hardwick, 1965) and South America (Anderson et al., , 2016), and increase the mitogenome resources of the Australian endemic H. punctigera (Table 1). We show that the available H. assulta mitogenomes are affected by misidentification (KP015198) and sequencing errors (KR149448). We also consolidate the current PCR-RFLP methods for species identification to include all five Helicoverpa pest species. Furthermore, we discuss the biosecurity implications of our study with respect to pest species identification and the importance of accurately characterized mitogenomes, while providing the molecular resources necessary for future studies of the evolutionary history and ecology of these Helicoverpa pest species.

| Helicoverpa species DNA library construction and sequencing
Fifteen mitogenomes were sequenced in this work: H. assulta assulta  (Table 1).
With the exception of the pinned historical H. assulta afra from Tanzania and H. assulta assulta from Thailand, all Helicoverpa specimens were stored in ≥95% ethanol. DNA was purified using the Blood and Tissue DNA extraction kit (Qiagen), prior to quantification using Qubit (Life Technologies). Sequencing libraries were constructed as reported in Walsh (2016). DNA extraction from H. assulta assulta (KT626655) was as reported in Perera et al. (2015), and the DNA library was constructed as described in Perera et al. (2016). Initial identification of adult H. gelotopoeon specimens from H. zea/H. armigera was as described by Hardwick (1965)  This Helicoverpa assulta individual (KP015198) is highly likely to be a misidentified H. armigera armigera individual from China based on nucleotide sequence identity and phylogenetic analysis as presented in this current study.

| Mitogenome assembly
For the assembly of the Helicoverpa mitogenomes (Table 1)

| Molecular characterization of Helicoverpa draft mitogenomes
To characterize the assembled draft mitogenomes of all Helicoverpa species, we used the program MITOS (Bernt et al., 2013), specifying the invertebrate genetic code (code #5) for identifying all tRNAs, rRNAs, and the start of protein-coding genes (PCGs). The origin of replication in our assembled mitogenomes was putatively identified, and due to its low complexity nature, we inserted a string of 5N's to indicate potential assembly difficulty across this region. The characterized mitogenomes were manually adjusted for stop codons to indicate the end of the PCGs using the published H. zea mitogenome PCGs as reference (Perera et al., 2016, KJ930516), although we note that to identify the most likely stop codon would require sequencing of RNA reverse transcribed cDNA (Gissi & Pesole, 2003).

| Confirmation of species identity
We examined all mitogenome PCGs using BlastN (Altschul et al., 1997)

| Phylogenetic analysis
We performed a phylogenetic analysis using all 13 protein-coding genes (PCGs) found in the publicly available mitogenomes of

| PCR-RFLP analysis of all five Helicoverpa pest species
Two previous studies have reported methods for distinguishing be-  (Arnemann et al., 2016;Arneodo et al., 2015;Behere et al., 2008;Leite et al., 2016;Tay, Walsh et al., 2017) were also included in the current study (Supporting Information Data S2 and S3-GenBank accession numbers and sequences used for RFLP).

| Molecular characterization of the mitogenomes of Helicoverpa (sub)species
The assembled mitogenomes of the five Helicoverpa species were estimated to be between 15,226 bp and 15,403 bp in length (Table 1). Note. Shaded regions indicate higher than expected nucleotide sequence identity at the interspecific level between H. armigera (GU188273) and the presumed H. assulta (KP015198; KR149448).
TA B L E 3 SNP alignment at the mtDNA Cyt b gene between Helicoverpa armigera (GU188273), two putative H. assulta individuals (KP015198, KR149448), and two H. assulta (MG437197; KT626655.1) as reported in this study approximately Cyt b gene to the rrnS (12S rRNA) gene region (ca.  Table 4). For all mitogenomes generated in this study, corresponding partial COI genes were also found to match the partial mtCOI genes used in the study of (KP015198, Li et al., 2016;i.e  with the same tree topology as previously reported by (Anderson et al., 2016). The phylogenetic relationship between H. zea and H. armigera suggested these two species shared a most recent common ancestor that diverged approximately 1.5-2 million years ago (Behere et al., 2007;Pearce et al., 2017aPearce et al., , 2017b.

| RFLP analysis
The PCR-RFLP method of Behere et al. (2008) and Arneodo et al. (2015) that used 511 bp of the c-terminal COI gene, and 698 bp (trimmed from 732 bp) from the N-terminal COI gene, respectively, were revised such that either the N-terminal or C-terminal region of partial COI gene could be used to differentiate all five pest Helicoverpa species through a combination of four restriction enzymes ( Table 5). The restriction enzymes selected represented a more comprehensive in silico analysis of publicly available partial COI sequences and avoided the need to include the partial Cyt b gene RFLP analysis originally designed by Behere et al. (2008). The revised RFLP method has not included the restriction enzymes identified by both Arneodo et al. (2015) and Behere et al. (2008) because of novel mtCOI haplotypes identified from global populations of H. armigera including from South America (e.g., Arnemann, 2015;Arnemann et al., 2016;Leite et al., 2016;Tay, Walsh et al., 2017). America . Species misidentification may have also resulted in the incorrect reporting of the H. assulta mitogenome (KP015198, Li et al., 2016). This misidentification is supported by the COI phylogeny (data not shown), by BlastN searches involving the regions of the COI and Cyt b gene (e.g., Behere et al., 2008;Behere et al., 2007), and by the concatenated mitogenome PCG's phylogeny ( Figure 1) Hardwick (1965), Laster and Hardee (1995), and Laster and Sheng (1995)   Note. Differentiation between all five Helicoverpa pest species for the N-terminal (698 bp) COI gene region required restriction endonucleases of BseJI, BsaL, BasBI, and BpuEI. For the PCR-RFLP of the 511 bp C-terminal region of partial COI gene, Bco5I, AquVI, BseRI, and Eco130I restriction endonucleases were identified. The presence and absence of restriction enzyme cut sites within amplicons are indicated by ✓ and ✗, respectively. Expected restriction fragment lengths from either the 698 bp or 511 bp partial COI gene regions are indicated accordingly.

| D ISCUSS I ON
for potential H. armigera-H. gelotopoeon, H. assulta-H. gelotopoeon, H. assulta-H. zea, and H. armigera-H. assulta hybrids remained to be tested. Failure to monitor for these interspecific hybrids may lead to invasive genotypes such as enhanced resistance to insecticides being spread unchecked . The identification of naturally occurring hybrids will be difficult and will require significant coordination efforts between governmental departments (e.g., quarantine services, molecular detection, and identification facilities), and the development and adoption of new biosecurity policies.
However, most of these are not yet recognized by policy makers as novel, potential and/or imminent national biosecurity threats.
Regardless of the shortcomings of mitochondrial genes in assisting with species confirmation, it is nevertheless desirable to obtain well-characterized mtDNA genes to bolster biosecurity and pest management practices. The importance of rechecking the assembled mtDNA against public DNA databases (e.g., NCBI Genbank, BoLD) is often not emphasized and has on occasion, led to the misidentification of species and mitogenomes (Tay, Elfekih, Court, Gordon, & Barro, 2016;Walsh, 2016). While providing the much needed mitogenome resources for the Helicoverpa pest species across the Old and New Worlds, our study is not aimed at criticizing mistakes and oversights, but is rather, a cautionary reminder of the need to check sequence data against that readily available public DNA databases.

CO N FLI C T O F I NTE R E S T
None declared.

AUTH O R CO NTR I B UTI O N S
WTT, TW, KG, and AM conceived and designed the study. WTT, TW, OP, CA, and AM contributed to the work. All the authors contributed material, data, analysis and to the writing of the manuscript.

DATA ACCE SS I B I LIT Y
All assembled mitogenomes are available in NCBI Genbank (Table 1).