Nepticulidae represent one of the early diverging Lepidoptera lineages, and the family currently comprises over 850 described species. The larvae of the vast majority of the species are leaf miners on Angiosperms and highly monophagous, which has led to persistent ideas on coevolution with their plant hosts. We present here a molecular phylogeny based on eight gene fragments from 355 species, representing 20 out of 22 extant Nepticulidae genera. Using two fossil calibration points, we performed molecular dating to place the origin of the family in the Early Cretaceous, before the main Angiosperm diversification. Based on our results we propose a new classification, abandoning all ranks between family and genus, as well as subgenera to allow for a stable classification. The position of Enteucha Meyrick within Nepticulidae remains somewhat ambiguous, and the species-rich cosmopolitan genus Stigmella Schrank, with nearly half of all described Nepticulidae, requires further study. Ectoedemia Busck, Zimmermannia Hering, Acalyptris Meyrick, Etainia Beirne, Parafomoria Borkowski, Muhabbetana Koçak & Kemal and Fomoria Beirne appear to have diversified in a relatively short evolutionary period, leading to short branches in the molecular phylogeny and unclear suprageneric relations. Otherwise support values throughout the phylogeny are mostly high and the species groups, genera and higher clades are discussed in respect of their supporting morphological and life-history characters. Wing venation characters are confirmed to be mostly reliable and relevant for Nepticulidae classification, but some other previously used characters require reinterpretation. The species groups of most genera are recovered, but only partly so in the large genus Stigmella. The molecular dating results are compared with existing knowledge on the timing of the Angiosperm radiation and reveal that the diversification of Nepticulidae could largely have been contemporaneous with their hosts, although some of the genera restricted to a single plant family appear to have begun to diversify before their hosts.
How the timing and pattern of herbivorous insect diversification relate to Angiosperm plant diversification in the Early Cretaceous remains as one of the great evolutionary questions of today (Grimaldi & Engel, 2005; Tilmon, 2008; Wahlberg et al., 2013; Wiens et al., 2015). Lepidoptera, in particular, are striking in their plant-dependent diversity; it is one of the four largest insect orders and the only one that is almost entirely associated with Angiosperms (Powell et al., 1998; Wiens et al., 2015). To understand the evolutionary history of early Lepidoptera, it is worthwhile envisioning the scene of the rapidly evolving mid-Cretaceous environment (Vakhrameev & Hughes, 1991; Skelton, 2003; Magallón et al., 2013). Imagine standing on the flood plain of the shallow Western Interior Seaway that covered much of the mid-West United States during the late Albian, c. 104 Ma (Graham, 1999). The higher country is still covered with forests composed of conifers, cycads and other early seed plants, and herds of dinosaurs roam freely. The flood plain, however, is covered with an already diverse pioneer vegetation with hundreds of species of a young group of fast-growing plants: the Angiosperms (Lidgard & Crane, 1988). The softer and more palatable leaves of angiosperms offer a new biome of resources ready to be exploited by the insects. Indeed, as an observer you will soon see that many leaves are damaged in various ways, and some show tracks inside: leaf mines (Labandeira et al., 1994, 2007).
Albian fossil leaf mines closely resemble modern-day Nepticulidae leaf mines, and the variation in morphology suggests that these insects had already diversified to some extent (Doorenweerd et al., 2015a). Higher lepidopteran groups have been identified as well, including the likely presence of the ditrysian leaf blotch mining moths Gracillariidae (Labandeira et al., 1994), but butterflies and most larger moths were probably scarce or absent (Wahlberg et al., 2013). Fossil evidence provides crucial information on the early evolution of Lepidoptera, but integration with time-calibrated molecular phylogenetic studies will be essential for a full understanding of the timing of diversification of different groups. Initial studies integrating both have mostly focused on discrepancies (Sohn et al., 2015) and many fossil records probably require reinterpretation following modern insights in Lepidoptera classification (Heikkilä et al., 2015). However, a revisionary study on Nepticulidae fossils has shown that there is huge potential in the fossil record to provide additional calibration points in studies that employ molecular dating (Doorenweerd et al., 2015a).
Knowledge of the phylogeny of Lepidoptera has matured in recent years (Regier et al., 2013, 2015; Timmermans et al., 2014; Heikkilä et al., 2015; Bazinet et al., 2013, 2016) and Nepticulidae are consistently placed among the earliest Heteroneurans. Nepticulidae and Opostegidae together form the superfamily Nepticuloidea, a sister-group relationship that is well supported morphologically and molecularly. A basal division of Heteroneura between Nepticuloidea and all other Lepidoptera is also well supported (Regier et al., 2015; Bazinet et al., 2016). This means that in the so-called Angiospermivora, the Nepticuloidea evolved after four or five other clades split off, with c. 700 described species (van Nieukerken et al., 2011; Regier et al., 2015). More than 600 of these belong to the Hepialidae, a group that consists of often polyphagous root feeders, wood borers and leaf feeders rather than specialist (monophagous or oligophagous) angiosperm feeders. Nepticulidae are commonly called pygmy moths, because the adult moths are among the smallest of all Lepidoptera (Fig. 1A–D), but in terms of diversity they comprise the largest group of early diverging Lepidoptera, with approximately 850 named species and an estimated 2000–2500 species globally. They are specialized endophytophagous insects, mostly leaf miners (Fig. 1E–I), but a few groups are bark or stem miners, shoot borers, or feed in green fruits of Acer L., and a handful are gall makers. Species generally feed in ‘core Eudicot’ angiosperms (APG III, 2009), with a preference for woody plants.
Ever since the erection of the genus Nepticula Heyden (Zeller, 1848; Stainton, 1849) – a junior synonym of Stigmella Schrank (Wilkinson, 1978) – the group has been recognized as a unity, and as the family Nepticulidae since 1854 (Stainton, 1854). Soon after Nepticula was erected, two more genera were recognized on the basis of venation: Trifurcula Zeller and Bohemannia Stainton (Zeller, 1848; Stainton, 1859). In the early 20th century, American authors erected four more genera with venation as leading characters: Ectoedemia Busck (also based on the galling habit), Obrussa Braun, Glaucolepis Braun and Microcalyptris Braun (Busck, 1907; Braun, 1915, 1917, 1925). Gradually the study of genitalia became standard in Lepidoptera and the first of such studies involving nepticulids divided Nepticula into a number of species groups (Petersen, 1930). Later, Beirne (1945), when describing the genitalia of the British Lepidoptera, erected five new genera, Dechtiria Beirne, Levarchama Beirne, Fedalmia Beirne, Etainia Beirne and Fomoria Beirne, and split Stigmella into Stigmella and Nepticula.
The first cladistic analysis and classification resulted from the PhD studies on the South African Nepticulidae fauna by Scoble (1983), here redrawn in Fig. 2A. The family was divided into two subfamilies: the Australian Pectinivalvinae and the global Nepticulinae, the latter subdivided into Nepticulini and Trifurculini. The use of subgenera was introduced in the genus Ectoedemia: Ectoedemia, Fomoria and Laqueus Scoble. In 1986 this classification was refined and extended on the basis of the Holarctic fauna during the PhD studies by van Nieukerken, and included larval and new adult characters in the character matrix (van Nieukerken, 1986b) (Fig. 2B). The main divisions were maintained, and two additional subgenera were included in Ectoedemia: Etainia and Zimmermannia Hering. Synapomorphies were defined for all clades, except one, the subgenus Ectoedemia (Fomoria). Both phylogenies were derived manually using Hennigian principles, by the comparison of character states and outgroup argumentation (Hennig, 1966). The next published phylogeny followed in 1994 (Puplesis, 1994) (Fig. 2C), and did not use Hennigian cladistics, but still recognized apomorphies, and divided the family into the subfamilies Nepticulinae and Trifurculinae. Pectinivalva Scoble was treated within Nepticulinae, and the previously recognized subgenera were raised to full genus, except for Laqueus, which was synonymized with Fomoria.
Four years later, in his unpublished PhD thesis of Australian Nepticulidae, Hoare prepared the first cladistic analysis of Nepticulidae using maximum parsimony algorithms employed in computer software (Hoare, 1998) (Fig. 2D). The weakness of this analysis was the low number of species included, but its strength was the inclusion of characters of all life stages and this resulted in a refined version of the divisions by Scoble (1983) and van Nieukerken (1986b). Additionally from Hoare's PhD studies, analyses of the Australian Pectinivalvinae (Hoare, 2000b; Hoare & van Nieukerken, 2013) resulted in the recognition of the new genus Roscidotoga Hoare, and division of the genus Pectinivalva into three subgenera: Pectinivalva Hoare, Casanovula Hoare and Menurella Hoare. In all cladistic analyses it was clear that some clades were supported by a whole array of characters, whereas others were hardly supported at all, and a bootstrap analysis of Hoare's dataset collapsed most of the branches in the tree into a polytomy and failed to support suprageneric groupings (Hoare, 1998).
Studies on Nepticulidae involving DNA sequence data started in the early 21st century with three gene fragments. The initial family-level results have been presented at several conferences (e.g. van Nieukerken et al., 2004a) but were difficult to reconcile with morphological findings and remained unpublished. During the PhD thesis studies of Doorenweerd, there was an opportunity to sequence up to eight genes for a comprehensive set of taxa with representatives from all polytypic nepticulid genera. Here we present the resulting molecular phylogeny and suggest a sustainable new classification for Nepticulidae. Furthermore, we use two fossil calibration points to estimate divergence times that provide an insight on how one of the earliest lineages of Lepidoptera diversified alongside Angiosperm host plants. Simultaneous with this publication, a revised catalogue for Nepticulidae will be published that formalizes the new classification (van Nieukerken et al., 2016a), as well as a publication describing three new Neotropical genera and several species, which are also included in the present study (van Nieukerken et al., 2016b).
Material and methods
DNA barcoding has been systematically included in our taxonomic studies of Nepticulidae since 2005 and DNA analyses in general since 2000. This has thus far yielded over 2800 DNA extracts and DNA barcodes, with either verified species names, or temporary species names for undescribed material or material that cannot yet be linked to a described species. All DNA barcodes are available through the Barcoding of Life Datasystems (BOLD) (Ratnasingham & Hebert, 2007). From these DNA extracts we made a selection to be analysed for multiple genes, covering all available genera and species groups. As outgroup we selected 11 exemplars of Opostegidae, the family that joins Nepticulidae in Nepticuloidea and is its undisputed sister (van Nieukerken, 1986b; Regier et al., 2015). The final dataset used for the phylogenetic analysis includes 344 ingroup specimens from 20 genera with data from at least three genes. Except for two monotypic South African genera, Varius Scoble and Areticulata Scoble, all known extant Nepticulidae genera are represented, including three new genera from the Neotropics, which are being published simultaneously (van Nieukerken et al., 2016b). Collecting details and photographs of specimens may be found in the BOLD dataset DS-NepPhylo (doi: dx.doi.org/10.5883/DS-NEPPHYLO).
DNA extraction, amplification and sequencing
The source specimens for DNA extraction were stored as dried pinned adults, as larvae frozen in ethanol >95% or occasionally as larvae, which had been dried inside their leaf mines. For many samples, DNA extraction was performed nondestructively by recovering either the abdomen with genitalia or the larval pelt from the lysis buffer after incubation (slightly adapted from Knölke et al., 2005). The genomic DNA extraction continued from the lysis step using a Macherey-Nagel NucleoMag 96 Tissue magnetic bead kit (Germany) on a Thermo Fisher KingFisher flex system (Waltham, MA). Polymerase chain reaction (PCR) was used to amplify target DNA sections of eight genes: cytochrome oxydase subunit I (COI), cytochrome oxidase subunit II (COII), translation elongation factor 1-alpha (EF1-alpha), carbamoyl-phosphate synthetase 2, aspartate transcarbamylase, and dihydroorotase (CAD), isocitrate dehydrogenase (IDH), cytosolic malate dehydrogenase (MDH), histone 3 and 28S ribosomal RNA (28S). Amplification of the genes Wingless and GAPDH was attempted using published primers and conditions (Wahlberg & Wheat, 2008). For Wingless, the resulting fragments varied in length from ∼350 to 2000 bp, judged from gel electrophoreses. Several single copy fragments of <1000 bp were sequenced, but the results were so riddled with introns that they could not be aligned at all and we did not continue with this gene. For GAPDH, the amplification success was low (19% using 95 test samples). From the successfully sequenced samples we deduced that this was probably due to sequence regions near the primer sites with high similarity to the primers. Also, in GAPDH a single intron was encountered; other introns may have contributed to difficulties during amplification. For the final eight fragments used in this study, PCR chemicals and cycling conditions follow those of Doorenweerd et al. (2015b). Two genes that have additionally been used in this study are CAD and MDH, for which the primer pairs CADmidF and CAD1028R and hybMDF and MDHmidR, respectively, were used (Wahlberg & Wheat, 2008). PCR conditions were identical to the other nuclear markers and the annealing temperature was set at 55°C. Bidirectional Sanger sequencing was outsourced to BaseClear (Leiden, the Netherlands). The resulting chromatograms were checked for quality and congruence in geneious R6.1.8 and the resulting sequences were managed using voseq 1.7.4 (Peña & Malm, 2012).
The alignment of 28S was prepared with mafft 7 (Katoh & Standley, 2013). The sequenced fragments of COI, H3 and MDH contained no insertions or deletions (indels) or introns and were straightforward to align. In the sequenced CAD fragment we found one triplet insertion in Ectoedemia angulifasciella (Stainton), specimen RMNH.INS.12764, and in the COII fragment we found a single triplet insertion in an Ectoedemia quadrinotata (Braun) specimen, RMNH.INS.18557. In the IDH fragment there was a single position with up to four triplets inserted, which occurred in many samples throughout Acalyptris Meyrick (0–2 triplets), Etainia (always three triplets) and Stigmella (0–4 triplets). In EF1-alpha, several introns were encountered which are presented in the results, but which were removed for the phylogenetic analyses. COI sequences were available for all specimens in the final dataset, and the success rate of the remaining genes was 92% for 28S, 84% for EF1-alpha, 82% for COII, 66% for CAD, H3 and IDH, and 35% for MDH. A list of the available sequences per specimen and Genbank accession numbers can be found in Table S1. The final aligned length of the dataset used for analyses was 4557 bp.
Maximum likelihood (ML) and Bayesian approaches were used for phylogenetic inference on the concatenated dataset. The appropriate substitution model and optimal partitioning were determined using partitionfinder 1.1 (Lanfear et al., 2012). For all markers individually, as well as the combined dataset, the GTR + Gamma model proved most suitable according to the Bayesian information criterion. We initially reconstructed ML trees using the PhyML plugin in geneious (Guindon & Gascuel, 2003) for each genetic marker individually and assessed those for contamination issues or conflicting signal, we then repeated that approach for the mitochondrial markers combined versus the nuclear markers combined. Although strongly differing in resolution, there was no incongruence between the phylogenetic signal of different datasets. In all subsequent analyses, the dataset was analysed following the partitioning from partitionfinder: eight partitions mostly following the division of gene fragments, except that IDH and MDH were combined, as well as the second and third codon positions of COI and COII, and the first codon position of COI and COII. Bayesian analyses were run with the Linux MPI version of exabayes 1.4.1 (Aberer et al., 2014), and ML analyses were done using garli 2.01 (Zwickl, 2006). exabayes was set to run until 1.5% convergence between four sets of four heated chains was reached, after which the sampled trees were examined with tracer 1.6 (Rambaut et al., 2014) and revealed stable convergence and sufficient sampling [all estimated sample size (ESS) values ≫ 200]. Eight best trees were searched with garli to see if a single best topology could be found consistently with the used settings, and subsequently four independent runs with 100 bootstrap replicates were performed and averaged to obtain bootstrap supports. Complete consistency in the best ML topology could not be reached, which is included in the interpretation of the results. When interpreting the results, we considered branches with posterior probabilities (PP) over 0.95 and bootstrap values (BS) over 60 as well supported.
Divergence time estimation
Two fossil calibration points were selected to estimate timing of divergences with the software package beast 2.3.2 (Bouckaert et al., 2014). There is a regular occurrence of Nepticulidae-like leaf mine fossils in the fossil record since the earliest finds, representing multiple species, in the Dakota formation of the early Cretaceous, dated at 102 Ma (Doorenweerd et al., 2015a). We used this to calibrate the Nepticulidae crown with a log-normal distribution. The second calibration is from adult Baltic Amber entombments identified as two species of Bohemannia (Fischer, 2013; Doorenweerd et al., 2015a), from a formation dated between 43 and 45.2 Ma. We used this to calibrate the crown Bohemannia clade, with a log-normal distribution. The site models were divided in eight partitions. The clock models were separated for 28S, the protein coding nuclear genes and the mitochondrial genes. For each clock set the substitution rate was constrained for a single partition within the set according to values that are likely to approach realistic biological values (Papadopoulou et al., 2010): for 28S at 6E-4, for the nuclear genes 0.0017 and for the mitochondrial genes 0.0168. With multiple partitions in one set, only one gene was constrained, allowing the results to be tested for convergence from these estimates. The length of the single chain Bayesian analysis was set to 120 million and results were checked for convergence and sufficient sampling using tracer 1.6. The resulting trees were combined using treeannotator (included in the beast package) and visualized with figtree 1.4.2 (Rambaut, 2014). The beast runs were repeated multiple times to verify a consistent outcome, and the reliability of the calibration points was assessed by repeating the analyses with each point left out. This increased the confidence intervals of the age estimates of the left-out calibration, but there was no incongruence between the mean of the two calibration points. When discussing the results, we mostly indicate the 95% highest probability density (HPD) as the estimated age range, to prevent overconfident conclusions that may result from focusing on the average estimated age.
Morphological terms largely follow our earlier treatments (van Nieukerken, 1986b; van Nieukerken et al., 1990; Hoare, 2000b), but we replace the previously used term aedeagus for the male intromittent organ with phallus, following the general trend in lepidopterological literature as suggested by Kristensen (2003), and we adopt Wootton's (1979) wing venation nomenclature, meaning that R2–5 become Rs1–4 and Cu changes into CuA. In the discussion we largely refer to published morphological information, but we also include unpublished data from Hoare's (1998) PhD thesis and from van Nieukerken's study, and we also checked several characters on our own material when working on this manuscript.
In the 482 bp fragment of EF1-alpha that we amplified, we encountered four positions with introns (Fig. 3). Stigmella intronia van Nieukerken & Nishida from Costa Rica (RMNH.INS.24036) (van Nieukerken et al., 2016b) contains three positions with introns, on the second position in that specimen (position 250), all specimens of Simplimorpha promissa (Staudinger) have a much shorter intron. A fourth intron position was encountered in larvae of an undescribed Fomoria (Fomoria RhododendronKorea, RMNH.INS.30123) and of an undescribed Stigmella (Stigmella FagaceaeGnLumut, RMNH.INS.11966) with introns of different length and composition. After removing the introns, the amino acid translations closely matched those of other Nepticulidae, indicating that there were no pseudogenes involved.
The results of the phylogenetic analyses of the 355 taxa data set are summarized in Fig. 4, which presents the eight-gene Bayesian topology of the 344 ingroup-taxa along with ML bootstrap support values for all species groups and higher branching (see Figure S1 for the full beast Bayesian tree with all taxon names). Although we did not sample the Opostegidae as extensively as the Nepticulidae, with only three out of seven genera, our results show an interesting, albeit poorly supported, sister-group relationship between Opostega Zeller and Opostegoides Kozlov, never seen in the earlier morphologically based phylogenies (Davis, 1989; Davis & Stonis, 2007). Nepticulidae are always monophyletic with highest possible support (PB 1, BS 100). Twenty monophyletic clades are interpreted as full genera. The thorough sampling allowed us to confirm many existing species groups within many of the genera and designate some new ones. Only some of the relationships between genera are well supported, but close examination of the topologies of intermediate as well as the final phylogenetic analyses allowed us to distinguish a limited number of possible distinct topologies for most cases, which we discuss and which are indicated by grey dotted lines in Fig. 4.
The first division of the Nepticulidae is between the genus Enteucha Meyrick and all remaining Nepticulidae (Fig. 4 node 1). Although the position of Enteucha is well supported (PB 1, BS 86), upon examining the results of individual analyses we encountered several bootstrap ML outcomes, as well as a beast Bayesian analysis that converged on a suboptimal PP, where Enteucha is joined on a clade with Stigmella and Simplimorpha Scoble (Fig. 4 node 2), which would be the previously morphologically recognized group ‘Nepticulini’ (Fig. 1B, D). The grouping of Stigmella and Simplimorpha together receives good support (PB 1, BS 75): lack of full support is due to the variable position of Enteucha. Stigmella is by far the largest genus within Nepticulidae, with almost half of all species. From our phylogeny it is possible to subdivide Stigmella in two or three clades, but morphological evidence for this is incomplete at this stage. We therefore tentatively name two clades ‘core Stigmella’ (the clade including the type species), and ‘non-core Stigmella’. There are currently many species groups within Stigmella, some of which are partially or wholly recognized in our phylogeny. The well-supported groups are indicated in Fig. 4.
The next group that splits off in the remaining part of the tree (Fig. 4 node 3) comprises the new Neotropical genus Ozadelpha van Nieukerken and the (chiefly) Australian genera: Roscidotoga, Casanovula, Pectinivalva and Menurella (Fig. 4 node 4). As far as is currently known, the latter four genera only include species from Australia, except for one species known from the island of Borneo, Menurella xenadelpha (van Nieukerken & Hoare). In most analyses, Ozadelpha is sister to these remaining genera, and occasionally Ozadelpha is grouped with Roscidotoga. The division between Roscidotoga and the remaining Australian genera is well supported, but the remaining suprageneric supports are low. Although the exabayes Bayesian analysis supports three monophyletic genera, in the ML bootstrap analyses as well as repeated beast analyses, Casanovula minotaurus (Hoare) is placed among Menurella, making Menurella paraphyletic.
The sister-group relation between a clade of the typically Australian genera + Ozadelpha (Fig 4, node 4) and the trifurculine genera (Fig. 4, node 6) is always recovered. The phylogenetic placement of the two newly discovered Neotropical genera Hesperolyra van Nieukerken and Neotrifurcula van Nieukerken remains uncertain. In the current results, they are grouped together, but supports are lacking, probably due to a combination of long branch attraction and poor taxon sampling. Also their phylogenetic origin relative to Bohemannia remains uncertain (Fig. 4 node 5). The position of Bohemannia relative to all other genera has been stable in all analyses. Glaucolepis and Trifurcula (Fig. 4 node 7), both formerly in Trifurcula, form a monophylum with high support, and the support for both genera and the species groups within these genera is high. The previously recognized subgenus Levarchama is now redefined as the Trifurcula cryptella species group.
Suprageneric relationships in the remaining part of the tree (Fig. 4 node 8) are uncertain from our current data. However, in any case, there is no support for a monophyletic Ectoedemia sensu lato with subgenera Ectoedemia, Zimmermannia, Etainia, Muhabbetana Koçak & Kemal (=Laqueus) and Fomoria (van Nieukerken, 1986b; Hoare, 1998, Fig. 2B, D). In the presented phylogeny as well as all examined bootstrap topologies, an Ectoedemia sensu lato is always paraphyletic with regard to Acalyptris and Parafomoria Borkowski. The redefined genera are all recovered individually with high support, except for Fomoria, where the analysis possibly suffered from a relatively large amount of missing data for the sampled taxa (Figure S1). Ectoedemia and Zimmermannia are joined without high support (PP = 0.95, BS = 45) (Fig. 4, node 9), the only observed alternative topology groups Ectoedemia with Etainia as sister. Acalyptris is in the presented phylogeny joined with Etainia, but in intermediate results it has also been grouped with Parafomoria or Muhabbetana.
The estimated age of origin of Nepticulidae falls completely within the Early Cretaceous (Fig. 5). Almost all of the currently recognized genera are estimated to have originated before the end of the Cretaceous, 65 Ma. In an analysis in which the beast run converged at a topology with suboptimal PP, with Enteucha grouped with Stigmella and Simplimorpha, this only affected the age estimates for Enteucha and Simplimorpha, dating the split between the two at an average of 78 Ma (data not shown). There are large differences in the number of known species per genus and we aimed with our sampling to cover the diversity across the genera. The percentage of species included per genus ranges from 4 to 100%, mean 38% (Fig. 5). However, it should be noted that our estimates of diversity are minimum values, even though they include undescribed diversity that is known from collections. Several large regions such as the Neotropics, Tropical Africa and Australia are, to a great extent, under-studied. Nonetheless, our current data suggest that Acalyptris, Fomoria, Stigmella and possibly Enteucha (the latter somewhat dependent on its systematic position) had probably already begun to diversify in the Cretaceous.
Introns in Nepticulidae
During our DNA sequencing efforts we encountered more introns than would have been expected from previous Lepidoptera-wide sequencing efforts (Mutanen et al., 2010). Introns in EF1-alpha have previously not been reported for Lepidoptera, but are present in most other insect orders (Djernæs & Damgaard, 2006). Recently, phylogenomic data mining has revealed that there are probably two copies of EF1-alpha in Lepidoptera, where one copy is highly fragmented by introns (Niklas Wahlberg, in litt.). The phylogenetic signal we found in EF1-alpha for Nepticulidae is congruent with other genes, leading us to believe either that we have been consistently targeting only one of the two copies with our primers, or that the two copies evolve in parallel. Either case should have no negative effect on our results, but care should be taken when our data are reused in other studies. In the vast majority of specimens (98.7%) that we successfully sequenced for EF1-alpha, we did not encounter introns, but part of the 16% that failed to amplify may have failed due to introns, rather than primer site mismatches. During the exploration of suitable gene fragments to be sequenced, we also found an intron in GAPDH, and many introns in different positions in Wingless. For these genes there are no indications that there are multiple copies. The reasons for a higher abundance of introns in Nepticulidae remain unclear. The introns were found in different unrelated genera and appear to contain no phylogenetic information.
Plausibility of the divergence time estimations
There are many studies that urge caution in interpreting molecular dating results (Warnock et al., 2012; Wheat & Wahlberg, 2013; Wilf & Escapa, 2014), and show that the results largely depend on the number and quality of the calibration points (Magallón et al., 2013). In our analysis we used two calibration points, of which we are fairly confident of their reliability and placing. With only two calibrated nodes it is no surprise that the 95% HPD often exceeds 20 Ma, but we feel that such large confidence ranges provide a realistic view of the current knowledge. This view holds when we compare our results with previously published molecular dating studies of Lepidoptera. Wahlberg et al. (2013) used the Mutanen et al. (2010) eight-gene genetic dataset covering all Lepidoptera, and used seven calibration points. Within Nepticuloidea they included one Ectoedemia, E. occultella (Linnaeus), and one Opostega, O. salaciella (Treitschke), and estimated the split between the two at between 101 and 136 Ma. This corresponds to the split between Nepticulidae and Opostegidae, which in our analysis is estimated at between 103 and 145 Ma. Their study also included the Dakota Formation fossil leaf mines (Labandeira et al., 1994) as one of the calibrations. However, instead of using it to calibrate the crown Nepticulidae, as in our study, it was used to calibrate the stem. A second, mitigating, issue is that the date of the Dakota Formation fossils was mistakenly given as 120 Ma. The Dakota Formation was dated at 99 Ma at the time of the publication of the fossils (Labandeira et al., 1994), but has now been readjusted to 102 Ma (see Doorenweerd et al., 2015a). These two departures by Wahlberg et al. from our own assumptions here presumably cancelled each other out and led to an age estimate for Nepticulidae comparable to our own. There are currently no other molecular dating studies that have included Nepticuloidea, but from two other recent studies that included Lepidoptera it is clear that we are in the early stages of understanding the timing of Lepidoptera diversification. In an insect-wide transcriptome-based study (Misof et al., 2014), genetic data from 1478 single copy nuclear genes of 144 taxa, of which 10 were Lepidoptera, were calibrated with 37 fossils. The crown age for Lepidoptera was estimated at between 118 and 180 Ma, significantly younger than estimated in the study by Wahlberg et al. (2013), where the crown age for Lepidoptera was estimated at between 200 and 230 Ma. In another recent study (Condamine et al., 2016), 874 taxa, of which 114 were Lepidoptera, eight genes and 89 fossils were used and the Lepidoptera crown age was estimated even earlier, at ca. 250 Ma. Increasing the reliability of the fossil calibration points will most likely bring important progress in the coming years and will place these findings into perspective. Nonetheless, our findings for Nepticulidae appear plausible within a range of estimates of the current studies.
On the use of systematic ranks
Our analyses recover the previously defined subfamily Pectinivalvinae and tribus Trifurculini as monophyletic clades (as sister groups), but we never recover a monophyletic Nepticulinae as in previous classifications (Fig. 2) and rarely a monophyletic Nepticulini. Based on our current knowledge we therefore discontinue the use of subfamily and tribe within Nepticulidae (and Opostegidae) for the time being. We also abandon the use of subgenera, which were introduced in the 1980s (Scoble, 1983; van Nieukerken, 1986b), mainly because the former large genus Ectoedemia (Ectoedemia + Zimmermannia + Etainia + Muhabbetana + Fomoria) is shown to be polyphyletic, which can only be addressed by either raising all subgenera to full genus, or including even more genera in this large polytypic entity. We reject the latter solution because that would leave no reliable or practical apomorphies. Instead, the newly assigned genera are almost all – except Fomoria – readily defined by apomorphies and well recognisable in adult and larval morphology and biology. Because the use of subgenera is often considered awkward by many users and impractical in many databases, we decided to extend this policy throughout the family and also abandon the use of subgenera in Trifurcula (Trifurcula + Glaucolepis) and Pectinivalva (Pectinivalva + Casanovula + Menurella). We discussed the new classification and the arguments extensively with other lepidopterists, who overall endorsed our views.
A new Nepticulidae classification
In this section, we review the agreement and disagreement of our molecular results with previous hypotheses based on morphology. We proceed sequentially through the tree in Fig. 4 using named genera or node numbers for higher clades, discuss how our findings correspond to the new classification – which is formally established in van Nieukerken et al. (2016a) – and include discussion on the molecular dating results presented in Fig. 5.
Nepticulidae are morphologically well supported by at least nine apomorphies (Scoble, 1983; van Nieukerken, 1986b), including the unique sensillum type, sensillum vesiculocladum (van Nieukerken & Dop, 1987), and in the larvae the reduction of abdominal setae to six pairs per segment and the larval antenna with only two basiconic sensilla, an apomorphy not previously published (listed by Hoare, 1998). It is therefore no surprise that the monophyly of Nepticulidae is also molecularly well supported. Overall it is striking that clades that were previously well supported in the morphological cladograms, typically by at least five apomorphies, are also supported in molecular analyses and most differences are in previously poorly supported clades (Scoble, 1983; van Nieukerken, 1986b; Hoare, 1998). The reliable morphological characters often include wing venation characters, confirming their relevance for the Nepticulidae generic classification. However, the thickening of vein 1 + 2A in the forewing was formerly considered an apomorphy of Nepticulinae (Enteucha + nodes 2 + 6), with the Pectinivalvinae (node 4 minus Ozadelpha) showing the plesiomorphic, normal condition, but the character must now be considered homoplasious. The cathrema, a usually striate thickening around the opening of the ductus ejaculatorius into the vesica of the phallus in the male genitalia, was previously considered as a synapomorphy for the Nepticulinae, and considered absent in the Pectinivalvinae (van Nieukerken, 1986b; Hoare et al., 1997) and in Enteucha considered to be replaced by a smooth thickening (van Nieukerken, 1986b). More recent studies proved that the cathrema was present in Pectinivalvinae as well (Hoare, 2000b; Hoare & van Nieukerken, 2013) and close examination of genitalia slides of Enteucha has convinced us that a striate thickening is, in fact, always or nearly always present, although it is usually narrow and only weakly striate, and is often obscured by cornuti. The thickening itself should be termed the cathrema, whilst the interconnected sclerites are considered to be associated structures and not part of the cathrema. The presence of a cathrema is thus an additional apomorphy for the Nepticulidae. It has been suggested (John Dugdale, in litt.) that the cathrema may represent a modified or internalized bulbus ejaculatorius, a structure that is absent in Nepticulidae. A major previous argument for a sister-group relationship between Nepticulinae and Pectinivalvinae has been the reduction of the number of antennal segments in the larva: whereas Pectinivalvinae have a two- or three-segmented antenna, all Nepticulinae share the reduced antenna with a single segment (van Nieukerken, 1986b; Hoare, 2000b; Hoare & van Nieukerken, 2013). The current phylogeny suggests either that this reduction happened more than once, or that segments were regained in node 4. Unfortunately we do not yet know the condition in Ozadelpha.
Enteucha is a small genus with currently 11 species described and about 8 undescribed species known from collections. We were able to include eight species of Enteucha in our phylogeny, about 42% of the known diversity. In all morphology-based classifications, Enteucha has been placed in a monophyletic group with Stigmella and Simplimorpha (Fig. 2). In our analyses this relationship is not recovered; instead, Enteucha is the first clade to split off. In a study on nonditrysian Lepidoptera based on data from 19 genes (Regier et al., 2015), three representatives of Opostegidae and seven representatives of Nepticulidae were included. In their results, as in our results, Ectoedemia and Trifurcula group together, but contrary to our results, they showed high support for a clade with Enteucha, Pectinivalva and Stigmella, which was also the result from the morphological treatment by Puplesis (1994; Fig. 2B). Although the most likely outcomes from our data suggest otherwise, we did find indications in the bootstrap trees as well as suboptimal Bayesian topologies that there is some support in the molecular data for grouping Enteucha with Stigmella and Simplimorpha. The condition of the collar was previously used to characterise this grouping (former Nepticulini): comprising lamellar scales versus the plesiomorphic condition piliform scales. Lamellar scales are only known from all species in the genera Enteucha, Stigmella (although sometimes less clearly so) and have now also been found in Ozadelpha and in subgroups of Bohemannia and Acalyptris. It is still a useful character to recognize genera and subgroups, but does not define any larger clades in a cladistics sense. Other morphological synapomorpies that support this grouping include the larval labrum without lateral setae, pupa with only a single row of spines per segment (van Nieukerken, 1986b), and the presence of a subdorsal retinaculum, which is paralleled in Acalyptris (Hoare, 1998). It will probably require a combination of many more genes as well as taxa to fully resolve this issue. By abandoning intermediate ranks between family and genus it currently does not affect the classification. Enteucha receives the oldest estimated stem age, between 102 and 135 Ma, and a crown age between 51 and 84 Ma. This age is puzzling in the light of the single host plant family: the Polygonaceae are considered a relatively young family (34–41 Ma: Forest & Chase, 2009). It seems unlikely that the various species colonized the Polygonaceae independently. Either the species evolved on the ancestor or extinct stem group species of Polygonaceae, or the age of the host family requires further study. Within Enteucha, E. basidactyla (Davis) from Florida is sometimes classified in a separate genus, Manoneura Davis (Puplesis & Robinson, 2000), but it consistently groups with the other species from Florida, E. gilvafascia (Davis) (both feeding on sea grape, Coccoloba uvifera L.), and both are always subordinate in a larger clade with unnamed Asian species and the European E. acetosae (Stainton). This supports our view that Manoneura has been correctly synonymized (van Nieukerken, 1986b). The rather unique characters of the male genitalia of E. basidactyla should therefore be considered as highly autapomorphic, and without value for generic classification. The monotypic Varius from South Africa, feeding on Ochnaceae, is potentially a synonym of Enteucha based on morphology, but no recent material is available for DNA analyses. Without molecular evidence and because of its isolated occurrence and aberrant host-plant choice, we prefer to keep it separate for the time being. Morphologically Enteucha can be recognized by the reduced venation (absence of Rs1+2), the absence of a transverse bar in the transtilla (paralleled in Pectinivalva, Glaucolepis, part of Acalyptris) and the anterior apophyses lacking anterior apodemes (van Nieukerken, 1986b). For the condition of the cathrema, see earlier. van Nieukerken (1986b) also listed unmelanized patches in the anterior sclerotization of tergum 2, but this is doubtful as the amount of melanization can vary and an insufficient number of species of Nepticulidae has been screened for this character.
Node 2: Stigmella and Simplimorpha
In all analyses Simplimorpha is consistently placed as sister to Stigmella, with an estimated divergence time between the two in the Cretaceous, between 82 and 118 Ma. Stigmella and Simplimorpha share the venation, with curved main trunk of R + Rs + M usually with four terminal branches and a separate CuA. Otherwise there are no obvious morphological apomorphies, unless Enteucha is included in this clade (van Nieukerken, 1986b).
There are only two species of Simplimorpha known, with clear allopatric distributions, and both have been sampled in our phylogeny. There are no undescribed species known from collections, but given the poor knowledge on the African fauna there is potential for undiscovered diversity. Both known species are oligophagous on Anacardiaceae. Simplimorpha promissa (Staudinger) feeds on Pistacia L. species, Rhus L. species and Cotinus coggygria (Scop.) in the Mediterranean area, and S. lanceifoliella (Vári) feeds on many Searsia F.A.Barkley species (formerly in Rhus), Protorhus longifolia (Bernh. ex C.Krauss) Engl. and introduced Schinus molle L. in Southern Africa (Scoble, 1983). All specimens of S. promissa that we sequenced for EF1-alpha contained an 80 bp intron (Fig. 3), which was not present in S. lanceifoliella. The estimated divergence time between the two species is in the Paleogene, between 27 and 57 Ma. This is the oldest estimated divergence time between two sister-species in the phylogeny (see Figure S1).
Stigmella is by far the largest nepticulid genus, containing nearly half of all the known species (currently 428) and can be found on all continents except Antarctica. From our current sampling and estimations, the crown Stigmella age is estimated to be Late Cretaceous, 68–99 Ma, making it one of the earliest diverging genera. For such a large and well-supported genus, it has remarkably few morphological apomorphies. van Nieukerken (1986b) only listed two: uncus bilobed (bifid) and larval antenna with sensilla placed cross-wise. To these we can add collar with lamellar scales, although sometimes appearing hairy, and this is paralleled in Enteucha, Ozadelpha, part of Bohemannia and Acalyptris. There are several exceptions to the bilobed uncus, e.g. Stigmella naturnella (Klimesch). The previously used character ‘gnathos with single posterior process versus two processes’, e.g. to separate the lapponica group as basal from the rest, appears to be very homoplasious; the previously considered plesiomorphic condition, a single process, similar to most other Nepticulidae, occurs in many groups throughout the genus and is possibly not always the plesiomorphic condition, but may have originated from fusion of the two processes. This is supported by New Zealand species, which exhibit a range of degrees of fusion, with S. tricentra (Meyrick) showing an intermediate state with processes fused basally and contiguous apically (Donner & Wilkinson, 1989: fig. 97, an accurate representation). From the molecular phylogeny it is clear that Stigmella can be subdivided into two (or three) large clades, which may warrant generic status. However, without sufficient morphological evidence to support this, it seems impractical to do so at this point and instead we have here termed them ‘core Stigmella’ (including the type species) and ‘non-core Stigmella’. One character that seems to follow the division into two clades is the feeding position of the larva: only larvae of Core Stigmella feed with their dorsum upwards, whereas all other Nepticulidae feed with their venter upwards. This is probably a strong apomorphy for the Core group, but impractical for classification purposes when larvae are unknown. Our results recover the following species groups with sufficient support: in non-core Stigmella: the S. prunifoliella, S. ultima, S. ulmivora, S. saginella and S. betulicola groups; and in core Stigmella: the S. sanguisorbae, S. lapponica, S. ogygia, S. epicosma, S. salicis, S. quercipulchella, S. anomalella, S. hybnerella, S. oxyacanthella, S. aurella, S. lemniscella, S. floslactella and S. ruficapitella groups (Fig. 4). In the genitalia, non-core Stigmella males usually have a wide uncus, sometimes shallowly bilobed, a juxta is often present, a manica never; the phallus often has only few cornuti. In core Stigmella the uncus is either bifurcate or sometimes deeply split into four lobes, and a juxta is rarely present. In a large clade including the S. aurella and S. ruficapitella groups there is often a manica (phallocrypt) around the phallus, which usually has many cornuti. In non-core Stigmella, female genitalia often have spiny signa, paired (e.g. S. betulicola group, some in S. ultima group) or unpaired (S. saginella group, some species in the S. paliurella group), accessory sacs are rare and, when present, rather small. Core Stigmella females usually have a strong accessory sac; in part of the S. ruficapitella group this sac clearly takes over the function of the reduced corpus bursae (van Nieukerken & Johansson, 2003). Several species groups in both clades are specialized on host plant families such as Fagaceae, Betulaceae, Rosaceae and Rhamnaceae. Most Old World tropical species are placed in non-core Stigmella and feed on such host families as Fabaceae, Moraceae, Euphorbiaceae, Phyllanthaceae, Meliaceae, Rutaceae, Dipterocarpaceae and Malvaceae.
Notable is the well-supported grouping of all examined New Zealand species (the S. ogygia group), the South American members formerly placed in an extended S. salicis group (Puplesis & Robinson, 2000), for which we propose the name ‘S. epicosma group’ and a strict S. salicis group (those feeding on Salicaceae). Overall the males in this clade have rather similar genitalia, and females have a noncoiled ductus spermathecae as potential synapomorphy. This clade splits first between the New Zealand S. ogygia group and the S. epicosma + S. salicis groups; the latter could be suggested to have a Neotropical origin, as the Neotropical Salix L. feeder Stigmella molinensis van Nieukerken & Snyers appears to be sister to all other S. salicis group members. This geographic distribution can only be explained by long-distance dispersal, as at the average estimated age of this split (Eocene, 41.5 Ma), New Zealand and Latin America were even further apart than today. It is also notable that all Asteraceae feeding Stigmella species are in this clade: both in New Zealand and in Latin America, even though both groups also use several other host plant families (Donner & Wilkinson, 1989).
Other new findings include an expanded S. lapponica group, containing several Rosaceae feeders, including S. malella (Stainton) and S. slingerlandella (Kearfott) (Figure S1) next to the originally included Betulaceae feeders, and another assemblage of groups feeding on Rosaceae: the S. hybnerella, S. paradoxa and S. irregularis groups, which we combine here as the S. hybnerella group, even though genitalia characters are rather diverse. Several other species groups that have previously been recognized were not or only partly recovered or we sampled only one species or none at all. Where applicable, they are discussed further in the catalogue (van Nieukerken et al., 2016a). Future studies on Stigmella that include more taxa globally, particularly from tropical regions, will be needed to get a better grip on the evolutionary history of this genus.
This new grouping represents the former Pectinivalvinae and Trifurculini, and is better supported in the Bayesian analysis than in the ML analysis (PP = 1, BS = 55). As earlier studies did not recognize this grouping, no morphological apomorphies have been noted previously, nor can we easily point out any now.
Node 4: Ozadelpha and the Australian genera
Ozadelpha has been newly described from the Neotropics (van Nieukerken et al., 2016b) and at present contains at least four species, of which one is included in our phylogeny: Ozadelpha specimen EvN4680, which is closely related to the type species Ozadelpha conostegiae van Nieukerken and Nishida. Both species feed on Conostegia D. Don (Melastomataceae) in Costa Rica. Another species feeds on Myrtaceae, and has been recombined by us as Ozadelpha guajavae (Puplesis & Diškus) (van Nieukerken et al., 2016b). The stem age for Ozadelpha is estimated at 51–86 Ma. We find it likely that more species of this genus are to be discovered in the Neotropics. Currently all known species have two fasciae on the forewing, and a collar comprising lamellar scales. In the venation, Ozadelpha resembles Stigmella, but CuA is usually very long, and as the condition of the collar is also similar to Stigmella an external diagnosis is difficult. The long CuA resembles that in Roscidotoga and, also in the genitalia, both genera share several characters: the large vinculum, bilobed uncus, and broadened anterior apophyses. However, Ozadelpha does not share any of the listed adult or pupal synapomorphies of the Australian genera (Hoare, 2000b). Larvae have not yet been studied. The grouping of Ozadelpha with all typically Australian nepticulid genera, Roscidotoga, Pectinivalva, Menurella and Casanovula, is well supported (PP = 1, BS = 97). If the hosts for the Ozadelpha species we currently know are representative of the genus, the majority of species in node 4 are associated with Myrtales.
A major character that gave its name to the genus Pectinivalva, including Casanovula and Menurella, is the valval pecten that was considered homologous to the stalked pectens in Opostegidae and various Adeloidea (Scoble, 1983; van Nieukerken, 1986b). In the light of our findings it is more likely that the pecten in most Casanovula, Menurella and Pectinivalva species is a neoformation, similar to a type of pecten found in several Acalyptris species (see also Regier et al., 2015). The morphological characters supporting a grouping of Pectinivalva, Menurella, Casanovula and Roscidotoga and, for each of these genera, are detailed by Hoare (2000b) and Hoare & van Nieukerken (2013). Although morphological characters appear reliable and support a generic status for all groups, instead of the previous subgeneric status, monophyly for all genera is not always recovered in the molecular results. This is probably a sampling issue: the estimated sampled diversity for these genera is between 4 and 38%, and especially low for Pectinivalva (4%) and Menurella (7%). Both are expected to have a diversity in the order of 70–80 species based on counts in the Australian National Insect Collection (Hoare, 1998). In the molecular dating analyses, one species of Casanovula is consistently recovered among Menurella, and Roscidotoga is grouped with Ozadelpha. The age estimates for these clades are therefore combined: the stem age for Pectinivalva + Menurella + Casanovula is estimated at between 64 and 97 Ma. The three genera probably diverged after the K-Pg boundary, in the Paleogene, somewhat later than most other genera, which originated in the Cretaceous. Hoare & van Nieukerken (2013) stressed the fact that some species of Menurella and Casanovula are still specialized on rainforest-dwelling Myrtaceae, whereas all Pectinivalva species, where known, feed on drought-resistant Eucalyptus, a genus that diversified particularly after Australia's Miocene aridification (5–24 Ma). Still, we see that splits between these genera in our estimate predate this drying out, with the estimate for the stem-group of Pectinivalva between 27 and 50 Ma. For Roscidotoga the stem age is estimated here at between 51 and 86 Ma, but there is some added uncertainty due to its variable position relative to Ozadelpha. While Hoare (2000b) still suggested a split between Pectinivalva sensu lato and Roscidotoga in the Early Cretaceous, later Hoare & van Nieukerken (2013) assumed that this split dates closer to the Miocene aridification; the latter is supported by our data.
Node 6: trifurculine genera
The remaining 12 genera form a well-supported monophyletic group (PP 1, BS 63). Previously this grouping has been classified as the Trifurculini or Trifurculinae (Fig. 2) and is well supported morphologically (van Nieukerken, 1986b), amongst others, by the character ‘veins R + Rs and M of forewing separate basally, forming closed cell’ (van Nieukerken, 1986b). This character is also valid for the clade Hesperolyra + Neotrifurcula + Bohemannia, where it is apparent in Neotrifurcula but reduced in the other two genera. The character ‘paired reticulate signa in female bursa copulatrix’ is another character that is almost always present in this node, but probably secondarily reduced in groups with a reduced bursa (Hesperolyra, Parafomoria and the Acalyptris staticis group). The origin of this group is securely estimated in the Cretaceous.
The grouping of Bohemannia with the two new Neotropical genera Hesperolyra and Neotrifurcula is not well supported, and also these genera have little in common morphologically (van Nieukerken et al., 2016b). The position of the new genera remains unclear, and these taxa often acted as ‘rogue taxa’ in our analyses and could be recovered in very different parts of the tree. Both genera probably have many more species in the Neotropics. Age estimates for these remain uncertain, but they are probably of Cretaceous origin.
This small Palearctic genus comprises one leaf-mining species [Bohemannia pulverosella (Stainton) on Malus Mill.], whereas the remaining six species are probably shoot or bud miners, although rearing records are mostly absent (van Nieukerken & Johansson, 1990). Hosts are known through association and at least one reared specimen from Betulaceae, without knowledge of the larval feeding (Alnus Mill., Betula L.) (van Nieukerken, 1986a). Recently, two fossil species were recognized from Baltic Amber (Fischer, 2013), which allowed us to use these as a calibration point for divergence time estimations. The stem age is estimated as 58–94 Ma, but this is also somewhat uncertain due to the problematic placing of Hesperolyra and Neotrifurcula. The phylogenetic placement of Bohemannia relative to the other genera in node 6 as found here is similar to that in Puplesis (1994), whereas van Nieukerken (1986b) regarded it as sister group to Ectoedemia sensu lato on the basis of two larval characters: the shape of frontoclypeus and length of tentorial arms. Given that Ectoedemia is no longer recovered as monophyletic in a sensu lato composition, these characters are probably unreliable for classification.
Node 7: Trifurcula and Glaucolepis
The new classification matches that of Puplesis (1994; Fig. 2C) in assigning generic status to Trifurcula and Glaucolepis. Other classifications have treated a larger Trifurcula, with subgenera Trifurcula, Glaucolepis and Levarchama (Fig. 2B, D). The previously recognized subgenus Levarchama is not raised to full genus, but treated here as the T. cryptella species group. Phylogenetically this entails no change, and the group is strongly supported by the combination of three probably interrelated characters: Rs + M with three branches in hindwing, male abdomen with paired tufts on T6–8, and the underside of the male hindwing with a velvet patch of androconial scales.
The genus Glaucolepis currently comprises c. 40 named species, of which the majority are known from the Mediterranean region. Most are leaf or stem miners in various trees and shrubs, a few feed in herbs and probably some make galls. Three apomorphies were given by van Nieukerken (1986b), and a fourth was suggested by van Nieukerken & Puplesis (1991), while adding some doubt to two of the previous apomorphies in the male genitalia. Our results recognize three well-supported species groups: the G. raikhonae group comprises three Palearctic species for which feeding habits are unknown, except for the gall-making G. oishiella (Matsumura) [=Sinopticula sinica (Yang)], which feeds on Prunus L. (Yang, 1989; van Nieukerken & Puplesis, 1991). The group has a wider distribution, as we already know unnamed species in this group from Australia and North America. The G. saccharella group contains the type species G. saccharella (Braun) from eastern North America, a leaf miner of Acer (Sapindaceae) and an unnamed species from Japan that makes leaf mines on various woody Fabaceae. The G. headleyella group is the most diverse and is confined to Europe, the Mediterranean and adjacent areas. Many species in this group mine in more than one leaf and continue to the next leaf via the petiole and stem, or sometimes only mine in the stem. Most of these feed on Lamiaceae, several on Plantaginaceae (Globularia L.) and Apiaceae (Bupleurum L.) and a few other families (Laštůvka & Laštůvka, 2000, 2007; Ivinskis et al., 2012; Laštůvka et al., 2013). The stem age for Glaucolepis is estimated as 59–88 Ma, and the crown age as 45–70 Ma. For the G. raikhonae group (three species) the crown age is 16–40 Ma, for the G. saccharella group (two species) it is 19–39 Ma and for the G. headleyella group (12 species) it is 25–40 Ma. The proliferation of the last group in the Mediterranean region, where species specialize on small drought-resistant shrubs, now common in Mediterranean habitats such as maquis and garrigue, is probably partly explained by the Oligocene–Miocene aridification of the region (Dong et al., 2013). We were unable to study substantial DNA data of the two Neotropical species assigned to Glaucolepis, but morphologically they are so different from other Glaucolepis that it is very unlikely they are placed correctly. At least G. argentosa Puplesis & Robinson, of which we examined one male morphologically and have a partial DNA barcode, lacks the three apomorphies of Trifurcula + Glaucolepis and genitalia only show some superficial similarities. It is not unlikely that these species represent another new genus.
The genus Trifurcula as defined here is largely a western Palearctic genus, with 36 named species and several unnamed species. The genus and its three species groups are all recovered with highest possible support, congruent with earlier morphological analyses (van Nieukerken, 1986b, 1990, 2007). The best morphological apomorphy is the loss of the connection between Rs1+2 and Rs3+4 in the forewing and another apomorphy is the host-plant family, Fabaceae, for which no exceptions are known. The T. cryptella group (seven species) comprises leaf miners on herbaceous and shrubby Fabaceae: Loteae, all in Europe and North Africa, including Macaronesia. The group was revised and its phylogeny analysed by van Nieukerken (2007). The T. subnitidella group (eight species) comprises European and Mediterranean species that make stem mines, occasionally starting in a leaf, also in herbaceous and shrubby Fabaceae, tribes Loteae and Hedysareae (van Nieukerken, 1990). The remaining c. 20 named species and several unnamed species belong to the T. pallidella group, and all are stem miners of Fabaceae belonging to the tribe Genisteae in the Mediterranean region (Laštůvka & Laštůvka, 1994, 2005; van Nieukerken et al., 2004c, 2010). Our phylogeny supports the evolutionary scenario sketched by van Nieukerken (2007): the ancestor of Trifurcula in the present sense was a leaf miner on Loteae, and in the clade of the T. subnitidella + pallidella groups the stem-mining habit evolved; in the T. pallidella group a shift to Genisteae as host plant occurred. Scoble (1980) added two South African species to Trifurcula, of which the position is unknown. We doubt whether they are genuine Trifurcula, but a further molecular analysis is required to answer this doubt. The stem age of Trifurcula is 59–88 Ma, and the crown age is 29–47 Ma. As in Glaucolepis, the proliferation of species in this genus can possibly be partly attributed to the Oligocene-Miocene aridification of the region (Dong et al., 2013).
Node 8: seven genera with unclear suprageneric relationships
This clade comprises the genera Parafomoria, Acalyptris and the former subgenera of Ectoedemia: Fomoria, Muhabbetana, Etainia, Zimmermannia and Ectoedemia. The estimated origin is Late Cretaceous (72–102 Ma), and they probably diverged in a relatively short time, leading to short branches and low support values. Alternative topology hypotheses are indicated with grey dotted lines in Fig. 4. Most of the genera are well defined by morphology and life-history characters, but no morphological apomorphies can be indicated for the whole clade.
Fomoria is the genus that stands out as being not well defined either molecularly or morphologically, an issue noted previously (Scoble, 1983; van Nieukerken, 1986b). The absence of distinct apomorphies has probably also led to the previous inclusion of species that are now in Hesperolyra, on grounds of some similarity in male genitalia (Puplesis & Robinson, 2000). In total, 48 named species belong to Fomoria, with the largest number, 22 species, in Africa. In our analysis, we incorporated three species groups, although all receive poor support. The F. weaveri group with 14 named and several unnamed species comprises mainly leaf miners on Hypericum L., Ericaceae and Rutaceae, and most species pupate inside the leaf mine. The species are characterized by a number of apomorphic genitalia characters (Hoare, 2000a; van Nieukerken, 2008). The Hypericum and Ericaceae feeding species form a clade, sister to the Rutaceae feeding species. As sister to these groups, we see an Australian species reared from Scolopia Schreb. (Salicaceae), the very species that one of us discussed as the first potential southern hemisphere member of the F. weaveri group, based on a single specimen, before the mine and larva were discovered by us at the same locality (Hoare, 2000a). The F. groschkei group (three named species from Europe and Africa, several unnamed Asian ones) comes as poorly supported sister in some of our analyses or as paraphyletic. It comprises several species feeding on woody Lamiaceae that were previously placed in Verbenaceae (Vitex L., Callicarpa L.), but we also found one species on Bignoniaceae (Radermachera Zoll. & Moritzi ) in Taiwan. Unfortunately we were only able to sequence one species of the morphologically and biologically well-supported F. vannifera group (Hoare, 2000a), F. vannifera (Meyrick, 1914) itself. Members of this group are leaf miners on Brassicaceae: Capparis L. and relatives. Here F. vannifera groups, without support, with the isolated Apiaceae feeding species F. viridissimella (Caradja). The stem age for Fomoria is estimated as 66–96 Ma, and the crown age as 57–86 Ma, showing overlapping estimates. The genus originated already in the Cretaceous.
Muhabbetana, known as Ectoedemia (Laqueus) in the majority of the literature, is an essentially African clade with 32 named species, with a small group of species occurring in the Mediterranean region and Macaronesia. The Afrotropical species feed on Ebenaceae and Celastraceae, whereas the Mediterranean species feed on Euphorbiaceae (genus Euphorbia L.) and Apocynaceae. The genus was previously synonymized with Fomoria (Puplesis, 1994; van Nieukerken et al., 2004b), but in our analyses the species ascribed to it always group separately. Morphologically it is characterized by the anal loop in the forewing and the larval stipes with two setae (Scoble, 1983; van Nieukerken, 1986b), but the last character has only been checked for Mediterranean species. Unfortunately, we have not been able to include the type species M. grandinosa (Meyrick) or a close relative in our analyses, but we include one unidentified African species (RMNH.INS.24076), and African material studied after finishing these analyses (not included here) also confirms this placement. The stem age is estimated as 63–92 Ma, and the crown age as 50–78 Ma.
The genus Parafomoria is a small group with eight named, and around seven unnamed species, occurring only in the Mediterranean region with one species going north into Central Europe. Most species occur in the Iberian Peninsula. All species make leaf mines on shrubby Cistaceae, often in winter. Morphologically the genus can be recognized by the reduced venation (loss of Rs1+2), expansion of the lateral arms of the vinculum, reduction of female bursa and development of male hair pencil. There is also an apomorphy in the 28S gene: a large gap in the D2 region occurs in all species and can be ascribed to the shortening of one loop in the secondary structure. The two groups recognized in the molecular analysis conform to the morphological division into two groups (van Nieukerken, 1983). The age for crown-group Cistaceae is around (18.5–) 14.2(–10.2) Ma (Guzmán & Vargas, 2009; Vargas et al., 2014). Our estimates for the crown group Parafomoria are much older, between 38 and 61 Ma. Whether this is due to unknown host relationships in Parafomoria with ancestors of the Cistaceae and its sister group Dipterocarpaceae, or to inaccuracy in the estimated age of Cistaceae is unclear. The age estimate for the stem of Parafomoria is 71–102 Ma.
The genus Etainia, with 16 named and five unnamed species, is both molecularly and morphologically well supported (Puplesis & Diškus, 1996; van Nieukerken & Laštůvka, 2002). As far as is known none of the species are leaf miners, but they feed in various ways in organs other than leaves. Several Holarctic species are associated with Acer (Sapindaceae) and either feed in the winged fruits and seeds (in the summer generation) or in the buds and shoots (winter generation or univoltine generation) (Kulman, 1967; Johnson, 1982; Emmet, 1984), but E. albibimaculella (Larsen) and an unnamed North American species feed on Ericaceae, and the possibility that E. obtusa Puplesis & Diškus feeds on Fraxinus L. (Oleaceae) cannot be excluded (van Nieukerken & Laštůvka, 2002). Hosts for the African species are unknown, but Acer does not occur there. The included African species is sister to the clade of Holarctic species, at some distance, suggesting an African origin. The estimated stem age is 66–96 Ma, and the crown age is 26–58 Ma.
The genus Acalyptris is the second largest genus after Stigmella, with 93 named and more than 60 unnamed species. It is widespread especially in the (sub)tropical and desert regions of the world, but in North America species can be found at higher latitudes in bogs and wetlands, feeding on Cyperaceae (E.J. van Nieukerken, personal observation). Most species are leaf miners, some mine stems, and hosts are varied. There are records from at least 21 plant families, mostly Eudicots, but also the monocot family Cyperaceae. There is one morphological apomorphy for the genus: closed cell in forewing shifted towards base, vestigial. The 19 species in our analysis group in four relatively well-supported species groups: the New World A. scirpi group; the West Palearctic A. staticis group, specialized on Plumbaginaceae; the Palearctic A. psammophricta group – formerly also known as the repeteki group – that occurs primarily in desert and steppe areas of Central Asia, the Middle East and North Africa; and the A. platani species group that is confined to the Old World. The estimated stem age is 66–96 Ma, and the crown age is 62–91 Ma. Like Fomoria, Acalyptris shows a large overlap in the estimates of the crown and stem groups, suggesting an early diverging genus.
The small genus Zimmermannia is restricted to the Holarctic region with 17 named species and around eight unnamed ones, of which nine species are included in the analysis. The species are bark miners where known, with most species associated with Fagaceae, but a few with Ulmaceae, Salicaceae and possibly Betulaceae (E. J. van Nieukerken, unpublished data). Whereas old mines are usually easily seen, finding and collecting larvae present a challenge and most species are only known from light-collected adults. Morphologically the sister-group relationship with Ectoedemia is well supported (van Nieukerken, 1986b), which is also the best-supported molecular scenario. The stem age is estimated at 54–80 Ma, and the crown age at 19–35 Ma.
Ectoedemia is the third largest genus in Nepticulidae, with 90 named and more than 50 unnamed species, especially in East Asia. It has a wide distribution in the northern hemisphere, and is particularly diverse in the Palearctic region. The phylogeny of Ectoedemia has already been treated in detail in a previous study (Doorenweerd et al., 2015b) and will not be repeated here. The present analysis differs from the published phylogeny in that the clade with the African species, the E. commiphorella species group, is not the first to split off, but instead splits off after the clade with the E. subbimaculella and E. populella species groups. Considering the low support for this, the real position of the African clade remains uncertain. The stem age for Ectoedemia is estimated to be between 54 and 80 Ma, either just before or just after the K-Pg boundary. The crown age is more securely estimated, between 33 and 49 Ma, Eocene. Because the centre of diversity for Ectoedemia is in the Palearctic and we have a thorough sampling coverage of all species groups, we believe this is one of the most reliable estimates in our dataset. All species in the E. populella group feed on Salicaceae, and started diversifying between 14.3 and 24.4 Ma, Miocene (see Figure S1). The crown age for Salicaceae is generally estimated to be much older, with the most recent estimates in the Cretaceous, between 73 and 87 Ma (Xi et al., 2012). The crown age for the E. angulifasciella group, for which the ancestral host was most probably a Rosaceae (Doorenweerd et al., 2015b), is estimated between 25.1 and 36.5 Ma. The estimated crown age for Rosaceae is just after the K-Pg boundary, around 61 Ma (Hohmann et al., 2015), or in the Cretaceous, around 88 Ma (Chin et al., 2014), in either case, long before the E. angulifasciella group started to diversify. The other species groups in Ectoedemia receive similar crown age estimates, ranging from 12 to 38 Ma, covering the late Eocene, Oligocene and Miocene.
Most genera and higher clades are distributed throughout several biogeographical regions without clear evolutionary patterns, but there are some splits that may correlate with the shifting of tectonic plates (Fig. 5). In particular the strongly supported relationship between Ozadelpha, which appears restricted to the Neotropics, and all endemic/near-endemic Australian genera (Fig. 4 node 4) suggests ancient vicariance. The split between these two is estimated to have occurred in the Late Cretaceous, where all southern continents were practically merged and slowly broke up. However, other splits in the phylogeny indicate that long-distance dispersal has probably occurred multiple times throughout the evolutionary history of Nepticulidae and cannot not be ruled out for the Ozadelpha–Australian relationships. For example, the estimated age of the monophyletic clade of Stigmella species from New Zealand (S. ogygia group) is estimated at 29–46 Ma, whereas New Zealand has been isolated for c. 80 Ma (Waters & Craw, 2006). To further understand the role of Gondwanan vicariance it would be important to target future sampling on southern hemisphere areas such as New Caledonia, which has an extremely diverse flora, including many ancient angiosperm groups. Other general biogeographic trends observed in the phylogeny include the repeated apparent dispersal from Africa to different parts of the northern hemisphere: in Simplimorpha, Muhabbetana, Etainia and Ectoedemia. To further understand this pattern it will be vital to include further sampling of African taxa. In terms of species numbers, two of the three largest genera, Stigmella and Ectoedemia, representing around two-thirds of Nepticulidae diversity, appear to have diversified primarily in the temperate regions. Within Stigmella, the predominantly tropical clades include fewer species and older splits than the clades mostly distributed in the temperate region. Overall the biogeographical region producing the most spectacular finds at present is the Neotropics, with three new genera already included in this paper from very fragmented and incidental sampling, undoubtedly leaving more to be found. Other groups of leaf miners have also been shown to have a large undiscovered Neotropical diversity, including at generic levels (Lees et al., 2014; Gilson R. P. Moreira, in litt.).
Contemporaneous evolution with Angiosperms
The molecular chronogram presented in Fig. 5 allows us to make a first comparison of the timing of diversification of Nepticulidae with respect to that of the Angiosperms. The main Angiosperm radiation has been estimated to have occurred between 90 and 110 Ma, during which most plant lineages split into the groups we now mostly refer to as families (Schneider et al., 2004; Magallón et al., 2013; Silvestro et al., 2015), and this has previously been suggested as an important driver of speciation in Lepidoptera (Wahlberg et al., 2013). Our molecular dating results show that it is likely that Nepticulidae were already present before this time, meaning that there was ample opportunity for coevolution in a broad sense during the main Angiosperm diversification in the Cretaceous and thereafter. The pattern of host use we observe in Nepticulidae does not mirror the phylogeny of Angiosperms in any way. At most we can say that the majority of species feed on plants in the fabid (Eurosid I) clade, but there are many exceptions to that rule. We find six genera that are (almost) restricted to a single host plant family: Enteucha on Polygonaceae, Casanovula on Myrtaceae, Menurella on Myrtaceae (one exception), Pectinivalva on Myrtaceae, Trifurcula on Fabaceae and Parafomoria on Cistaceae. With two of those, Enteucha and Parafomoria, the estimated stem age of the leaf-mining genus is older than the estimated crown age of the host family, indicating that there has probably been a different ancestral host, or that the dating estimates of either the insects or the hosts are inaccurate. Recent studies on angiosperms have focused more on the diversification, rather than just stem and crown ages (Xing et al., 2014; Bouchenak-Khelladi et al., 2015). This reveals that in Fagales and Rosales, predominant host orders for Nepticulidae, diversification intensified during the Miocene. Combined with the observation that most nepticulid species specialize on just one or several closely related host species, it is likely that many of the host relations we observe at species or species group level were established during the Miocene. It will be particularly interesting in future studies to include fine-scale host-use data and compare this with the diversification estimates of the hosts.
First and foremost, we are much indebted to everyone who has contributed specimens for our analyses, including (in alphabetical order): Leif Aarvik, James Adams, David Agassiz, Val Albu, Kees Alders, Giorgio Baldizzone, Bengt Å. Bengtsson, Kees van den Berg, Oleksiy V. Bidzilya, Willy Biesenbaum, Richard Brown, Sifra Corver, Don Davis, Jurate and Willy De Prins, Charley Eiseman, Willem Ellis, Cees Gielis, Robert Heckford, Douglas Hilton, Martin Honey, Michael Hull, Povilas Ivinskis, Jari Junnilainen, Lauri Kaila, Ole Karsholt, Sjaak Koster, Mikhail Kozlov, John Langmaid, Aleš and Zdeněk Laštůvka, Houhun Li, Anna Mazurkiewicz, Wolfram Mey, Marko Mutanen, Kenji Nishida, Greg Pohl, Lynn Raulerson, Jadranka Rota, Rudi Seliger, Chris Snyers, Kazuhiro Sugisima, Reinhard Sutter, Rumen Tomov, Paolo Triberti, Hector Vargas, David L. Wagner, Andreas Werno, Hugo van der Wolf, K. Yazamaki, Shen Horn Yen, Michael Zerafa and Vadim Zolotuhin. We would like to thank our colleagues in the molecular laboratory of the Naturalis Biodiversity Center, Dick Groenenberg, Frank Stokvis, Kevin Beentjes and Marcel Eurlings, for their help and hints. We are indebted to Kees van den Berg for the many ways he assisted us, e.g. with rearing, mounting of moths, as well as optimizing the methods for preparing genitalia and larvae. We very much appreciate that Roland Johansson and the Swedish Species Information Centre permitted us to use the accurate watercolour drawings of Nepticulidae for Fig. 5, and we appreciate George Sinnema allowing us to use his excellent photographs of live Nepticulidae in Fig. 1. We would like to thank Niklas Wahlberg for making us aware of the copy of the elongation factor 1-alpha gene and discussing this matter, and we thank Marie Djernaes for discussing the relevance of introns in elongation factor 1-alpha and her help with determining the exact position of these. We thank John Dugdale for sharing his insightful ideas on the cathrema and Gilson Moreira for sharing his information on newly discovered Gracillariidae genera in the Neotropics. Hugo van Duijn and Atze de Vries of the IT department of Naturalis were very helpful with compiling multi-threaded versions of phylogenetic analysis software in an OpenStack cloud computing environment, which allowed us to perform computationally demanding analyses. Our work was made possible by financial support from the Dutch government Economic Restructuring Fund (FES) as well as multiple collecting expeditions funded in part by the Uyttenboogaart Eliasen Foundation. A Naturalis Temminck Fellowship to RJBH in 2009, to finalize publications arising out of his PhD thesis, led the authors to their discussions on the nepticulid phylogeny that eventually led to this collaborative work. We thank two anonymous reviewers and the editor for their comments, which helped to improve the manuscript.