Genome streamlining via complete loss of introns has occurred multiple times in lichenized fungal mitochondria

Abstract Reductions in genome size and complexity are a hallmark of obligate symbioses. The mitochondrial genome displays clear examples of these reductions, with the ancestral alpha‐proteobacterial genome size and gene number having been reduced by orders of magnitude in most descendent modern mitochondrial genomes. Here, we examine patterns of mitochondrial evolution specifically looking at intron size, number, and position across 58 species from 21 genera of lichenized Ascomycete fungi, representing a broad range of fungal diversity and niches. Our results show that the cox1gene always contained the highest number of introns out of all the mitochondrial protein‐coding genes, that high intron sequence similarity (>90%) can be maintained between different genera, and that lichens have undergone at least two instances of complete, genome‐wide intron loss consistent with evidence for genome streamlining via loss of parasitic, noncoding DNA, in Phlyctis boliviensisand Graphis lineola. Notably, however, lichenized fungi have not only undergone intron loss but in some instances have expanded considerably in size due to intron proliferation (e.g., Alectoria fallacina and Parmotrema neotropicum), even between closely related sister species (e.g., Cladonia). These results shed light on the highly dynamic mitochondrial evolution that is occurring in lichens and suggest that these obligate symbiotic organisms are in some cases undergoing recent, broad‐scale genome streamlining via loss of protein‐coding genes as well as noncoding, parasitic DNA elements.

. The modern mitochondrial genome is derived from an ancient alpha-proteobacterium, which, since its endosymbiosis with ancestral eukaryotes roughly 1.45 BYA (Martin & Mentel, 2010), has undergone significant reductions in genome complexity and size via loss of both protein-coding genes and intronic sequences and intergenic spacers (Adams & Palmer, 2003;Gray et al., 1999;Khachane et al., 2007).
The extent of mitochondrial genome reduction varies substantially among taxa and can even vary between closely related sister species (Dibb, 1993;Jo & Choi, 2015;Lynch, Koskella, & Schaack, 2006;Signorovitch, Buss, & Dellaporta, 2007;Simmons et al., 2015;Wang, Zhang, Li, & Zhang, 2018). Bilateral metazoan mitochondrial genomes are highly consistent in size (16-20 kbp in length), usually contain the same 37 coding features, and lack introns or retrotransposable elements (Beagley, Okada, & Wolstenholme, 1996;Saccone, Giorgi, Gissi, Pesole, & Reyes, 1999). In contrast, other lineages of life, such as plants, have mitochondrial genomes that vary in content and size by up to three orders of magnitude (Alverson, Rice, Dickinson, Barry, & Palmer, 2011). Variations in content and size can be partially explained due to dynamic gains and losses of repetitive noncoding DNA (intergenic spacers) and selfish genetic elements (introns and transposable elements) that have parasitized portions of these genomes (Feschotte, Jiang, & Wessler, 2002;Paquin et al., 1997;Pogoda, Keepers, Lendemer, Kane, & Tripp, 2018). The differences in the presence/absence of these selfish genetic elements within the powerhouse organelle of eukaryotes are a major distinction between different broad evolutionary lineages. There are two types of self-splicing introns that are present in the mitochondrial genomes of most eukaryotic lineages, group I and group II, both of which are partial ribozymes and have the capability of moving themselves within the genome (Saldanha, Mohr, Belfort, & Lambowitz, 1993). In addition, both types of introns contain internal open reading frames (ORFs) that encode for intron-encoded proteins (IEPs) that additionally help to promote the mobility of the introns that they occupy (Belfort, 2003;Belfort & Bonocora, 2014;Belfort, Derbyshire, Parker, Cousineau, & Lambowitz, 2002). Group I introns typically encode for homing endonucleases (HEGs) types LAGLIDADG and GIY-YIG, while group II introns usually encode for reverse transcriptase genes (RT) (Lang, Laforest, & Burger, 2007).
These genetic elements and other retrotransposable elements are often considered selfish as they pose no obvious value to their host genome (Edgell, Chalamcharla, & Belfort, 2011). However, because of their frequent replication and transposition throughout the genome, these genetic elements have the capability of introducing mutations within the host genome upon their insertion (Cambareri, Foss, Rowtree, Selker, & Kinsey, 1996;Nagy & Chandler, 2004). As such, these genetic elements have developed strategies that minimize mutation during insertion by avoiding initial disruption of the host exon-intron structure (Edgell et al., 2011). The HEG element can then function to spread both itself and its host intron throughout the genome (Burt & Koufopanou, 2004;Thiéry, Börstler, Ineichen, & Redecker, 2010) unless it is lost because of mutational events or host repression mechanisms (Brookfield, 2005;Chevalier & Stoddard, 2001). In addition, these elements are known to be able to move horizontally (Goddard & Burt, 1999;Wu & Hau, 2014) between different species genomes which helps to maintain their persistence.
Prior work characterizing mitochondrial evolution in lichens is limited but has revealed a highly variable landscape of introns across mycobionts (Brigham et al., 2018;Funk et al., 2018;Pogoda et al., 2018). Here, we employ data from 58 lichen mycobionts to examine broad-scale patterns of intron gains, losses, and genome streamlining in seven different lineages of lichens: Lecanorales, Peltigerales, Telochistales, Ostropales, Pertusariales, Mycocaliciales, and Arthoniales ( Figure 1). Specifically, we (a) record genome-wide intron presence and sequence similarity in an evolutionary framework by inferring gains and losses through ancestral state reconstructions; (b) test the number of times complete or partial intron loss has occurred across the evolutionary history of the studied taxa; (c) examine intron sequence similarity and position in the cox1 gene; and (d) quantify instances of genome streamlining via loss of selfish parasitic genetic elements, such as introns and homing endonucleases.

| Sample collection
To analyze intron gain and loss across lichenized fungal (i.e., mycobiont) mitochondrial genomes, we selected 58 Ascomycete species that span seven lineages of Lecanoromycetes. These 58 species represent 21 different genera. Twenty-two mitochondrial genomes were previously sequenced and annotated (Pogoda et al., 2018); the remaining 36 genomes were newly assembled for the present study, and all are available on GenBank (Supporting Information Table S1).
All 58 species are native to the southern Appalachian Mountain biodiversity hotspot of eastern US area and were collected in the wild during fieldwork between 2016 and 2017. All specimens are deposited in the herbaria of the New York Botanical Garden (NY) and University of Colorado, Boulder (COLO) (Supporting Information   Table S1). Efforts were made to sample only single thallus for both macro-and microlichens; however, due to the physically small size of microlichens, more than one individual was sometimes included.
For macrolichens, ca. 1 × 1 cm of tissue was removed, targeting the thallus margins and lobes. For microlichens, tissue was scraped from rock or tree substrates using a sterile razor blade. Tissue samples were air-dried in a laminar flow hood for 24 hr and then frozen at −20°C until transport to the University of Colorado for DNA extraction and sequencing.

| DNA extraction and sequencing
Dried samples were first pulverized using tungsten carbide bearings in a Qiagen 96-well plate shaker. Genomic DNA (gDNA) was extracted from tissues using a Qiagen DNeasy 96 plant kit. The manufacturer's protocol was modified to include a 10 min of 65°C incubation step for the ground material in lysis buffer, as well as a 100% ethanol wash, before final drying of the membrane prior to elution, which has been shown to improve DNA concentration and purity (Pogoda et al., 2018). Extracted samples were stored at −20°C prior to library preparation.
F I G U R E 1 Fifty percent majority rule consensus tree from Bayesian analysis, with posterior probabilities mapped at each node. Tree rooted using Artonia susa, A. ruana, and Opegrapha vulgata. Genera for which multiple species were sampled are demarcated with colored boxes. To the right of each species is a cartoon representation of intron presence and location within the cox1 gene. Sequence similarity between introns is represented by unique colors. A black-colored intron indicates a unique, likely derived intron for that species. Introns are colored to indicate sequence similarity within a single genus (i.e., blue in one genus is not the same intron as blue in other genera) Genomic libraries were prepared following standard protocols using Nextera® XT DNA library prep kits (Illumina®), with 1 ng input DNA. Samples were barcoded using unique dual index adapters Nextera® i5 and i7. Libraries were cleaned using solid-phase reversible immobilization (SPRI) to remove fragment sizes <300 base pairs. Quality control (QC) for pooled samples was conducted to ensure appropriate sample concentration and fragment size using a Qubit 3.0 fluorometer and an Agilent 2,100 Bioanalyzer.
All wet laboratory work was performed in the Department of Ecology and Evolutionary Biology at the University of Colorado, Boulder. Sequencing was conducted at the University of Colorado BioFrontiers Institute Next-Generation Sequencing Facility in Boulder, Colorado.

| Mycobiont genome assembly
Raw demultiplexed sequences were trimmed to exclude adaptor sequences using Trimmomatic-0.36 using the parameters "ILLUMINACLIP:NexteraPE-PE.fa:2:20:10MINLEN:140 LEADING:20 TRAILING:20" (Bolger, Lohse, & Usadel, 2014), with the file "NexteraPE-PE.fa" containing the standard set of Nextera adapters to be trimmed from reads. Resulting fastq files were de novo assembled using SPAdes version 3.9.0 with the following parameters: careful -k 35,55,85 (Bankevich et al., 2012). The resulting assemblies included genomic representatives of all taxa (e.g., primary mycobiont, secondary fungal partners such as endolichenic and surficial fungi, bacterial symbionts, and photobionts) present in the metacommunity at the time of tissue sampling. Depth of the assembly was roughly proportional to the amount of input DNA such that the primary fungal and photobiont partners have the highest coverage in contrast to other symbionts.
We conducted several steps to ensure the mitochondrial sequences presented in this study belonged to the desired mycobiont rather than the photobiont or any other symbiont (such as endolichenic fungi) present in the metacommunity at time of sampling.
First, we used command-line BLAST to a representative lichenized Ascomycete mitochondrion (Usnea ceratina: NCBI accession NC_035940) to identify candidate contigs as mitochondrial, and these contigs typically had coverage of about 10-20 times that of nuclear genome contigs. Second, these contigs were then web BLASTed to the NCBI nonredundant database. In every taxon examined, the longest and highest coverage contigs identified with the command-line BLAST had very high % identity (>95%) web-BLAST hits to the expected lichenized fungus at common barcoding loci.
Third, contigs were circularized using the raw genomic reads and error-corrected using SAMtools tview (Li et al., 2009), and tview was used to ensure that no contigs assembled as chimeras between the mycobiont mitochondrion and another mitochondrion present in the meta-assembly. Chimeric junctions appear as abrupt changes in alignment depth and sharp cutoffs in read alignments; tview revealed no chimerism in the assemblies.
Annotations were conducted using DOGMA (Wyman, Jansen, & Boore, 2004) and then prepared for submission in Sequin 15.10

| Genomic content
To assess gene and intron content for each mycobiont mitochondrion, gene boundaries and intron boundaries were identified using BLAST to determine exon/intron boundaries. The cox1 gene was focused on in the analyses because it contained the greatest number of introns of any gene within each genome. Gene length, intron length, and sequence with homology to homing endonucleases (LAGLIDADG and GIY-YIG) for the cox1 gene were summed to determine overall length. For example, if there were eight ORFs with homology to a HEG (either full length or degenerated), these were summed to yield a total number of base pairs for that feature in each genome.

| Genome correlations
In order to examine the drivers of genome size variation, we tested for correlation between genome size and (a) the summed cox1 gene length, (b) the summed cox1 intron length, (c) total number of introns in the cox1 gene, (d) total number of introns present throughout the genome, (e) number of HEG elements present in the cox1 gene, and (f) total number of base pairs of HEG elements in the cox1 gene. Each test was conducted before and after correcting for phylogenetic relatedness using a phylogenetic generalized least squares (PGLS) approach under a Brownian motion model of trait evolution. PGLS tests were conducted using the R packages ade4 (Dray & Dufour, 2007), ape (Paradis, Claude, & Strimmer, 2004), nlme (Pinheiro, Bates, DebRoy, & Sarkar, 2017), and geiger (Harmon, Weir, Brock, Glor, & Challenger, 2008). To explore whether there exists a signal of evolutionary relatedness in each of our datasets relating to key genome features (Felsenstein, 1981), we tested for phylogenetic signal using Pagel's lambda and Blomberg's K (Blomberg, Garland, & Ives, 2003;Pagel, 1999). Analyses were conducted using the R package phytools (Revell, 2012), assuming a Brownian motion model of trait evolution.

| Correlation between categorical data and intron number
To determine whether lichen growth form (macrolichen or microlichen; Supporting Information Table S1), photobiont partner (cyanobacterium, green coccoid alga, or green chain-forming trentepohlioid alga; Supporting Information Table S1), or mode of reproduction (asexual or sexual; Supporting Information Table S1) was correlated with genome-wide intron number and/or number of cox1 introns, we conducted a one-factor ANOVA test using the R package dplyr (Wickham, Francois, Henry, & Müller, 2016). Data were square roottransformed prior to analysis to adjust for non-normality of initial values. Character states were assigned to each species as follows: (a) All foliose and fruticose lichens were classified as macrolichens, and crustose lichens were classified as microlichens; (b) photobiont partners were assigned based on the primary photobiont present based on examination of the voucher specimen by JL and ET (note that no known tripartite lichens were included in this study); (c) reproductive mode was assigned based on the dominant reproductive mode present in both the specimen and the species (i.e., thalli and species that produced lichenized diaspores were assumed to reproduce asexually, even rare individuals in nature may also produce sexual reproductive structures; thalli and species that did not produce lichenized diaspores were treated as sexually reproducing because sexual reproductive structures were nearly always present and these were inferred to produce sexual spores).

| Phylogenetic comparative analyses
To reconstruct a phylogeny to enable downstream analyses on intron evolution, we utilized data from the complete rDNA contig. First, fulllength or near full-length nuclear rDNA contigs, which included sequences representing 18S, ITS1, 5.8S, ITS2, and 26S, were extracted from the 58 metagenomic assemblies by performing a BLASTn of the meta-assemblies against a representative rDNA contig (Cladonia rangiferina: accession KY119381). Because prior studies have shown that six of our study genera for which multiple representatives were sampled (Cladonia, Heterodermia, Lecanora, Parmotrema, Pertusaria, and Usnea) form strongly supported, reciprocally monophyletic lineages (Mower, Stefanović, Young, & Palmer, 2004), and to minimize potential impacts of paralogous introns at shared sites across different genera, we first aligned only the coding sequences for all 58 species ( i.e., 18S, 5.8S, and 26S). Second, the hypervariable regions (i.e., introns, ITS1 and ITS2) were aligned separately within each of these six genera and appended to the end of the coding sequence alignment. Intronic and noncoding data from other lineages (those with only one species per genus) were thus not considered in our alignment. Base positions for which more than one taxon was missing data were excluded from the alignment prior to phylogenetic analysis. The alignments were then combined into a single, joint matrix which was aligned using MUSCLE (Edgar, 2004) and then manually adjusted to correct for machine errors. The GTR + Γ+I model of sequenced was applied to all phylogenetic analyses as a result of model selection using the Akaike information criterion (AIC) implemented in ModelFinder (Kalyaanamoorthy, Minh, Wong, Haeseler, & Jermiin, 2017). Bayesian topologies were inferred in MrBayes (Huelsenbeck & Ronquist, 2001;Ronquist & Huelsenbeck, 2003), sampling trees over 1,000,000 MCMC generations (Nei & Kumar, 2000) and treating gaps as missing data. The default first 25% of trees representing the burn-in were excluded from further consideration. The sampling temperature was set to temp = 0.002, and eight chains were implemented in the tree search. The posterior distribution of trees was used to calculate a 50% majority rule consensus tree, upon which we mapped Bayesian posterior probabilities (Tamura & Nei, 1993

| Intron positions and sequence similarity within a genus
To assess how conserved introns were within and across species, These intron sequences were compared for nucleotide similarity using BLAST and then colored based upon intron similarity (i.e., the "red" intron in Cladonia has high sequence similarity only within that genus and is not the same intron as "red" in another genus (Figure 1).

| Mycobiont intron search in metagenomic assemblies
To assess whether the introns that were present in the mitochondrial genomes of the mycobiont were present in other genomes (e.g., the mitochondrial genome of the photobiont or the nuclear mycobiont genome), a command-line BLASTn was performed using a concatenated file containing all the sequences from the introns extracted from each of the mycobiont mitochondrial genomes against the meta-assemblies of each of the 58 species. The resulting BLAST tables were parsed, and each hit was assessed for bit score. We determined the species from which the contig came by using BLASTn searches against the NCBI nonredundant database.

| Intron clustering
Intronic DNA sequences for the cox1 gene were extracted from each annotation to compare sequence similarity for the gene between all 58 species. An all-versus-all BLASTn was conducted, and the resulting table was parsed to include only hits >100 bp in length and with a bit score >100. A pairwise similarity matrix was generated in which the bit score of the comparison between two introns was used to produce grayscale weighting for the cell representing the comparison (i.e., black indicates higher sequence similarity than light gray; Figure 2).
Introns were clustered using the R program iGraph (Csardi & Nepusz, 2006). The function cluster_optimal was employed to calculate the optimal community structure for the intron sequences that resulted from the all-versus-all command-line BLAST. A bipartite graph was constructed with vertices representing introns and edges between vertices representing BLAST similarity weighted by bit score. Each intron was color-coded to identify the genus from which it originated (Figure 3).

| Ancestral state reconstruction
To assess the evolution of intron sequence similarity, as well as broad-scale gain and loss events, ancestral state reconstructions (Ekman, Andersen, & Wedin, 2008) were conducted using Mesquite (Maddison & Maddison, 2018). The Bayesian consensus tree was imported and trimmed to only include species that contained cox1 introns and had more than one representative per genus. A character matrix of the 19 cox1 intron clusters (see Intron Clustering) was built for these 45 species; for each species, we scored whether the cluster was (1) present or (0) absent. The history of each character was reconstructed using maximum-likelihood methods to estimate ancestral states, with default probability models in effect. Nodes (internal and external) were colored (black or white) to indicate the presence or absence of a given character (i.e., intron; Figure 4).
A character matrix for total intron length, total intron number in the cox1 gene, and genome-wide total intron number was imported to assess overall ancestral intron gain and loss. The steps outlined above were repeated to reconstruct the ancestral states of these characters. Nodes (internal and external) were color-coded to indicate the range of cox1 intron lengths (Supporting Information Figure S2a), cox1 intron number (Supporting Information Figure   S2b), and genome-wide intron number (Supporting Information Figure S3).

| Mycobiont genome content
Each of the 58 lichen mitochondrial genomes contained a conserved set of 14 protein-coding genes: cob, cox1, cox2, cox3, nad1, nad2, nad3, nad4, nad4L, nad5, nad6, atp6, atp8, and rps3. Another proteincoding gene present in some but not all the genomes was atp9, which we showed previously to be absent in some members of Arthoniales  (Table 1). The total number of introns was correlated with overall genome size (R 2 = 0.49, p = 0.0001) and remained significant after correcting for phylogenetic relatedness (p = 0.0001). Eleven of the 15 genes were parasitized by introns, but F I G U R E 2 Pairwise similarity matrix of the resulting bit score from comparison between introns in the cox1 gene of each species that was present within a genus here represented by two or more species (rows and column each represent unique introns). Matrix represents a nucleotide all-versus-all BLASTn (diagonal values representing identical comparisons omitted). Gray scale is weighted by bit score (measure of sequence similarity and number of bp that are similar); darker colors indicate higher bit score. Withinspecies comparisons are demarcated by boxes, and genus is noted on right-hand side of figure. four (atp8, atp9, nad4L, and nad6) did not contain any introns in the species examined.
On average, the cox1 gene contained the greatest number of introns within each genome (Table 1). As was the case overall, the number of introns (R 2 = 0.38, p = 0.004) and length of introns (R 2 = 0.53, p < 0.00001) within this gene were strongly correlated with genome size. However, after correcting for phylogenetic relatedness, the number of introns within the cox1 gene was not significantly correlated with genome size (p = 0.69), suggesting phylogenetic signal in the number of cox1 introns that was further evidenced by Blomberg's K (p = 0.00003) and Pagel's lambda (p = 0.014) values.
The coding DNA sequence of the cox1 gene was consistent in size across all the species examined and was not significantly correlated to overall genome size (Table 1; R 2 = 0.02, p = 0.885).

| Synteny
The order of gene features was not consistent across all 58 genomes, suggesting some degree of gene-block inversions and translocations. We examined six sets of congeners (i.e., members of a genus) and found conservation of gene order varied considerably even within genera. At one extreme, gene order was conserved for (a) all eight species of Usnea, (b) all but one of the 11 species of Cladonia, and (c) all but one of the seven species of Heterodermia (Supporting Information Figure S1). The exception in Cladonia was C. uncialis, which had an inversion of the block of genes containing "nad6-cox3-mtLSU-nad2-nad3." The exception in Heterodermia was H. echinata, which featured a translocated nad3. In contrast, the two Lecanora species examined, which are closely related sister taxa (Lendemer & Harris, 2014), were markedly variable in both genome size and feature order (Supporting Information Figure S1; L. cinereofusca was 32,357 bp in length and L. saxigena was 56,579 bp in length). The ten Parmotrema species examined were syntenic with the exception of their nad1 and atp6 genes (Supporting Information Figure S1). In addition, two species (P. austrosinense and P. stuppeum) each contained two copies of atp6, one truncated and one full length; furthermore, these were the only mitochondrial genomes in this sample set to contain any duplication within the core set of protein-coding genes (see Mycobiont Genome Content; Supporting Information Figure S1).

| Phylogenetic relationships
Our alignment of rDNA and introns totaled 16,352 bp in length, and analyses of these data recovered the same overall genus level relationships found in prior large-scale phylogenetic studies of the Lecanoromycetes (Miadlikowska et al., 2014). Phylogenetic relationships were in general well-supported (PP = 1.0); however, seven nodes were not strongly supported (i.e., PP < 0.95 Figure 1). Percent pairwise divergence is reported for all 58 species (Supporting Information Table S2).

F I G U R E 3
Clustering of cox1 gene introns with high sequence similarity between species. Clusters with more than one genus (more than one color dot) indicate ancestral introns, while clusters with introns from only one genus (one color dot) are more recent gains. Clusters of greater than two introns are numbered, and vertices (nodes) are colored to represent the species that the intron originated from | 4253 POGODA et Al.

| Homing endonucleases
Substantial numbers of ORFs with homology to homing endonucleases were present in the mitochondrial genomes we examined.
Specifically, we identified two types of HEGs: LAGLIDADG and GIY-YIG. These HEG elements, either full length or degenerated, were especially abundant in the introns of the cox1 gene. The number of HEGs (Table 1; R 2 = 0.25, p = 0.06) and summed length of ORFs containing homing endonucleases (R 2 = 0.30, p = 0.021) were marginally correlated with genome size, and both remained significant after phylogenetic correction with PGLS (p = 0.04 and p = 0.04 respectively). The HEGs were present as either freestanding within an intron (identified by having unique start and stop codons) or fused/ within the same reading frame as the intron it parasitized (identified as sharing a start or stop codon;

| Intron gain and loss
Intron gain and loss were examined genome-wide as well as specifically within the cox1 gene. Ancestral state reconstruction indicated that, genome-wide, the ancestral mitochondria of the species examined contained five to ten introns, with both subsequent gains and losses across the sample set. In the cox1 gene, there were also an intermediate number of introns (3-5) that later underwent genusand species-specific gains and losses. Species of Heterodermia, Parmotrema, and Usnea showed overall trends toward intron gain (Table 1), with species of Usnea representing the most extreme case.

| Transmission of intron sequences
Group I and group II introns can be transmitted both vertically and horizontally (Belfort & Bonocora, 2014;Cho, Qiu, Kuhlman, & Palmer, 1998;Goddard & Burt, 1999). Using ancestral state reconstructions, we inferred that the intron sequences which were represented more than once in the data set are vertically transmitted ( Figure 4). However, for the unique introns in some species (introns colored black; Figure 1), we wished to determine where they had originated from (i.e., the nuclear mycobiont genome or the photobiont mitochondrial genome). To explore this further, we searched each of the 58 species meta-assemblies for sequences with high similarity (>80%) to the introns extracted from the mycobiont mitochondrial genomes. We observed that the best hits were to contigs that had low sequence coverage (1-3×, which was the average coverage of the contigs associated with the mycobiont nuclear contigs in the assembly) and had sequence matches to fungal/lichen species in NCBI's nonredundant database (>80% identity and >60% coverage). This suggests that in these cases, the nuclear genome of the mycobiont may be acting as a potential reservoir from which mitochondrial introns can arise. The intron sequences were distributed in 19 clusters of two or more introns (Figure 3). Clusters 1, 3, 5-9, 11, and 13 were present in two or more genera suggesting a relatively early origin among sampled taxa in our tree, while clusters 2, 4, 10, 12, and 14-19 were present only within a single genus, suggesting more recent gains (Figure 4). Introns were more similar within a given genus (always >90% similarity) than between genera (>80% similarity, sometimes >90%), again suggesting ancestral gains and losses followed by subsequent mutations within a genus (Figure 2). Species of Usnea contained the highest number of introns that contained high (>80%) sequence similarity (n = 10; Figure 1) and accounted for four of the 19 clusters (Figure 3). While Parmotrema contained a large number of introns that were similar between species (n = 8), it also contained six unique introns found in only a subset of species and these were relatively derived within the genus (Figure 1).

| Intron correlation to categorical data
Genome-wide intron number was significantly and positively correlated with lichens that were cyanobacterial (p-value = 0.00616), were macrolichens (p-value = 0.0000873), and reproduce asexually (p-value = 0.0306). Additionally, the number of introns present in cox1 was significantly correlated with the macrolichen growth form (p-value = 0.00034).

| Divergence among cox1 introns in Usnea
The cox1 introns among Usnea were highly divergent in comparison with species in the other five genera for which multiple species were sampled. Species of Usnea also had on average the highest number of introns within the cox1 gene, and these introns were generally short in length in comparison with other genera (Supporting Information Figure S2a). In addition, species of Usnea had the fewest number of parasitic homing endonucleases (Table 2). F I G U R E 4 Ancestral state reconstruction for four of the nineteen intron clusters (these clusters were chosen to demonstrate early and late intron gains): clusters 1 (a), 3 (b), 6 (c), and 15 (d; see Figure 3 for cluster identification). Pies at nodes represent likelihoods that a given intron cluster was (black) or was not (white) present at ancestral node. Tree shows only species having introns with sequence homology to other species (see text for further explanation)

| D ISCUSS I ON
In this study, we documented differences in the number and variability of introns within 21 genera of lichens (six of which we sampled more than one representative species) that are on par with the total variation present among major subdomains of the tree of life, such as metazoa, fungi, and plants. Previous research has demonstrated that intron number is variable between different species of nonlichenized Ascomycete fungi (e.g., S. cerevisiae is relatively intron-poor in comparison with Aspergillus nidulans; Paquin et al., 1997;Nielsen et al., 2004) and can drive major differences in genome size in these organisms (Sandor, Zhang, & Xu, 2018). Our study recapitulates in lichenized fungi the pattern of dynamic intron gains and losses, even between sister species, and differences in genome size observed in other nonlichenized fungi (Figure 1) as well as comparing mitochondrial intron number and location among groups of closely related lichenized species. In these lichenized fungi, we recovered evidence for both genome size proliferation via intron gain and streamlining via loss of mitochondrial introns over a short evolutionary timescale.
The striking examples in our dataset include sister species within a genus that in some cases differed by fivefold in intron number.
Across the mitochondrial genes present in lichen mycobionts, we found evidence for HEG element parasitism in 11 genes. Among these, cox1 was by far the most heavily parasitized by LAGLIDADG and GIY-YIG homing endonucleases, with 49 of the 58 species (~85%) containing at least one intron. Twenty-one species contained two or more HEG elements in a single intron. This nested HEG arrangement has the potential to drive alternative splicing (Guha et al., 2017), which in some lineages may foster gene regulatory divergence under variable environmental conditions, as has been demonstrated in diatoms (Rastogi et al., 2018).
Ancestral state reconstruction revealed that cox1 has undergone both intron gains and losses, the latter of which appear to be a derived feature, unique to multiple individual species in our dataset.
The nine species for which no introns within cox1 were detected (Arthonia ruana, Cladonia peziziformis, Graphis lineola, Hypogymnia vittata, Icmadophila ericetorum, Imshaugia aleurites, Lecanora cinereofusca, Lepraria oxybapha, and Phlyctis boliviensis) are characterized by substantial reductions in overall genome size and/or low overall numbers of introns across all mitochondrial genes (Table 1) and differ strikingly in these characteristics even compared to close congeners. These instances mark losses rather than gains and can be taken as evidence of parallel evolution across multiple, distantly related lichens. This evidence for parallel streamlining of mitochondrial genomes via loss of parasitic introns and HEG elements in these symbiotic organisms has been similarly documented at the level of coding genes (Pogoda et al., 2018). Curiously, the fact that these derived features were recovered only toward the tips of phylogenetic branches and never observed deeper in our phylogenetic tree suggests that complete intron loss is not evolutionarily stable in lichenized fungi.
The data presented here thus extend some evidence of genome streamlining in symbiomes (sensu Tripp et al., 2017) from protein-coding genes to repetitive, noncoding elements (Andersson & Andersson, 1999;Hansen & Moran, 2014;Moran & Bennett, 2014;Nikoh et al., 2014;Pogoda et al., 2018), suggesting action of parallel selection throughout coding and noncoding portions of the mitochondrial genome. However, genome reduction has been accompanied by gains in genome size in several lineages (Heterodermia, Parmotrema, and Usnea), and reductions are neither ubiquitous nor the only mode of evolution across symbiotic lichenized fungi. This is similar to other fungal species (Paquin et al., 1997;Santamaria et al., 2009) and suggests that the lichen mycobiont mitochondrial genome is not stably undergoing genome streamlining via loss of intronic sequences.
Notably, some traits and lifestyle attributes of lichens sampled in this study correlate with intron number. Separately, macrolichens, lichens that have cyanobacterial photobionts, and/or lichens that reproduce asexually have significantly more introns than other species (see Intron Correlation to Categorical Data & Supporting Information Table S1). Macrolichen morphology is strongly correlated with asexuality (Tripp & Lendemer, ms in prep.). In asexually reproducing lichens, selection should be less effective at removing mildly deleterious mutations owing to processes such as Muller's ratchet (Haigh, 1978). Introns and other retrotransposable elements are expected to be slightly harmful, on average, due to the replication costs of their DNA and encoded RNA and proteins, and because by virtue of frequent replication and transposition throughout the genome, they have the capability of introducing harmful mutations within the host genome upon insertion (Cambareri et al., 1996;Nagy & Chandler, 2004). If the nuclear genome is indeed acting as a reservoir for these introns, asexual lichens will have a larger nuclear intron reservoir, due to the lack of recombination, than sexual lichens explaining why on average asexual lichens have more mitochondrial introns. Species that reproduce largely asexually also may have shorter overall generations times (Charlesworth & Charlesworth, 1997) and thus have more opportunities for selfish, parasitic elements such as introns to proliferate throughout their genomes. However, other studies have found that uniparental mitochondrial inheritance and the spread of HEGs may be influenced by certain mating type loci (Yan et al., 2018), thus suggesting that there are possible underlying genetic mechanisms that influence the spread of HEGs within the mitochondrial genome. Future work examining the presence/absence of HEGs, their spread throughout the genome, and the associated lichen mating types will help to further elucidate the underlying drivers of differences in intron number.
While introns do add noncoding length to genes, thus incurring costs during cell division and transcription, they offer the potential benefit of alternative splicing, contributing valuable flexibility in gene expression and regulation (Dibb, 1993;Jo & Choi, 2015;Lynch et al., 2006;Smith et al., 2018). Alternative splicing may in fact confer greater genetic flexibility to the mitochondrial genomes of plants and fungi compared to those of the relatively intron-poor bilateral animals (Dibb, 1993;Jo & Choi, 2015;Kazan, 2003;Keren, Lev-Maor, & Ast, 2010;Lynch et al., 2006). Future research exploring the transcription of mitochondrial genes in lichenized fungi may determine whether alternative splicing is occurring or whether the introns simply propagate because of faster generation times and/or reduced ability to eliminate these elements from genomes.
Our study recapitulates many of the patterns observed in nonlichenized fungi. We see relatively stable gene content with the notable exception of loss of mitochondrial atp9 in some genera of lichenized fungi (Arthoniales; Bailey et al., ms in prep., Lecanorales, Ostropales and Teloschistales; Pogoda et al., 2018) and duplication of atp6 in two species of Parmotrema (P. austrosinense and P. stuppeum). Nonlichenized fungi also maintain relatively stable gene content, for example, sometimes losing nad1 (Sandor et al., 2018). Gene synteny is both maintained in some species and highly variable between others in both lichenized and nonlichenized fungi (Pogoda et al., 2018;Sandor et al., 2018). Additionally, genome size in both can vary widely, even between sister species, and is driven by often major differences in intron number, variable lengths of intergenic regions, and differences in the presence/absence of homing endonucleases (Pogoda et al., 2018). Our study adds to the growing literature on fungal mitochondria and demonstrates that lichenized fungi have many of the same polymorphisms of nonlichenized fungi.

| A unique case of divergence within Usnea
Usnea (Old Man's Beard) is a morphologically distinctive and species-rich lineage represented on every continent (Crespo et al., 2007). Speciation rates within Usnea have been estimated to be two to three times higher than rates in other members of Parmeliaceae (Kraichak et al., 2015). In this study, we found that species of Usnea harbored more variable intron sequences (i.e., sequence dissimilarity) compared to any other sampled genus (Figures 1 and 3). These data suggest a potential link between speciation rate and rate of intron evolution, potentially as a function of faster rates of mutation and/or faster generation times within Usnea.
Of further interest is our documentation that species of Usnea contained the fewest homing endonucleases parasitizing cox1 introns despite containing a higher average number of shorter length introns (average summed intron sequence for Usnea = 4,200 bp, Parmotrema = 9,300 bp, Heterodermia = 7,700 bp) compared to any other genus in this study. The leading hypothesis to explain mechanisms of intron loss involves reverse transcription in which mRNAs are intermediately converted into cDNAs and the cDNAs, lacking some or all of the intronic sequences, participate in recombination to produce a gene sequence without introns (Roy & Gilbert, 2006;Zhang, Yang, & Niu, 2010). This process requires reverse transcription machinery such as reverse transcriptase, maturase, and homing endonucleases to be present (Roy & Gilbert, 2006;Zhang et al., 2010). Reconstruction of ancestral intron states in this study suggests that species of Usnea are marked by relatively recent gains of short intron sequences that have undergone species-level losses.
We suggest that these mitochondrial genomes have yet to be highly parasitized by HEG elements via vertical transmission and therefore lack some of the required reverse transcription machinery to excise introns, as other genera have likely acquired.

| CON CLUS ION
In this study, we explored both the genome-wide intron landscape and dynamic evolution within cox1 among numerous lichenized fungal mitochondrial genomes, demonstrating a high degree of parasitism of introns. These intronic elements are shared among varying levels of phylogenetic diversity: Some reflect sharing among different orders or classes separated by ~418 Ma years of evolution (e.g., Cladonia and Arthonia; Prieto & Wedin, 2013;Kumar, Stecher, Suleski, & Hedges, 2017; Cluster 3 in Figure 3), whereas others reflect sharing between only sister species. Our data show that intron gains and losses have occurred multiple times across the evolutionary history of the Lecanoromycetes, with substantial variability across the species examined.
Our data yielded evidence for nine instances of complete loss of introns within cox1 and most other genes as well as two instances of complete, genome-wide intron loss. This suggests that some (