Long‐read direct RNA sequencing of the mitochondrial transcriptome of Saccharomyces cerevisiae reveals condition‐dependent intron abundance

Mitochondria fulfil many essential roles and have their own genome, which is expressed as polycistronic transcripts that undergo co‐ or posttranscriptional processing and splicing. Due to the inherent complexity and limited technical accessibility of the mitochondrial transcriptome, fundamental questions regarding mitochondrial gene expression and splicing remain unresolved, even in the model eukaryote Saccharomyces cerevisiae. Long‐read sequencing could address these fundamental questions. Therefore, a method for the enrichment of mitochondrial RNA and sequencing using Nanopore technology was developed, enabling the resolution of splicing of polycistronic genes and the quantification of spliced RNA. This method successfully captured the full mitochondrial transcriptome and resolved RNA splicing patterns with single‐base resolution and was applied to explore the transcriptome of S. cerevisiae grown with glucose or ethanol as the sole carbon source, revealing the impact of growth conditions on mitochondrial RNA expression and splicing. This study uncovered a remarkable difference in the turnover of Group II introns between yeast grown in either mostly fermentative or fully respiratory conditions. Whether this accumulation of introns in glucose medium has an impact on mitochondrial functions remains to be explored. Combined with the high tractability of the model yeast S. cerevisiae, the developed method enables to monitor mitochondrial transcriptome responses in a broad range of relevant contexts, including oxidative stress, apoptosis and mitochondrial diseases.

Mitochondria are one of the hallmarks of eukaryotic cells.Commonly known as 'the powerhouse of the cell', they perform several key cellular processes, such as the biosynthesis of iron-sulphur clusters, branched amino acids, haeme, and lipids, and play an important role in cellular ageing and programmed cell death (reviewed by Braun & Westermann, 2011;Malina et al., 2018).To date, about 150 mitochondrial disorders have been reported in humans, the genetic origin of which lies in mitochondrial (mtDNA) or nuclear (nDNA) DNA.As a remnant of their bacterial origin, mitochondria harbour their own DNA; however, most of the 1000-1500 proteins composing mitochondria are nuclear-encoded and imported from the cytosol to mitochondria (Timmis et al., 2004).Remarkably, mitochondria have retained specific genes and require their own complete set of proteins for gene expression, RNA splicing and degradation and DNA maintenance (Contamine & Picard, 2000).
Mitochondrial functions are largely conserved across eukaryotes, including the eukaryotic model Saccharomyces cerevisiae, which is a preferred model for mitochondrial biology for two main reasons.
First, S. cerevisiae can grow in the complete absence of respiration and with partial or complete loss of mtDNA (Claypool, 2013).Second, S. cerevisiae is still, to date, one of the rare organisms harbouring genetic tools to modify mtDNA beyond base editing (e.g., gene deletion and integration; Larosa & Remacle, 2013).Using these features, mtDNA-free yeast cells (ρ 0 mutants) were shown to stably maintain mouse mtDNA (Yoon & Koob, 2019), a first step towards the potential humanization and engineering of mtDNA in yeast.S. cerevisiae has therefore an important role to play in deepening our understanding of mitochondrial processes.
Conserved dodecamer sequences separate the transcripts within the polycistrons and enable modulation of gene expression, while transcription is initiated at conserved nonanucleotide promoters (Ojala et al., 1981;Osinga et al., 1984;Turk et al., 2013).While splicing is rare in S. cerevisiae nDNA (Schreiber et al., 2015), three mitochondrial genes (COX1, COB and 21S rRNA) contain intronic sequences (Foury et al., 1998;Pel & Grivell, 1993).Q0255 requires the omission of two GC clusters to be fully translated (Bordonné et al., 1988;Michel, 1984).Several of these introns encode proteins; for example, the I-SceI homing endonuclease is encoded in the intron of the 21S rRNA gene (Colleaux et al., 1986), while introns of COX1 and COB encode maturases that assist in splicing of their respective intron sequences (Lipinski et al., 2010).Some intron-encoded proteins are the result of alternative splicing, where parts of an exon combined with different intronic sequences of the same primary transcript yield different proteins, next to the exon-encoded protein.
These splicing events are catalysed by nDNA-and mtDNA-encoded enzymes (Lambowitz & Belfort, 1993).Despite its small size, the mtDNA results in a complex RNA landscape involving mechanisms that, despite decades of study, are not fully elucidated.
Since the first gene arrays in the 1990s, the cytosolic transcriptome of S. cerevisiae has been intensively investigated (Jenjaroenpun et al., 2018;Nookaew et al., 2012).The mitochondrial transcriptome is generally not detectable in these classical transcriptome studies, which is largely explained by the fact that mRNA extraction methods rely on the selection of poly(A)-tailed mRNA, while yeast mtRNA is not polyadenylated (Gagliardi et al., 2004).
Additionally, the mitochondrial transcriptome represents a small fraction of the total yeast transcriptome (ca.5%), which is largely dominated by (ribosomal) cytosolic RNA species (Mueller & Getz, 1986;Osorio & Cai, 2021).Capturing the mitochondrial transcriptome therefore requires a dedicated methodology.To date, a single study reported the yeast mitochondrial transcriptome, using cellular subfractionation to enrich mitochondria and short-read RNA sequencing (Turk et al., 2013).While this study brought new insight into regulatory elements such as promoter sequences, dodecamer sequences, UTRs and processing sites, the use of short reads, even with paired ends, cannot capture the full complexity of the mtRNA transcriptome.This technical limitation can be overcome by implementing long-read RNA sequencing, as recently demonstrated by the successful resolution of various complex and spliced transcriptomes ranging from viruses to plant, mammalian and even a full poly(A)-human transcriptome (Depledge et al., 2019;Gao et al., 2021;Jenjaroenpun et al., 2018;Ono & Yoshida, 2020;Roach et al., 2020;Workman et al., 2019;Zhao et al., 2019).
The goal of this study is to provide a comprehensive description of the mitochondrial transcriptome of Saccharomyces cerevisiae, including identification of processing of polycistronic transcripts and splicing of the different transcript isoforms.To this end, we developed a robust protocol for mitochondrial RNA isolation and combined it with Nanopore long-read direct RNA sequencing technology.This method was used to investigate the response of the mitochondrial transcriptome to different carbon sources (glucose and ethanol) chosen for their ability to tune yeast physiology and respiratory activity.

Take away
• An RNA-seq method to resolve the yeast mitochondrial transcriptome was developed.
• The full mitochondrial transcriptome was recovered by direct long-read RNA-seq.
• Splicing patterns were successfully captured with a single-base resolution.
• This work identified a yet-unreported impact of growth conditions on Group II intron turnover.
Yeast strains were grown aerobically at 30°C in 500-mL shake flasks containing 100 mL synthetic medium with ammonium as nitrogen source (SM) supplied with vitamins and trace elements, prepared and sterilized as described previously (Verduyn et al., 1992), in Innova incubator shakers (Eppendorf) set at 200 rpm.For respiro-fermentative growth, sterilized media were supplemented with a D-glucose solution to a final concentration of 20 g L −1 glucose (SMD, 'glucose media').For respiratory growth, media were supplemented with absolute ethanol to a final concentration of 2% (v/v) and with a glycerol solution to a final concentration of 2% (v/v) (SMEG, 'ethanol media').Glucose solutions (50% w/v) and glycerol solutions (99% w/v) were autoclaved separately for 10 min at 110°C.When required, the medium was supplemented with separately sterilized solutions of uracil (Ura) to a final concentration of 150 mg L −1 , lysine (Lys) to a final concentration of 300 mg L −1 and/or adenine (Ade) to a final concentration of 200 mg L −1 (Pronk, 2002).Alternatively, YPD medium, containing 10 g L −1 Bacto Yeast extract, 20 g L −1 Bacto Peptone and 20 g L −1 glucose, was used.
For plate cultivation, 2% (w/v) agar was added to the medium before heat sterilization.
Frozen stocks were prepared by the addition of sterile glycerol (30% v/v) to late exponential phase shake-flask cultures of S. cerevisiae, and 1 mL aliquots were stored aseptically at −80°C.

| Isolation of mitochondria by differential centrifugation
Crude mitochondria were isolated by a combination of protocols described by Liao et al. (2018) and Luttik et al. (1998), with a few modifications.The protocol was initially tested using a strain (IMC173) with fluorescent mitochondria before isolating mitochondria for RNA-seq (Supporting Information: Figure S1 and Methods).
Pre-cultures of CEN.PK113-7D were grown overnight on SMD and were used to inoculate duplicate 500 mL shake flasks containing 100 mL SMD or SMEG at an optical density at 660 nm (OD 660 ) of 0.1-0.2.The mitochondrial volume of a cell is dependent on respiratory activity (Di Bartolomeo et al., 2020); therefore, to yield enough mitochondrial mass, 300 mL of culture was used when grown on respiratory SMEG media and 800 mL of culture was used for respiro-fermentative glucose media.SMD cultures were grown until the mid-exponential phase (OD 660 of 5) to prevent cells from going through diauxic shift.SMEG cultures were grown until mid-to lateexponential phase (OD 660 of 5-19), before harvesting by centrifugation.All centrifugation steps were performed at 4°C in an Avanti J-E high-speed fixed-angle centrifuge (Beckman Coulter Life Sciences) with a JA-10 rotor for volumes larger than 30 mL and a JA-25.50T A B L E 1 Saccharomyces cerevisiae strains used in this study.

| RNA extraction from mitochondrial fraction
RNA extraction and handling were done in an RNase-free workspace.
When possible, materials and workspaces were cleaned with RNAseZAP spray and towels (Sigma Aldrich) and consumables and working solutions were autoclaved for 45 min at 121°C to denature any RNases.In vitro-synthesized RNA was spiked throughout the protocol to assess throughput and efficiency of the protocol.The synthesis of control RNA and timing of spike-ins are described in the Supporting Information: Methods.
RNA from the mitochondria-enriched fraction was isolated using the miRNeasy Mini kit (Qiagen).Mitochondrial fractions were spun down for 10 min at 10,000g at 4°C, the buffer was removed and the mitochondria were thoroughly resuspended in 700 μL QIAzol reagent at RT by vortexing.This was sufficient to lyse the mitochondria, and RNA was extracted following the manufacturer's instructions.Oncolumn DNase I treatment using the RNase-free DNase set (Qiagen) was included in the RNA extraction to remove any DNA contaminants.RNA was eluted in 30 μL nuclease-free water at RT. RNA integrity (RIN) was assessed using an RNA ScreenTape assay for the TapeStation (Agilent), and RNA quantity was assessed using a Qubit fluorometer with the RNA broad-range assay kit (Thermo Fisher).
Samples with an RIN>7 and a yield of 3 μg or more were used for enzymatic polyadenylation.

| In vitro polyadenylation and cleanup of isolated RNA
In vitro polyadenylation was performed using E. coli Poly(A) Polymerase (New England Biolabs), according to the manufacturer's instructions, with a few modifications: 3-10 μg of RNA was added to each reaction to meet Nanopore input requirements.Additionally, 20 U murine RNase inhibitor (New England Biolabs) was added to each reaction.After polyadenylation, samples were cleaned up using miRNeasy Mini kit with a few modifications.The polyadenylation reaction was filled up to 100 μL using nuclease-free water.S2 and S4).All DNA

| Library preparation and sequencing
Poly(A)-tailed RNA samples were used for library preparation using the Direct RNA Sequencing Kit (SQK-RNA002, Oxford Nanopore Technologies (ONT)), following the manufacturer's instructions.
Library preparation yield was measured with a Qubit ® 2.0 fluorometer and corresponding Qubit™ dsDNA BR Assay Kit.Before library loading, the remaining active pores were measured by a flow cell check with MinKNOW software (ONT).Library was loaded onto a R.9.4.1 flow cell with FLO-MIN106D chemistry using the Flow Cell Priming Kit (EXP-FLP002, ONT) according to the manufacturer's instructions.Sequencing runs were started via MinKNOW with default parameters for 16 h.

| Basecalling
Raw FAST5 files were basecalled on a high-performance computing cluster (GPU) using Guppy 4.4.1 (ONT) with default parameters for flow cell type and sequencing kit.Basecalled reads were categorized as PASSED for Guppy-determined quality score >7 and as FAILED for reads <7 (Supporting Information: Table S5).All PASSED FASTQ files were concatenated into a single FASTQ file each and filtered by length to cut-off at 40 bp.Basecalled reads were processed according to standard EPI2ME workflow (Metrichor Ltd., ONT) for FASTQ RNA control reads to assess sequencing accuracy and coverage of the RNA control strand (RCS).

| Mapping
To account for control sequences, the FASTA file of the genome reference sequence was amended with FASTA files of all in vitro synthesized control sequences, including the sequence of RCS (ENO2 gene as provided by ONT).The resulting FASTA file and corresponding annotation file were used for mapping with minimap2 (Li, 2018) (parameters: -ax splice -uf -k14 --secondary=no) to obtain a Sequence Alignment/Map (SAM) file (Li et al., 2009).All PASSED reads longer  S5).Quantitative analysis of expression levels therefore required library normalization.RPKM (reads per kilobase million) is the standard normalization method for RNA-seq, developed for short-read RNA sequencing in which read length (ca. 150-200 bp) is substantially shorter than gene length (e.g., several kb in yeast).RPKM normalizes to library size, but also to gene length, as to compensate for this direct correlation between read number and gene length.However, normalization by gene length is not applicable to long-read RNA sequencing as read length often approaches or even exceeds the gene length.The data sets were therefore only normalized over library depth in counts-per-million (CPM) in R (script: readcounts.rmd)using edgeR (Robinson et al., 2009).For all quantification analysis, reads mapping to both the S. cerevisiae native ENO2 feature and RCS feature were considered to be control reads, as the two could not be distinguished based on sequence identity.
Normalization between ethanol and glucose datasets was done by normalization over the mitochondrial dry weight (mtDW, R-script: normalization_etoh-gluc.rmd).To normalize, the input amount of cells in OD units was first normalized to cell dry weight (CDW) using the following relations (experimentally determined as described by Verduyn et al., 1992 using 30 samples of different OD 660 ): Glucose −1 660 2 The CDW was multiplied by the volumes used (Supporting Information: Table S5) to get the total CDW per condition.Cells have a density of 1 g mL −1 , and on ethanol a mitochondrial volume of 0.35 mL mitochondria per mL ethanol-grown culture and 0.05 mL mitochondria per mL glucose-grown culture (Di Bartolomeo et al., 2020).From this follows approximately 0.26 g (±0.02) mtDW g CDW −1 for ethanol cultures and 0.027 (±0.001) g mtDW g CDW −1 for glucose cultures.This conversion factor was used to normalize over mitochondrial dry weight.All counts, either raw, normalized to CPM or normalized to CPM and mtDW, are listed in the Supporting Information (Koster et al SI_readcounts.xlsx).
Per-base coverage of the mitochondrial transcriptome was obtained with SAMtools (parameters: samtools depth -aa).Relative per-base coverage at distinct loci was computed in R (script: coverageplots.rmd) by normalizing coverage at each base position to the total read depth in counts-per-million (CPM).Ratios between intron and exon levels were determined by calculating the coverage of the last 250 base pairs of each splicing variant or exon, as these base pairs are unique between each variant, and then subsequently normalizing over the exon level.

| Read length estimation
For assessment of the read length estimation, a different reference alignment file was used in which the mitochondrial genome was split up into shorter reference sequences that each contained a single mitochondrial open reading frame (ORFs).This allowed for assigning of reads to single ORFs rather than to the full mtDNA, making quantification more straightforward and filtering out erroneous reads with gaps that span multiple ORFs.Read mapping was then performed using the same parameters as described.
Using the resulting BAM files, for each ORF, the average length of its respective mapped reads was calculated (script: readlengths.rmd).

| Visualization of read mapping
Read mapping was visualized using Integrated Genome Viewer (Robinson et al., 2011) and Tablet (Milne et al., 2012).Mapping of long reads was visualized using the full mitochondrial genome as a reference sequence and corresponding BAM files.Splicing was visualized using separate mitochondrial ORFs as reference sequences with the respective BAM files, as this increases the resolution of the data when looking into splicing events.

| Growth rate analysis
Growth rate analysis was performed in 96-well microtiter plates at 30°C and 250 rpm using a Growth Profiler 960 (EnzyScreen BV).
Frozen glycerol stocks were inoculated in 100 mL YPD medium and grown overnight.0.5 mL of the overnight culture was transferred to 100 mL SMD + Ura/Lys/Ade and grown until the OD 660 had doubled at least once to ensure exponential growth.From this culture, a 96well microtiter plate (EnzyScreen, type CR1496dl) containing either SMD + Ura/Lys/Ade or SMEG + Ura/Lys/Ade with final working volumes of 250 μL was inoculated with a starting OD 660 of 0.1.
Growth rate analysis and data analysis were performed as described in Boonekamp et al. (2022).Gigabases in total per sample.Reads of both samples were mapped using BWA (version 0.7.15) (Li & Durbin, 2009) against CEN.PK113-7D genome.Alignments were processed using SAMtools (version 1.3.1)(Li et al., 2009).De novo assembly was performed using SPAdes (version 3.9.0)(Bankevich et al., 2012) for both samples.Assembled contigs were aligned with nucmer (MUMmer package v3.1) (Kurtz et al., 2004) to CEN.PK113-7D reference; the overlapping contigs aligned to the mitochondrial chromosome were merged into scaffolds to reconstruct the mtDNA for both samples.The contigs containing COX1 and COB as well as the consensus sequences of the mitochondrial genomes of 161-U7 and 161-I 0 were aligned to the CEN.PK113-7D genome using BLAST (NCBI) and visualized in R using genoPlotR (file: alignments.Rmd) (Guy et al., 2010).The sequencing data and de novo assemblies are deposited at NCBI (https://www.ncbi.nlm.nih.gov/)under BioProject ID PRJNA902953.

| Isolation of mitochondria and preparation of RNA
Compared to nuclear DNA-encoded transcripts, the quantification of all RNA species in mitochondria presents specific technical challenges that require tailor-made experimental and computational procedures.
First, the cellular abundance of mitochondrial RNA in yeast is very low and represents circa 5% of the total cellular RNA pool under respiratory conditions, and approximately 1%-3% under glucose repression, of which 90%-95% represents mitochondrial rRNA (Mercer et al., 2011;Mueller & Getz, 1986) Yeast cultures were spheroplasted and homogenized, and mitochondria were isolated by differential centrifugation.Oxygen uptake measurements and microscopic analysis confirmed that isolated mitochondria were abundant and intact (Supporting Information: Figure S1).Western blot analysis and enzyme assays further showed that mitochondria were significantly enriched, but not fully devoid of cytosolic contamination (Supporting Information: Figure S1).The mitochondrial fraction can be contaminated by the carryover of cytosolic RNA species suspended in the cytosol and by RNA species attached to the outer mitochondrial membrane.These contaminants could be removed by RNAse treatment and by removal of the mitochondria outer membrane (i.e., mitoplasting).These treatments did however not significantly reduce the contamination by most cytosolic RNA and reduced mitochondrial RNA yields (Supporting Information: Figure S2).RNA was therefore directly extracted from crude isolated mitochondria without additional processing.
To quantify the contamination by non-mitochondrial RNA species and monitor RNA quality during extraction, two synthetic non-polyadenylated control RNAs were spiked to mitochondria before RNA extraction.A short (60 bp) in vitro-synthetized control simulated a tRNA-length transcript (IVT01), while the 1 kb eYFP transcript simulated messenger RNAs (IVT-eYFP, Figure 2a).Control RNAs of various lengths were also spiked further down the workflow.
Spiking before poly(A) tailing assessed the IVT reaction yield and the poly(A) tail length.A control supplied by Oxford Nanopore Technologies (ONT) was added before sequencing to assess sequencing throughput and library size and quality (Figure 2a).
Polyadenylation was required to make the mitochondrial and in vitrosynthesized RNA compatible with the direct RNA sequencing protocol of ONT, as the adapters required for Nanopore sequencing are added through a 10(dT) primer sequence that hybridizes with the poly(A) tail of mRNA.As mitochondrial RNA is not natively polyadenylated, addition of a poly(A) tail of at least ten adenine residues was required for successful adapter ligation and subsequent sequencing of the RNA.
Following the above-described protocol, RNA-seq was performed using independent triplicate cultures of the S. cerevisiae laboratory strain CEN.PK113-7D grown on glucose or on a mixture of ethanol and glycerol (further referred to as ethanol media) as the sole carbon source.RNA sequencing resulted in library sizes ranging between 0.4 and 1.1 Gb of data, with 0.4 × 10 6 to 1.2 × 10 6 reads that passed a quality score of above 7 (Spreadsheet S1).The average read N50 score was around 1.2 kb, indicating that most of the reads were of a length corresponding to a mature mRNA.The longest reads were around 8 kb, revealing that some, but not all, full-length polycistronic primary transcripts were detected (expected size 1.6-16 kb, were plotted against the mRNA length for identification of these longer and shorter reads.A few reads were substantially shorter or longer than the expected ORF length.Notably, the read length for the ATP8 and OLI1 genes was, respectively, six and three times longer than the expected mRNA length.(Figure 3a).In line with earlier reports, closer inspection of the read alignment of these loci confirmed the identification of a bicistronic transcript encoding ATP6 and ATP8 (Simon & Faye, 1984).However, long ATP8 transcripts were also due to the presence of 5′ untranslated leader (UTL) and 3′ untranslated region (UTR) sequences, rather than polycistronic expression with the surrounding genes: 44% of the reads mapping to ATP8 were bicistronic, whereas 55% of the reads were extended due to attached UTL and UTR regions (Figure 3d).
OLI1 is part of a 4.8 kb polycistron; however, while a few transcripts extending past the OLI1 dodecamer were detected (Figure 3e), no full-length primary transcripts were found.In agreement with previous reports, OLI1 transcripts elongation was caused by the presence of a 5′UTL (Thalenfeld et al., 1983;Turk et al., 2013).
Noncoding RNA sequences were, in general, shorter than their expected length.The most extreme case was 21S rRNA, for which less than 1% of the reads mapping to 21S rRNA gene displayed the expected gene length.Since the used ORF lengths already exclude intron sequences, this cannot be the result of splicing.Analysis of the raw reads mapped to the 21S rRNA locus revealed that reads were strongly truncated or degraded at approximately 1-1.5 kb after the 5′ start of the 21S rRNA gene (Figure 3c).In addition, 21S rRNA is the longest RNA present in the mitochondria, and may therefore be more sensitive to RNA degradation.Additionally, most of the intron sequences of COX1 and COB (aI1-aI5γ and bI1-4, respectively) were shorter than their expected length.This could potentially be explained by the rapid turnover of spliced introns, which is essential for efficient protein expression and respiration and to prevent intron accumulation (Margossian et al., 1996;Turk & Caprara, 2010), but this requires further investigation.
The fraction of tRNAs in the present data set was particularly low (Figure 2b).Processed tRNAs have a short length, ranging from 71 to 90 bp.The data showed that the spiked RNA controls below 102 bp were strongly depleted during the RNA-seq protocol (Supporting Information: Figure S7).This result was in line with earlier reports that Nanopore sequencing cannot accurately analyse short nucleotide sequences of <100 bp (Wilson et al., 2019) and suggests that short RNAs such as tRNAs were underrepresented in the present study.All tRNAs are part of a polycistronic transcript, either in a tRNA cluster or grouped with other RNA transcripts.Analysis of reads aligned to the tRNA cluster revealed that single processed tRNAs were indeed rarely detected and that any captured tRNAs were part of a polycistronic transcript (Figure 3b).

| mtRNA coverage and landscape
The present datasets originated from three biological culture replicates, mitochondria isolation rounds and sequencing runs, leading to different read distributions and library sizes (Figure 2b and Supporting Information: Table S5).Quantitative analysis of expression levels therefore required library normalization.
RPKM (reads per kilobase million) is the standard normalization method for RNA-seq, developed for short-read RNA sequencing in which read length (ca.150-200 bp) is substantially shorter than gene length (e.g., several kb in yeast).RPKM normalizes to library size, but also to gene length, as to compensate for this direct correlation between read number and gene length.However, normalization by gene length is not applicable to long-read RNA sequencing as read length often approaches or even exceeds the gene length.The datasets were therefore only normalized over library depth in countsper-million (CPM).
The sum of polycistronic primary transcripts of the mtDNA encompassed 54.1 kb of the heavy strand, which means that 64% of the heavy strand is theoretically expressed and 6.3 kb (7%) of the light strand.In the obtained RNA-seq data set, the breadth of coverage of the mitochondrial DNA heavy strand was higher than expected, with 74.7% and 85.7% for ethanol-and glucose-grown cells, respectively (Figure 4a,b, and Supporting Information: Figures S8 and S9), revealing a very good representation of the mitochondrial transcriptome.When looking at the raw read alignment of coding and noncoding regions, the respective 10%-20% higher expression does not have a clear origin and seems to mostly result from a combination of occasional read-through past the dodecamer or misalignment of read ends (Supporting Information: Figure S10), which was especially prevalent in the AT-rich 3′ end of 15S rRNA (Supporting Information: Figure S11).
For the light DNA strand, only containing one active origin of replication and tRNA tT(UAG)Q2, the observed coverage was 14.8% for ethanol-grown and 7% for glucose-grown cultures (Figure 4a and Supporting Information: Figures S8 and S9).Most expressions clearly originated from the expected primary transcripts.On ethanol, the breadth of coverage of the light strand was slightly higher than expected, but the depth of this coverage was very low.This might result from transcriptional read-through at the ORI3 locus or from the presence of mirror RNAs, which have been described to exist in similar low quantities in both human and yeast mitochondria and are possibly a by-product of RNA processing (Szczesny et al., 2012;Turk et al., 2013).
On ethanol, all of the coding regions were sufficiently covered by sequencing reads (Figure 4 and Supporting Information: Figure S10), including regions encoding tRNAs, albeit in a low amount and often as polycistronic reads (Figure 3b).The gaps in coverage represented 14%-25% of the mitochondrial genome, but all in noncoding regions.
The mitochondrial genome was therefore exhaustively sequenced, despite the low representation of mitochondrial RNAs in the total RNA-seq data (ca.7%, Figure 2b).Irrespective of the carbon source used, the per-base coverage showed a large variation in expression between the mitochondrial genes (up to 3300-fold, Figure 4a).The described method yielded a complete coverage of the mitochondrial genome and could therefore be used to quantify changes in RNA expression and processing of the mitochondrial genome between different conditions.

| Comparison of transcript levels between ethanol-and glucose-grown cultures
While comparing the relative expression of the various mitochondrial RNA species within a given growth condition is straightforward and only requires normalization over library depth in counts-per-million (CPM), comparing RNA abundance between growth conditions is more challenging.S. cerevisiae is a typical Crabtree-positive yeast, in which excess glucose triggers a respiro-fermentative metabolism.
Glucose exerts a strong repression on respiration, leading to low respiration rates, small and sparse mitochondria, and lower mitochondrial RNA levels (Altmann et al., 2007;Mueller & Getz, 1986;Visser et al., 1995).Conversely, ethanol is fully respired, leading to abundant and very active mitochondria.The present method first extracted mitochondria from different culture volumes on glucose and ethanol, while RNA sequencing is performed on the same input amount of RNA.Comparing mtRNA levels during growth on glucose and ethanol therefore requires an additional correction for the abundance of mitochondrial mass per cell, between the two carbon sources.Since direct RNA-seq results in smaller library sizes (~0.5-1M aligned reads) well below the required input to obtain a high-power differential gene expression using short-read sequencing (~10M reads), gene expression was measured as transcript-level counts (Dong et al., 2021;Jenjaroenpun et al., 2018;Soneson et al., 2019).Additionally, a recent study reported a correlation between mitochondrial mass and cell dry weight of glucose-grown and ethanol-grown cultures (Di Bartolomeo et al., 2020).This correlation was used to normalize the counts per mitochondrial dry weight, thereby enabling the direct comparison of mtRNA abundance between glucose-and ethanol-grown cultures (Figure 5 and Supporting Information: Table S5).
In both conditions, based on non-normalized data, the most abundant RNA species were rRNAs, with 15S rRNA being 5.5 times more abundant than 21S rRNA (Figure 5a,b).Transcripts of all mitochondrial mRNAs were recovered in both conditions, but their relative expression was affected by the type of carbon source.For instance, transcripts mapping to the cytochrome c oxidase subunits 1, 2 and 3 (COX1, COX2 and COX3, respectively), the cytochrome b (COB) and the F0-ATPase subunit c (OLI1) were most abundant in ethanol-grown cultures (Figure 5a).Conversely, on glucose, transcripts mapping to the COX1 ORF were most abundant, followed by COB and the meganuclease-enconding transcript I-SceI, encoded in the omega-intron of 21S rRNA (Figure 5b).In line with the reduced involvement of mitochondria in respiration with excess glucose, the overall normalized abundance of mtRNA was lower in glucose-than in ethanol-grown cultures.Accordingly, with the notable exception of COX1 and COB, the abundance of transcripts encoding respiratory chain subunits was 5-15 times higher in ethanol-than in glucosegrown cultures (Figure 5c and Supporting Information: Figure S12).
Lastly, since the used quantification method could not confidently distinguish splicing variants and primary transcripts, reads mapping to intron and exon sequences of the COX1 and COB ORFs were cumulated for this initial quantification.
Additionally, tRNAs were more abundant in ethanol-than in glucose-grown cultures, which is in line with the increase in protein synthesis in ethanol-grown cultures and previous studies on tRNA levels under glucose repression (Baldacci & Zennaro, 1982).Out of the 24 tRNAs, 22 could be quantified in all ethanol-grown cultures, while only tP(UGG)Q was reliably detected in glucose-grown cultures (Figure 5).In line with these observations, the third most abundant RNA species in ethanol-grown cultures was RPM1.RPM1 encodes the RNA component of RNase P, an enzyme highly active during respiratory growth that is essential for tRNA processing (Rossmanith, 2012;Sulo et al., 1995).All tRNAs are part of a polycistronic transcript, either in a tRNA cluster or grouped with other RNAs, and the three most abundant RNA species tL (UAA)Q, tF(GAA)Q and tE(UUC) are located at the start of a polycistronic primary transcript (Figure 1).
The data set also provides insight into the expression of the origins of replication.It is still unclear whether mitochondrial DNA replication relies on RNA priming, rolling-circle replication, or a combination of the two (Bernardi, 2005;Chen & Clark-Walker, 2018).
Nevertheless, transcripts for three (ORI2, ORI3 and ORI5) out of the eight mitochondrial origins of replication were detected under both growth conditions, suggesting that these origins were active under the tested conditions (Figures 4a and 5).This is in accordance with the presence of uninterrupted promoter sequences for these Oris (Baldacci et al., 1984;Graves, 1998;Kaufman et al., 2000;Turk et al., 2013).The abundance of these replication primers was three-to fivefold higher with ethanol as carbon source (Figure 5).Many reads also mapped to ORI6, which was most likely an artefact caused by the presence of the OLI1/VAR1 primary transcript within this origin.This hypothesis was supported by the lack of reads mapping to the 5′-end of ORI6 (Figure 3e).
Remarkably, the read coverage within genes was also carbonsource dependent, particularly for the COX1 and COB and 21S rRNA loci (Figure 4a).For instance, the coverage within the 21S rRNA on ethanol as compared to glucose showed a sudden decrease in between the 5′ end and the omega-intron, at approximately 1 kb after the 5′ end, which corresponds to the previously observed gap in read alignment (Figure 3c).A similar dip in coverage was observed for the RPM1 gene, for which the decrease in coverage and read truncation exactly matched the RNA processing pattern and proposed heptakaidecamer cleavage site (Supporting Information: 12.9 kb long, of which 1.6 kb consists of exon sequences that together encode the Cox1p.The COB pre-mRNA is 7.1 kb long, of which 1.2 kb encodes Cobp.The COX1 and COB introns belong to either Group I or Group II homing introns that are distinguished by their splicing mechanism and are considered phenotypically nonessential for S. cerevisiae (Dujon, 2020;Lambowitz & Belfort, 1993;Séraphin et al., 1987).The COX1 exon sequences are interrupted by four Group I and three Group II intron sequences (COX1-aI1, aI2, aI3, aI4 and aI5α,β,γ), and COB contains one Group I and four Group II intron sequences (COB-bI1, bI2, bI3, bI4 and bI5), that are spliced out of the pre-mRNA in a protein-facilitated manner (Huang et al., 2005;Lambowitz & Belfort, 1993;Tzagoloff & Myers, 1986).As a result of alternative splicing (part of) these introns can also encode proteins (Figure 4b,c).COB-BI2, -BI3 and -BI4 encode maturases, COX1-AI1 and -AI2 encode reverse transcriptases and COX1-AI3, -AI4 and -AI5α encode endonucleases, all of which facilitate splicing and intron mobility of their respective intron sequence (Foury et al., 1998;Lambowitz & Belfort, 1993;Moran et al., 1995;Tzagoloff & Myers, 1986).Finally, COX1-AI5β encodes a putative protein of unknown function; Group II introns aI5γ and bI1 are noncoding.
In ethanol-grown cultures, the pre-mRNAs of both COX1 and COB were clearly spliced and processed to yield an RNA containing only exon sequences (Figure 4b,c).For COX1, spliced intron sequences were also detected but displayed various degrees of degradation and were rarely found in full length, which is in agreement with the shorter-than-expected read length of intron sequences (Figure 3a).Overall, no significant difference was observed between intron and exon abundance of COX1 in ethanolgrown cultures (Figure 4b,d).For COB, intron sequences were rare, especially introns bI3 and bI4 were strongly depleted and present at only 10% of the exon level.Some reads showed patterns of alternative splicing, but since intron levels are relatively low and intron turnover is co-translational in mitochondria (De Silva et al., 2017), it is difficult to attribute the intron levels to either 'regular' (aI1-5, bI1-5) or alternative splicing (AI1-5, BI2-4) of the intron.
In glucose-grown cultures, the splicing pattern of the mature COX1 and COB exon was more difficult to infer from the raw read alignment, and unspliced pre-mRNAs were clearly more abundant (Figure 4b,c).Also, in contrast with ethanol-grown cultures, in glucose-grown cultures the coverage of several sequences mapping to intronic regions was substantially higher than the corresponding exon abundance for both COX1 and COB (Figure 4b,c).COX1-aI1, -aI2 and -aI5γ were, respectively, 57, 23 and 14 times more abundant than the mature COX1 exon-RNA (Figure 4d).Similarly, the COB-bI1 intron was 3.5 times more abundant than the COB mature exon-RNA (Figure 4e).Di Bartolomeo et al. (2020) reported that the abundance of the Cox1 protein was 15-50 times higher than its intron-encoded proteins, a result in contrast with the relative exon-intron abundance measured in the present study (Supporting Information: Figure S14).
Additionally, analysis of the raw read alignment to the COX1 and COB ORFs revealed that the increased intron levels were not due to alternative splicing, as the intron sequences were rarely attached to a 5′ exon sequence (Figure 4b,c).Lastly, considering that the aI5γ and bI1 introns do not encode any protein (Lambowitz & Belfort, 1993), the high intron abundance most likely resulted from the accumulation of the spliced intron sequences rather than the elevated expression of intron-encoded genes.

| Exploring the effect of the presence of Group II introns on yeast physiology
Group II introns specifically accumulated during respiro-fermentative growth in glucose medium.While this accumulation might be a side effect of some yet unknown condition-dependent regulation of intron stability, the possibility of a physiological role for mitochondrial lariats cannot be excluded.In higher eukaryotes, there are indications that noncoding RNA can play a role in gene expression, epigenetic regulation, and genome stability (Cavalcante et al., 2020;Zhao et al., 2018Zhao et al., , 2020)).The hypothesis that intron sequences may play a similar role in yeast can be tested by comparing the physiology of yeast strains with and without Group II introns.Despite the poor genetic accessibility of the mitochondrial genome, a few 'intron-less' strains have been previously constructed (Huang, 2004;Huang et al., 2005).
The physiological response of an intron-less strain from the 161 strain lineage (also known as ID41-6/161) to growth in the presence of glucose or ethanol was therefore explored (Mahler & Lin, 1978;Wenzlau et al., 1989).The intron-less strain 161-I 0 does not contain introns in COX1 and COB and also lacks the ω-intron (encoding I-SceI).COX1 and COB introns were removed through biolistic transformation (Huang, 2004;Huang et al., 2005, personal communication).The control congenic strain 161-U7 only lacks the ω-intron.
As the sequence of these strains' genome was unavailable, S.
cerevisiae 161-U7 and 161-I 0 genomes were sequenced using Illumina short-read technology.A de novo assembly was performed based on the sequencing reads (Supporting Information: Table S6), and resulting contigs containing COX1 and COB could be aligned to the CEN.PK113-7D mtDNA to determine the absence of intron sequences (Figure 6a).Additionally, a consensus sequence of the mitochondrial genomes of 161-U7 and 161-I 0 was generated by mapping the contigs from the de novo assembly of the two strains to the CEN.PK113-7D mitochondrial genome and extracting the resulting consensus sequence (Supporting Information: Figure S19).
In addition to the mitochondrial genome, the complete sequence of the nuclear genome of these strains was de novo assembled and is available at NCBI (see Materials and methods section).
Comparing the mitochondrial consensus sequences of the 161-U7 and 161-I 0 confirmed the absence of the ω-intron in both strains (Supporting Information: Figure S19) and the absence of all other introns in strain 161-I 0 (Figure 6).The mtDNA of 161-I 0 covered 77% of the 161-U7 genome, and the similarity between the two mitochondrial consensus genomes was 99.2% (Supporting Information: Table S7).The de novo assembly resulted in fragmented mitochondrial genomes (Supporting Information: Table S6) and must be complemented with long-read sequencing to confirm the correct structure of the mtDNA; therefore, the analysis of the COX1 and COB loci was performed using the correctly assembled de novo contigs that contained these loci rather than the generated consensus sequence (Figure 6).The similarity between the COX1 and COB alleles was 99.8% and 99.9%, respectively (Supporting Information: Table S7), yielding 12 SNPs in COX1 and 1 in COB.Additionally, in the COB locus of intron-less strain 161-I 0 , the 14 bp exon sequence between introns bI1 and bI2 appeared to be removed from the mtDNA (Figure 6a).
A previous study reported substantial physiological differences between the two strains during growth on glucose, but growth on ethanol was not explored (Rudan et al., 2018).In contrast with this earlier study, 161-U7 and its intron-less variant grew at similar rates on glucose medium.In cultures with ethanol as carbon source, the intron-less strain grew slightly but significantly slower (14%, p value 0.05, Student's t-test homoscedastic) than its parent (Figure 6b).Such a change in growth rate may be explained by mutations that occurred in construction of strain 161-I 0 or the absence of the second exon of COB.The respiration rate was similar for the two strains during growth on ethanol and was increased by ca.30% during growth on glucose for the intron-less strain (Figure 6c), which might indicate a partial alleviation of glucose repression.The two strains were equipped with a mitochondrial fluorescent protein to monitor mitochondrial mass.When applied to the CEN.PK113-7D control strain, this method showed a fourfold decrease in mitochondrial mass in glucose-grown as compared to ethanol-grown cultures (Figure 6d,e).This result was in good agreement with earlier reports for the same strain (Di Bartolomeo et al., 2020).The same method applied to 161-U7 and 161-I 0 showed a substantially smaller mitochondrial mass variation between glucose-and ethanol-grown cells, where mitochondrial mass on glucose represented approx.60% of the mitochondrial mass on ethanol but revealed the absence of difference in mitochondrial mass between the two strains.The absence of introns did therefore not lead to clear physiological differences when comparing growth on glucose and ethanol media.

| DISCUSSION
In the present study, RNA enrichment by mitochondria isolation combined with Nanopore long-read sequencing enabled the capture of the full mitochondrial transcriptome in the model eukaryote S.
cerevisiae.All expected RNA species were found and the mitochondrial RNA coverage was sufficient to quantify most of the transcriptome.Small RNA molecules below 100 bp, specifically tRNAs, were an exception; thus, studies focusing on these RNA species should rather opt for, or complement with, short-read sequencing, or use methods to specifically enrich these transcripts (Thomas et al., 2021).The coverage of the mtRNA was sufficient but can be further improved through removing contaminating RNA species by further purification of mitochondria (e.g., using Nycodenz gradient purification [Boldogh & Pon, 2007]).Additionally, treatment with nucleases could reduce the carryover of cytosolic RNA, which represents up to 97% of the sequenced RNA in the present study.
Depletion of rRNA (Wangsanuwat et al., 2020) is another approach to enrich samples for the relevant RNA species.Finally, the excessive number of reads originating from the RNA controls can be substantially reduced by decreasing the amount of spiked-in control RNA.Based on the nonanucleotide sequences that initiate translation in the mitochondria (Osinga et al., 1984), the mitochondrial genome is assumed to be expressed in 11 primary polycistronic transcripts that are further processed at internal dodecamer sites and by endolytic cleavage of tRNAs.However, no full-length primary transcripts were detected in the present study.Knowledge of processing of these polycistronic transcripts is still very limited, and polycistronic transcription has only been demonstrated directly for few primary transcripts (Simon & Faye, 1984;Thalenfeld et al., 1983).It was only recently found that the dodecamer is recognized by the protein Rmd9p and that deletion of this protein results in the detection of unprocessed primary transcripts (Hillen et al., 2021;Nouet et al., 2007).The lack of full polycistronic transcripts might reflect the real in vivo situation, in which the short half-life of polycistrons results from co-transcriptional splicing (De Silva et al., 2017).However, although mitochondrial RNA is stabilized at its dodecamer sequences, it cannot be excluded that during sample processing of the intact mitochondria, the RNA processing and turnover enzymes stay active (Gagliardi et al., 2004;Hillen et al., 2021), particularly during spheroplasting that is performed at 30°C for 30-45 min.Other studies were able to detect primary transcripts, albeit in extremely low amounts (Simon & Faye, 1984;Thalenfeld et al., 1983), so increasing the amount of mitochondrial RNA input by further purification of mitochondrial RNA may also aid in the detection of these polycistronic transcripts.Nevertheless, investigating the possibility of additional quenching or stabilization of the mitochondria before isolation may improve detection of unprocessed RNA (Evers et al., 2011;Schwalb et al., 2016).
While the present approach cannot give a perfect snapshot of in vivo RNA abundance, by applying the exact same protocol to all samples quantified, mtRNA abundance between different conditions was demonstrated with good reproducibility between biological replicates (Figure 5).In a previous study, the transcriptome of S.
cerevisiae grown in ethanol medium was monitored using short-read sequencing (Turk et al., 2013).The relative abundance of RNA  Mueller and Getz (1986).This may be explained by the secondary structure of these RNAs, which can interfere with the binding of short sequencing reads or RNA probes, whereas Nanopore direct RNA sequencing includes relaxation of secondary structures (Price et al., 2017;Soneson et al., 2019).The levels of mitochondrial rRNA could not be determined by the used ScreenTape assay for RNA quality control, as there was too much contamination by cytosolic rRNA (Supporting Information: Figure S20).
The reported high expression of COX1 relative to COX2 in glucose conditions was unexpected as COX1, 2 and 3 together encode the catalytic core of cytochrome c oxidase and exist in a 1:1:1 stoichiometry in the inner mitochondrial membrane (Fontanesi et al., 2008;Merle & Kadenbach, 1980;Tsukihara et al., 1996).The presented finding also contradicts a previous study on the stoichiometry of mitochondrial protein and RNA levels under glucose repression (Di Bartolomeo et al., 2020;Mueller & Getz, 1986).
However, the underlying cause of the elevated level of COX1 transcripts was the presence of alternative splicing products and intron sequences of the COX1 locus, which were especially highly expressed under glucose conditions (Figure 4d,e).Likewise, ATP8 and ATP6 transcripts, expressed on a single cistron, were expressed in similar levels, while the proteins are present in different stoichiometries in the cell, which can be attributed to posttranscriptional regulation of translation (Rak & Tzagoloff, 2009;Turk et al., 2013).Nevertheless, these large differences between the present study and earlier reports on mitochondrial RNA levels could be an artefact induced by Nanopore sequencing, and in future studies, it would be elegant to include a short-read method as a comparison to identify potential artefacts.
As opposed to short-read sequencing, the present approach enabled the analysis and visualization of splicing variants with single base pair resolution and the quantification of the abundance of RNA species at different processing stages.As transcripts were sequenced in a single read, RNA processing events could be identified by combining information on transcript coverage and on the presence of gaps in read mapping to the reference genome, as illustrated by comparing the known RNA cleavage sites of RPM1 with the coverage and read alignment of that gene (Supporting Information: Figure S13).
Similarly, a strong decrease in coverage of 21S rRNA and a gap in read alignment was found in a single locus during growth on ethanol media.The consistent occurrence of the same phenomenon for all three independent replicates of ethanol-grown cultures, but the absence in glucose-grown cultures, suggested the existence of a condition-dependent new processing site for these RNA species.
As the 21S rRNA sequence harbours no heptakaidecamer or internal dodecamer splicing sites, the exact target site for this novel RNA processing event remains to be defined.
Group II introns are circular RNAs (known as lariats) and are a result from the covalent linking between the 5′ end and an A/U-residue 8 to 9 base pairs from the intron 3′ end.The formation of this branch point is followed by exon ligation and cleavage at the 3′ splice site and the resulting release of a circular intron lariat (Lambowitz & Belfort, 1993;Schmelzer & Schweyen, 1986;Yang et al., 1998).
Nanopore technology cannot sequence circular RNA molecules, which means that (part of) the lariats were most likely linearized before sequencing.The existence of linear intron RNA in the mitochondria has been hypothesized previously (Baldacci & Zennaro, 1982;Halbreich et al., 1984).Linearization can result from different mechanisms, either mechanical, such as by shearing during sample processing or through in vivo processes, including debranching by specific enzymes, degradation by endonucleases or intron lariat reopening reactions causing linear intron intermediates (Jarrell et al., 1988;Pyle, 2016).Linear introns can also splice through a hydrolysis reaction (Li-Pook-Than & Bonen, 2006;Murray et al., 2001;van der Veen et al., 1986).In yeast, this mechanism has only been detected in vivo when introns were mutated in their branch site (Podar et al., 1998).Introns aI1 and aI2 are 2.3 kb in length, while aI5γ and bI1 are both around 0.8 kb.Over 75% of reads for the COX1-aI1, -aI2, aI5γ and bI1 intron had the expected length up to the branch point, which  [110]) is not localized to mitochondria (prediction using DeepLoc2.0,Supporting Information: Table S8; Thumuluri et al., 2022) and Dbr1p has not been detected in any mitochondrial proteome (Di Bartolomeo et al., 2020;Sickmann et al., 2003).Therefore, it can be reasonably assumed that lariats are either debranched in vivo by a yet unidentified debranching enzyme, or that (a subset of) the Group II introns is spliced through hydrolysis as linear RNA molecules when grown on glucose.This may also mean that the presented intron levels are an underrepresentation of the actual intron level, as the introns with lariat structure could not be sequenced using Nanopore sequencing.Using in vitro debranching with purified Dbr1 could provide more insight into this.
Regardless of the configuration of the Group II introns, a clear condition-dependent change in Group II introns abundance was observed when cultures were grown on glucose.The most likely mechanisms leading to this change in intron abundance are an increase in either intron stabilization by proteins on glucose or in vivo degradation of introns on ethanol.Potential target proteins may be identified by comparing the differential expression of RNA-binding proteins observed between glucose-and ethanol-grown cultures (Di Bartolomeo et al., 2020;Mitchell et al., 2013;Tsvetanova et al., 2010).From this comparison, YBR238C, mitochondrial paralog of Rmd9p that stabilizes mitochondrial RNA at their dodecamer sequence (Hillen et al., 2021;Nouet et al., 2007), was 1.8 times more abundant in glucose-than in ethanol-grown cultures (Supporting Information: Figure S14).MtRNA-binding protein levels between ethanol and glucose were also compared relative to Cox1p expression, as this gives insight into the different levels of protein required between ethanol and glucose to mature a similar level of COX1 mRNA.When normalizing to Cox1p levels, Mss116p, an RNAhelicase and -chaperone involved in splicing Group II introns (Huang et al., 2005), showed over twofold increase in expression between growth on ethanol and glucose (Supporting Information: Figure S14), as well as Dis3p, which was demonstrated to play a role in mitochondrial intron decay (Turk et al., 2013).Combined with the difference in intron RNA abundance between glucose-and ethanolgrown cultures, these data suggest condition-dependent Group II intron processing or stabilization by RNA-processing enzymes.The underlying molecular mechanism remains to be explored.
While accumulation of introns in glucose-grown cultures might be a mere side effect of condition-dependent intron turnover rates, an interesting fundamental question is whether these Group II introns have an impact on mitochondrial functions.In yeast, it has been demonstrated that efficient splicing and regulation of Group I introns plays a regulatory role (Margossian et al., 1996;Rudan et al., 2018;Turk & Caprara, 2010), while in higher eukaryotes, noncoding mtRNA may play a role in cell signalling and mitochondrial expression (Cavalcante et al., 2020;Zhao et al., 2018).Such functions have not yet been described for Group II introns.The Group II introns are not conserved within the Saccharomyces genus (Wolters, 2018), and as intron-free strains did not show a different phenotype in ethanol medium, it was assumed that Group II introns are phenotypically neutral (Dujon, 2020;Séraphin et al., 1987).Intriguingly, one study reported increased respiration rates and mitochondrial volume during growth on glucose medium using a fully intron-less S. cerevisiae strain, a phenotype which could also be achieved by overexpressing the Mss116p RNA-helicase and -chaperone involved in Group II introns splicing (Huang et al., 2005;Rudan et al., 2018).As there is no transcriptome data available for the 161 lineage control during growth on glucose and ethanol, there is currently no evidence of any condition-dependent variation in Group II introns levels in this

F
I G U R E 1 Annotated mitochondrial genome of CEN.PK113-7D.Primary polycistronic transcripts (light blue boxes); mRNA, messenger RNA, ncRNA, noncoding RNA; rRNA, ribosomal RNA; tRNA, transfer RNA.Exons that encode the main proteins are indicated in pink.Coding intron sequences are encoded in orange and noncoding intron sequences are indicated in white.
than 40 bp were aligned to the reference FASTA file.SAM files were filtered and converted to Binary Alignment/Map (BAM) files and indexed using SAMtools, whereby only uniquely aligning reads were extracted based on FLAGs 0 and 16.Poly(A) tail length was estimated by nanopolish-polya(Workman et al., 2019), for both PASSED and FAILED reads of one of the sequencing runs (Ethanol replicate #3), following provided instructions.For assessment and visualization of poly(A) tail lengths of datasets, only poly(A) tail estimations with nanopolish-polya-specific QC-tag PASS were used.2.4.5 | Feature quantification and normalizationAnnotation of alignment and quantification of features was performed using FeatureCounts(Liao et al., 2013), where the abovenamed reference file was supplied together with a filtered BAM file with uniquely aligned reads.Importantly, long-read specific, overlap allowed, directional counting in terms of strandedness and fractional counting (parameters: -L -O -s 1 --fraction) were employed.The present datasets originated from three biological culture replicates, mitochondria isolation rounds, and sequencing runs, leading to different read distributions and library sizes (Figure2band Supporting Information: Table Sequencing pipeline and quality control.(a) Schematic overview of the sample preparation and sequencing pipeline.(b, c) Waffle plots showing the distribution of read origin and type of the passed reads (Q ≥ 7) of the three replicates of ethanol-grown cultures (b) and glucose-grown cultures (c).The distribution of the origin of sequencing reads between mitochondrial, cytosolic and control RNA is shown on the left; on the right, the distribution of different types of mitochondrial RNA is shown.One square represents 1/100 of the total number of reads.

2. 5 |
Characterization of intron-less strains 2.5.1 | Fluorescence microscopy Microscopy was performed using a Zeiss Axio Imager Z1 (Carl Zeiss AG) equipped with a HAL 100 Halogen illuminator, HBO 100 illuminating system and AxioCam HRm Rev3 detector (60 N-C 1″ ×1.0) (Carl Zeiss AG).The lateral magnification objective ×100/1.3oil was used with Immersol 518F immersion oil (Carls Zeiss AG).Fluorescence of ymTurquoise2 was detected using filter set 47 (Carl Zeiss AG; excitation bandpass filter 436/20 nm, beamsplitter filter 455 nm, emission filter 480/40 nm).Results were analysed using the Fiji package of ImageJ (Schindelin et al., 2012).Mean fluorescence per cell was analysed by thresholding the phase-contrast channel of each image using the Otsu algorithm to determine the location of the cells in the image.Based on the thresholded phase-contrast image, a binary mask was generated indicating the regions of interest (ROI, i.e., cell locations) of the phase-contrast image.Subsequently, the mask was overlaid on the DAPI channel of the image and the mean fluorescence for each ROI was determined, resulting in a mean fluorescence per cell.
Specific rates of oxygen consumption were measured in 4 mL volume in a stirred chamber at 30°C.The oxygen concentration was measured with a Clark-type oxygen electrode (YSI).Strains were grown overnight in SMD + Lys/Ade.The overnight culture was diluted twice in 100 mL SMD + Lys/Ade and SMEG + Lys/Ade.Glucose cultures were grown for another 3 h to ensure exponential growth, and ethanol cultures were grown for another 48 h to ensure that strains were fully respiratory.Strains were spun down and washed twice to prevent carryover of carbon-containing medium and resuspended in SM.For respiration assays on glucose, 25 OD 660 units of culture were inoculated in a total volume of 4 mL in fully aerated SM + Lys/Ade to a final OD 660 of 6.25.Airflow in the chamber was stopped and 20 μL of a 50% (w/v) glucose solution was added to a final concentration of 14 mM.For respiration assays on ethanol, 40 OD 660 units of culture were inoculated in a total volume of 4 mL in fully aerated SM + Lys/Ade to a final OD 660 of 10.Airflow in the chamber was stopped and 50 μL absolute ethanol solution was added to a final concentration of 210 mM.The percentage of dissolved oxygen was measured until oxygen was depleted and was used to calculate the rate of oxygen consumption expressed as the decrease of oxygen in the reaction vessel over time divided by the biomass concentration in the vessel.2.5.4 | DNA sequencing and bioinformaticsWhole-genome sequencing of strains 161-U7 and 161-I 0 was performed by Macrogen Europe.Genomic DNA of samples 161-U7 and 161-I 0 were sequenced at Macrogen on a Novaseq.6000sequencer (Illumina) to obtain 151 cycle paired-end libraries with an insert-size of 550 bp using TruSeq Nano DNA library preparation, yielding 2 . Second, the absence of polyadenylation of yeast mitochondrial RNA prevents their enrichment by standard, polyA tail-based methods.Finally, to fully capture the diversity in mitochondrial RNA splicing variants, processing of the RNA should be kept to a bare minimum, as PCR or cDNA synthesis reactions might introduce biases.To overcome these problems, a mitochondrial RNA-seq workflow was developed and tested (Figure2a).Enrichment for mitochondrial RNA was achieved by physically separating mitochondria from yeast cells and extracting RNA from isolated mitochondria.To minimize steps that might bias the mitochondrial RNA pool, either by enriching for specific species or by modifying transcripts length, Nanopore long-read direct RNA sequencing was used, in which a single (polyadenylation) step is required between RNA extraction and sequencing.Synthetic control RNAs were spiked at three different stages to monitor mitochondrial RNA quality during extraction, polyadenylation, and sequencing (Figure2a).

Figure 1 )
Figure1).ONT direct RNA sequencing does sequence poly(A) tails of the RNA molecules, thereby enabling the quantification of poly(A) tail length.The enzymatically added poly(A) tails of mitochondrial and control RNAs were on average 40-50 adenine residues long, a length comparable to cytosolic RNA species (Supporting Information: FigureS3).Ninety-five percent of the basecalled reads passed the ONT quality control (Q-value > 7), and all passed reads mapped to the reference genome.The mitochondrial RNA species represented around 5%-12% of the total number of passed reads, excluding the fraction of control reads; this represented a ca.twofold enrichment of mitochondrial RNA in both conditions, the rest being mostly cytosolic ribosomal RNA (Figure2b,c and Supporting Information: FigureS4).The libraries were nevertheless large enough to comfortably cover the mitochondrial genome: the obtained data sets had on average 5 × 10 4 reads that mapped to the mitochondrial genome, equating approximately 600 Mb of sequencing data to cover the 86 kb mitochondrial genome.On ethanol, the breadth of coverage was at an average of 75% of the heavy strand and 15% of the light strand of the mitochondrial genome, and all the open reading frames (ORFs) were covered by sequencing reads.Direct RNA sequencing therefore yielded enough reads of sufficient quality and of the expected size and was further explored to characterize the mitochondrial transcriptome.

F
I G U R E 3 Read length and coverage of genes.(a) Average read length for each gene plotted versus the expected ORF length, where the ORF length is defined as the length of a matured RNA from start to stop codon, after splicing (if applicable).Reads from ethanol-grown cultures are shown as closed circles, and reads from glucose-grown cultures are shown as open diamonds.Reads are separated per type, mRNA: messenger RNA (pink), ncRNA: noncoding RNA (green) rRNA: ribosomal RNA (purple).Spliced genes are separated between the main exon (pink) and the different spliced-out introns (orange).The dashed line represents a full-length read (gene length = read length), and the blue-shaded area represents an interval where the read length is 75% -125% of the full gene length.(b-e) Visualization of raw reads of one representative replicate on ethanol, aligned using Tablet with the primary transcripts of the tRNA cluster (b), 21S rRNA (c), ATP8-6 (d) and OLI1-VAR1 (e).The coloured arrows indicate the genes and their respective locations and the primary polycistronic transcripts are indicated with light blue arrows.Boxes show visualization with Tablet of sequencing reads mapped to the different loci, a red line indicates correct mapping of a sequencing read to the location on the mtDNA and a grey line indicates a gap in the read.

F
I G U R E 4 Sequencing coverage of the mitochondrial transcriptome of cultures grown on ethanol and glucose.(a) per-base coverage of the mitochondrial transcriptome of cultures grown on ethanol (top) compared to the per-base coverage of the mitochondrial transcriptome of cultures grown on glucose (bottom), normalized per million bases as counts-per-million (CPM) of the heavy (forward) strand of the mitochondrial DNA.Coverage depth is represented in grey, and the standard deviation between the sequencing depth of triplicate experiments is shown in orange.(b) Schematic overview, sequencing depth and splicing of the COX1 open reading frame (ORF).The middle shows the COX1 ORF, where the exon sequences are shaded in pink; colourless areas are spliced out of the final COX1 mRNA.The different splicing variants are shown below the COX1 ORF.Capitalized names are used to indicate protein-encoding introns; noncoding introns are shown in lowercase.(i) shows per-base coverage of the COX1 ORF, normalized per million bases as CPM.(ii) shows visualization of sequencing reads mapped to the COX1 ORF visualized using Tablet; a red line indicates alignment of a sequencing read to the location on the ORF and a grey line indicates a gap in the read.The analysis was done for mitochondria of ethanol-grown cultures (top) and glucose-grown cultures (bottom).The location of the COX1 exon is shaded in pink for visualization purposes.(c) Schematic overview, sequencing depth and splicing of the COB ORF.(d,e) Transcript levels of the COX1 (d) and COB (e) introns on glucose (light pink) and ethanol (purple), normalized to library size in CPM and normalized to transcript levels of the exon mRNA.

F
I G U R E 6 Comparison of the genomes and physiology of the CEN.PK113-7D, 161-U7 and 161-I 0 strains.(a) Alignment of the (i) COX1 and (ii) COB loci on the mitochondrial genome of CEN.PK113-7D, and contigs of 161-U7 and 161-I 0 .COB of 161-U7 was divided into two contigs, which were combined at a 76-bp overlap.The mitochondrial genome is represented as a black line and the respective coordinates are indicated below each genome/contig.The locations of exon mRNA, rRNA and ncRNA are indicated as blue squares, and tRNA and Ori are not shown.Spliced exons are connected by a line, and annotations of the sequences are indicated above (CEN.PK113-7D, 161-U7) and below (161-I 0 ) the alignment.Grey boxes or lines indicate a sequence identity of >95% (BLAST) between the two alignments.(b) Maximum specific growth rate (h −1 ) in microtiter plates of CEN.PK113-7D, 161-U7 and 161-I 0 on ethanol and glucose medium.(c) Respiration rate, (d) mean fluorescence intensity per cell measured in the mTurquoise2 channel, determined by microscopy and (e) microscopy pictures of strains IMC251 (CEN.PK113-7D), IMC252 (161-U7) and IMC253 (161-I 0 ) grown on SM-ethanol or SM-glucose with a blue fluorescent mTurquoise2 (mTq2) protein targeted to the mitochondria with a preSU9 mitochondrial targeting signal.Mean fluorescence was calculated per cell (n ≥ 250); the cell area was based on the brightfield image.Scale bars represent 10 μm.An asterisk (*) represents a significant difference with p < 0.05 (paired two-tailed Student's t-test).
indicates little to no degradation (Figure4b,c).Additionally, the median weighted read length (N50) of the data set was 1.4 kb and most of the RNA was sequenced in full (Figure3a), meaning RNA degradation during library preparation was likely limited.Lastly, mechanic shearing of the circular RNA during library preparation would result in a random location of the break for each RNA molecule.When comparing the alignment of the raw reads to each intron, no random breakage of reads was observed, as reads either consistently covered the full intron from 5′ to 3′ end or were partially degraded from the 5′ end (Figure4b,c).These data suggested the accumulated Group II introns of mitochondria of glucose-grown cells were not sheared and were sequenced in full, up to the branching point.The reproducibility of Group II introns abundance and length between biological replicates suggests that the difference in intron abundance observed between glucose-and ethanol-grown cultures was not an artefact caused by mechanical RNA shearing during sample processing, as approximately half of the reads covered the lariat branching point and were sequenced in full length (Supporting Information: FiguresS15-S18).The only known intron-debranching enzyme in S. cerevisiae (Dbr1p specific lineage.Conversely, intron-less strains from the CEN.PK S. cerevisiae lineage, used for transcriptome analysis in the present study, are not available and are challenging to construct.More research will therefore be required by either constructing intron-less CEN.PK strains or by quantifying mtRNA levels in strains of the 161 lineage, to conclude on a potential physiological effect of Group II introns.Little is known about the context dependency of mitochondrial gene expression and splicing.Using the developed method, the impact of environment on transcriptional responses in the tractable model organism S. cerevisiae can be uncovered at a base pair-level resolution, both in terms of RNA quantity and processing under different conditions.Considering the important role of mitochondria in ageing in mammals, monitoring transcriptomic responses of yeast mitochondria to oxidative stress or in chronologically ageing cultures might reveal new layers in gene expression regulation.The presented methodology therefore opens the door to a deeper understanding of mitochondrial processes and their regulation.
Computational methods for RNA sequencing data analysis2.4.1 | Availability of dataWhen indicated, scripts and files required for analysis and interpretation of the data mentioned hereafter are deposited on our research group's servers and can be freely accessed at https://gitlab.tudelft.nl/ charlottekoste/mitornaseq. For R-based analysis, RStudio (RStudio) was used.Raw RNA-seq data and normalized expression levels were deposited at NCBI (www.ncbi.nlm.nih.gov)GEO under accession number GSE219013.