Recent advances in understanding the molecular genetic basis of mitochondrial disease

Abstract Mitochondrial disease is hugely diverse with respect to associated clinical presentations and underlying genetic causes, with pathogenic variants in over 300 disease genes currently described. Approximately half of these have been discovered in the last decade due to the increasingly widespread application of next generation sequencing technologies, in particular unbiased, whole exome—and latterly, whole genome sequencing. These technologies allow more genetic data to be collected from patients with mitochondrial disorders, continually improving the diagnostic success rate in a clinical setting. Despite these significant advances, some patients still remain without a definitive genetic diagnosis. Large datasets containing many variants of unknown significance have become a major challenge with next generation sequencing strategies and these require significant functional validation to confirm pathogenicity. This interface between diagnostics and research is critical in continuing to expand the list of known pathogenic variants and concomitantly enhance our knowledge of mitochondrial biology. The increasing use of whole exome sequencing, whole genome sequencing and other “omics” techniques such as transcriptomics and proteomics will generate even more data and allow further interrogation and validation of genetic causes, including those outside of coding regions. This will improve diagnostic yields still further and emphasizes the integral role that functional assessment of variant causality plays in this process—the overarching focus of this review.

4.7 per 100 000 in children. 2 "Mitochondrial disease" is a collective term for many different clinical disorders united by the common features of failure of mitochondrial function and aberrant energy metabolism. 3 Mitochondrial disorders can present at any age and result from pathogenic variants in either the nuclear genome (nDNA) or mitochondrial genome (mtDNA). Due to this dual genetic control of mitochondrial function, these disorders can be inherited with any inheritance pattern: sporadic, maternal, autosomal dominant, autosomal recessive or X-linked.
Mitochondria are present in all nucleated cell types and therefore, mitochondrial disease may affect any organ or tissue in the body. Some patients have an organ specific disease-"pure" myopathy, cardiomyopathy or optic neuropathy, while others have multisystem involvement at presentation or acquire this during the course of their progressive disease. While there may be some diagnostic, and potentially prognostic, utility in categorising the myriad clinical features as particular syndromes e.g. MELAS (Mitochondrial encephalomyopathy, lactic acidosis, and strokelike episodes); MERRF (Myoclonic epilepsy with ragged red fibres); LHON (Leber hereditary optic neuropathy); NARP (Neuropathy, ataxia, and retinitis pigmentosa); Leigh Syndrome and Pearson Syndrome, the reality is that many patients do not fit easily into this syndromic classification. A further complication is that genotype-phenotype correlations in mitochondrial disease are often poor, even within these defined syndromes. For example, Leigh syndrome, which presents as a progressive neurodegenerative disorder in childhood, exhibits marked genetic variability and is associated with pathogenic variants in more than 75 different mtDNA or nDNA genes. 4,5 Conversely, a single genotype can present with a range of phenotypes; the most common heteroplasmic pathogenic mtDNA variant, m.3243A > G, can present with a classic MELAS phenotype, but also with MIDD (Maternally-inherited diabetes and deafness), sensorineural hearing loss, myopathy, cardiomyopathy, seizures, migraine, ataxia, cognitive impairment, bowel dysmotility, short stature, diabetes, external ophthalmoplegia or Leigh syndrome and 9% of individuals are asymptomatic. 6 The vast clinical and genetic heterogeneity of mitochondrial disease coupled with poor genotype-phenotype correlations makes the genetic diagnosis of patients a challenging task.

| Mitochondrial function and genetics
Mitochondria are dynamic organelles that are responsible for the generation of adenosine triphosphate (ATP) via oxidative phosphorylation (OXPHOS); approximately 90% of the energy requirement of the cell is met through hydrolysis of ATP produced this way. 7 However, mitochondria are also involved in many other processes including, but not limited to, iron sulfur cluster formation, 8 the citric acid cycle, 9 regulation of apoptosis 10 and calcium homeostasis in conjunction with the endoplasmic reticulum. 11 Human mtDNA is a closed-circular molecule of 16 569 bp and encodes 37 genes; 13 polypeptides, 22 transfer RNAs (tRNAs) and 2 ribosomal RNAs (rRNAs). 12 The mitochondrial genome is exclusively maternally-inherited and is present in multiple copies within cells. Cells can be homoplasmic, where all mtDNA molecules are identical, or heteroplasmic, where two (or more) variant populations of mtDNA exist within one cell. Heteroplasmy levels are a factor in determining the aforementioned clinical heterogeneity in patients harboring the common pathogenic m.3243A > G variant. 13 However, heteroplasmy does not fully explain the phenotypic variability with sex 14 and heritable nuclear factors 15 likely to play a role.
All 13 mtDNA-encoded proteins are essential hydrophobic components of the OXPHOS system located in the inner mitochondrial membrane (IMM). The OXPHOS system comprises five multi-subunit complexes and two electron carriers (ubiquinone and cytochrome c). Complexes I-IV and the electron carriers constitute the electron transport chain, which establishes an electrochemical gradient across the IMM that is then dissipated via complex V (the F 1 F O ATP synthase) to synthesize ATP. Individual OXPHOS complexes can also combine into larger complexes; the crystal structures of some of these supercomplexes, including the mammalian respirasome, have recently been elucidated. 16,17 In addition to the 13 polypeptides encoded by mtDNA, more than 60 further nuclear-encoded respiratory chain proteins are translated in the cytosol and imported into mitochondria prior to incorporation into the OXPHOS complexes. Indeed, the nuclear genome accounts for 99% of the mitochondrial proteome which is estimated to comprise 1158 total mitochondrial proteins. 18 A large proportion of these proteins have important roles in OXPHOS, either directly as respiratory complex subunits, cofactors, assembly factors and substrate-generating upstream pathways, or more indirectly, for example, factors involved in the expression of mtDNA-encoded genes. Mitochondrial gene expression requires proteins involved in mtDNA maintenance, transcription, RNA processing/maturation, translation and posttranslational insertion into the IMM. Furthermore, all of these proteins need to be correctly targeted and imported into the mitochondria. Thus, any defects in mitochondrial protein import or the structure of the mitochondria, caused by aberrant cristae formation or abnormal membrane lipid composition, as well as factors affecting mitochondrial fission and fusion can also negatively impact the OXPHOS system. The genetic heterogeneity of mitochondrial disorders is a consequence of this wide range of proteins that impact OXPHOS function. Over 300 mitochondrial disease-mtDNA pathogenic variants: 36/37 genes OXPHOS subunits (CI) MT-ND1, MT-ND2, MT-ND3, MT-ND4, MT-ND4L, MT-ND5, MT-ND6; (CIII) MT-CYB; (CIV) MT-CO1, MT-CO2,  MT-CO3; (CV) MT-ATP6, MT-ATP8   Ribosomal RNA   MT-RNR1   Transfer RNA   MT-TA, MT-TC, MT-TD, MT-TE, MT -TF, MT-TG, MT-TH, MT-TI, MT-TK, MT-TL1, MT-TL2, MT-TM, MT-TN, MT-TP,  MT-TQ, MT-TR, MT-TS1, MT-TS2, MT-TT, MT -TV, MT -TW, MT - ABAT, DGUOK, DNA2, MGME1, MPV17, POLG, POLG2, RNASEH1, RRM2B, SAMHD1, SLC25A4, SSBP1,  SUCLA2, SUCLG1, TFAM, TK2, TOP3A, TWNK, TYMP   RNA maturation/  modification   ELAC2, ERAL1, FASTKD2, GTPBP3, HSD17B10, LRPPRC, MRM2, MTFMT, MTO1, MTPAP, NSUN3, PNPT1,  PUS1, TRIT1, TRMT10C, TRMT5, TRMU, TRNT1   Mitochondrial  aminoacyl tRNA  synthetases   AARS2, CARS2, DARS2, EARS2, FARS2, GARS, GATB, GATC, HARS2, IARS2, KARS, LARS2, MARS2, NARS2,  PARS2, QRSL1, RARS2, SARS2, TARS2, VARS2, WARS2, YARS2   Mitoribosome   MRPS2, MRPS7, MRPS14, MRPS16, MRPS22, MRPS23, MRPS28, MRPS34, MRPL3, MRPL12, MRPL44, PTCD3   Translation   C12orf65, GFM1, GFM2, RMND1, TACO1, TSFM,  F I G U R E 1 List of genes currently associated with mitochondrial disease sorted according to function. Some genes have more than one mitochondrial function, so we have used broad categories to ensure their most appropriate assignment. Our selection criteria necessitated causative genes have a primary or secondary impact on OXPHOS and does not include genes where variants have been described in cancer, but not a mitochondrial disorder (eg, SDHC). Over 150 genes linked to mitochondrial disease have been discovered since the implementation of next generation sequencing (NGS) in 2010. Today, pathogenic variants in 36/37 mitochondrial-encoded genes and 295 nuclear-encoded mitochondrial genes have been shown to affect mitochondrial energy metabolism, highlighting the impact NGS has had in the identification of causative genes that are associated with a wide range of mitochondrial functions associated genes have been described to date ( Figure 1)-a list that has grown enormously over the last decade largely due to the advent and application of next generation sequencing (NGS) technologies. The impact of NGS in identifying novel disease genes can be illustrated using examples related to mitochondrial protein synthesis. Before the advent of widespread NGS, surprisingly few of the genes encoding proteins involved in mitochondrial protein synthesis were associated with disease. The first to be described was a variant in MRPS16 encoding a subunit of the mitoribosome. 19 Only one other mitoribosomal component (MRPS22) was associated with disease 20 before NGS was introduced, but since then pathogenic variants in MRPL3, 21 MRPL44, 22 MRPL12, 23 MRPS7, 24 MRPS23, 25 MRPS34, 26 MRPS2, 27 MRPS14, 28 MRPS28 29 and MRPS39 30 have been identified in patients. All were discovered by NGS methods with the exception of MRPL12 which was identified using microsatellite genotyping and Sanger sequencing. 23 Similarly, prior to use of NGS, only three mitochondrial aminoacyl tRNA synthetases were established as disease genes; the first described was the aspartyl tRNA synthetase (DARS2 31 ) followed by the arginyl (RARS2 32 ) and tyrosyl (YARS2 33 ) aminoacyl tRNA synthetases. The remaining 14 mitochondrial tRNA synthetases (AARS2, CARS2, EARS2, FARS2, HARS2, IARS2, LARS2, MARS2, NARS2, PARS2, SARS2, TARS2, VARS2 and WARS2) plus GARS and KARS, which encode synthetases used in both the cytosol and the mitochondrion, were all identified as disease genes by NGS, with the last one to be found being WARS2. [34][35][36] Mitochondria do not contain a glutaminyl tRNA synthetase, instead tRNA Gln is first charged with Glu by EARS2 before a transamidation reaction converts the Glu-mt-tRNA Gln to Gln-mt-tRNA Gln . This reaction is catalyzed by GatCAB, the glutamyl-tRNAGln amidotransferase protein complex, which consists of three proteins encoded by QRSL1, GATB and GATC respectively. Until recently, only variants in QRSL1 were associated with disease, 25 but the first pathogenic variants in GATB and GATC have now been identified in patients. 37 This completes the list of genes involved in mitochondrial tRNA aminoacylation associated with mitochondrial disorders.
This clearly demonstrates the impact of next generation sequencing in terms of expanding the spectrum of genes associated with disease; advances in sequencing technology will facilitate further gene discovery and this list is likely to continue expanding for some time yet. 38

| GENETIC DIAGNOSIS OF MITOCHONDRIAL DISORDERS: SEQUENCING STRATEGIES
Traditionally, initial suspicion of mitochondrial disease relies upon varied clinical observations, metabolic changes, such as increased plasma or cerebrospinal fluid (CSF) lactate or urinary 3-methylglutaconic acid, 39 and neuroimaging. 40 All of these can be indicators of a mitochondrial etiology, but are not in themselves unique to mitochondrial patients; the complexity of these disorders means that a clear diagnostic algorithm can be difficult to implement. This raises a number of questions concerning diagnosis in an era when NGS is so prevalent. When to perform a muscle biopsy and integrate functional testing? Which sequencing strategy should be used? Is a "multiomics" approach the future of mitochondrial diagnostics? Which experiments are necessary to affirm pathogenicity of novel gene variants or candidate disease genes? Here, we will dissect the importance and application of NGS technologies and discuss the functional validation of novel disease genes. Figure 2 outlines an overview of the various stages and techniques involved in the genetic diagnosis of mitochondrial disorders and will be expanded upon throughout the remainder of this review.
Historically, genetic diagnosis of mitochondrial disease was achieved through candidate gene studies guided by histochemical and biochemical phenotyping of patient tissue, usually collected from a skeletal muscle biopsy. This has been described as a "biopsy first" or "from function to gene" approach. 41 Only in cases with very clear syndromic presentations would a genetic test be carried out prior to a muscle biopsy, for example, a child presenting with MELAS would be tested for the common m.3243A > G MT-TL1 pathogenic variant. The advent of NGS has since revolutionized the diagnosis of many rare genetic disorders, especially heterogeneous disorders such as mitochondrial disease. Over the past decade a number of NGS approaches have been successful including whole mtDNA sequencing, 42 targeted gene panels (eg, complex I), 43 targeted exome sequencing ("MitoExome"), 44,45 whole exome sequencing (WES), 25,46-48 whole genome sequencing (WGS) 49 and RNA-Seq. 50 We will further discuss the impact and merits of each strategy below.

| Whole mtDNA sequencing
Using NGS for whole mtDNA sequencing allows any mtDNA variant to be identified and gives accurate assessment of heteroplasmy levels. 42 It remains common practice in many mitochondrial diagnostic centers to first sequence the mtDNA to exclude mitochondrial variants before performing WES or WGS. In adult-onset cases, a mtDNA etiology is far more common, so whole mtDNA sequencing remains a more pragmatic option than going directly to WES/WGS. It is important to remember that many pathogenic mtDNA variants are restricted to clinically-affected tissues such as skeletal muscle. 51

Confirmation of variant pathogenicity and involvement in impaired mitochondrial OXPHOS metabolism
• Enzyme histochemistry and spectrophotometric respiratory chain enzyme activity assays

| Targeted gene panels
Initial application of NGS methods in mitochondrial disorders tended to sequence targeted mitochondrial gene panels that included a spectrum of genes encoding respiratory chain components and known disease-associated genes 52,53 or more expansive panels, known as the "MitoExome," which included all genes listed in the MitoCarta inventory. 44,45 Smaller, more focussed, gene panels have also been successful in identifying variants in known genes related to a specific clinical phenotype and OXPHOS presentation. For example, we have used a custom Ampliseq panel for genes involved in Complex I function to identify pathogenic variants in TMEM126B and NDUFA6. 43,54 The effectiveness of this approach relies upon having biochemical evidence of an isolated complex I defect, and with continued improvements in turnaround times and decreasing sequencing costs of WES, using an approach that is not restricted to a specific subset of predetermined genes would seem sensible given the number of possible disease genes associated with mitochondrial dysfunction. Indeed, at least 15 genes now associated with mitochondrial disease are not included on the wide-ranging "MitoExome" panel 55 highlighting the advantages of non-targeted, unbiased approaches such as WES. This will no doubt change again as WGS costs are already decreasing to allow this to become more commonly used.

| Gene agnostic approaches (WES and WGS)
In mitochondrial disease WES has been hugely successful improving diagnostic yields in patients with nuclear gene defects and identifying variants in novel disease genes. Many centers now report that approximately 60% of patients receive a genetic diagnosis. [46][47][48] This success has led towards a "genetics first" approach to diagnosis which may avoid the requirement for skin or muscle biopsies entirely. The benefits of such an approach are clear if WES or WGS identifies a known pathogenic variant in a known disease gene, but this is only one of many potential outcomes and the only one that gives a firm diagnosis using sequencing alone.
Other outcomes from WES/WGS include identifying: (1) a novel variant in a known disease gene, (2) a novel variant in a known mitochondrial protein of known function that has not been previously associated with disease (3) a predicted pathogenic variant in a protein of unknown function or (4) no clear candidate variants (Figure 2). With an ever-increasing number of variants of unknown significance (VUS) being identified, functional validation of pathogenicity is vital. Indeed, the high diagnostic rates of the aforementioned publications are partly due to WES being applied to biochemically well-characterized cases of mitochondrial disease, which aided the prioritization of variants. For example, in one study with more heterogeneous cohorts, the diagnostic yield was less than 39% compared to 57% in the subgroup with the highest suspicion of mitochondrial disease. 48 Despite the revolutionary impact of WES on the genetic diagnosis of mitochondrial disease, a significant proportion of cases (~40%) remain unresolved. This may be due to the variant being detected, but not prioritized by current bioinformatic pipelines or may be because the causative variant does not reside in the coding regions of the genome. WGS is able to detect all genetic variants and therefore has the potential to further increase the diagnostic yield. However, with the increased number of VUS identified and incomplete coverage of inherited disease genes, 56 variant prioritization is a major challenge. There have been many studies assessing the annotation of variants, with examples of previously described pathogenic variants shown to be present in healthy individuals, 57,58 calling into question the accuracy of disease gene annotation. Furthermore, WGS technologies are still improving and use of technical benchmarks are required to ensure accurate interpretation of variant calls. 59 Variant prioritization is key and most in-house bioinformatic filtering pipelines should take into consideration the following: the rarity or presence of the variant in databases such as gnomAD (http://gnomad.broadinstitute.org) or ExAC 60 ; conservation of the amino acid; modeling the F I G U R E 2 An overview of the workflow utilized to identify and validate variants associated with mitochondrial disease. First, clinical information is vital to inform appropriate genetic testing. If no mitochondrial DNA (mtDNA) or syndrome-associated nuclear variants are identified, we advocate the use of trio whole-exome (WES) or whole-genome (WGS) sequencing. For each of the outcomes of WES/WGS, different levels of investigation are required to prove pathogenicity. Then, we have outlined some of the basic techniques that can be used to investigate the impact of those variants on OXPHOS metabolism using patient tissue or cells. In cases where disease mechanisms are poorly understood, these materials alongside cell and animal models can aid investigations. There is a plethora of techniques available, and instead of providing an exhaustive list we have highlighted those most commonly used, as well as gene function-specific investigations, some of which are expanded upon in the text. Abbreviations: Co-IP (co-immunoprecipitation); EMSA (electrophoretic mobility shift assay); FRET (fluorescence resonance energy transfer); iPSC (induced pluripotent stem cells); MELAS (mitochondrial encephalomyopathy, lactic acidosis, and stroke-like episodes); MIDD (maternally inherited diabetes and deafness); OXPHOS (oxidative phosphorylation); SBF-SEM (serial block-face scanning electron microscopy); STED (stimulated emission depletion); TAP (transporter associated with antigen processing); TEM (transmission electron microscopy); WES (whole-exome sequencing); WGS (whole-genome sequencing); Y2H (yeast two-hybrid) amino acid change in the protein (https://zhanglab.ccmb. med.umich.edu/I-TASSER/); use of in silico tools such as SIFT 61 and PolyPhen2 62 to predict pathogenicity of the variant. There are online tools which can take into account these factors such as Ensembl's Variant Effect Predictor. 63 Bioinformatic tools will continue to improve and inclusion of such data may allow for better assessment into the authenticity of variants.
Guidelines from the American College of Medical Genetics (ACMG) aim to standardize interpretation of variants into five classifications ("pathogenic," "likely pathogenic," "uncertain significance," "likely benign," and "benign") and there are five criteria listed as either "very strong" or "strong" evidence for classifying novel variants as pathogenic: (1) null variant, for example, nonsense, frameshift, splice site etc., (2) a variant resulting in the same amino acid change as an established pathogenic variant, (3) a de novo variant, where paternity and maternity have been established, (4) well-established functional studies show damaging effect on the gene or gene product, (5) the prevalence of the variant in affected individuals is significantly higher than in controls. 64 There are also six criteria listed as "moderate" and five listed as "supporting" evidence of pathogenicity which can be found in the ACMG standards and guidelines. 64 When using WES or WGS, we advocate the sequencing of the family trio (ie, the patient and both unaffected parents) whenever possible. Trio sequencing is particularly powerful as it enables prioritization of de novo variants based on knowledge of segregation within the family (listed as criteria 3 of the ACMG guidelines above). Recently, there have been increasing reports of de novo dominant causes of mitochondrial disorders in cases suspected of having a recessive etiology, including variants in SLC25A4, 65,66 ATAD3A, 67 SLC25A24, 68,69 DNM1L, 70-72 CTBP1 73,74 and ISCU. 75 In the case of de novo SLC25A4 variants, sequencing the family trio was vital as the clinical phenotype of the patients did not resemble any of the previously reported patients with either autosomal dominant or recessive variants in SLC24A4. 66 Thus, the variant had failed to be prioritized in cases where only the proband was sequenced.
Of the five ACMG criteria we have listed above ("very strong" or "strong") at least two are usually required to classify the variant as "pathogenic" rather than "likely pathogenic" or "unknown significance" (see ACMG standards and guidelines for full classification rules 64 ). Criteria 4 (functional studies) is the only one of these "strong" criteria where there is scope to provide additional information on the consequences of the variant that may promote its classification to a clinically actionable level. Undertaking such studies is therefore immensely valuable and central to the genetic diagnosis of many mitochondrial diseases.

| FUNCTIONAL VALIDATION
WES is currently more widely used than WGS, but we expect this to change in the near future. There have been calls for a first-line WGS based approach in mitochondrial disease. 76 We agree that trio WGS should form a central role in the diagnostic process, but that the majority of cases will likely require some degree of functional validation (Figure 2). When a known pathogenic variant explaining the clinical phenotype has not been identified by WES/WGS, skin and skeletal muscle biopsies can be crucial. Their use has enabled identification of many novel genetic associations where functional assessment was essential for affirming pathogenicity. It is in this niche where skeletal muscle biopsy continues to prove invaluable in the diagnosis of mitochondrial disease.

| Variants in mtDNA
In the case of mtDNA variants it is helpful to assess the mutation load if the variant is not homoplasmic, but it is important to do so in an appropriate tissue since mutation load can differ between tissues. 13,51 Furthermore, for novel mtDNA variants of unknown significance, the "gold standard" approach for verifying pathogenicity of a mtDNA variant is to demonstrate an observable correlation between higher mutant loads and severity of biochemical phenotype; this is typically done, for novel mt-tRNA gene variants, by assessing heteroplasmy in individual COX-positive and COX-deficient fibers to demonstrate a functional threshold. [77][78][79] 3.2 | Variants in nuclear DNA

| Novel variants in a known disease gene
The functional workup varies on a case by case basis, particularly when variants identified via WES are novel. When those variants appear in a gene that has previously been associated with disease, the functional workup requires confirmation of segregation in the family and demonstration of a biochemical phenotype, where appropriate, in patient samples that is similar to the phenotype previously described in other patients with variants in the same gene. If the clinical and biochemical phenotypes are similar and the variant segregates with disease, then there is little doubt as to the pathogenicity of the variant.

| Novel variants in a known mitochondrial protein not associated with disease
In cases where the results from WES have indicated a likely pathogenic variant in a gene not previously linked to disease then additional work may be required. If the function of the gene is known, then it is important to design experiments to test for an expected phenotype. For example, if the protein is known to be involved in RNA processing then demonstrating aberrant processing and increased levels of RNA precursors in patient samples can strengthen the case for the variant being pathogenic, as was demonstrated in cell lines harboring pathogenic TRMT10C (MRPP1) variants. 80 The gold standard for proving pathogenicity is to perform rescue experiments by introducing a wild-type copy of the gene into patient fibroblasts, often using a viral delivery system. Restoration of the biochemical phenotype to control levels upon expression of the wild-type gene allows confirmation of pathogenicity.

| Novel variants in a gene encoding a protein of unknown function
The approaches used to validate novel pathogenic variants in genes of unknown function identified by NGS approaches are similar to those described above with rescue experiments being particularly important. In these cases, patient samples can be instrumental in implicating a mitochondrial role for a gene of unknown function (for example, RMND1, 81 MGME1, 82 FBXL4 83 ) or elucidating a previously unknown mitochondrial function of a gene (for example, TRMT5 84 and TOP3A). 85

| No clear candidate pathogenic variants
One aspect of functional assessment that has grown with the advent of WGS is transcriptomics and proteomic approaches to complement WGS data and aid in variant prioritization. The potential of using transcriptomics (RNA sequencing [RNA-Seq]) to tackle undiagnosed cases of mitochondrial disease has recently been assessed. 50 Of 48 cases that WES had previously failed to diagnose, RNA-Seq yielded a genetic diagnosis in five patients and candidate variants in the remaining 43, including identification of novel disease gene TIMMDC1 which encodes a complex I assembly factor. 50 Additionally, Cummings and colleagues successfully diagnosed 35% of 50 unsolved rare muscle disease cases using RNA-Seq; this approach compared patient RNA-seq data to RNA-seq data from 184 control skeletal muscle samples, illustrating the power required to identify significant variations. 86 They also highlighted the importance of acquiring pathologically-relevant tissue; analysis of tissue from the Genotype-Tissue Exppression (GTEx) Consortium 87 revealed that many of the most common muscle-disease genes are associated with significantly lower expression in blood and fibroblasts compared to skeletal muscle, rendering them underpowered. Implementation of RNA-Seq can therefore improve diagnostic rates when utilized alongside WGS when no clear variants are initially identified, but the tissuespecificity of many disorders may render the use of RNA-Seq case-limited and highlights another instance where skeletal muscle biopsy may be essential.

| METHODS TO FUNCTIONALLY VALIDATE PATHOGENICITY AND DISSECT MOLECULAR MECHANISMS
The techniques used to functionally characterize variants identified by WES also varies on a case by case basis, but there are some common approaches. Following Sanger sequencing confirmation and demonstration that the variant segregates with disease in the family, one of the first functional tests performed is western blotting of proteins from patient and control tissue samples. it is important to assess the steady-state levels of the protein encoded by the variant gene. ACMG guidelines consider frameshift variants and those affecting splicing as loss-of-function alleles which are defined as pathogenic with the protein expected to be absent from affected patient tissues. Decreased steady-state protein levels are often observed if there is a missense change, either homozygous or compound heterozygous (with another missense or nonsense variant), indicating either lower expression levels or increased turnover of the mutant protein and demonstrates a functional consequence of the identified variant. However, this is not always the case as variants can cause a decrease in function without a decrease in expression.

| Protein studies
Since our definition of mitochondrial disorders generally focusses on impaired energy production, standard experiments assess the steady-state levels of various OXPHOS complex subunits via SDS-PAGE and immunoblotting and assembly of each complex by blue-native PAGE. More recent advances include complexome profiling, which uses mass spectrometric analysis of 2D BN/SDS-PAGE before hierarchical clustering to analyze the assembly of complexes and their subunits. This technique can be used to assess respiratory chain complexes 43,88 and the mitoribosome, recently showing that MRPS2-deficiency leads to aberrant small mitoribosomal subunit formation. 27 The aforementioned techniques can be achieved using either muscle biopsy or skin fibroblasts. Skeletal muscle often shows a more severe molecular phenotype and can be used to further demonstrate the pathogenic nature of variants, using oxidative enzyme histochemistry and quantitative immunohistochemistry 89 but does not provide a renewable source of material for further investigation into disease mechanisms, as is the case with skin fibroblasts. For example, fibroblasts provide an opportunity to study mitochondrial translation defects that underlie an OXPHOS deficiency using [ 35 S] labeled methionine and cysteine incorporation to assess nascent mitochondrial protein synthesis, and when mitoribosomal defects are suspected, sucrose gradients to analyze their assembly. These techniques recently revealed that homozygous MRPS14 variants diminished translation, but did not affect mitoribosomal assembly. 28 An observable biochemical defect in patient fibroblasts also allows rescue experiments to be performed and, as stated earlier, these are crucial for functionally confirming the pathogenicity of novel variants in novel disease genes.

| Imaging studies
There are a multitude of imaging techniques that can help further our understanding of the pathological role of novel disease genes, including transmission electron microscopy (TEM) and high-resolution confocal imaging. These techniques in combination helped characterize the mitochondrial abnormalities associated with SLC25A46 variants in Leigh Syndrome and demonstrated the function of SLC25A46 within a mitochondrial/ER pathway involved in lipid transfer. 90 Similar imaging studies were employed in the characterization of a mitochondrial neurodegenerative disorder caused by NME3 variants and demonstrated that the dual functions of NME3 (NDP-kinase activity and a role in mitochondrial fusion) contributed to the disease mechanism. 91 High-resolution confocal microscopy in particular is playing a major role in elucidating the roles of novel mitochondrial genes in mitochondrial dynamics. Although many of these genes have not yet been described in disease etiology, a better understanding of their functions will aid prioritization of variants in genetically undiagnosed mitochondrial disease patients. Furthermore, the integration of improvements to more recently developed techniques such as serial block-face scanning electron microscopy (SBF-SEM) in skeletal muscle will help further our understanding of mitochondrial disease genes and disease mechanisms. 92

| Modeling putative disease variants
If a biochemical defect is only detectable in skeletal muscle, modeling experiments might be required to obtain more evidence of pathogenicity. One such example is the recent identification of de novo variants in SLC25A4 encoding the skeletal muscle isoform of the mitochondrial ATP/ADP carrier. 65,66 The biochemical defect was only observable in skeletal muscle, so the equivalent mutations were modeled in yeast and recombinant proteins were expressed in a bacterial system to allow assessment of ADP/ATP transport and demonstrate pathogenicity of the variants.
Limited availability of patient samples mean that other model systems including cell line models (eg, CRISPR, iPSCs) or whole organisms (eg, Drosophila, zebrafish and mice) may be used to provide supportive evidence of pathogenicity or to further investigate the molecular mechanisms of disease to further the knowledge of mitochondrial biology.

| Discovery of novel genes linked to mitochondrial metabolism via genome wide CRISPR/Cas9 screening
The CRISPR/Cas9 genome editing system is a powerful tool that has accelerated our ability to discover and annotate gene functions, particularly when patient cells harboring pathogenic variants are not available. The phenotypic expression and tissue-specificity of mitochondrial disorders also restrict our understanding of disease mechanisms occurring in different cell types and tissues, making it difficult to dissect exact functions of mitochondrially-destined proteins. Therefore, generating either iPSC-based disease models 93 or animal models, 94,95 using genome engineering tools is important for a complete functional characterization of mitochondrial disease genes.
The discovery of new genes linked to mitochondrial metabolism through the use of functional genetic screening approaches, including genome-wide CRISPR/Cas9 screens can aid the interpretation of variants identified by WES/WGS. A genome-wide CRISPR/Cas9 knockout screen has been recently employed to identify genes essential for human mitochondrial oxidative phosphorylation. 96 The screening strategy relied on a "death screening" using the cell death marker Annexin V and uncovered 191 high confidence genes necessary for survival in galactose rich media where cells entirely rely on OXPHOS. In addition to the known 72 OXPHOS disease genes, the screen also identified two genes absent from the MitoCarta 2.0 (TMEM261 and N6AMT1). Another high-throughput genetic screen of 2231 genes using a CRISPR interference (CRISPRi) library identified 136 mitochondrial genes involved in mitochondrial bioenergetics. 97 Additionally, the screen uncovered 20 nonmitochondrial genes whose knockdown led to a decrease in real-time ATP levels measured by a combined Fluorescence Resonance Energy Transfer (FRET)-based biosensor and Fluorescence-activated Cell Sorting (FACS) system. The identification of a catalogue of new genes associated with mitochondrial metabolism will allow these to be factored into bioinformatic pipelines when analyzing WES/WGS data and may also provide further biochemical evidence of the mitochondrial defect, contributing to improved patient diagnosis.

| Animal models
A variety of animal models (commonly including Caenorhabditis elegans, Drosophila melanogaster, zebrafish and mice) can be utilized to assess tissue specificity, disease progression and mechanisms associated with mitochondrial dysfunction. For example, in humans, pathogenic variants in TRMU (encoding the mitochondrial tRNA 5-methylaminomethyl-2-thiouridylate methyltransferase) are associated with deafness-associated mitochondrial disease but the pathophysiology was initially poorly understood. 98 A CRISPRgenerated Mtu1 (homologue of human TRMU) knockout zebrafish model offered novel insights into disease mechanisms underlying the deafness seen in patients-a phenotype that was recapitulated in the fish model. 95 A recent report of the first case of mitochondrial disease due to OXA1L variants utilized RNA interference of the Drosophila homologue, CG6404, to show defects in complexes I, IV, and V assembly in flies, which was consistent with what was observed in patient tissues. 99 Despite the ease of use of zebrafish and Drosophila and their ability to recapitulate mitochondrial disease in certain cases, mice remain the most commonly developed models for studying mitochondrial pathophysiology and dysfunction due to the genetic and physiological similarities to humans and the potential for testing novel therapies. 100 Some mouse models recapitulate human disease effectively, for example, mitochondrial translation optimization protein 1 (MTO1) knockout replicates the cardiomyopathy displayed in humans. 101 The mtDNA helicase Twinkle knock-in mouse phenotype correlates strongly with human disease presentation including progressive neurodegeneration and epileptic seizures. 102 However, tissue-specific presentations are not always recapitulated; for example, the NDUFS4 mouse model reproduces a Leigh-like phenotype similar to that displayed by human patients, but not the characteristic basal ganglia changes. 103,104 Furthermore, some mouse models do not demonstrate corresponding biochemical phenotypes to those observed in human patients; for example, human SURF1 variants are associated with severe mitochondrial COX defects and early lethality whereas SURF1 −/− mice demonstrate increased longevity in the absence of many of the mitochondrial phenotypes associated with the human disease. 105,106 The aforementioned genes are all nuclear-encoded highlighting the historic difficulties associated with generating faithful models of human disease related to pathogenic, heteroplasmic mtDNA variants. However, using a random mutagenesis and a phenotype-driven approach, mice with a mutation (m.5024C > T) in the mitochondrial tRNA ALA gene have been generated, 107 and more recently used to test experimental therapeutic strategies. 108,109 These studies, among many others, demonstrate animal models can be a powerful tool to study new disease genes, particularly when there is only a single patient or patient samples cannot be obtained. Moving forwards, high throughput screening using targeted CRISPR/Cas9 in zebrafish and collaborations such as the International Mouse Phenotyping Consortium will enable functional and pathobiological investigations of many genes. 110,111 This will lead to the characterization of novel phenotypes and provide candidate genes for many genetically undiagnosed clinical diseases.

| "Multi-omic" approaches to investigate gene functions
As discussed earlier, the identification of causal variants in mitochondrial disease is moving slowly towards a "bi-omic" approach that analyses the genome and transcriptome. 50,86 However, truly "multi-omic" approaches are emerging in a research setting that focus on investigating basic mitochondrial biology. It is easy to envisage this approach yielding elucidations of novel disease mechanisms, especially when new disease genes of unknown function are identified. A recent study used analysis of mRNA, proteins, lipids and metabolites to identify over 90 targets of the RNA-binding protein Puf3p, and delineate its role in coordinating Coenzyme Q and OXPHOS biogenesis in Saccharomyces cerevisiae. 112 Previous efforts to identify Puf3p targets delivered unclear results, as it was difficult to identify truly productive binding events; the integration of these four "omics" strategies highlight the power of a "multi-omics" approach in elucidating the function of a protein in certain situations and we expect these approaches to be used more commonly in the future, including investigating the function of novel mitochondrial disease genes.

| CONCLUDING REMARKS
The introduction of NGS into mainstream genetics, combined with the large number of mitochondrial disease gene candidates, means that putative pathogenic variants are now identified in many different scenarios. We recognize the many difficulties associated with assignation of pathogenicity and, in our experiences, the collaborative integration of accredited diagnostic pathways and research activity has proven vital. Clinical information can be extremely important in directing appropriate genetic testing, but we often advocate the use of WES or WGS on the basis of speed, comprehensive coverage of nuclear and mitochondrial DNA variants and simultaneous assessment of heteroplasmy. It must be reiterated that in such cases it is vital that trios are sequenced whenever possible to ensure rapid variant prioritization. Then, when novel disease genes are identified, casespecific techniques should be undertaken to further understand disease mechanisms as outlined (Figure 2). Rescue experiments either in patient cell lines or CRISPR-generated cell lines are particularly important in proving pathogenicity of a novel variant.
It is also worth highlighting the importance of international collaboration within the mitochondrial disease field. Identification of additional cases with similar clinical phenotypes is hugely beneficial in being able to confirm pathogenicity of variants in novel disease genes. Online tools such as GeneMatcher 113 allow cases to be brought together and aid in the collaborative effort to report new genes associated with disease. Collaboration has also facilitated studies that complement those in humans, as demonstrated by the International Mouse Phenotyping Consortium.
The main output measure in a diagnostic setting should relate to the help offered to families affected by mitochondrial disease. There is no cure for mitochondrial disease so consistent and rapid diagnosis, best delivered using NGS, will ensure that these families are able to make informed decisions regarding provision of care of affected members and for many, reproductive options. It is apparent that a "multi-omics" approach (eg, WES/WGS in addition to transcriptomics, proteomics, metabolomics, etc.) has the potential to increase even further the diagnostic yield and will no doubt result in the identification of new disease genes which will further enhance the understanding of mitochondrial biology and disease mechanisms.