Extreme diversification of the mating typehigh-mobility group (MATA-HMG) gene family in a plant-associated arbuscular mycorrhizal fungus



  • Arbuscular mycorrhizal fungi (AMF) are important plant symbionts that have long been considered evolutionary anomalies because of their apparent long-term lack of sexuality, but recent explorations of available DNA sequence have challenged this notion by revealing the presence of homologues of fungal mating typehigh-mobility group (MATA-HMG) and core meiotic genes in these organisms.
  • To obtain more insights into the sexual potential of AMF, homologues of MATA-HMGs were sought in the transcriptome of three AMF isolates, and their functional and evolutionary trajectories were studied in genetically divergent strains of Rhizophagus irregularis using conventional and quantitative PCR procedures.
  • Our analyses revealed the presence of at least 76 homologues of MATA-HMGs in R. irregularis isolates. None of these was found to be surrounded by genes generally found near other known fungal mating type loci, but here we report the presence of a 9-kb-long region in the AMF R. irregularis harbouring a total of four tandem-repeated MATA-HMGs; a feature that highlights a potentially elevated intragenomic diversity in this AMF species.
  • The present study provides intriguing insights into the genome evolution of R. irregularis, and represents a stepping stone for understanding the potential of these fungi to undergo cryptic sex.


Arbuscular mycorrhizal fungi (AMF) are an ancient and ubiquitous group of obligate plant symbionts that are thought to have assisted the colonization of land by plants c. 800 million yr ago through the establishment of the mycorrhizal symbiosis; a close association of AMF with the roots of over 80% of land plants, and many important crops (Smith et al., 1997; Sanders, 2002, 2003; Corradi & Charest, 2011). The hallmark of this symbiosis is the transfer of photosynthetically produced carbohydrates from the plant to the fungus in exchange for increased supplies of water and mineralized nutrients, and for this reason AMF are considered key mutualists in terrestrial ecosystems (van der Heijden et al., 1998; Smith et al., 2000; Munkvold et al., 2004; Wagg et al., 2011).

Besides their ecological relevance, AMF are also intriguing from a cellular point of view, as they harbour coenocytic hyphae (i.e. aseptate) that are perpetually multinucleate (Rosendahl, 2008). These multinucleated hyphae have been proposed to be able to fuse through a process called anastomosis, which could allow nuclear exchange between genetically dissimilar AMF (Croll et al., 2009). AMF are also characterized by an unusually elevated molecular diversity within single spores, whose origin is currently debated (Lloyd-Macgilp et al., 1996; Lanfranco et al., 1999; Kuhn et al., 2001; Pawlowska & Taylor, 2004; Rodriguez et al., 2004; Hijri & Sanders, 2005; Pawlowska, 2005; Stukenbrock & Rosendahl, 2005; Heitman et al., 2007).

AMF are also peculiar from an evolutionary perspective, as they are currently thought to have propagated for the past 500 million yr in the absence of sexual reproduction. This remarkable long-term clonal evolution has resulted in these curious fungi being referred to as ‘ancient asexuals’ – that is, evolutionary ‘anomalies’ that should have long gone extinct as a consequence of their intrinsic incapacity to offset the accumulation of deleterious mutations over time (e.g. through meiotic recombination; Smith, 1986; Judson & Normark, 1996; Butlin et al., 1998; Welch & Meselson, 2000; Normark et al., 2003). Although sexual reproduction has yet to be formally observed in AMF, alternative scenarios for their long-term evolutionary and ecological success in the sexual stages have started to emerge. In particular, AMF have been recently proposed to be capable of undergoing a cryptic sexual cycle following the identification of events of recombination in some natural populations (Vandenkoornhuyse et al., 2001; Croll & Sanders, 2009; den Bakker et al., 2010), and the detection of many homologues of genes essential for meiosis (Halary et al., 2011; Corradi & Lildhar, 2012) and those normally found in the mating type (MAT) loci of early diverging fungi (Tisserant et al., 2012).

Among these potential signatures of sex, the identification of Rhizophagus homologues of SexM and SexP is particularly intriguing, as it suggested an intrinsic potential for these fungi to harbour a bona fide MAT locus. Briefly, a fungal MAT locus is a genomic region found in most fungi that serves the common purpose of determining sexual compatibility between two individuals of outcrossing species, or a region required for sexual differentiation in self-fertile species. Sexual identity is determined by allelic variation at this locus, but the gene content and structure of the locus can differ quite drastically across members of the fungal kingdom (Lee et al., 2010). Alleles of the MAT locus are often called idiomorphs to denote sequences that occupy the same locus on a chromosome, but do not necessarily share conservation in sequence, gene order, or a common descent (Metzenberg & Glass, 1990). Overall, the MAT loci of diverse fungi can contain different combinations of genes, including homeodomain proteins, high-mobility group (HMG) transcription factors, and an alpha-box (which has recently been reclassified as an HMG domain; (T. Martin et al., 2010); but the putative ancestral version of them all is currently found in zygomycetes, where the two idiomorphs are represented by MATA-HMG genes known as SexM or SexP (Idnurm et al., 2008).

The ancestral role of MATA-HMGs in fungal mating, combined with the fact that homologues of these genes are often found in most known fungal MAT loci, makes them ideal candidates for new signatures of sexual reproduction that might be found in surveys of AMF genomes. Indeed, homologues of SexM and SexP have been identified in Rhizophagus irregularis, but their diversity and genomic context in AMF are completely unknown. Here, we surveyed the transcriptomes of two isolates of R. irregularis (SwiC2, this study; and DAOM 197198, Tisserant et al. (2012)) and one of the closely related species Rhizophagus diaphanus (MUCL 43196, this study) for the presence of homologues of MATA-HMGs, and our explorations revealed the presence of a surprisingly elevated number of gene homologues in all strains investigated.

Materials and Methods

AMF culture, DNA extraction and conventional PCR

A total of 14 isolates (Table 1) from the AMF Rhizophagus irregularis and one from Rhizophagus diaphanus were investigated at different levels for the presence of MATA-HMG genes. Isolates of R. irregularis, referred to as SwiA4, SwiC2, SwiB3, SwiA1, SwiA5 and SwiA2, were previously harvested from one field in Taenikon, Switzerland, and cultivated within in vitro split-plates in symbiotic association with Ri (root inducing) T-DNA-transformed Daucus carota roots as previously described (Koch et al., 2006). SwiA4, SwiC2 and SwiB3 were chosen for preliminary assessment of among-isolate genetic variation, as they have been previously shown to be genetically and phenotypically different from each other (Koch et al., 2006; Corradi et al., 2007; Croll et al., 2008, 2009; Angelard & Sanders, 2011) but maintain the ability to undergo anastomosis between one another (Croll et al., 2009). Mycelium was harvested as previously described (Koch et al., 2006), and genomic DNA was extracted from mycelium using a protocol based on phenol-chloroform extraction combined with the MasterPure™ Complete DNA and RNA purification kit from Epicentre Biotechnologies (Madison, WI, USA). Upon extraction, DNA was precipitated using 99% isopropanol following an incubation for 30 min at −20°C, and a final centrifugation at 4°C at 10 000 g for 10 min. The supernatant was discarded and the DNA pellet was washed twice with 70% v/v ethanol, vacuum-dried and suspended in TE buffer.

Table 1. Strains of Rhizophagus irregularis used in this study
DAOMOrigin (name)
240477Poland (Pol)
234180Ripon Québec (CanQc2)
240201Îles-de-la-Madeleine Québec (CanQc4)
240448Tunisia (Tun)
240434Larose Forest Ontario (CanOn1)
240721Belgium, Louvain-la-Neuve (Bel)
229457Clarence Creek Ontario (CanOn2)
240159Revelstoke BC (CanBc)
A1Switzerland (SwiA1)
A2Switzerland (SwiA2)
A4Switzerland (SwiA4)
A5Switzerland (SwiA5)
B3Switzerland (SwiB3)
C2Switzerland (SwiC2)

Conventional PCR procedures were performed in 50-μl reaction volumes using 2X Econotaq (Lucigen, Middleton, WI, USA) mastermix, 0.2 μl of each primer pair (10 mM), and 10 ng of template DNA. PCR programmes generally followed: 95°C for 3 min, followed by 34 cycles of 94°C for 30 s, 55°C for 30 s, then 60°C elongation temperature for 60 s, and a final elongation for 5 min at 60°C. The resulting sequences were aligned onto existing transcripts using muscle (Edgar, 2004), or using the ‘Map to reference’ assembly function available within the geneious software package (Biomatters, Auckland, New Zealand).

Inverse PCR procedures

Genomic DNA (c. 100 ng) of isolate SwiC2 was digested with restriction enzymes, and the digests were precipitated with sodium acetate and 100% ethanol, and used in ligations with T4 DNA ligase (20 μl; 4°C for 16 h). The ligation reactions were used directly as the templates for PCR. Eleven restriction enzymes were used. ClaI, EcoRI, HindIII, KpnI, NdeI, PstI, XbaI and XhoI recognize 6-bp sites. Three enzymes, BfaI, HpaII and Sau3AI, recognize 4-bp sites. The restriction enzymes and T4 DNA ligase were purchased from New England Biolabs (Ipswich, MA, USA). Two PCR conditions were employed with Ex Taq (Takara, Shiga, Japan). The parameters were either 94°C for 2 min, then 32 cycles of 94°C for 20 s, 50°C for 20 s and 60°C for 4 min, or 94°C for 2 min, then 32 cycles of 94°C for 20 s, 50°C for 20 s and 68°C for 3 min. PCR reactions were resolved on 0.8% agarose 1X Tris-acetate-EDTA gels. PCR products were purified from agarose gel slices, and directly sequenced. In cases where amplicons were faint, they were cloned into the plasmid TOPO pCR2.1 (Invitrogen/Life Technologies, Grand Island, NY, USA). Independent plasmid clones were sequenced using the universal M13F and M13R primers, and internal primers. The sequence reads were assembled with sequencher software version 4.8 (Gene Codes Corporation, Ann Arbor, MI, USA). All primers used in the present study are listed in Supporting information Table S1. Inverse PCR procedures were instrumental in obtaining a total of five supercontigs, which have been deposited in GenBank under the following accession numbers: KC517357 (SwiC2), KC571706 (SwiA4), KC571707 (SwiA4), KC597699 (SwiB3) and KC597698 (SwiB3).

RNA extraction and qPCR procedures

Total RNA was isolated using the QIAgen Plant RNA extraction kit (Qiagen, Venio, Netherlands). Between 20 and 40 mg of fresh mycelium was crushed in β-mercaptoethanol diluted with the manufacturer's lysis buffer using a plastic pestle. The homogenized mycelium was then subjected to RNA extraction following the manufacturer's instructions. Upon RNA extraction, the solution was treated with RNase-Free DNase I (Epicentre Biotechnologies) at 37°C for 1 h, followed by DNase I treatment in the presence of 200 μl of T and C Lysis solution (Epicentre Biotechnologies). The resulting solution was vortexed for 5 s, 200 μl of MPC protein precipitation reagent (Epicentre Biotechnologies) was added for protein precipitation, and the mixture was vortexed for an additional 5 s. The solution was placed on ice for 5 min, followed by a 10 000 g centrifugation at 4°C for 10 min to pellet debris. The supernatant was placed in a new tube, and RNA was precipitated using isopropanol, washed with 70% v/v ethanol, vacuum-dried, and ultimately re-suspended in RNase-free H2O (Epicentre Biotechnologies). RNA concentration was determined using a Nanodrop spectrophotometer (Fisher Scientific, Waltham, MA, USA). In all cases, 1 μg of DNase-free RNA was immediately subjected to RT-PCR using the iScript kit (Bio-Rad Laboratories, Hercules, CA, USA) following the manufacturer's protocol. In addition, 0.5 μg of RNA was heated at 65°C for 5 min and then run on a 2% agarose 1X Tris-EDTA gel to visually inspect RNA for consistent quality across samples. No-RT controls were performed alongside all cDNA syntheses reported in the present study to confirm the absence of genomic DNA contamination.

A total of six AMF MATA-HMG genes (three variable and three monomorphic; HMG6, HMG1, HMG37, HMG22, HMG52 and HMG65) were selected for analysis of gene expression using quantitative real-time PCR (qPCR). Selection of the three gene targets showing allelic variation (e.g. variable HMGs: HMG6, HMG1 and HMG37) for qPCR analyses was based on the presence of substantial divergence (> 90%) at the amino acid level between at least two out of three isolates used for comparisons (SwiA4, B3 and C2; HMG37 is the sole exception), and successful optimization of primer sets – that is, only HMGs having over 90% divergence at the amino acid level that produced consistent standard curves resulting from optimal primer specificity were used for downstream qPCR analyses. Invariable HMGs were selected randomly but all produced consistent standard curves. The expression of housekeeping genes encoding β-tubulin, actin and EF1α was used for comparison. Primers for qPCR were designed to amplify a small region of six selected HMG sequences, and three housekeeping genes (β-tubulin, actin and Ef1α; Table S1). Real-time PCR reactions were carried out on a CFX 96 thermal cycler (Bio-Rad Laboratories) and analysed with the Bio-Rad cfx manager software V2.0 (Bio-Rad Laboratories). All reactions were performed according to the manufacturer's conditions, and consisted of 0.6 μl of H2O, 0.2 μl each of forward and reverse primers, 5 μl of Ssofast 2X master mix (Bio-Rad Laboratories) and 4 μl of cDNA diluted to 40 ×. In all cases, optimum annealing temperature, primer specificity and amplification efficiency were determined, respectively, using a PCR temperature gradient, a single melt-curve peak as well as gel electrophoresis and a serial dilution (Taylor et al., 2010), with a cDNA mixture containing equal amounts of cDNA extracted from each condition. For all primers used, a 57°C annealing temperature provided robust PCR amplifications of the desired target sequence, and upper and lower limits of reaction efficiency thresholds with the serial dilutions of 90% and 110% with a minimum R2 value of 0.98 were obtained, suggesting optimal qPCR conditions (Taylor et al., 2010). Amplifications were performed using the following conditions: initial denaturing of 95°C for 3 min followed by 40 cycles of 95°C for 10 s and 57°C for 6 s and a final melt curve from 65 to 95°C, with 0.5°C increments, holding at each step for 5 s, with ‘no template’ controls included for every target in each run.

Target gene expression was determined by cycle of quantification (Cq) values, using the single baseline threshold provided with the cfx software V2.0 (Biorad Laboratories). Inter-run calibrations within the cfx manager software were implemented following the manufacturer's recommendations to compare experimental conditions which were measured on separate qPCR runs and used in order to normalize for inter-plate variability (Pabinger et al., 2009). In this case, the software adjusts Cq values of all samples for each target gene between runs using the pairwise difference of each target gene amplified from a common sample defined by the software, which was present on both runs. For all genes, normalization of gene expression was performed using the data-driven normalization algorithm implemented in NORMA-Gene (Heckmann et al., 2011). This latter algorithm runs in Microsoft Excel, and estimates a normalization factor by calculating mean expression values for each replicate of all target genes (Navarro-Martín et al., 2012). This effectively reduces technical variation (i.e. as a result of experimental bias), but has no effect on relative differences between treatments (Holmstrup et al., 2011). Normalization was performed using 10 target genes (a minimum of five genes is suggested to yield robust results; Heckmann et al., 2011). Student t-tests (two-tailed), assuming unequal variances for all comparisons, were used to measure statistically significant changes in the expression of a target gene between growth conditions. Changes in expression were considered statistically significant at  0.05.

Acquisition of transcriptome data from R. irregularis and R. diaphanus and identification of AMF MATA_HMG domains

RNA isolated from in vitro cultures of R. irregularis and R. diaphanus was subjected to Illumina sequencing at Fasteris S.A. (Geneva, Switzerland). The respective RNA extracts were used to produce cDNA libraries following the company's protocols, and were then sequenced using one complete channel on the HiSeq 2000 instrument (Illumina, San Diego, CA, USA). The sequencing procedure resulted in 201 051 108 and 176 382 504 reads (100 bp-pairs long) for R. diaphanus and R. irregularis-SwiC2, respectively. Reads were assembled using velvet oases (Schulz et al., 2012) with a hash value of 93 for both species, resulting in the acquisition of c. 20 000 contigs for each species. Homologues of fungal mating type HMG genes representative of different fungal phyla (MATA-HMG;= 25; Table S2) were searched across available transcriptome sequence data newly obtained from R. irregularis (strain SwiC2; this study) and R. diaphanus (MUCL 43196; this study), and publicly available transcriptome data from R. irregularis (DAOM 197198; Tisserant et al., 2012) using reciprocal Blast procedures (i.e. BlastX, tBlastX, BlastP and tBlastN). For comparisons, similar searches were also performed across publicly available genome sequence data from members of the Chytridiomycota, Zygomycota, Basidiomycota and Ascomycota (i.e. the chytrids Allomyces macrogynus and Batrachochytrium dendrobatidis; the basidiomycetes Ustilago maydis, Puccinia graminis and Cryptococcus neoformans; the ascomycetes Saccharomyces cerevisiae, Aspergillus nidulans and Neurospora crassa; and the zygomycetes Phycomyces blakesleeanus and Rhizopus oryzae). All potential MATA-HMGs identified in the transcriptome data from Rhizophagus spp. (this study) and other fungal genomes were further compared against the GenBank nr database in order to confirm their homology, and their sequences were manually inspected to avoid redundancy. The accession numbers resulting from these analyses are shown in Table 2.

Table 2. List of predicted mating typehigh-mobility group (MAT-HMG) domains found within Rhizophagus irregularis and Rhizophagus diaphanus (MUCL 43196) isolates and the reciprocal Blast first hit of each MAT-HMG domain containing query sequence
MAT-HMG # Query accessionOrganismAnnotationProteinE-valueFirst Blast hit accessionSexM or SexP from Rhizopus sp. (E-value)
  1. The accession numbers of the query sequences are shown, along with the organism's name, the predicted domain name, the functional domain annotation and the e-value, and accession of the first Blast hit retrieved from the National Center for Biotechnology Information nr database is listed for each Rhizophagus spp. MAT-HMG gene. The last column shows whether the query sequences are more similar to SexM or SexP from R. delemar. Absence of information in the last column denotes a lack of similarity with SexM or SexP.

  2. a

    Transcripts can be accessed at the URL: http://mycor.nancy.inra.fr/IMGC/GlomusGenome/index3.html.

  3. b

    Transcripts can be accessed in the NCBI EST database.

  4. c

    Potential pseudogenes (i.e. codon stops within ORF).

  5. d

    CDD search indicates this gene contains a MATA_HMG domain.

1remain_C20901a Fusarium sacchari MATA_HMGMAT1-2-10.003 BAE94382.1 SexM (0.070)
2 BM959072.1 b Xanthoria polycarpa MATA_HMGMAT1-2-10.001 CAI59768.2 SexM (0.23)
3remain_C19309a Colletotrichum higginsianum MATA_HMGHMG box protein4.00E-12 CCF36618.1 SexM (9E-09)
4cremain_C5083a Metarhizium acridum MATA_HMGHMG transcription factor8.00E-05 EFY86728.1 SexM (3E-04)
5 KC785106 b Grosmannia clavigera MATA_HMGMAT1-2-10.004 EFX05114.1 SexM (0.015)
6 KC785107 b Verticillium alboatrum MATA_HMGPredicted protein4.00E-06 XP_003007798.1 SexM (0.012)
7 KC517357 b Melampsora larici-populina MATA_HMGHypothetical protein9.00E-07 EGF99649.1 SexM (2E-05)
8 KC785103 b Talaromyces marneffei MATA_HMGMAT1-2-10.003 ABC68485.1 SexP (0.002)
9 GW085214.1 b Schizosaccharomyces pombe MATA_HMGMc 22.00E-10 NP_595867.1 SexM (7E-06)
10 KC785109 b Penicillium chrysogenum MATA_HMGHypothetical4.00E-05 XP_002564591.1 SexP (0.17)
11 KC785101 b Erysiphe necator MATA_HMGMAT1-2-10.079 AEB33764.1 SexM (0.71)
12 GW085640.1 b Diaporthe sp.MATA_HMGMating type gene8.00E-10 BAE93753.1 SexM (4E-11)
13 KC785098 b Penicillium chrysogenum MATA_HMGMating type gene3.00E-11 CCE33026.1 SexM (9E-11)
14c GW086179.1 b Mucor mucedo MATA_HMGSexM4.00E-10 AFA26123.1 SexM (1E-10)
15 KC785099 b Cryphonectria parasitica MATA_HMGMAT1-1-34.00E-04 AAK83344.1 SexM (0.023)
16 GW088572.1 b Trametes versicolor MATA_HMGHMG box protein5.00E-04 EIW55118.1 SexP (1.1)
17 KC785097 b Cladonia galindezii MATA_HMGMAT1-21.00E-04 AAT48651.1 SexP (0.004)
18 GW088698.1 b Fusarium oxysporum MATA_HMGMat-2 protein4.00E-14 BAA28611.1SexM (2E-13) 
19 KC785108 b Rhynchosporium secalis MATA_HMGHMG box protein3.00E-04 CAD62166.1 SexP (0.006)
20 GW088846.1 b Schizosaccharomyces japonicus MATA_HMGMatMc1.00E-17 AFM85245.1 SexM (1E-17)
21 KC785100 b Fibroporia radiculosa MATA_HMGPredicted protein0.002 CCM00606.1 SexM (0.090)
22 KC785104 b Trichoderma atroviride MATA_HMGHypothetical protein0.006 EHK50111.1 SexM (0.12)
23 KC785105 b Cercospora apiicola MATA_HMGMAT1-23.00E-10 ABB83710.1 SexM (1E-06)
24 KC785102 b Rhynchosporium secalis MATA_HMGHMG box protein2.00E-10 CAD62166.1 SexM (1E-06)
25 KC785120 b Fibroporia radiculosa MATA_HMGPredicted protein0.002 CCM01306.1 SexM (0.002)
26 GW090102.1 b Talaromyces marneffei MATA_HMGHMG box6.00E-12 XP_002151220.1
27 KC785126 b Rhynchosporium secalis MATA_HMGHMG box protein3.00E-04 CAD62166.1 SexP (0.011)
28 GW093400.1 b Trametes versicolor MATA_HMGHypothetical2.00E-09 EIW63176.1 SexM (3E-09)
29c KC814215 b Talaromyces marneffei MATA_HMGMAT1-2-10.97 ABC68485.1 SexM (1.7)
30 GW098009.1 b Piriformospora indica MATA_HMGHypothetical5.00E-11 CCA67490.1 SexM (1E-10)
31 KC785117 b Paracoccidioides brasiliensis MATA_HMGMAT1-23.00E-05 AEI83491.1 SexM (3E-04)
32 GW098177.1 b Penicillium chrysogenum MATA_HMGHMG box0.006 XP_002564591.1 SexM (0.013)
33 KC785121 b Ustilago hordei MATA_HMGPrf12.00E-11 CCF52951.1 SexP (1E-07)
34 GW103650.1 b Talaromyces marneffei MATA_HMGMAT1-2-11.00E-04 XP_002152469.1 SexM (0.16)
35 KC785119 b Pneumocystis murina MATA_HMGHypothetical protein3.00E-06 EMR09318.1 SexM (3E-04)
36 GW103707.1 b Rhizopus oryzae MATA_HMGSexP0.28 ADU04732.1 SexP (2E-04)
37 GW090024.1 b Trametes versicolor MATA_HMGHypothetical protein1.8 EIW55066.1
38 GW111241.1 b Schizosaccharomyces pombe MATA_HMGmc 25.00E-05 NP_595867.1 SexM (0.011)
39 KC785113 b Fibroporia radiculosa MATA_HMGHypothetical8.00E-05 CCM01306.1 SexP (0.58)
40 GW112127.1 b Gaeumannomyces graminis MATA_HMGHypothetical4.00E-04 EJT79613.1 SexM (0.004)
41 KC785110 b Aspergillus kawachii MATA_HMGHypothetical2.4 GAA93066.1 SexP (0.003)
42 GW115599.1 b Zymoseptoria tritici MATA_HMGHypothetical4.00E-04 XP_003855030.1 SexM (3E-06)
43 GW088880.1 b Talaromyces marneffei MATA_HMGMAT1-2-10.008 ABC68485.1 SexM (0.14)
44 GW118912.1 b Cryphonectria parasitica MATA_HMGMAT1-1-31.00E-07 AAK83344.1 SexM (0.014)
45 KC785115 b Talaromyces stipitatus MATA_HMGMAT1-2-12.00E-06 XP_002488738.1 SexM (5E-06)
46 GW120847.1 b Trametes versicolor MATA_HMGHypothetical1.00E-08 EIW52457.1 SexM (4E-06)
47 KC785125 b Schizophyllum commune MATA_HMGHypothetical3.00E-05 XP_003029891.1 SexM (1E-04)
48 GW122078.1 b Piriformospora indica MATA_HMGHypothetical1.00E-04 CCA72393.1 SexM (7E-04)
49 KC785113 b Syzygites megalocarpus MATA_HMGSexP0.010 AET35404.1 SexP (6E-04)
50remain_C10144a Mycosphaerella populorum MATA_HMGHypothetical protein1 EMF17374.1 SexP (5E-04)
51 KC785111 b Xanthoria polycarpa MATA_HMGMAT1-2-10.002 CAI59768.2 SexM (0.051)
52remain_C10514a Rhynchosporium secalis MATA_HMGHMG box protein2.00E-09 CAD62166.1 SexM (2E-08)
53 KC785128 b Talaromyces marneffei MATA_HMGMAT1-2-13.00E-07 ABC68485.1 SexM (0.002)
54remain_C15333a Metarhizium acridum MATA_HMGHMG transcription factor3.00E-07 EFY86728.1 SexM (1E-07)
55c KC785127 b Tremella mesenterica MATA_HMGHypothetical0.079 EIW72397.1 SexM (1.5)
56remain_C16561a Debaryomyces hansenii MATA_HMGDEHA2E09460p0.007 XP_459717.2 SexP (1E-04)
57cremain_C20306a Metarhizium anisopliae MATA_HMGMAT1-1-30.005 BAE93596.1
58cremain_C26978aPhomopsis sp.MATA_HMGMAT1-2-10.058 AFP89369.1 SexM (0.10)
59remain_C2725a Pyrenophora teres MATA_HMGHypothetical1.00E-16 XP_003297060.1 SexM (3E-15)
60remain_C2727aSalpingoeca sp.MATA_HMGHypothetical0.034 EGD75508.1 SexP (1E-06)
61remain_C28337a Schizosaccharomyces japonicus MATA_HMGmatMc0.006 AFM85245.1 SexP (0.005)
62remain_C5967a Piriformospora indica MATA_HMGHypothetical1.00E-04 CCA72393.1 SexM (0.014)
63dremain_C6410a Acyrthosiphon pisum Sox_TCFHypothetical1.00E-05 XP_003245920.1
64cremain_C8213a Metarhizium anisopliae MATA_HMGMAT1-1-36.1 EFZ01123.1
65remain_C2832a Colletotrichum higginsianum MATA_HMGHMG box protein2.00E-04 CCF38267.1 SexM (3E-04)
66cremain_C9612a Schizosaccharomyces pombe MATA_HMGmatMc0.75 AAB28876.1 SexM (0.001)
67 KC785112 b Colletotrichum higginsianum MATA_HMGHMG box protein0.016 CCF38267.1 SexP (8E-04)
68 KC785123 b Phycomyces blakesleeanus MATA_HMGSexM9.00E-06 ABX27909.1 SexM (9E-06)
69 KC785122 b Verticillium longisporum MATA_HMGMAT1-1-20.096 AEA29200.1
70 KC785124 b Gaeumannomyces graminis MATA_HMGHypothetical0.002 EJT79613.1 SexP (9E-06)
71 KC785118 b Verticillium dahlia MATA_HMGHypothetical8.00E-06 EGY15843.1 SexP (2E-06)
72 GW104503.1 b Hymenoscyphus pseudoalbidus MATA_HMGMAT1-2-10.001 AFQ90566.1 SexM (0.002)
73 GW083220.1 b Colletotrichum higginsianum MATA_HMGHMG box protein4.00E-04 CCF38267.1 SexP (0.004)
74remain_C3698a Baudoinia compniacensis MATA_HMGMAT1-2-12.00E-05 EMC98166.1 SexM (1E-05)
75 GW082685.1 b Beauveria bassiana MATA_HMGHMG box protein1.00E-08 EJP65574.1 SexM (7E-11)
76 KC785116 b Xanthoria polycarpa MATA_HMGMAT1-2-10.012 CAI59768.2 SexM (0.024)

Crossing experiments

Two experimental designs, based on in vitro culturing of AMF, were used to determine whether some R. irregularis MATA-HMGs may be involved in the process of partner recognition. The first type of in vitro culture represents a negative control, where one single isolate was grown alone in M-medium (St-Arnaud et al., 1996), so the RNA isolated from these cultures represents a collection of transcripts originating from the mycelial network of one single isolate. These were called ‘standalone’ cultures (Fig. 1a). The second type of culture was designed to harvest hyphae originating from two interacting mycelial networks. These were called the ‘crossing’ cultures (Fig. 1b). Crossing cultures consist of 150-mm (diameter) circular plates with subcompartments created with 70-mm circular plates. 70-mm plates contained M-medium with sugar, while the area within the 150-mm plate but outside the 70-mm plate contained M-medium without sugar to avoid extensive root proliferation outside the 70-mm plate container. ‘Standalone’ plates contained one 70-mm plate, while crossing plates contained two 70-mm plates. Hyphal exit points (two in the crossing cultures, and six in the standalone plates) were created in the 70-mm plates by heating 1-cm-wide tweezers and melting openings in the edge of the 70-mm dish down to the level of the M-medium. M-medium bridges were later produced using a pipette across all exit points (Fig. 1b).

Figure 1.

Drawing of the standalone (a) and crossing (b) cultures. Smaller circles within the larger circle are 70-mm plates inside the 120-mm plates. Gaps in small circles are exit points. Brown lines inside the 70-mm plates represent carrot roots. Finer black lines coming out of the carrot roots represent hyphae.

In crossing experiments, a total of three isolates were used; namely the isolates SwiA4, SwiC2 and SwiB3 of R. irregularis. These were chosen because they originated from the same population in Switzerland (Koch et al., 2004, 2006; Croll et al., 2008), but genetically diverge from each other and are capable of undergoing anastomoses. Each of these three isolates was grown in replicates of three using the two abovementioned cultures and the following conditions: ‘standalone’, ‘self-crossed’, where two mycelial networks of the same isolate were present (referred to as A4–A4, C2–C2 or B3–B3), or ‘outcrossed’, where two mycelial networks of different isolates were present (referred to as A4–C2, C2–B3 or A4–B3). Throughout the AMF culturing period, which always lasted a total of 32 d at 25°C, carrot roots growing out of the 70-mm dish were redirected back into the subcompartment, while hyphae were allowed to grow through the exit points containing M-medium bridges, and thus proliferate and interact with other hyphae in the same compartment. To obtain good RNA yields, two plates per each biological replicate were pooled.

Identification of recombination events and phylogenetic reconstruction of MAT-HMG genes

Recombination events were detected using seven tools implemented in the program package rdp 4.13 (D. P. Martin et al., 2010), including rdp (Martin & Rybicki, 2000), geneconv (Padidam et al., 1999), chimaerea (Posada & Crandall, 1998), MaxChi (Smith, 1992), BootScan (Salminen et al., 1995), SiScan (Gibbs et al., 2000) and 3seq (Boni et al., 2007). These methods provide separate assessments of recombination events – that is, BootScan, SiScan and rdp detect recombination based on phylogenetic methods, while geneconv, chimaerea, MaxChi and 3seq are alignment-based (D. P. Martin et al., 2010). All settings in rdp were left as default with the exception that sequences were set to linear (as opposed to circular). Auto sequence masking was applied to alignments as suggested by the rdp manual in order to remove highly similar sequences from the scan for recombination (D. P. Martin et al., 2010). As the program infers a potential recombinant and two potential parental sequences (D. P. Martin et al., 2010), we applied sequence masking to all sequences in an alignment but three sequences for each scan for recombination (Vergin et al., 2007).

Following identification of recombination, MATA-HMG orthologues from 14 isolates were used in combination with the neighbour-network algorithm available in splitstree v4.6 (Huson, 1998) to test for reticulate branching patterns, with p-distance and 1000 bootstrap replicates to retrieve branch support. The Φw test for recombination implemented in splitstree was also used as an additional measure of recombination (Bruen et al., 2006). Sequence alignments on either side of the suggested recombination breakpoints were also analysed using phylogenetics to independently test for phylogenetic incongruence. In this case, the best-fit model of nucleic acid substitution among 88 models implemented in Jmodel Test was selected (Posada, 2009), using the Akaikes information criterion corrected for small sample sizes (AICc), and maximum likelihood trees were constructed using PhyML 3.0 (Guindon et al., 2010),with default settings, the only exceptions being that the starting tree topology search was set to best of nearest neighbour interchange (NNI) and subtree pruning and regrafting (SPR), and 100 bootstrap replicates used for branch support.

A phylogeny of AMF MATA-HMG genes has been reconstructed using PhyML (Guindon & Gascuel, 2003) and the LG model. In this case, amino acid sequences from representatives of SexM/P of the Mucorales, as well AMF MAT-HMG genes from Table 2 with E-values ≤ 110 and all their best reciprocal hits, were extracted and aligned using muscle (Edgar, 2004) and trimmed using trimal (Capella-Gutiérrez et al., 2009) with the Gappyout method. 1000 bootstrap replicates were used for branch support. The resulting phylogenetic tree and original alignment are available as Fig. S1.


The genome of R. irregularis contains an unusually high number of MATA_HMG domains

Searches for homologues of the MATA-HMGs across three available transcriptomes from isolates of R. irregularis and R. diaphanus resulted in the identification of a total of 75 transcripts, all of which were found to harbour the motif defining the HMG domain of fungal mating type loci (i.e. MATA_HMG; CDD ID: cd01389). One additional HMG was also retrieved using inverse PCR in subsequent analyses (Table 2). With one exception, all these genes could be amplified from one single isolate of R. irregularis (isolate SwiC2) following PCR and Sanger sequencing with specific primers (Table 2), demonstrating that the variation identified among transcripts did not result from alternative splicing. MATA-HMGs present in one AMF individual far exceed that found in other fungi with a sequenced genome (Fig. 2). The vast majority of AMF MATA-HMGs have maintained some weak similarity with SexM and SexP that comprise the mating type locus of several zygomycetes (Table 2, Fig. S1), although most of the 76 MATA-HMGs were found to be more closely related to MATA-HMGs from higher fungi (i.e. ascomycetes and basidiomycetes; MAT 1-2-1, MAT 1-1-2, MATMc and the pheromone response factor, Prf1) using reciprocal Blast procedures than to homologues from the Mucorales (Table 2). A single transcript was found to contain two MATA_HMG domains (HMG74), and we also identified eight putative expressed pseudogenes that were all characterized by the presence of an early stop codon along the open reading frame surrounding the MATA-HMG Blast hit.

Figure 2.

Number of mating typehigh-mobility group (MATA-HMG) domain-containing genes identified in the genomes of several Rhizophagus irregularis strains, and the genomes of representative species of the Ascomycota, Basidiomycota, Zygomycota and Chytridiomycota. The schematic phylogenetic representation is based on rRNA sequences.

Some MATA_HMG domains show substantial divergence between R. irregularis isolates from one population

To explore the possibility that some of the 76 MATA-HMGs could represent AMF idiomorphs, we amplified the DNA region surrounding these MATA-HMG domains by PCR from three R. irregularis isolates (isolates SwiA4, SwiB3 and SwiC2). In all cases, direct sequencing of PCR products resulted in chromatograms with no ‘double peaks’ (i.e. no closely related paralogues were simultaneously amplified in the PCR reaction), and the resulting sequences were always found to be much more closely related to their respective alleles from other isolates than to any other MAT-HMG identified in this study. Taken together, these results suggest that the allelic variation described here is likely to result from orthology rather than paralogy.

PCR reactions resulted in a successful amplification for all isolates investigated in the vast majority of genes tested, the few exceptions being those previously identified as pseudogenes (e.g. HMG55), demonstrating that none of those genes are too divergent not to co-amplify using specific primers. Accordingly, we found most AMF MATA-HMGs either to be monomorphic or to vary only slightly at the sequence level between different members of the population, and only 35 AMF MATA-HMG genes were found to be polymorphic at the amino acid level between the three isolates (Fig. 3), the most extreme being HMG49 with an average amino acid identity of 77% between pairs of isolates. This sharply contrasts with typical idiomorphs, which share virtually no obvious similarity.

Figure 3.

Pairwise amino acid similarity between mating typehigh-mobility group (MATA-HMG) genes in isolates SwiA4, SwiB3 and SwiC2 of Rhizophagus irregularis. Each coloured circle represents a MATA-HMG gene and the colour represents the per cent amino acid similarity between isolates SwiA4 and SwiC2. The per cent amino acid similarity between isolates SwiA4 and SwiB3 is represented on the x-axis and that between isolates SwiB3 and SwiC2 on the y-axis. The numbers above some points represent variable MATA-HMG genes HMG6 (1), HMG1 (2), and HMG37 (3) which were analysed by quantitative real-time PCR.

Sequence variation was also investigated across additional isolates for a number of MATA-HMGs (HMG49, = 14; HMG37, 47, 51, 61 and 63, = 9). These latter genes were chosen because they showed elevated sequence divergence and were repeatedly found to be present in only one version of two divergent alleles among the three original isolates investigated, making them intriguing candidates for population genetics purposes. As previously shown by others, allelic variation was sometimes found to be maintained over extremely large geographical distances (i.e. R. irregularis isolates harvested from different continents harboured highly conserved polymorphisms; Vandenkoornhuyse et al., 2001; Croll & Sanders, 2009; den Bakker et al., 2010; Figs S2–S7).

Detection of inter- and intra-isolate recombination events in AMF

The large number of homologous sequences (i.e. MATA-HMG genes) we isolated from different strains of one AMF species represented a unique opportunity to seek the presence of homologous recombination in this putatively asexual lineage, and potentially confirm previous findings of recombination based on other regions of the AMF genome and other species (Vandenkoornhuyse et al., 2001; Croll & Sanders, 2009; den Bakker et al., 2010). Our inspections of orthologous MATA-HMGs isolated from different strains resulted in the identification of recombination in only three cases (out of 76). Two of these were suggested to have taken place within single isolates (i.e. intragenomic gene conversion; HMG6 and HMG1; Table 3), while another MATA-HMG displayed alternations in nucleotide similarity between isolates along an alignment (HMG49; Table 3, Fig. S2). Specifically, HMG6 from SwiC2 was found to contain a 139-bp fragment identical to HMG7 from the same isolate, while a second recombination event was suggested to have taken place between HMG1 and HMG61 in isolates SwiC2 and SwiB3. Gene conversion was supported by independent phylogenetic analyses, which supported conflicting evolutionary relationships between different regions of the putative recombinant HMG6 in isolate SwiC2 (Fig. S8A). However, the recombination event between HMG1 and HMG61 was not supported using the same methodology (data not shown).

Table 3. Results of rdp analysis of the two mating typehigh-mobility group (MAT-HMG) sequences where recombination events were detected using seven recombination detection tools
Putative inter-isolate recombination
Recombinant HMG #49a,d49b,d49c,d49c,d49b,d49b,d49a,d49c,d
First breakpoint101*13766*81*29*345*152*316*
Second breakpoint301*359324*278311438*347*440*
Recombinant sequenceCanOn2CanOn2CanOn2CanOn2CanOn2CanOn2CanOn2CanOn2
Major parent sequenceSwiA4SwiA4CanQc2*SwiA1*CanBc*SwiA2SwiC2*SwiC2
Minor parent sequenceSwi A2SwiC2SwiA2SwiA2SwiA2CanBc*SwiA1CanB3*
rdp  3.2E-3
geneconv 3.80E-020.0430.00480.0150.00891.1E-20.014
BootScan 1.40E-020.0310.000954.30E-30.0170.0453.8E-20.024
MaxChi 1.70E-030.00280.00266.7E-30.00650.00295.6-30.008
chimaera 4.40E-020.0070.0447.7E-30.000140.00121.3E-30.025
SiScan 2.30E-072.4E-98.9E-30.00460.00412.4E-20.000005
3Seq 3.40E-020.0110.00398.8E-40.000880.0189.20E-3
Putative intra-isolate recombination
Recombinant HMG #7d1d1d
  1. Asterisks next to recombination breakpoints and isolates indicate an uncertain designation of a recombination breakpoint, minor parent or major parent by rdp.

  2. a

    The major parent is the potential recombinant.

  3. b

    Either parental sequence is a potential recombinant.

  4. c

    The minor parent is the potential recombinant.

  5. d

    Possible misidentification of the recombinant sequence.

First breakpoint197*150147
Second breakpoint336264*264*
Recombinant sequenceHMG6-SwiC2HMG1-SwiB3HMG1-SwC2
Major parent sequenceHMG6-SwiB3HMG1-SwiA4HMG1-SwA4
Minor parent sequenceHMG7-SwiC2HMG61-SwiB3HMG61-SwiB3
geneconv 4.7E-79.9E-061.3E-7
BootScan 1.30E-041.9E-73.1E-8
MaxChi 1.10E-090.000460.0014
chimaera 3.60E-090.00130.0013
SiScan 2.30E-093.3E-9
3Seq 1.00E-96.60E-78.1E-7

Surprisingly, potential for inter-isolate recombination was also found, although only for one gene (HMG49; Table 3, Fig. 4). In this case, a total of eight potential recombination events were identified that may have shuffled this genomic region among several isolates of R. irregularis. As for HMG6, independent phylogenetic reconstructions using regions between recombination breakpoints revealed phylogenetic incongruences suggestive of recombination (Fig. S8B). The presence of recombination at this locus was also confirmed by obtaining a reticulate branching pattern following phylogenetic network reconstructions (Huson & Bryant, 2006), which was very significantly supported by the Φw test for recombination (= 0.00000074). The combination of reticulate associations and Φw test is considered a robust analysis that effectively distinguishes between recurrent mutations and recombination in an alignment (Bruen et al., 2006).

Figure 4.

Neighbour network of AMF HIGH MOBILITY GROUP gene 4 (HMG49) in Rhizophagus irregularis isolates of a broad geographical origin. Bootstrap proportions above 90% (= 1000 replicates) are shown. Isolate codes can be found in Table 1. The scale bar represents 0.01 substitutions per site.

Some MATA-HMGs are located in tandem repeats in the AMF genome

The regions surrounding several divergent HMGs were investigated using inverse PCR to identify adjacent genes, and a potential conservation in gene order (i.e. synteny) with known fungal mating type loci. The inverse PCR procedures were successful in expanding sequence information only for one gene (HMG6), and resulted in the acquisition of a 7272-bp-long DNA stretch. This region was then used as a query against available genome data from R. irregularis (kindly provided by the Rhizophagus Genome Consortium) to further explore the architecture of this region of the AMF genome, which helped us produce a supercontig with a total length of 8572 bp. Gene annotation revealed the presence of four tandem-repeated MATA-HMGs along this supercontig (Fig. 5), and the assembly was confirmed using PCR and Sanger sequencing using DNA from the isolate SwiC2. In parallel, two large portions of this genomic region were also isolated from additional strains of one population (SwiA4 for and SwiB3) to investigate evolutionary patterns along this large region of the AMF genome, and our explorations revealed most polymorphism to be located within noncoding regions, with one notable exception – the gene HMG6 is present in two allelic forms that are specific to SwiC2 and Swi4, respectively, while both alleles are present in the genome of SwiB3 (Fig. S9).

Figure 5.

Schematic representation of an 8572-bp region of the Rhizophagus irregularis genome harbouring four tandem-repeated mating typehigh-mobility group (MATA-HMG) genes obtained using inverse PCR and bioinformatics approaches.

MATA_HMG gene expression following hyphal interactions in the AMF

Following qPCR analyses, only a subset of genes were found to be significantly up-regulated (Fig. 6) in crossing experiments as opposed to standalone conditions, but no unambiguous pattern of gene expression emerged. Specifically, even when expression of specific MATA-HMGs significantly increased following self-crossing and outcrossing experiments compared with standalone cultures (e.g. HMG6, 37), the up-regulation was often found to vary substantially among isolates and was also found in some cases to affect housekeeping genes. Similarly, no significant differences in gene expression could be identified between self-crosses and outcrosses, and two monomorphic HMGs (HMG62 and HMG55) were also found to be differently regulated among the different experimental conditions. More generally, the expression of the transcriptome seems to increase during crossings, resulting perhaps from a more active mycelium.

Figure 6.

Gene expression of variable mating typehigh-mobility group (MAT-HMG) domains (a) HMG6-A, (b) HMG6-B, (c) HMG1 and (d) HMG37 and invariable MAT-HMG targets (e) HMG65, (f) HMG52 and (g) HMG22, and potential reference genes (h) Elongation factor Ef1-α, (i) β-tubulin and (j) actin in R. irregularis isolates SwiA4 (A4), SwiB3 (B3), and SwiC2 (C2) standalone conditions and crossing conditions: self-crossings A4–A4, C2–C2 and B3–B3, and out-crossings A4–C2, A4–B3 and B3–C2. = 3–4 biological replicates were carried out for every experimental condition. = 2 technical replicates were performed for target gene measurements of all experimental conditions, except for reactions measuring the three variable gene targets and β-tubulin in the standalone conditions (i.e. standalone conditions A4, C2, and B3 measured with HMG49, HMG52, HMG65 and β-tubulin) where = 1 reaction for each biological replicate measured with these targets. Axes for each gene target are rescaled to the highest expressed sample (i.e. the sample with the lowest Ct value has a value of 1). Significant increases compared with standalone conditions are indicated by: *,  0.05; **,  0.02; ***,  0.01. Error bars are +SD.


Extensive paralogy and the potential origin of intra-individual variation in Rhizophagus

In the present study, a total of 76 gene transcripts harbouring the MATA-HMG domain were identified, all of which were later found to be present within the genomes of several isolates of R. irregularis. These genes are very diverse in sequence, as signified by the wide range of fungal MATA-HMGs with which these genes share homology, and appear to have rapidly expanded within the AMF lineage. Many other genes have been previously reported to be present in many so-called ‘variants’ within one AMF genome (e.g. up to 112 variants for the large subunit ribosomal RNA genes (Boon et al., 2010) and 15 variants for Heat Shock Protein, Hsp70 (Kuhn et al., 2001), and between 15 and 217 variants have been suggested to exist for the Polymerase 1 like sequence, PLS1 (Kuhn et al., 2001; Pawlowska & Taylor, 2004; Hijri & Sanders, 2005; Boon et al., 2010). Interestingly, the presence of several homologues within our transcriptome could only be confirmed for Hsp70, while only one single homologue of the gene PLS1-like could be identified. In both cases, the MATA-HMGs clearly stood to be much more abundant (see Fig. S10 for a comparison of abundance with other genes previously suggested to be highly variable within AMF genomes).

So, where do all these MATA-HMGs come from? Are they segregated among co-existing nuclei as suggested by some (Kuhn et al., 2001; Hijri & Sanders, 2005), or are they all present within one genome as suggested by others (Pawlowska & Taylor, 2004; Stukenbrock & Rosendahl, 2005)? Presently, our findings support the notion that intragenomic gene duplications play a central role in originating intra-sporal genetic diversity in AMF genes. This is exemplified by one specific region of the AMF genome along which we identified up to four tandem-repeated HMGs, and there are currently no reasons not to expect other members of this family (or any other AMF gene for that matter) to be present in a similar context elsewhere in the genome. Certainly, if rampant paralogy is confirmed for other regions of the R. irregularis genome, then this would offer a conventional explanation for the difficulties in assembling a complete genome sequence from representatives of this fungal group (i.e. high intra-nuclear diversity), and a robust alternative to the ‘heterokaryosis hypothesis’ for the origin of molecular polymorphism in AMF.

Rare recombination in AMF – meiosis or mitosis?

In this study, isolating MATA-HMGs from representatives of many populations of R. irregularis has allowed us to explore the presence of recombination events in these supposedly ancient asexuals, and our findings were in broad agreement with those previously obtained by others (Vandenkoornhuyse et al., 2001; Croll & Sanders, 2009; den Bakker et al., 2010). Specifically, our stringent procedures revealed recombinational events in only 4% of the 76 genes analysed here, with only one potentially resulting from gene exchange between isolates (i.e. cryptic sex; HMG49), but it remains to be elucidated whether this rare gene shuffling has resulted from meiosis (i.e. sex) or mitotic events.

Unfortunately, the exact origin of these recombining sequences is currently difficult to assess in AMF, because little cellular evidence has been gathered so far regarding the presence or absence of (para)sexually related processes in these organisms (Tommerup, 1988; Tommerup & Sivasithamparam, 1990; Giovannetti et al., 1999; Croll et al., 2009), so most assumptions of their presence are currently based on molecular data (Riley & Corradi, 2013). For instance, AMF have been proposed to be capable of undergoing meiosis based on the presence of a complete set of meiosis-specific genes within their genomes, but a meiotic cycle has yet to be formally observed in this lineage (Halary et al., 2011; Corradi & Lildhar, 2012; Riley & Corradi, 2013). For this important reason, one can still assume that meiosis-specific genes could function in processes that are not necessarily related to sexual reproduction in AMF (e.g. DNA repair only). Similarly, although the identification of potential inter-strain gene shuffling in this study supports the idea that AMF are capable of exchanging genetic material, we presently cannot exclude the possibility that such events could have arisen following asexually related processes (e.g. mitotic recombination or transposition) until conclusive evidence for the presence of genetic exchange in natural population is presented.

However, the detection of recombination in present and past studies is still essential for our understanding of the genetics of these organisms, as it demonstrates that AMF have indeed found a way to shuffle their genetic information across the genome (and possibly between members of one species) to reduce their mutational load. The next step will be to understand whether these organisms have been recombining following conventional sexual cycles (i.e. through meiotic recombination following nuclear fusion) or by using a number of nonsexual and unconventional means such as mobile DNA elements.

No conclusive evidence for a potential AMF MAT locus, for now

The identification of MATA-HMGs within different strains of R. irregularis is highly intriguing because these transcription factors are often found within the mating type loci of other fungal organisms (Lee et al., 2010), so their presence in AMF has suggested the provocative hypothesis that these may be used for similar purposes (e.g. sexual reproduction and partner recognition). In other fungi, these transcription factors tend to follow specific evolutionary trajectories, which include allelic variation between different members of one species (Geiser, 2008), a relatively well-conserved gene order between members of one fungal class (Lee et al., 2010) and, finally, an increase in gene expression following crossing experiments using genetically different strains of one species (Wetzel et al., 2012).

In the present study, searching for similar sexual patterns resulted in mixed, but highly intriguing, results. On the one hand, alleles of each MATA-HMG were never found to diverge among isolates to the extent one would expect for typical idiomorphs in a heterothallic species, and the simultaneous presence of homologues of SexM/P in one mycelium might suggest that R. irregularis is homothallic. On the other hand, however, searches for the presence of gene order conservation surrounding some of these genes revealed no obvious conservation with known fungal MAT loci, and our quantitative PCR approaches were also inconclusive in identifying an ideal AMF MAT-locus candidate, although this latter search was hindered by the tremendous number of homologues that could have been chosen for such analyses, and by the notorious difficulties in building optimal and fully axenic experimental designs using these obligate plant symbionts as models.

Concluding remarks and future directions

The discovery of an expanded set of genes that are homologues to those found in several fungal mating type loci provided important insights into the evolution of these ecologically relevant fungi and their genomes, at different levels. First, the amount of MATA-HMG homologues we found in single AMF individuals was very remarkable per se, reaching numbers well in excess of those found in any other known fungus with a sequenced genome. It is also noteworthy that the genes we report here only represent expressed portions of the genome, so the true extent of this gene family in AMF is very likely to be far greater still. Why AMF expanded those gene family regions over others (e.g. many AMF genes are only found in one copy) is currently unknown, but the persistence of so many homologues within one individual almost certainly underpins their great importance. These genes may be conserved and expanded because they were recruited for other functions, a hypothesis supported by several reports indicating that mating-type transcription factors can also control processes not directly related to sexual reproduction (Adam et al., 2011; Bidard et al., 2011; Wang et al., 2012). However, given their similarity with genes known to be involved in sex in other fungi, we also believe that claims of long-term clonal evolution in AMF may have been a little overstated, and should be treated with some caution until conclusive evidence for asexuality is brought forward.

Without knowledge of the type of genetic system for mating that exists, it is difficult to infer whether any of the 76 MATA-HMG genes identified could correspond to a bona fide mating type locus. For instance, the small sample size may preclude identification of an idiomorphic region in a heterothallic species, while in many homothallic species the MAT locus is common between all individuals. In any case, the acquisition of large-scale genome data from AMF will be instrumental in tackling many of the issues that could not be addressed in the present study; most notably by providing a general picture of the overall diversity of MATA-HMGs in AMF, and that of other genes that are generally linked to the presence of sexuality in other fungi, and by demonstrating whether these families are preferentially inflated over others. The availability of large genome contigs from one AMF individual could also be used to investigate genome-wide expression information (e.g. mRNAseq) instead of single genes; and such analyses could be performed following crossing experiments similar to those used in the present study. These approaches, combined with more sophisticated experimental designs, may finally allow the identification of those genomic regions that may be involved in mating in AMF, with the potential to result in a dramatic step forward in our understanding of the lifestyle of these ecologically critical symbionts of land plants.


We would like to thank Daniel Croll and two anonymous reviewers for comments on a previous version of the manuscript, and Alexander H. Koch and Ian R. Sanders for providing valuable in vitro cultures of R. irregularis. N.C. is a Fellow of the Integrated Microbial Biodiversity program of the Canadian Institute for Advanced Research (CIFAR-IMB). This work was supported by grants from the Natural Sciences and Engineering Research Council of Canada to NC (NSERC-Discovery) and US National Science Foundation to A.I. F.M. receives funding from the Laboratoire of Excellence ARBRE (ANR-11-LABX-0002-01).