Author for correspondence: J. Labbé Tel: +33 3 83 39 40 80 Fax:+33 3 83 39 40 69 Email: firstname.lastname@example.org
• A genetic linkage map for the ectomycorrhizal basidiomycete Laccaria bicolor was constructed from 45 sib-homokaryotic haploid mycelial lines derived from the parental S238N strain progeny. For map construction, 294 simple sequence repeats (SSRs), single-nucleotide polymorphisms (SNPs), amplified fragment length polymorphisms (AFLPs) and random amplified polymorphic DNA (RAPD) markers were employed to identify and assay loci that segregated in backcross configuration.
• Using SNP, RAPD and SSR sequences, the L. bicolor whole-genome sequence (WGS) assemblies were aligned onto the linkage groups. A total of 37.36 Mbp of the assembled sequences was aligned to 13 linkage groups. Most mapped genetic markers used in alignment were colinear with the sequence assemblies, indicating that both the genetic map and sequence assemblies achieved high fidelity.
• The resulting matrix of recombination rates between all pairs of loci was used to construct an integrated linkage map using JoinMap. The final map consisted of 13 linkage groups spanning 812 centiMorgans (cM) at an average distance of 2.76 cM between markers (range 1.9–17 cM).
• The WGS and the present linkage map represent an initial step towards the identification and cloning of quantitative trait loci associated with development and functioning of the ectomycorrhizal symbiosis.
Underlying a tree's ability to generate large amounts of biomass or store carbon is its interactions with soil microbes known as ectomycorrhizal fungi, a symbiotic organism that excels at procuring necessary, but scarce, nutrients such as phosphate and nitrogen. The fungus within the root is protected from competition with other soil microbes and gains preferential access to carbohydrates within the plant, while concurrently transferring most of its obtained nutrients to the tree, and thus a mutualistic relationship is established (Martin et al., 2007). The ectomycorrhizal fungus Laccaria bicolor (Maire) P.D. Orton (Basidiomycota, Agaricales, Hydnangiaceae) (Matheny et al., 2007) forms symbiotic associations with a wide variety of tree species in the northern hemisphere (Mueller, 1982, 1991). In Europe, L. bicolor is mainly associated with Pinaceae, but it is sometimes found under deciduous trees such as Quercus or Fagus. In North America, L. bicolor appears to be primarily associated with members of Pinaceae. L. bicolor has been a major experimental model for decades (Martin et al., 2004), and has been used in large-scale mycorrhizal inoculation programs (Le Tacon et al., 1992). The elucidation of its genome, which has been sequenced under the initiative of the US Department of Energy (Martin et al., 2008), will be of interest to communities studying everything from physiology and ecology in forest ecosystems to fundamental questions in evolution and development of host and symbionts.
The whole-genome sequence (WGS) reads of L. bicolor S238N-H82 were initially assembled using the JAZZ assembler (J. Chapman et al., unpublished). After excluding redundant and short segments of assembled sequences or scaffolds, there remained 65.4 Mb of sequence (665 scaffolds with 248 > 10 kb in length), of which 6.6 Mb (10.1%) existed as unassembled sequences or captured gaps (Martin et al., 2008). After initial computational and manual annotations, the WGS data were re-assembled using the ARACHNE assembler (Batzoglou et al., 2002). The ARACHNE algorithm was less sensitive to repeat regions, such as abundant transposable elements. The ARACHNE assembly generated larger scaffolds, so-called supercontigs, with fewer gaps (total supercontig size, 59.9 Mb with 0.6 Mb gaps; 82 supercontigs > 10 kb), suggesting inconsistencies between genome assemblies. As a result, it was difficult to assess the degree of mis-assembly, as no independent comparator, such as a genetic linkage map, was detailed enough to verify the released WGS assemblies. Most eukaryotic genome sequencing projects are often preceded by the construction of physical and genetic (meiotic) maps. For the L. bicolor genome project, there was no physical BAC map, and although a L. bicolor linkage map was constructed by Doudrick et al. (1995), the polymorphic marker sequences were not available, thus precluding genetic map-assembled sequence integration. Moreover, there were no reports on the chromosome number for L. bicolor; however, in L. montana, a close species, the haploid number of chromosomes appeared to be n = 9 (Mueller et al., 1993).
In addition to WGS, another important component of ecological genomics and association genetics programs is the development of a linkage and quantitative trait loci (QTL) map. Linkage maps in ascomycetous fungi have been shown to be useful for identifying QTLs, candidate gene mapping, and comparative mapping between species (Hulbert et al., 1988; Tzeng et al., 1992; Debets et al., 1993; Arnau et al., 1994; Nitta et al., 1997; Gale et al., 2005). However, among basidiomycetes genetic linkage maps are scarce. Thus far, there are genetic maps for Phanerochaete chrysosporium (Raeder et al., 1989), Agaricus bisporus (Callac et al., 1997), Coprinopsis cinerea (Muraguchi et al., 2003), Cryptococcus neoformans (Marra et al., 2004) and Pleurotus ostreatus (Park et al., 2006). One feature of basidiomycetes that facilitates genetic mapping is the presence of homokaryotic haploid meiotic spores produced in the fruiting body. Assay of mycelium issued from these spores directly reveals the products of meiosis (essentially, behaving like a cross to a homozygous testor strain), allowing an efficient mapping of genes.
In the present study, we generated a genetic linkage map for L. bicolor based on a mapping panel comprising 45 individual haploid siblings from a progeny of the parental S238N dikaryon strain. For map construction, 294 simple sequence repeats (SSRs), single-nucleotide polymorphisms (SNPs), amplified fragment length polymorphisms (AFLPs) and random amplified polymorphic DNA (RAPD) markers were employed to identify loci that segregated in backcross configuration. The JAZZ and ARACHNE WGS assemblies were anchored to this genetic linkage map using 49 sequence-tagged markers. Finally, key structural genes coding for mating-type loci, nuclear rDNA, laccase lcc6 and hydrophobin LbH2 were mapped on the linkage map. The release of the WGS and development of a linkage map for L. bicolor would enable us to map and clone QTLs associated with symbiosis development and functioning, and acquisition of scarce soil nutrients.
Materials and Methods
Fungal strain and culture conditions
Spores were sampled from caps of Laccaria bicolor (Maire) P.D. Orton fruiting bodies growing beneath Pseudotsuga menziesii (Mirb.) Franco seedlings inoculated with L. bicolor stain S238N in a glasshouse or in a nursery (Di Battista et al., 1996) and germinated according to Fries (1983). Ninety-one individual homokaryotic (haploid) mycelia were used in this study, including the S238-H82 strain from which the genome was sequenced (Martin et al., 2008). All homokaryotic mycelia were subcultured in Petri dishes containing a Pachlewski agar medium (Henrion et al., 1992) and stored at 4°C with yearly subculturing. To provide material for DNA isolation, mycelium was grown in glass tubes containing 20 ml of Pachlewski liquid medium for 3 wk at 25°C. All strains are stored and subcultured at INRA-Nancy.
Homokaryotic mycelium was removed from the growth medium, rinsed in H2O and frozen in liquid nitrogen. Approximately 80 mg (FW) were used for DNA isolation using the DNeasy Plant Mini Kit (Qiagen, Courtaboeuf, France) according to the manufacturer's instructions. DNA was recovered in 50 µl of deionized water. Taxonomic identity of strains and quality of DNA were ascertained by PCR amplification and sequencing of the internal transcribed spacer (ITS) of the nuclear ribosomal DNA, using the following primers: ITS1 (5′-TCCTCCGCTTATTGATATGC) and ITS4 (5′-TCCGTAGGTGAACCTGCGG). The PCR was performed in a Perkin-Elmer Cetus thermocycler 9700 (Applied Biosystems, Foster City, CA, USA) according to Di Battista et al. (2002).
Detection, scoring and sequencing of RAPD fragments The sequence of 23 10-mer primers (Table 1) used by Doudrick et al. (1995) was retrieved from the UBC primer set (http://www.michaelsmith.ubc.ca/services/NAPS/Primer_Sets/) available at the Nucleic Acid Protein Service Unit (University of British Columbia, Canada). RAPD analysis was carried out on the 91 sibling homokaryons and the parental S238N dikaryon. PCR amplification of the DNA template was as per the protocol of Doudrick et al. (1995), except that PCR was performed using a Perkin-Elmer Cetus thermocycler 9700 using 0.5 U of Taq DNA polymerase (Qbiogene, Strasbourg, France) and the Qbiogene 10× buffer containing 25 mm MgCl2. After amplification, RAPD fragments were resolved by electrophoresis (5 V cm−1) for 2 h in 2% agarose gels (one-third agarose (Qbiogene), two-thirds wide-range agarose (Sigma-Aldrich, Saint-Quentin Fallavier, France)) in TBE buffer (45 mm Tris base, 45 mm boric acid, and 1 mm EDTA, pH 8). RAPD fragments were detected by staining with ethidium bromide (0.5 mg ml−1) for 20 min. RAPD fragments were scored as dominant segregating markers (presence/absence) when amplicons of identical size were detected in three replicates for each polymorphic fragment. RAPD markers were identified by the letter R followed by a number (i.e. R1). The markers were imported in JoinMap v 3.0 (Van Ooijen and Voorrips, 2001) and checked for deviation from 1 : 1 ratios. Polymorphic RAPD fragments were excised from agarose gel using QIAquick Gel extraction kit (Qiagen), cloned using TOPO TA cloning Kit pCR 2.1TOPO vector (Invitrogen, Cergy-Pontoise, France) and sequenced on both strands using M13 forward and M13 reverse primers, the CEQ Dye-labeled Dideoxy-Terminator Cycle Sequencing kit (Beckman Coulter, Fullerton, CA, USA) and the automated CEQ 8000 XL sequencer (Beckman Coulter). Sequences of these RAPD fragments are available at http://mycor.nancy.inra.fr/IMGC/LaccariaGenome/GeneticMap/RAPDMarkerSequences and were located by similarity search on the JAZZ and ARACHNE genome assemblies using the BLAST algorithm at the JGI L. bicolor genome website (http://www.jgi.doe.gov/laccaria) or INRA LaccariaDB website (http://mycor.nancy.inra.fr/IMGC/LaccariaGenome/).
Table 1. Sequences of the 23 random amplified polymorphic DNA (RAPD) primers used for the genetic linkage map construction
Annealing temperature (°C)
SSR analysis The SSR markers were detected on the L. bicolor ARACHNE genome assembly (v. 1.0) (http://mycor.nancy.inra.fr/IMGC/LaccariaGenome/Annotation/index.php?select=fast) using MAGELLAN 1.1 software (Lim et al., 2005). Details on the SSR distribution analysis will be presented elsewhere (J. Labbéet al., unpublished). We selected all SSR types except for the compound motifs in an effort to sample the 20 larger ARACHNE supercontigs. SSR primers were designed using the online Primer3 tool (Koressaar & Remm, 2007). The SSR analyses were conducted at Oak Ridge National Laboratory (ORNL) and INRA-Nancy using two different protocols.
At ORNL, SSRs were analyzed with Fluorescein-12-dUTP (Enzo Roche, Penzberg, Germany). PCR reactions were carried out in a total volume of 15 µl containing 25 ng of template DNA, 7.5 ng forward and reverse oligonucleotide primers (Operon Technologies, Alameda, CA, USA), 20 µm of each dNTP, 0.5 U Taq DNA polymerase (New England Biolabs, Beverly, MA, USA), 1.5 µl 10× buffer (containing 100 µm Tris-HCl, pH 8.3, 500 mm KCl, 20 mm MgCl2 and 10.0 g l−1 bovine serum albumin). PCR was conducted in a Perkin-Elmer Cetus thermocycler 9700, using 10 touchdown cycles performed with annealing temperature starting at 60°C and ending at 50°C with a 1°C decrease each cycle. After the touchdown cycles, there followed 20 cycles of 94°C for 30 s, 50°C for 30 s, and 72°C for 1 min combined with a final extension at 72°C for 7 min. PCR products were detected on an ABI 3730XL sequencer using the standard microsatellite genotyping module. The amplification products were mixed in appropriate ratios and diluted 1 : 10 with loading buffer (91% deionized formamide, 9% internal standard GeneScan 450ROX (Applied Biosystems)), then denaturated at 95°C for 5 min followed by rapid cooling on ice.
At INRA-Nancy, SSRs were analyzed with one fluorescent dye-labeled primer (D2-PA, D3-PA, D4-PA, WellRED dyes; Proligo, Paris, France). PCR reactions were carried out in a total volume of 10 µl containing 10 ng of template DNA, 10 ng forward and reverse oligonucleotide primers (Invitrogen), 200 µm of each dNTP, 0.5 U Taq DNA polymerase (Qbiogene), 1 µl Qbiogene buffer 10× containing 25 mm MgCl2. PCR was conducted in a Perkin-Elmer Cetus thermocycler 9700 (Applied Biosystems) with 30 cycles of 94°C for 30 s, 60°C for 1 min, and 72°C for 2 min combined with a final extension at 72°C for 10 min. PCR products were detected on a CEQ 8000 XL sequencer using the CEQ 8000 V9.0 genotyping module (Supporting information, Fig. S3). Before analysis, each amplification product was diluted 1 : 10 with deionized water. One microliter of this dilution was mixed with 0.5 µl DNA size standard (DNA size standard kit 600 bp, Beckman Coulter) and with 30 µl SLS buffer (Beckman Coulter).
AFLP analysis The AFLP analyses, carried out at the ORNL, have been applied in map construction for many different pedigrees (Yin et al., 2002, 2004). In this study, we used three replicates of the DNA template extracted from the dikaryon S238N and H82 (the sequenced homokaryon). Two initial screening steps were performed before genotyping the entire progeny set. We initially used S238N and H82 to screen primer combinations to identify discernible amplification profiles. In a second step, we used S238N, H82 and six additional homokaryons to screen primer combinations for reproducible amplification profiles. These eight individuals were then included in the progeny genotyping step. Thus, the reproducibility of the AFLP amplification profiles was confirmed during each of the three steps listed above.
The AFLP procedure was performed as described by Vos et al. (1995) with the following modifications. Pre-amplification reactions (15 µl) were performed for 3 µl of the diluted DNA template using 20 pmol each of a pair of AFLP primers (Operon Technologies) with no selective 3′ nucleotides on the ‘E’ primer and 1 ‘C’-selective 3′ nucleotide on the ‘M’ primer. Reaction conditions were identical to those described for SSRs. The pre-amplified products were diluted 1 : 30 as DNA template for selective amplification. Selective amplification was carried out in a volume of 15 µl reaction mixture, containing 3 µl diluted pre-amplification product, 0.5 pmol ‘E’ primer with two selective nucleotides (Hex-labeled) and 5 pmol ‘M’ primer with three selective nucleotides (Operon Technologies), 200 µm each dNTP, 0.5 U Taq DNA polymerase (Promega, Madison, WI, USA), 1.5 µl 10× buffer (100 µm Tris-HCl, pH 8.3, 500 mm KCl, 20 mm MgCl2), 10.0 g l−1 BSA and 1% (v/v) deionized formamide. Thermocycling conditions for selective amplification were 12 cycles of 94°C for 30 s, 65°C for 30 s decreasing by 0.7°C per cycle, and 72°C for 60 s, followed by 23 cycles of 94°C for 30 s, 56°C for 30 s and 72°C for 60 s. PCR products were detected use the ORNL SSR genotyping protocol discussed earlier on the ABI 3730XL sequencer. The restriction site sequences of AFLP fragments were too short for aligning the AFLP markers on the genome sequence assemblies.
Mapping of selected nuclear genes, SNPs and mating-type genes
Five structural loci were mapped: hydrophobin (LbH2), laccase (lcc6), mating-type A and B (MATa and MATb), and nuclear ribosomal DNA intergenic spacer 1 (IGS1). PCR amplification of the genes coding for LbH2 (JGI ID 399267) and lcc6 (JGI ID 399748) on 91 monokaryotic genomic DNA was conducted in a Perkin-Elmer Cetus thermocycler 9700 with 30 cycles of 94°C for 30 s, 53°C for 30 s, and 72°C for 2 min combined with a final extension at 72°C for 10 min. Primers (Table 2) were designed using the online Primer3 tool (Koressaar & Remm, 2007). PCR reactions were carried out in a total volume of 25 µl containing 25 ng of template DNA, 20 ng forward and reverse oligonucleotide primers (Invitrogen), 250 µm of each dNTP, 0.5 U Taq DNA polymerase (Qbiogene), and 2.5 µl Qbiogene buffer 10× containing 25 mm MgCl2. PCR products were sequenced on both strands using the respective forward and reverse PCR primers, the CEQ Dye-labeled Dideoxy-Terminator Cycle Sequencing kit and the automated CEQ 8000 XL sequencer according to the manufacturer's instructions. Sequences were then assembled using CLUSTALW (http://bioinfo.hku.hk/services/analyseq/cgi-bin/clustalw_in.pl) and Sequencher 4.2 (Gene Code Corporation, USA) software. SNPs were detected for lcc6 at positions 1758, 1869, and 1940, and for LbH2 at position 478.
Table 2. Sequenced-tagged markers used for anchoring whole-genome sequence (WGS) assemblies on the genetic linkage map
Positions on ARACHNE supercontigs
Positions on JAZZ scaffolds
Linkage groups (genetic distance, cM)
(F) and (R) indicate the forward and reverse primers used to amplify the 49 sequence-tagged markers, respectively. CM values indicate the genetic distance on linkage groups in centiMorgan from the proximal end.
Analysis of meiotic segregation of the intergenic spacer 1 (IGS1) of the nuclear rDNA was carried out as per Selosse et al. (1996). The IGS1 haplotypes, α or β, were amplified from 91 monokaryotic genomic DNA using the primer 5SA (5′-CAGAGTCCTATGGCCGTGGAT) and the fluorescent dye-labeled primer INRAjLIGS1 (5′-CAGTGGAGTAAGTCAG-D4-PA) (D2-PA, D3-PA, D4-PA, WellRED dyes, Proligo). PCR amplification was conducted in a Perkin–Elmer Cetus thermocycler 9700 with 35 cycles of 94°C for 30 s, 50°C for 30 s, and 72°C for 5 min combined with a final extension at 72°C for 10 min. PCR reactions were carried out in a total volume of 10 µl containing 10 ng of template DNA, 10 ng forward and reverse oligonucleotide primers (Invitrogen), 200 µm of each dNTP, 0.5 U Taq DNA polymerase (Qbiogene), and 1 µl Qbiogene buffer 10× containing 25 mm MgCl2. Amplified IGS1 homo- and heteroduplexes (Selosse et al., 1996) were detected on the CEQ 8000 XL sequencer using the fragment detection module. Before analysis, each amplification product was diluted in 1 : 10 with deionized water. One microlitre of this dilution was mixed with 0.5 µl DNA size standard (DNA size standard kit 600 bp, Beckman Coulter) and with 30 µl SLS Beckman buffer. The IGS1 sequence, being part of the rDNA tandem repeat (18S-ITS1-5.8S-ITS2-28S-IGS1-5S-IGS2) (Martin et al., 1999), was assumed to co-map with the whole rDNA repeat region. The mating-type genes STE3, HD1 and HD2 were mapped using the segregation analysis amplification products carried out in Niculita-Hirzel et al. (2008).
Linkage analysis and map construction
Owing to the numbers of used markers (i.e. 301), the pre-mapping diagnostics showed that these markers can be resolved with confidence until a minimum of mapping population size of 39 (JoinMap X2 test, Van Ooijen & Voorrips, 2001). The mapping population consisted of a L. bicolor S238N progeny of 45 haploid homokaryons. The haploid nature of the members of this population allowed the application of a backcross model for handling data. Linkage analysis between markers, estimation of recombination frequencies and determination of the linear order of loci, were performed using JoinMap V3.0 software (Van Ooijen and Voorrips, 2001). Recombination rates were converted to genetic distances in centiMorgans (cM) using the Kosambi's mapping function (Kosambi, 1944). This mapping function assumes crossover interference met in multipoint analysis, meaning that the presence of one crossover reduces the probability of another in the area. Thus, when multi-markers were analyzed, the crossover disturbing coefficient C was introduced into the mapping function to adjust the genetic distance on linkage groups. Thus, for marker A-B-C:
where Rr is the observed recombination rate, Re is the expected recombination rate, and r1 and r2 are the independent recombination rates between marker A-B and marker B-C. The standard deviation of C is:
C is a value between 0 and 1. If C = 1, there is not a crossover disturbance; if C = 0, there is complete crossover disturbance. Then, C is integrated into calculation of the Kosambi's mapping function (Kosambi, 1944). Thus, the genetic distances between markers were adjusted by this parameter to a multipoint analysis, which is consequently different from a single two-point analysis.
Referring to the genome assembly, we adapted the grouping criteria, reaching final minimal LOD scores of 3.0 for small linkage groups (LGs) and 5.0 for large LGs, and a maximum recombination fraction of 0.4. In some cases, because of the lack of adequate recombination and LOD information, complete maps for some groups could not be created. In these cases, markers within these groups were split based on linkage relationships; two or more maps were generated and linkage between smaller groups and larger groups assessed. When smaller groups linked within larger groups, the smaller groups were removed. A third round of marker addition was performed, but without reordering first- and second-generation maps. LGs with ≥ four loci were retained as major LGs to represent the L. bicolor genome. We calculated total map distance that covered all LGs and average distance between markers as total map distance divided by the number of mapped markers. Several markers were linked as pairs or triplets, but not integrated in the map (e.g. linkage pair 4). The integration of the AFLP markers into our preliminary linkage map (Martin et al., 2008), constructed using only RAPD and SSR markers, led to the exclusion of several markers. The positions of these excluded sequence-tagged markers were determined by alignment on the ARACHNE assembly.
Alignment of JAZZ and ARACHNE assemblies
Masked sequences for whole-genome JAZZ and ARACHNE assemblies (version 1.0) were downloaded from the JGI (http://www.jgi.doe.gov/laccaria) and the INRA LaccariaDB (http://mycor.nancy.inra.fr/IMGC/LaccariaGenome/Annotation/), respectively. The JAZZ assembly scaffolds were aligned on the ARACHNE assembly supercontigs by using the BLASTN algorithm (99% identity, E value 0.0) at the INRA LaccariaDB BLAST server, and the GEnome PAir – Rapid Dotter Gepard (http://mips.gsf.de/services/analysis/gepard) (Krumsiek et al., 2007). The sequenced genetic map markers were located on this assembly alignment with BLASTN, and the genetic map and the genome assemblies aligned manually. Average recombination rates were obtained by dividing the total linkage distance (cM) by the total physical length (Mb) for each LG (pseudochromosome). These estimates were not adjusted for gaps between LGs or supercontigs or for differences in marker density.
Of the 144 SSR primers screened using the parental S238N (N+N) dikaryon and a pilot set of seven randomly selected haploid homokaryons, 63 yielded PCR amplicons. Among these 63 SSRs, we retained 38 polymorphic SSRs (13 codominant and 25 dominant). For map construction, we selected 45 homokaryons yielding one of the two alternative alleles inherited from the dikaryon, by using the 13 codominant SSRs. Homokaryons displaying none of these two alternative alleles were exluded based on assuming a segregation distortion (possibility of aneuploidy) (Table 3, Fig. S2). Ultimately, 37 SSRs were placed on the genetic map, with one SSR remaining unlinked. The primers and the sequences of these 37 SSRs were then used to anchor the SSR markers on the JAZZ/ARACHNE genome assemblies (see later). In the AFLP analysis, from the 99 AFLP primer combinations, 46 primer pairs revealed 254 polymorphisms (average of five loci per primer combination). These AFLP markers were used to improve map coverage in regions of the genome that lacked mapped SSRs or sequence-tagged RAPD markers. The mapped marker list is presented in Table 2. In the RAPD analysis, 23 RAPD primers, chosen from Doudrick et al. (1995) for their profile clarity, generated 39 segregating RAPD markers. Only eight were finally kept for mapping calculation because of ambiguous segregation in the remaining 31 markers.
Table 3. The codominant microsatellite markers (simple sequence repeats, SSRs) segregating in the Laccaria bicolor mapping pedigree identified during SSR primer screening
Among the initial set of 91 homokaryons, 45 were selected by using 13 codominant SSRs. Primer sequences are given in Supporting Information, Table S1.
Genetic map construction
Thus, a total of 301 markers that segregated in a 1 : 1 ratio in a L. bicolor progeny of 45 selected haploid homokaryons were used to calculated average pairwise LOD and recombination data. The initial calculation with JoinMap was able to group 294 markers (four SNPs, eight RAPDs, 37 SSRs, 243 AFLPs, two mating-type genes) at LOD thresholds of three (small groups) and five (large groups) with a maximum recombination fraction of 0.4. These LOD thresholds were chosen as they produced the minimum number of LGs, while maintaining the integrity of the map. Adding AFLP markers significantly increased the map size from 400 to 812 cM. The final map contained 287 markers positioned on 13 LGs, four marker pairs and one marker triplet (Fig. S1). LGs contained four to 66 markers and their sizes ranged from 10 to 124.7 cM, covering a total genetic length of 812 cM, with an average distance of 2.76 cM between the adjacent markers (range 1.9–17 cM). Thus, the 287 markers in the current map may be assumed to provide a reasonably comprehensive coverage of the L. bicolor genome. Owing to unknown chromosome number, the strict correspondence between LG ID and L. bicolor chromosomes remained unresolved.
Integration of genome assemblies and genetic map
To anchor the JAZZ and ARACHNE assembled sequences to the L. bicolor genetic map, we aligned the sequences of 49 mapped SSR, RAPD and SNP markers onto the assembled genomic sequences using BLASTN. The number of markers located on the assemblies ranged from one (LG 9) to nine (LG 1) per linkage group. No sequenced markers were available to align assembled sequences to LG 10, 11 and 12. Table 2 summarizes the total number of JAZZ scaffolds and ARACHNE supercontigs anchored on the genetic map.
Schematics of the assembled sequences-genetic map integration are shown in Fig. 1. For example, based on sequence alignments of nine mapped markers, ARACHNE supercontigs 1211 and 1184, and large regions of supercontigs 1199 and 1195 were anchored to LG 1 (124 cM, 6.38 Mb) (Fig. 1). Eight scaffolds (e.g. 83, 8, 71, 41 and 7) from the JAZZ assembly were similarly integrated into LG 1. Based on the alignment of nine markers, LG 2 (97 cM, 4.68 Mb), comprising two subgroups, was anchored to ARACHNE supercontig 1205, partly to ARACHNE supercontig 1198 and to several scaffolds (18, 35, 2, 9, 44, 10 and 17) from the JAZZ assembly. In total, c. 38 Mb of the ARACHNE assembled sequences were integrated into pseudochromosome units along 10 LGs. The majority of the mapped markers used in alignment were colinear with the sequence assembly (Table 2, Fig. 1). The 10 LGs aligned with 16 ARACHNE supercontigs accounted for 63% of this genome sequence assembly. Other supercontigs were too short to be genetically oriented. Several unoriented JAZZ scaffolds (e.g. 41, 71, 8 and 83 on LG 1) were placed on pseudochromosomes, but their orientation was determined using BLASTN against the ARACHNE supercontig.
The ARACHNE assembly displayed some discrepancies that can be corrected by the present genetic map. For instance, it is clear that regions of ARACHNE supercontigs 1195 and 1199 anchored on LG 1 (Fig. 1) should be removed from the rest of their assembled sequences that were anchored on LG 3 and LG 9. Similarly, based on the organization of LG 2 (Fig. 1), a part of the ARACHNE supercontig 1198, which is mostly anchored to LG 3, should be moved into ARACHNE supercontig 1205. According to Martin et al. (2008), the largest ARACHNE supercontig 1195 (6.8 Mb) likely belongs to LG 9.
The ratio of genetic distances to physical lengths provides an estimate of the recombination rate. In L. bicolor, the ratio averages 20.92 cM Mb−1 for the anchored part of the genome (Fig. 1). In many species there is a large variation in the recombination rate among linkage groups and a general tendency for the smallest linkage groups to recombine more than the large ones (Solignac et al., 2007). In L. bicolor, the recombination rate is very similar for all LGs, varying from 15.58 cM Mb−1 on LG 5 to 25.14 on LG 4. A region with a high recombination rate was identified on LG 5 from approximately locus 1207M10 to locus TA-CCC498, whereas a region of low recombination rate was noted on LG 6 from approximately TC-CAT72 to TA-CCA73. The two regions of the mating-type loci also showed distinct recombination rates.
Mapping structural genes
Nucleotide sequencing of LbH2 and lcc6 (P. E. Courty, unpublished) genes from the L. bicolor S238N homokaryons allowed us to detect a single SNP in LbH2 intron 3 (Shyd2.1) and three SNPs in lcc6 coding sequence (Table 2). By using these SNPs, we located the corresponding loci on the linkage map. LbH2 (identified by Shyd2.1) mapped on LG 1 (supercontig 1211) (Fig. 1), whereas lcc6 (identified by SLac14.1, SLac14.2) mapped on pair 2 (supercontig 1215) (Fig. 1). The mating-type locus MATa (homeodomain transcriptional factors, HD1 and HD2) was mapped on LG1 (supercontig 1195) (Figs S1, S4). This result suggests that LG1 and LG9 are tandem ends of the same chromosome. MATb (STE3-like pheromone receptors) was mapped on nonintegrated linkage pair 4 on supercontig 1168; the latter likely belonging to LG 8 (see previous map version in Martin et al. 2008).
The whole nuclear rDNA repeat is not part of the large JAZZ and ARACHNE assembled sequences currently available, because of the repetitive nature of this region. A BLASTN analysis using known nuclear rDNA tandem repeat sequences (Martin et al., 1999) as queries revealed that portions of the rDNA located onto unassembled JAZZ scaffolds (Martin et al., 2008). Sequence reads corresponding to the consensus sequence of the rDNA unit were found at c. 500 copies in raw sequence traces (NCBI Trace Archive), whereas single copy genes were present at 10 copies, indicating that the L. bicolor genome contains 50 repeats of the 10 kbp-rDNA unit, that is, 500 kbp. The linkage analysis of the segregating rDNA IGS1 heteroduplexes (α or β) allowed the mapping of this rDNA tandem repeat locus on the distal end of LG 7 (Fig. 1).
The present genetic linkage map is composed of 287 markers on 13 LGs covering a total genetic length of 812 cM, with an average of 2.76 cM between adjacent markers. The advantage of this linkage map over that generated in Doudrick et al. (1995) is greater genome coverage and integration of the assembled sequences of the L. bicolor S238N-H82 WGS. Nucleotide sequences of the RAPD markers used by Doudrick et al. (1995) are unavailable and therefore the two L. bicolor genetic maps could not be reconciled. We were, however, able to create and map sequence tags for 39 RAPD markers derived from 23 Doudrick et al. RAPD primers, but only eight of these markers were integrated into the current genetic map.
In total, we aligned 48 sequence-based markers with the JAZZ and ARACHNE genome sequence assemblies; nine for LG 1, LG 2 and LG 3, four for LG 4, LG 5 and LG 8, two for LG 6, pairs 2 and 4, and one for LG 7 and LG 9 (Fig. 1). This validated the current large-scale structure of the draft WGS assemblies, although additional finishing sequencing will be required to improve the assembly of highly polymorphic transposon-rich regions and the integration of telomeric and centromeric regions. The majority of the mapped markers used in our alignment were colinear with the sequence assembly, indicating that both the genetic map and sequence scaffolds achieved high fidelity. The genetic map anchored c. 38 Mb (66%) of the ARACHNE assembly. For example, the ARACHNE supercontig 1184 and corresponding JAZZ scaffolds, were anchored through four markers and the ARACHNE supercontig 1211 was anchored by three markers on LG 1 (Fig. 1). The anchoring of the distal end of LG 1, corresponding by similarity searches to the JAZZ scaffold 83, 8, 71 and 41, was not confirmed by a genetic linkage marker. The number of markers allowing the anchoring of LGs on the genome assemblies is still limited and, as a consequence, a large number of the ARACHNE assembled sequences remained poorly anchored, or even unanchored (e.g. anchoring of supercontig 1214 on LG 6, Fig. 1).
In most cases, the linkage map and physical sequence reciprocally validated marker order and position; it also revealed some problems in each. Improving the genetic map-assembled sequence integration will require an increased number of assignable markers designed to target poorly aligned regions. Additional SNP, SSR and AFLP markers will continue to be added to the genetic map, furthering the integration of genetic and physical resources. However, filling the gaps in the WGS assemblies will be limited in the centromeric and heterochromatic regions (Grewal & Klar, 1997). Similarly, regions having a high density of protein-coding genes (Poyatos & Hurst, 2007) will require additional measures.
Based on the JAZZ and ARACHNE assemblies, the haploid genome size of L. bicolor was estimated to be 59.9–64.9 Mb (Martin et al., 2008). An average of 46.7 kb cM−1 was obtained by dividing the size of ARACHNE assembled genome integrated into the genetic map by the total genetic size. This ratio was close to those reported in other Agaricales, 48.5 kb cM−1 in Agaricus bisporus (Kerrigan, 1993) and 35.1 kb cM−1 in Pleurotus ostreatus (Larraya et al., 2000). Thus, the estimated size of the L. bicolor genome contained within the integrated genetic map and sequence assembly is 37.36 Mb.
Several regions of relatively high or low recombination were discernible when the genetic linkage distances were compared with the physical sequence distances (Fig. 1). Recombination suppression has been reported in the basidiomycetes Ustilago hordei (Lee et al., 1999) and Cryptococcus neoformans (Lengeler et al., 2002) for sex chromosomes where mating-type loci were present in large genomic regions that fail to pair properly during meiosis. A similarly low recombination rate for the region surrounding the L. bicolor mating-type A (Fig. S4) was proposed by Niculita-Hirzel et al. (2008) based on the determination of intergenic distance. Niculita-Hirzel et al. (2008) reported that the STE3-like pheromone receptor genes (LbSTE3.1, LbSTE3.2, LbSTE3.3, LbSTE3.4) were clustered in one locus (i.e. MATb) of JAZZ scaffold 56. Our data suggest that MATb is located on linkage pair 4, likely belonging to LG 8. The size of this locus is c. 9.5 kb with three sub-loci, each containing two genes that encode for a mating-type-specific pheromone and the corresponding receptor (Niculita-Hirzel et al., 2008). Analysis by PCR of the MATa and MATb loci showed that the two mating-type loci were unlinked in contrast to previous findings (Doudrick et al., 1995). In our study, the MATa locus was localized on ARACHNE supercontig 1195 (JAZZ scaffold 1) anchored onto LG 1 (Fig. 1).
We located the ribosomal DNA at the distal end of LG 7. Based on the number of copies (i.e. 50) per haploid genome and the size of the rDNA tandem repeat (10 kb, Martin et al., 1999), the size of the rDNA locus is c. 0.5 Mb. The occurrence of numerous telomeric motifs (TTAGGG) in the intergenic spacer (IGS) (Martin et al., 1999) is in agreement with the suggested peritelomeric location.
The current genetic map reduced the number of LGs from 15 (Doudrick et al., 1995) to 13. Cytological chromosome counts for Laccaria montana, a closely related species, suggested a minimal haploid number of nine chromosomes (Mueller et al., 1993). Ongoing cytological chromosome counts and electrophoretic karyotyping will confirm the haploid number of chromosomes in L. bicolor S238N. Based on the integration of the linkage map into WGS assemblies (i.e. unaligned supercontigs and scaffolds), we postulate that the total haploid number of L. bicolor chromosomes is less than 13.
The principal interest of genetic maps is to localize and then identify Mendelian genes or quantitative trait loci (QTLs) in association mapping studies. The current linkage map is therefore a precursor to several genomics applications, including QTL mapping, comparisons of genetic maps among Laccaria species and interspecific transfer of genomic information, and candidate-gene association studies. Moreover, this study supplies useful marker resources with known physical and genetic position for population genetic studies that could be used to differentiate local genetic populations (Tuskan et al., 1990) or to conduct future genome-wide association studies. The availability of a genetic linkage map integrated to the WGS (Martin et al., 2008) in L. bicolor will enhance our ability for comprehensive analyses of the regulation of chromosome recombination, the mechanisms underlying fruiting body and ectomycorrhiza formation, ecological symbiosis fitness and evolutionary interactions between the symbiont model L. bicolor and its various hosts.
This project was supported by the Lorraine Region and INRA through a PhD. scholarship to JL. Funds were also provided by INRA, the European Commission Network of Excellence EVOLTREE, the Bioenergy Center Program at ORNL and the US Department of Energy. Oak Ridge National Laboratory (ORNL) is managed by UT-Battelle, LLC for the US Department of Energy under contract no. DE-AC05-00OR22725. We would like to thank Christine Delaruelle, Véronique Jorge, Brett Mommer, Lee Gunter, Hélène Niculita-Hirzel and Joseph Armento for their assistance and helpful discussions. We also thank Pierre-Emmanuel Courty and Julien Gibon for the permission to use LbH2 and lcc6 sequences. We thank the Joint Genome Institute and the Stanford Human Genome Sequencing Center for making the L. bicolor genome assemblies available before publication.