Unexpected vagaries of microsatellite loci in Glomus intraradices: length polymorphisms are rarely caused by variation in repeat number only


(*Author for correspondence: tel +41 61 267 23 20; fax +41 61 267 23 30; email thomas.boller@unibas.ch)

Microsatellite markers, or simple sequence repeats (SSRs), are widely used as tools to distinguish genotypes or individuals in paternity analyses, forensics and population genetics (Ellegren, 2004). Microsatellites have been studied extensively in many fungal genomes (Lim et al., 2004) but, surprisingly, have not been exploited to study the population genetics of arbuscular mycorrhizal fungi (AMF), a class of important plant symbionts, until two publications independently claimed the utility of these markers for a specific species, Glomus intraradices (Croll et al., 2008a; Mathimaran et al., 2008). In a letter contributed to this forum (Croll et al., 2008b), these ‘microsatellite markers’ were tabulated with the aim of clarifying possible confusions about their suitability in population genetics. The authors of this letter concluded that ‘as expected, a majority of the loci from both studies show length polymorphism in the repeat motif’ (cited from Croll et al., 2008b). However, only 10% of the length polymorphisms they observed (Croll et al., 2008a) were caused, at least partially, by changes in the repeat motif (Table 1). The vast majority of length polymorphisms (> 90%) were caused by insertions–deletions (indels) in the flanking regions; some of the so-called SSR loci did not contain any repeat longer than two triplets and were not polymorphic in these areas.

Table 1.  Nature of polymorphisms observed at simple sequence repeat (SSR) loci in Croll et al. (2008a) and this study
StudyAlleles sequencedTotal number of length polymorphismsOnly caused by SSRPartly caused by SSRNot at all caused by SSR
  • a

    Including length polymorphisms compared with the original sequence obtained from public databases (see Supporting Information Fig. S1).

Croll et al. (2008a)40292 (7%)1 (3%)26 (90%)
Present3628a5 (18%)10 (36%)13 (46%)

In our own study, we identified microsatellite loci, using well-defined criteria, in our database screen (at least five identical repeats of two, three or four nucleotides, or a stretch of at least 10 identical single nucleotides). We found clear length polymorphisms in 18 loci selected in this way, examining eight different strains of G. intraradices. The target repeat sequence was present in each case, and it would have been logical to assume that the length polymorphisms would have been caused by changes in the numbers of repeat lengths. However, when we sequenced two alleles of different size for each of the 18 loci (see Supporting Information Fig. S1), we found that the length difference was based exclusively on repeat length polymorphism in only 18% of the alleles studied, and at least partially in 36%. For almost half of the alleles studied (46%), the repeat was not affected and the length polymorphism was caused by adjacent indels (Table 1).

The frequency of length polymorphisms in the targeted microsatellites was only marginally higher than in the nontargeted flanking regions (i.e. 5.1% per base pair in the microsatellite region compared with 2.7% per base pair in the flanking region for our study) (Fig. S1).

We conclude that microsatellites of short length (n ∼ 5 for di-, tri- and tetranucleotides, and n ∼ 10 for mononucleotides), as investigated in the studies (Croll et al., 2008a; Mathimaran et al., 2008), seem not to enhance significantly the probability to find length polymorphisms of value for population genetic analysis. Nevertheless, as also stated in the accompanying letter (Croll et al., 2008b), length polymorphisms that happen to occur within and around such short microsatellites may still be highly useful in genotyping.

Length polymorphisms such as those analysed here are useful to demonstrate genetic differences among G. intraradices isolates in general; it remains an ongoing debate whether markers in expressed sequences or in noncoding regions are of greater interest. Mutation rates vary across the genome, and it is generally assumed that noncoding regions evolve at a higher rate than coding regions, as a result of selective constraints on the transcripts and proteins encoded by the genes. On the other hand, markers in expressed parts of the genome, such as expressed sequence tag (EST)-derived markers, have advantages over nonexpressed markers as they could be both used for gene mapping as well as for population genetics. Moreover, EST-derived markers are believed to be more suitable for cross-species transferability (Varshney et al., 2005; Ellis & Burke, 2007; Hisano et al., 2007). For population genetics, ‘neutral’ markers not subject to selection are of particular interest, and markers derived from ESTs (Mathimaran et al., 2008) may be less favourable in this respect. However, markers derived from a genome survey (Croll et al., 2008a) may also be expressed. Moreover, nonexpressed parts of the genome can be under equally strong selection as expressed parts and we therefore suggest that ‘neutrality’, if required, has to be tested for each locus instead of relying on global assumptions.

Accidentally, two of the bona fide microsatellites selected in our study, namely Glint09 and Glint18 (Mathimaran et al., 2008), were in a sequence previously studied, encoding a P-type II ATPase D (Corradi et al., 2007). Analysis of each of these two loci displayed a clear single band in all our single-spore DNA preparations, indicating that it was represented by an allele (or alleles) of a single size in an individual spore of each strain analyzed. With respect to the locus of Glint09, this corresponded to a band of 107 bp, 115/116 bp or 121/122 bp (Mathimaran et al., 2008). Experiments with DNA from mixed spores showed that two alleles of different size showed up as clear doublets with the appropriate size difference (data not shown). The size of the alleles found in individual spores matched the length variants (105/106, 114/115 and 121 bp) found combined either as two or three alleles in DNA preparations of mycelium from root-organ cultures of single strains in the previous study (Corradi et al., 2007). We do not have an explanation for this difference, but we point out that different single spores of a given strain, subjected to whole-genome amplification (WGA), always yielded a unique band of constant length for a given polymorphic locus (Mathimaran et al., 2008).

The ability to detect single alleles at a polymorphic locus in single spores is a clear advantage of the WGA method. Whole-genome amplification is particularly useful for detecting low-copy-number sequences from environmental samples (Gonzalez et al., 2005) where standard polymerase chain reaction (PCR) methods are insufficient, and it has successfully been used to genotype powdery mildew (Fernandez-Ortuno et al., 2007). Owing to the high-fidelity proof-reading function of Phi29 DNA polymerase, the WGA product is a highly accurate copy of the original genome (Dean et al., 2002). Indeed, using this technique with four separate amplifications from single spores of two different isolates of G. intraradices, there was faithful amplification for all of the three loci tested (Mathimaran et al., 2008). Thus, the WGA procedure greatly enhances opportunities to detect size polymorphisms at multiple loci in single spores.

The potential SSR markers identified by Mathimaran et al. (2008) have been deposited in a newly developed database for Glomus (http://glomus.vital-it.ch/), which is maintained by the Swiss Institute of Bioinformatics and is now accessible to scientists worldwide. In the future, this database will be upgraded to allow users to retrieve as well as to deposit useful length-polymorphic markers for tracing AMF. This is particularly important because large numbers of markers may soon be available from various AMF species, which need to be consolidated into a relational database for easy access of a particular marker locus, as in the case of databases for other eukaryotes (see e.g. the Swiss Vitis Microsatellite Database).

Interestingly, both studies reviewed here and in the accompanying letter (Croll et al., 2008b) clearly show that all the loci characterized by length polymorphisms have a single size within a given isolate and thus are not heterogeneous in descendants of a single spore. This means that such length polymorphisms – whether caused by indels or microsatellite repeat polymorphisms – can be used to genotype AMF strains. This will be a great asset for future population genetic and ecological studies as well as for the re-identification and tracing of AMF strains of particular value used as biofertilizers in agriculture. Moreover, the absence of multiple alleles in a given strain suggests that AMF are essentially homokaryotic with a haploid genome rather than having an unusual heterokaryotic lifestyle, two contrasting hypotheses discussed recently in New Phytologist (Rosendahl, 2008; Young, 2008).