• Open Access

Optimizing TILLING populations for reverse genetics in Medicago truncatula

Authors

  • Christine Le Signor,

    1. Unité Mixte de Recherche en Génétique et Ecophysiologie des Légumineuses à Graines (UMR-LEG), Institut National de la Recherche Agronomique (INRA), BP 86510, F-21065 Dijon, France
    Search for more papers by this author
  • Vincent Savois,

    1. Unité Mixte de Recherche en Génétique et Ecophysiologie des Légumineuses à Graines (UMR-LEG), Institut National de la Recherche Agronomique (INRA), BP 86510, F-21065 Dijon, France
    Search for more papers by this author
  • Grégoire Aubert,

    1. Unité Mixte de Recherche en Génétique et Ecophysiologie des Légumineuses à Graines (UMR-LEG), Institut National de la Recherche Agronomique (INRA), BP 86510, F-21065 Dijon, France
    Search for more papers by this author
  • Jérôme Verdier,

    1. Unité Mixte de Recherche en Génétique et Ecophysiologie des Légumineuses à Graines (UMR-LEG), Institut National de la Recherche Agronomique (INRA), BP 86510, F-21065 Dijon, France
    Search for more papers by this author
  • Marie Nicolas,

    1. Unité Mixte de Recherche en Génétique et Ecophysiologie des Légumineuses à Graines (UMR-LEG), Institut National de la Recherche Agronomique (INRA), BP 86510, F-21065 Dijon, France
    Search for more papers by this author
  • Gaelle Pagny,

    1. Unité Mixte de Recherche en Génétique et Ecophysiologie des Légumineuses à Graines (UMR-LEG), Institut National de la Recherche Agronomique (INRA), BP 86510, F-21065 Dijon, France
    Search for more papers by this author
  • Françoise Moussy,

    1. Unité Mixte de Recherche en Génétique et Ecophysiologie des Légumineuses à Graines (UMR-LEG), Institut National de la Recherche Agronomique (INRA), BP 86510, F-21065 Dijon, France
    Search for more papers by this author
  • Myriam Sanchez,

    1. Unité Mixte de Recherche en Génétique et Ecophysiologie des Légumineuses à Graines (UMR-LEG), Institut National de la Recherche Agronomique (INRA), BP 86510, F-21065 Dijon, France
    Search for more papers by this author
  • Dave Baker,

    1. John Innes Genome Laboratory, Norwich NR4 7UH, UK
    Search for more papers by this author
  • Jonathan Clarke,

    1. John Innes Genome Laboratory, Norwich NR4 7UH, UK
    Search for more papers by this author
  • Richard Thompson

    Corresponding author
    1. Unité Mixte de Recherche en Génétique et Ecophysiologie des Légumineuses à Graines (UMR-LEG), Institut National de la Recherche Agronomique (INRA), BP 86510, F-21065 Dijon, France
      * Correspondence (fax +33-380-693-263; e-mail thompson@dijon.inra.fr)
    Search for more papers by this author

* Correspondence (fax +33-380-693-263; e-mail thompson@dijon.inra.fr)

Summary

Medicago truncatula has been widely adopted as a model plant for crop legume species of the Vicieae. Despite the availability of transformation and regeneration protocols, there are currently limited tools available in this species for the systematic investigation of gene function. Within the framework of the European Grain Legumes Integrated Project (http://www.eugrainlegumes.org), chemical mutagenesis was applied to M. truncatula to create two mutant populations that were used to establish a TILLING (targeting induced local lesions in genomes) platform and a phenotypic database, allowing both reverse and forward genetics screens. Both populations had the same M2 line number, but differed in their M1 population size: population 1 was derived from a small M1 population (one-tenth the size of the M2 generation), whereas population 2 was generated by single seed descent and therefore has M1 and M2 generations of equal size. Fifty-six targets were screened, 10 on both populations, and 546 point mutations were identified. Population 2 had a mutation frequency of 1/485 kb, twice that of population 1. The strategy used to generate population 2 is more efficient than that used to generate population 1, with regard to mutagenesis density and mutation recovery. However, the design of population 1 allowed us to estimate the genetically effective cell number to be three in M. truncatula. Phenotyping data to help forward screenings are publicly available, as well as a web tool for ordering seeds at http://www.inra.fr/legumbase

Introduction

TILLING (targeting induced local lesions in genomes) is a reverse genetic strategy for the identification of mutations throughout a genome and a screening method facilitating the localization of these mutations. Mutations are induced by the chemical mutagen ethyl methane sulphonate (EMS), and particular regions can be screened for the presence of mutations by high-throughput polymerase chain reaction (PCR). This procedure was developed for Arabidopsis (McCallum et al., 2000; Colbert et al., 2001; Comai and Henikoff, 2006; see http://TILLING.fhcrc.org:9366/home.html for experimental details and advice), and has been extended to a wide variety of organisms, including maize, rice, pea, lotus, tomato, Drosophila and zebrafish (Perry et al., 2003; Wienholds et al., 2003; Till et al., 2007). The main advantage of the TILLING strategy is that a series of allelic mutations can be obtained, displaying a range of phenotypes as a result of changes in the functionality of genes. It allows the recovery of weak alleles of genes for which complete loss-of-function is lethal. In Medicago truncatula, other techniques are available that yield a high proportion of gene knockouts: insertional mutagenesis with the retrotransposon Tnt1 (Benlloch et al., 2006) and fast-neutron mutagenesis coupled to reverse genetic screening (G. Oldroyd, John Innes Centre, Norwich, UK, pers. commun.). The chemical mutagen EMS produces a high density of mutations, allowing a moderate population size to be used relative to other mutagenic treatments (γ-rays or fast-neutron irradiation), which generate a high proportion of gene knockouts, but with low mutation densities, and therefore require large population sets. TILLING requires the availability of either cDNA or annotated genomic sequences of the gene of interest in order to detect and locate point mutations in target genes; 260 000 M. truncatula cDNA sequences are available in the public domain and, by the end of 2009, the entire gene-rich regions of M. truncatula will be fully sequenced. Thus, the availability of cDNA and genomic sequences is not a restriction on the application of this technique.

We have generated two TILLING populations, aiming to optimize the mutation density whilst retaining a high mutant recovery rate. The choice of EMS dosage is crucial because a high dosage will lead to a high mutation density, but also to a high frequency of infertile plants. The two populations were generated using different schemes: in population 1 (EMS1), 500 M1 plants were used to produce 4500 M2 plants; in population 2 (EMS2), 4350 M2 plants were derived by single seed descent (SSD) from 4350 M1 plants. In this study, the mutation density of the two populations is established and their respective screening efficiencies are compared. The results are based on a batch of 56 genes, 10 of which were screened on both populations, the remainder being screened in EMS2. The TILLING platform (http://www.jicgenomelab.co.uk/revgenuk.html) and the phenotypic database, legumbase (http://www.inra.fr/legumbase), set up to exploit this resource, are also presented.

Results

Production of A17 mutant populations

Line A17, issued from the cultivar Jemalong, is the worldwide reference line for M. truncatula biology, used for genomic sequence determination, expressed sequence tag (EST) generation and mutant resource creation. More than 90% of the genomic resources available were obtained from this line. A seed batch of more than 40 000 seeds of M. truncatula Jemalong line A17 was produced from a seed lot of 100 seeds received from INRA Montpellier, France (UMR1097, J. M. Prosperi). Although M. truncatula is predominantly self-fertilizing, some residual cross-pollination can occur. In order to avoid crosses, plants were raised in insect-proof glasshouses and covered with perforated plastic bags to keep them isolated from their neighbours. To ensure the quality of the TILLING populations, 100 parental plants were analysed for genetic uniformity using a set of 16 simple sequence repeat (SSR) markers distributed over the eight chromosomes (SSR markers and genetic map provided by T. Huguet, ENSAT Toulouse, France; Pierre et al., 2008). No heterogeneity in banding patterns was observed.

The mutagenized M. truncatula population was created by first testing a range of doses from 0.025% to 0.4% EMS (corresponding to 2–32 mm) using, as an early selection criterion, the percentage of seedling survival. A ‘kill-curve’ analysis on batches of 200 seeds showed that seedling survival decreased markedly at doses above 0.2% (Figure 1). Thirty plants of each treatment were grown to maturity under glasshouse conditions to assess fertility and seed production. The fertility of M1 plants decreased rapidly with high mutagen doses (one-third sterility at a dose of 0.2%, one-half at a dose of 0.3% and two-thirds at a dose of 0.4%). To balance maximum mutation density with acceptable plant survival rate, the optimum dosage appeared to be between 0.15% and 0.2% for M. truncatula line A17. The two TILLING populations were produced from two different seed batches, each issued from the initial 100 parent plants. After checking for seedling survival in the two M0 seed batches for the two EMS dosages (0.15% and 0.2%) and the control treatment, optimal doses of 0.2% EMS for population 1 (EMS1) and 0.15% EMS for population 2 (EMS2) were selected. The spectrum of mutations generated in each population was checked by examining seed and seedling phenotypes in the M2 generation (Table 1). The comparable results obtained for EMS1 and EMS2 with regard to albino frequency and the higher percentages of chlorotic phenotypes and M2 embryo abortion in EMS2 suggest that the mutagenesis efficiency is dependent on both the EMS dosage and the M0 seed batch quality. The mutagenesis efficiencies obtained correspond well with previous results for M. truncatula (Penmetsa and Cook, 2000).

Figure 1.

Percentage of seedling survival assessed after ethyl methane sulphonate (EMS) treatment on imbibed seeds of Jemalong line A17 at various dosages. Dose 0% corresponds to water treatment. Each treatment was performed on 200 seeds.

Table 1.  Frequency of albinos and chlorotic phenotypes in the M2 generation for populations EMS1 and EMS2 (EMS, ethyl methane sulphonate). Data were recorded on 4000 plants in each population. The embryonic lethality of M2 seeds (in M1 pods) was added for both populations and was recorded for 600 pods per population
PhenotypeEMS1 (0.20%)EMS2 (0.15%)
Albinos1.4%1.2%
Chlorotic or pale green4.4%7.6%
Embryonic lethality15%19%

The detailed construction of the two populations is shown in Figure 2. A first population ‘EMS1’ of 4500 M2 plants raised from 500 M1 plants was available with DNA extracts in 2004 (families of 12 M2 per M1). A second population ‘EMS2’ of 4350 M2 plants raised by SSD from 4350 M1 plants was produced subsequently (seeds and DNA lots harvested in 2005). To produce the M2 generation, 20 M2 seeds were sown and notation was recorded for 10 plantlets at the seedling stage. By the pre-flowering stage, plants were thinned out so that only four remained in each pot. After flowering, one healthy looking plant was retained as a source of leaf material for DNA and M3 seeds at maturity.

Figure 2.

Design of the two ethyl methane sulphonate (EMS) populations. M0 seeds of line A17 from cultivar Jemalong were EMS mutagenized. M1 seeds resulted in approximately 500 fertile M1 plants in population 1 (EMS1) and 4350 M1 plants in population 2 (EMS2). M2 seeds were harvested from each individual M1 plant. In each population, up to 20 seeds were sown and plants were visually scored. In EMS1, up to 12 M2 plants per M1 plant were leaf collected and harvested. In EMS2, one healthy M2 plant was leaf collected, self-fertilized and harvested. Each population resulted in about 4500 M2 plants. DNA was extracted from collected tissue and the samples were pooled to increase TILLING (targeting induced local lesions in genomes) throughput (pools of 12 individuals in EMS1 and pools of eight individuals in EMS2). After polymerase chain reaction (PCR) amplification of target genes, heteroduplexes were formed and digested using ENDO1 nuclease. Fragments were separated by denaturing polyacrylamide gel electrophoresis and visualized using a Li-Cor system at the Dijon Platform (image shown) and by capillary electrophoresis at the Genome Laboratory (John Innes Centre).

Phenotyping of the A17 mutant populations and creation of the Legumbase database

To facilitate phenotypic scoring, an ontology adapted to M. truncatula was defined, based on those already developed for pea (Dalmais et al., 2008) and lotus (Perry et al., 2003). A high-throughput scoring strategy (which excluded both root and symbiosis evaluation) was employed, applied to 1000 mutant lines within a growing season. A detailed phenotypic catalogue was devised with three major categories, 22 categories and 66 subcategories (Table S1, see Supporting Information). The data for the EMS1 population contains one record for each of 12 plants derived from a single M2 family. Of the 4850 M2 families grown and phenotyped, 1939 (40%) showed a visible phenotype. Of these, 47% were in EMS1 and 39% in EMS2. The most commonly observed phenotypes were related to leaf and cotyledon colour, plant size, leaf size and leaflet shape, as reported for pea and lotus (Perry et al., 2003; Dalmais et al., 2008). Among the lines showing a phenotype, 53% were scored for a single altered trait (42% in EMS1, 55% in EMS2) and 47% displayed multiple altered traits (58% in EMS1 and 45% in EMS2). These data are consistent with those obtained for the TILLING population of pea (Pisum sativum var. Caméor) (Dalmais et al., 2008). The large number of multiply altered families in EMS1 is a result of the larger number of plants analysed per family in this population (12 times as many as in EMS2). However, the relatively low discrepancy between the phenotyping of the two populations allows us to conclude that a minimum number of plants (10 plantlets and four flowering plants) is sufficient to evaluate major visible phenotypes present in the family. The vocabulary used, together with the number of lines found in each phenotype subcategory, is shown in Table S1 (see Supporting Information).

To manage the phenotypic and seed stock data, the LegumBase database was developed, which is publicly available through a web interface (http://www.inra.fr/legumbase). This relational database contains phenotypic data from about 10 000 legume genotypes (M. truncatula, P. sativum and Vicia faba). These genotypes are of scientific or heritage interest: EMS mutants, wild or breeding genotypes. Among these, about 5100 were M. truncatula M2 mutant lines, all publicly available. The phenotypic descriptors of LegumBase fit the vocabulary used to describe the lines shown in Table S1 (see Supporting Information). One of the main goals of LegumBase is to enable the user to search and visualize the collections; mutant phenotypes can thus be searched by selecting either a single category or a combination of traits: cotyledon and plantlet characteristics at the young plant stage; architecture, leaf, stem and flower traits at the flowering plant stage; pod and seed characteristics at the mature plant stage. A search using different phenotypic criteria permits the use of the resource for forward genetics. More than 1100 digital images are available. On registration, users can order seeds via the web interface. An automatic Material Transfer Agreement (MTA) is sent to the requester by e-mail and all operations (MTA sent or received, seeds sent or received) are stored in the database, with the date of execution.

Reverse genetics screenings

TILLING screenings of EMS1 and EMS2

DNA samples were prepared from 4500 M2 plants of the EMS1 population and 4350 plants of the EMS2 population, and organized in pools of 12 for EMS1 (one pool per M1 progeny) and pools of eight for EMS2 (no redundancy in the M2 generation). Mutations were first detected in the amplified targets in pools using the mismatch-specific endonucleases ENDO1 (Triques et al., 2007) and CEL1. Individual mutant lines were identified following the same protocol on individual DNA from pools, and the mutations were confirmed by sequencing. A total of 546 mutants was detected in the 56 targets using either EMS2 alone or both populations: 34 mutants were identified within 10 targets screened in the EMS1 population, and 512 mutants were identified within 56 targets in the EMS2 population. The mutations observed were predominantly of G → A or C → T changes, except one T → G, three T → A and two A → C changes (0.92%). As reported previously by Greene et al. (2003), EMS mutagenesis in Arabidopsis induced 99% changes from G/C to A/T. The six exceptions observed could arise from spontaneous mutation: a natural rate of around 10−8 bp per generation has been reported for two dicotyledons, Arabidopsis thaliana and Nicotiana tabacum (Filkowski et al., 2004), but an error-prone repair mechanism is more probable.

Comparative results of 10 duplicate screens on EMS1 and EMS2 are presented in Table 2. In the comparison of EMS1 with EMS2, the mean number of mutations observed in exons per kb screened is twice as high in EMS2 (10 mutants per exon kb) as in EMS1 (4.5 mutants per exon kb). The data for 56 screens on the EMS2 population are presented in Table 3. The mean number of mutations in EMS2 is 12.9 mutants per exon kb screened. The induced mutations observed in exons in EMS1 consisted of 62% missense, 35% silent and 3% stop or truncation mutations, and, in EMS2, of 69.7% missense, 25.4% silent and 4.9% stop or truncation mutations (Table 4). The distribution of mutations is similar in the two populations (pair-wise comparison of mutation distribution in EMS1 with distribution in EMS2: χ2 = 2.95, P = 0.23). The number of observed mutations is in perfect agreement with the expected frequency predicted by coddle (see χ2 tests in Table 4). Of the 512 mutants detected in EMS2, 171 were homozygous, confirming the expected proportion of homozygous mutants of one-third. In EMS1, the proportion of homozygous mutants could not be assessed as only one or two individuals per pool were sequenced to confirm the mutation.

Table 2.  Comparative results of the molecular screeening of 10 targets in the two ethyl methane sulphonate (EMS) populations. Screenings were undertaken on one 384-pool plate for each target, representing 4500 individuals for EMS1 and 3072 individuals for EMS2. The detection method used was the Li-Cor system.The total number of mutations, type of mutation (missense or stop) and homozygosity (homo) are indicated, as well as the number of mutants per pool in population EMS1
GeneSize (bp)Coding size (bp)EMS1EMS2
Number of mutations per classNumber of mutations per class
TotalMissenseStopHomoPer poolTotalMissenseStopHomo
L1L72072052021; 56102
VP182060764012; 2; 2; 29705
PS11128112875014; 1; 1; 3; 19804
PI152041031002;3201
RLKin1100110065021; 1; 1; 9; 916918
LYR61220120022026; 61311 1
PII6072580000 2111
Dof2115010103010398 2
HMT15005760000 2200
NewLec80557322001; 13101
Total95707582342118 7854225
Mean957  758.2 3.4 2.10.10.82.9 7.250.3 2.5
Table 3.  Summary of additional screening on EMS2 for Grain Legumes Integrated Project (GLIP) requests. In Dijon, France (Unité Mixte de Recherche en Génétique et Ecophysiologie des Légumineuses à Graines), the technique used was the Li-Cor system: genes 1–11 and screenings performed on 3072 individuals. At the John Innes Centre, Norwich, UK (Genome Laboratory), a capillary sequencer was used: genes 12–46 and screenings performed on 4350 individuals. EMS, ethyl methane sulphonate
NumberGeneSize (bp)Coding size (bp)EMS2
Number of mutations per class
TotalMissenseStopHomozygous
1CCS5289743733 1
2LYK212536695311
3LYK4101063743 0
4NRT2B12079662110
5bZIP115481062 1
6Wrky127072542 2
7FB155034022 0
8Enod1173752555 2
9CL13939255109 5
10HB1121172651 0
11bHLH11948689711
12AP21043915302343
13bZIPATB295147154 1
14CHR17A92730310820
15Cl1282224632 1
16DCL1A136210171711 6
17DCL3A924532128 6
18ERF110202823210
19ERFlike1105149685 4
20ERN2104991665 1
21G04A111347611614
22G04B104950293 6
23G07A109110911915 4
24HK1_new121068364 2
25HK1_reg110366159612
26HK3113786675 2
27Lyr21074107418918
28Lyr411431143201614
29MADS1108930611 0
30Nacinit9896902016 4
31NacNod9044747610
32NFP109710972420 5
33Nup1331123112310910
34PHYA100410046411
35PHYB994994181224
36PHYE99699610712
37RDR21012101212813
38RL1295645986 1
39RLP111331069127 6
40RR496536244 0
41SGS2116610589513
42SGS3B10017111610 6
43SINA488438784 5
44SINA6114170574 3
45TC9991193254664 2
46XRN413706328314
 Mean1047.4 700.2 9.4 6.51.32.5
Table 4.  Distribution of missense and truncation mutations along the TILLED fragments and the number of heterozygotes and homozygotes observed in detected mutants in EMS1 and EMS2. The data are based on the synthesis of 10 screenings in EMS1 and 56 screenings in EMS2. χ2 values with two degrees of freedom were used for the comparison with the coddle prediction models. EMS, ethyl methane sulphonate
 EMS1: number of mutationsEMS2: number of mutations
AllSilentMissenseTruncationAllSilentMissenseTruncation
Distribution341320151214234525
% expected 27.467.94.7  26.80 67.70 5.50
% observed 35.062.03.0  27.7 67.4 4.9
χ2 value (2 d.f.) 3.49   0.53   
P 0.17   0.77   
Heterozygous (Het)    341 9622718
Homozygous (Hom)    171 46118 7
Ratio Het/Hom    1.99  2.09  1.92 2.57

The average density of mutations was calculated as (size of the exon screened × total number of individuals screened/total number of identified mutants). The average mutation density observed in EMS2 (1/485 kb in Dijon, France; 1/424 kb in Norwich, UK) is 2.5-fold higher than that observed in EMS1 (1/1290 kb), confirming the better quality of EMS2.

Distribution of the mutations

Mutations were not regularly spaced within the fragments screened (Figure 3). The detection of mutants falls off towards both ends of the targets as a result of the detection technique used. Greene et al. (2003) reported that Li-Cor analysis with fluorescent primers suffers from high fluorescent background noise at the low-molecular-weight end of the gel and a weak signal at the high-molecular-weight end of the lane. Approximately 100 base pairs at each end of the target were not screened. This technical limit had little influence on our results because the primers were designed outside of the target zone identified by coddle, mainly in introns, and so the mutants could be detected and recovered throughout the exons.

Figure 3.

Distribution of mutations as a function of scaled fragment coordinates. The length of a TILLED fragment was scaled to 1000 bp and the positions of each mutation were calculated on this scale. The data are based on results obtained from EMS1 and EMS2, and represent 546 point mutations from 56 screens. Axis, number of mutations in the range considered; ordinate, position of the mutation along a 1-kb fragment. EMS, ethyl methane sulphonate.

Local compositional bias around the mutated ‘G’

The nucleotide composition was examined around the mutated ‘G’. Deviations from the expected frequencies were observed in the neighbourhood of each ‘G’ based on the 56 fragments TILLED (Figure 4). In positions –1 and +1, a highly significant bias (P < 10−4) was found, and a bias significant at the 0.05 level in position +2. No significant bias was observed at positions –2/–3/+3 (χ2 test of expected vs. observed frequencies). In position –1, ‘G’ was more frequent than expected (×1.5) and ‘T’ was less frequent (×0.6), whereas, in position +1, ‘A’ was more frequent (×1.6) and ‘C’ less frequent (×0.4) than expected. In position +2, there was a significant deviation from the expected frequency for ‘G’ (x1.7). Although our results are less robust than those obtained for Arabidopsis (Greene et al., 2003), because of the smaller number of targets, we observed the same tendency of a higher frequency of purines and a lower frequency of pyrimidines directly around the mutated ‘G’. No significant bias was seen among the observed and expected triplets. The expected frequency of triplets was estimated from the product of the frequencies of the bases at the –1 and +1 positions.

Figure 4.

Observed frequency (left bar) of the four bases in each position around the mutated ‘G’ (position 0) for the 546 mutations detected in the EMS1 and EMS2 populations. The expected frequencies (right bar) of each base observed in all sequences (forward and reverse) in the neighbourhood of a ‘G’ base. Discrepancies between observed and calculated frequencies superior to 15% are indicated by an asterisk on the observed bar. EMS, ethyl methane sulphonate.

Discussion

A TILLING strategy has been used successfully to detect mutations in 56 genes in M. truncatula, providing a unique tool for functional validation studies. Considering the population and scoring all targets (EMS2), on average, an allelic series of about 13 mutations per exon kb screened was found, 67% of which were missense and 5% nonsense mutations. Two EMS populations were generated independently using two different designs. Although the M2 generations consisted of the same number of plants, the M1 generations differed in size: EMS1 was a narrow-base M1 population of 500 individuals, as opposed to the large-base M1 population of EMS2 of 4350 individuals. The recording of visible phenotypes indicated that the two populations exhibited the same categories of mutations. Owing to redundancy in the M2 generation of EMS1, as expected, a higher mutation density was found in the EMS2 population than in EMS1. The 1 : 2.5 ratio observed between the mutation densities of the two populations, based on the results from molecular screens, could be compared with estimations of the mutation frequency from phenotypic data via an estimation of the genetically effective cell number (GECN) and mutation rate.

EMS population design, GECN and mutation rate

As mutations occur randomly in each cell, different cells in the same seed will contain different mutations. Thus, M1 plants, derived from seed mutagenesis, are chimeric. For the detection of mutations, only those that occur in cells that form the germ-line can be detected in the M2 generation (Koornneef, 2002). The number of meristem cells that contribute to the formation of seed (germ-line), called GECN (Carroll et al., 1988), has typically been estimated to be between two and three (Li and Redei, 1969; Carroll et al., 1988). In consequence, the progeny of an M1 plant may be derived from a sector in which a specific gene (A) is not mutated (AA) or from a sector carrying a heterozygous mutation (Aa). The segregation observed in the M2 generation for this gene (progeny of the chimeric plant) is 5 : 3 (AA : Aa + aa), where GECN = 2. This 5 : 3 ratio is obtained from a 1 : 3 ratio (AA : Aa + aa) from the mutated sector and a 4 : 0 ratio from the non-mutated sector. Where GECN = 3, the estimated (AA : Aa + aa) ratio is 9 : 3, that is 1 : 3 for the mutated sector and 4 : 0 for the two other non-mutated sectors. Considering that the mean number of mutated plants recovered in each M2 pool of EMS1 is three of 12 plants, the observed ratio (AA : Aa + aa) is close to 9 : 3, and we can infer that the mean GECN in M. truncatula is three.

From this estimation, we can compare the mutation density and mutation rate between our two populations. If GECN = 3, we can infer that the 500 M1 plants in EMS1 give rise to 1500 independent mutation sets in the M2 generation. EMS1 is indeed equivalent to a population of 1500 M2, with, on average, four individuals per single mutation set. This assumes no loss of mutation caused by selfing. In EMS2, the SSD scheme induced a loss of one-quarter of the mutations as wild-type/wild-type (wt/wt) Mendelian segregants in the M2 generation. Thus, the 4000 M2 plants are equivalent to only 3000 mutated lines. Our screening results support the balance between 1500 apparent M2 lines in EMS1 and 3000 apparent M2 lines in EMS2.

The mutation rate was also estimated using GECN estimation and the frequency of pale-green phenotypes (Table 1) to be between 4.5 × 10−5 in EMS1 and 1 × 10−4 in EMS2. This 1 : 2 ratio of mutation rates between the two populations is in accordance with the molecular screening results, indicating twice as many mutants in EMS2 as in EMS1.

From these results, we can infer the optimum balance between the sizes of the M1 and M2 generations for the generation of a maximum number of independent mutations. The recovery of the mutations occurring in the M1 generation is determined by the size of the M2 families. If a single cell germ-line is heterozygous for a mutation and, if GECN = 3, the probability of recovering the mutation in either a heterozygous or homozygous mutant state is P = 0.25 if only one individual is tested in M2 (if GECN = 1, this probability would be P = 0.75). The number of M2 plants per single M1, allowing the recovery of at least one mutant (heterozygous or homozygous) in the M2 population with P% chance, is calculated using a binomial law of probability: P = 0.25 (Table 5).

Table 5.  Probability of recovery of at least one mutant in M2 among the descendants of M1 plants as a function of the number of M2 plants tested per M1 plant (1–10 plants). We assume that the genetically effective cell number (GECN) is three and consider a total M2 population of 1000 plants. Consequently, the number of M2 families tested is indicated, as well as the expected number of mutations recovered and the total number of plants produced
 Number of M2 individuals per M1
12346810
  • *

    Equivalent to number of M1 plants.

  • Number of mutations recovered in M2 = number of M1 plants × GECN × P (probability of recovery of at least one mutant).

P (at least one mutant in M2)0.250.440.580.680.820.890.94
Number of M1 plants for M2 size of 1000*1000 500 333 250 167 125 100
Number of plants produced M1 + M22000150013331250116711251100
Number of mutations recovered for population size of 1000750660580510411334282
Number of mutations/number of plants0.3750.440.430.410.350.300.26

With one M2 individual per M1 plant, the probability of recovering a mutation in a single heterozygous cell in one M1 plant is P = 0.25. Ten individuals per M1 plant are sufficient to recover the majority of mutations occurring in M1 (P = 0.95). From a genetic point of view, the highest number of mutations recovered is obtained with 1M2/M1, and this population is that with no redundancy. From an economic point of view, an optimal balance between the number of M1 plants and the number of M2 plants exists in order to obtain the maximum number of mutations at the lowest possible cost (Redei and Koncz, 1992). If the production costs of M1 and M2 plants are identical (all plants individually harvested), the efficiency can be estimated from the ratio of the number of mutations recovered/number of plants produced (see Table 5). A good compromise for M. truncatula would be an M1 population size half that of M2, that is families of two M2 plants per M1 plant.

Mutation detection and local bias around the mutated ‘G’

A total of 546 point mutations was detected in the two EMS populations (34 in EMS1 and 512 in EMS2). The calculated overall mutation rate of one mutation every 485 kb, found in our best-characterized population (EMS2), is not significantly different from the rate of one mutation per 500 kb reported in maize (Till et al., 2004), 1.5-fold lower than the rates reported for Arabidopsis (1/300 kb; Greene et al., 2003), rice (1/300 kb; Till et al., 2007) and Caenorhabditis elegans (1/293 kb; Gilchrist et al., 2006), and two-fold lower than that observed in pea (1/203 kb; Dalmais et al., 2008). A much higher mutation density has been observed for tetraploid wheat (1/40 kb) and hexaploid wheat (1/24 kb; Slade et al., 2005), a species able to withstand much higher doses of EMS without an obvious impact on survival or fertility rates, because of multiple gene redundancies in the polyploid genome.

Our preliminary analysis on the first 56 targets suggested that a limited local sequence bias at the mutated site was observed in our populations. This might reflect a preference for enzyme cleavages, but the use of the mismatch-specific endonuclease ENDO1, which cleaves a broad spectrum of mismatches with high efficiency (Triques et al., 2007), suggests that this hypothesis is unlikely. Another explanation for the observed bias would be the mismatch repair (MMR) of alkylation damage. As the sites of damage are recognized by MMR proteins with different affinities, depending on the identities of several base pairs around the mismatch, the damaged region is possibly excised and repaired by a special DNA polymerase. Wu et al. (2003) showed, in A. thaliana, a specialization of DNA-MMR for particular mismatches and sequence contexts. Although the heterodimers AtMSH2–MSHx bind G/T mismatches preferentially, the AtMSH2–MSH7 complex binds G/G, G/A, A/A and, especially, C/A mismatches as well as or better than G/T mismatches. Mismatches induced by EMS are likely to be repaired in plants with high efficiency.

Functional validation and TILLING platforms

One-half of the induced missense mutations are predicted to result in a damaged protein (Markiewicz et al., 1994). An allelic series of 5–10 mutants should therefore be sufficient for phenotypic analysis of a non-redundant locus. Backcrossing is currently being carried out for the 10 targets screened on both populations. Two backcrosses to the reference line A17 will remove approximately 75% of the background EMS mutations. Homozygous mutant lines derived from these backcrosses will be sufficiently pure for use in genetic and phenotypic analyses. The allelic series of mutations provided by TILLING are powerful tools for functional validation of candidate genes.

The two EMS populations created for M. truncatula offer a total of more than 9000 M2 plants for forward or reverse genetics. Phenotyping data are publicly available via our web interface LegumBase (http://www.inra.fr/legumbase) for forward genetics (1100 images and 66 distinct phenotypes described). In order to exploit these populations for reverse genetics, genomic DNA was prepared for both populations. A TILLING platform was set up in INRA-Dijon, France for the duration of the European Grain Legumes Integrated Project (GLIP). To provide a community Medicago TILLING platform, DNA from EMS2 was transferred to the Genome Laboratory at the John Innes Centre (Norwich, UK), who provide the TILLING service (http://www.jicgenomelab.co.uk/revgenuk.html). Seed requests should be made to the Centre for Biological Resources for mutants for M. truncatula [Unité Mixte de Recherche en Génétique et Ecophysiologie des Légumineuses à Graines (UMR-LEG), Institut National de la Recherche Agronomique (INRA), Dijon, France, or the website http://www.inra.fr/legumbase].

Experimental procedures

Plant production

All TILLING plants were multiplied under glasshouse conditions (from January 2003 to December 2004) in 0.5-L pots filled with pouzzolane. Plants were automatically watered with a nutritive solution of 3.5 N/3.1 P/8.6 K. The temperature was above 19 °C during the night and below 30 °C during the day; complementary artificial lighting was provided to attain a 16-h day. The plants were not inoculated with Sinorhizobium sp. bacteria and the nitrogen supply was designed to be mainly mineral and non-limited. The pods were harvested at maturity, threshed and the seeds were sieved to eliminate broken seeds. In our glasshouse conditions (see above), the duration of the growth cycle is 4–5 months.

EMS treatments

Seeds of M. truncatula were scarified by immersion in concentrated sulphuric acid for 3 min, and rinsed five times in a large volume of distilled water. EMS was diluted according to the chosen dose in deionized water. Seeds were then added to the bottle. The closed bottles were placed on a rotary shaker (50 r.p.m.) all day and overnight (24 h imbibition). EMS solution was then removed and the seeds were rinsed extensively 12 times for 30 min with gentle shaking. Before sowing, imbibed seeds were vernalized for 24 h at 4 °C in a cold room.

Two EMS treatments were carried out: EMS1 in 2003 at 0.20% (16 mm) and EMS2 in 2004 at 0.15% (12 mm). The choice of dose depended on the percentage of seedling survival observed in each seed batch. Mutagenesis efficiencies for the two populations were assessed by recording albinos or chlorotic phenotypes on M2 plantlets and embryonic lethality in M2 seeds on M1 plants at the seed filling stage: 200 M1 pods from 1500 plants were dissected under a binocular microscope and the numbers of aborted and wild-type seeds were recorded.

Estimation of the mutation rate

To obtain an estimation of the mutation rate in our populations, the formula described by Li and Redei (1969) was used

R = M/(S × GECN × 2D)

where R is the mutation rate per µm, M is the number of M2 families segregating for recessive mutants, S is the number of surviving plants, GECN is the genetically effective cell number of the mutagen-treated tissues (germ-line), ‘2’ is a correction factor for diploidy and D is the dose of mutagen in µm.

Genotyping platform

DNA preparation

DNAs from M2 plants were prepared from one trifoliate leaf (about 0.15 g) using the Dneasy Plant 96 Qiagen Kit (Qiagen S.A., France) for 96 samples, following the manufacturer's instructions. For EMS1, concentrations were estimated by visualization on a 1% agarose gel, and roughly adjusted to equal concentration with deionized water, if necessary, before 12-fold pooling by a Tecan (Tecan Group, Männedorf, Switzerland) liquid handling robot. For EMS2, DNAs were quantified using Pico green (Molecular Probes, Invitrogen Corporation, Carlsbad, California, USA) against a universal DNA concentration standard on a Tecan Genios plate reader. All samples were normalized to 0.5 ng/µL (diluted in deionized water) and eight-fold pooling was managed by a Xiril (Xiril AG, Hombrechtikon, Switzerland) liquid handling robot.

Choice of the TILLING fragment

The coddle (codons optimized to discover deleterious lesions; http://www.proweb.org/coddle; Till et al., 2003) program, combined with the primer3 tool (Rozen and Skaletsky, 2000), was used to define the best amplicon for TILLING. coddle identifies areas within the gene which have the highest probability of affecting gene function when mutated by EMS, and scores possible missense and nonsense changes.

Detection of mutations using Li-Cor (at Dijon, France)

A protocol modified from the TILLING protocol set up for the Arabidopsis TILLING Project (Colbert et al., 2001) was used. As a result of heterogeneity in PCR yield among pools, a two-step PCR was employed. The nested PCR was performed in 11-µL volumes (5 µL of DNA + 6 µL of mix) using the Taq Core kit 25 [including concentrated Taq (15 U/µL), deoxynucleoside triphosphate (dNTP) and 10 × buffer] from MP-QBiogene (MP Biomedicals, France). Pools from EMS1 were arranged on one single 384-well plate; pools from EMS2 were arranged on one 384-well plate plus one-half of a 384-well plate. All primers were obtained from MWG Biotech (MWG, Ebersberg, Germany) for PCR1, external primers were diluted to a final concentration of 0.2 µm; for PCR2, a mixture of labelled and unlabelled internal primers was used (ratio 3 : 2 labelled : unlabelled for IRD700, and ratio 4 : 1 for IRD800) and diluted to a final concentration of 0.2 µm. The PCR1 product was diluted at either 1 : 20 or 1 : 50 dependent on yield, and 5 µL of this dilution was transferred robotically to a new 384-well PCR plate. Cycling was performed in an MWG Biotech 384-well cycler as follows: (i) PCR1: 94 °C for 3 min, 30 cycles (94 °C for 30 s, Tm for 40 s, 72 °C for 1 min/kb), 72 °C for 5 min; (ii) PCR2: as PCR1, followed by 99 °C for 10 min, 70 cycles (72 °C for 20 s, –0.3 °C/cycle), to facilitate heteroduplex formation. Amplification was followed by ENDO1 treatment (endonuclease and protocols provided by A. Bendahmane, Institut National de la Recherche Agronomique, Unité de Recherche en Génomique Végétale, Evry, France; Triques et al., 2007) in a final volume of 20 µL, and incubated at 37 °C for 20 min. Reactions were stopped by the addition of 5 µL of 0.15 m ethylenediaminetetraacetic acid (EDTA). Twenty microlitres of the mix were transferred robotically to 4 × 96-well spin-plates (Millipore, Billerica, Massachusetts, USA) filled with pre-swollen Sephadex G-50. After centrifugation according to the manufacturer's protocol, 3 µL of mix supplemented with 2 µL of a formamide loading buffer (2 × stock solution: bromophenol blue 1%; formamide 98%; EDTA, pH 8.0, 10 mm) was pipetted into a 96-well plate. Samples on the plate were denatured at 96 °C for 3 min and then stored on ice. One microlitre of each sample was pipetted into a 96-LiCor loading rack and absorbed on to a membrane comb before loading on Li-Cor gel. Following the pre-run focusing step on a Li-Cor 4300, the comb was inserted. Following electrophoresis for about 3 h (for a 1-kb fragment), 700 and 800 channel images were analysed using Adobe Photoshop software: point mutants seen on one image have counterparts in the second channel that add up to the homoduplex length. Once mutants had been detected in pools, the individuals comprising the pools were screened following the same protocol as for the pools (nested PCR, ENDO1 digestion, cleaning up and electrophoresis run), but using mixed DNA (individual + wild-type A17) to ensure heteroduplex formation. The nature of the mutation was further determined by sequencing individual PCR products. Sequences were analysed using Chromas software. For a heterozygous mutation, the mutated base was identified by two superposed peaks higher than background noise. In the case of a homozygous mutation, the wild-type sequence was aligned with the mutant sequence to identify the single nucleotide polymorphism (SNP) (clustal analysis). The impact of the mutation was then assessed at the transcript level by checking its effect on intron splicing and, at the protein level, by checking its effect on codon change. The mutation density was estimated as the total number of mutations divided by the total number of base pairs screened, that is the amplicon size × number of individuals screened.

Detection of mutations using a capillary sequencer (at John Innes Centre, Norwich, UK)

PCR was performed in 12.5-µL volumes using ExTaq (Takara Bio Inc, Shiga, Japan) polymerase and 5 µL of genomic pooled DNA at 0.5 ng/µL. Primers were obtained from Operon (Eurofins Operon, Ebersberg, Germany) and mixed in a molar ratio of 3 : 2 (labelled : unlabelled) for final primer concentrations of 0.2 µm. Primers were designed with melting temperatures of 60–70 °C using coddle, and Primer3 when coddle provided unsuitable designs, dependent on region structure and client preference. Cycling was performed on a G-Storm (Gene Technology Limited, Essex, UK) 96-well cycler as follows: 95 °C for 2 min; eight cycles of 94 °C for 20 s, Tm +3 °C to Tm –4 °C decreasing by 1 °C/cycle and 72 °C for 1 min; 45 cycles of 94 °C for 20 s, Tm –5 °C for 30 s and 72 °C for 1 min; 72 °C for 5 min; 99 °C for 10 min; 70 cycles of 20 s at 70 °C to 49 °C, decreasing by 0.3 °C/cycle; 5 µL of PCR was cleaved with CEL1 and incubated at 45 °C for 15 min. The reaction was stopped by the addition of 5 µL of 150 mm EDTA. The sample was then isopropanol precipitated and resuspended in 9.9 µL of Hi Di Formamide and 0.1 µL MRK1000 (Gel Company, San Francisco, California, USA) Rox size standard. The sample was then heat denatured before loading on to an ABI 3730 (Applied Biosystems, Foster City, California, USA) using a 96-capillary 50-cm array with pop7. Genemapper 4.0 was used to analyse the outputs from the ABI sequencer to identify possible cleavage products resulting from the cleavage of heteroduplex mismatches. PCR and DNA sequencing were performed on all candidate individuals from the detected eight pools. Sequence traces were analysed in Mutation Surveyor (SoftGenetics, State College Pennsylvania, USA) to identify individuals with mutations and possible protein changes.

Screening management

The resource was created within the framework of GLIP, and priority was reserved for partners in this project. Ten genes were screened on both populations. Screenings were performed in Dijon on one 384-pool plate for each target, i.e. on 4600 individuals for EMS1 and 3072 individuals for EMS2. All available DNA from EMS2 (4350) was screened in the Genome Laboratory at the John Innes Centre (Norwich, UK); 21 genes were screened on EMS2 by the Dijon platform and 35 genes by the Norwich platform in response to GLIP requests.

Acknowledgements

This work was supported by the Grain Legumes Integrated Project (FOOD-CT-2004-506223) of the European Commission FP6 Framework Programme. The authors wish to thank B. Darchy for expert technical assistance in the management of plants and seeds, B. McCullagh, L. MacPherson and R. Goram for running the HTP TILLING platform at John Innes Genome Laboratory, Norwich, UK, and Abdelhafid Bendahmane (Institut National de la Recherche Agronomique, Unité de Recherche en Génomique Végétale, Evry, France) for supplying endonuclease ENDO1, associated protocols and helpful advice. We are grateful to Bradley J. Till for providing the protocols of the original TILLING method at the beginning of the project. We also thank Karine Gallardo for critical reading of the manuscript and for providing helpful comments.

Ancillary