Sharka is a devastating viral disease caused by the Plum pox virus (PPV) in stone fruit trees and few sources of resistance are known in its natural hosts. Since any knowledge gained from Arabidopsis on plant virus susceptibility factors is likely to be transferable to crop species, Arabidopsis's natural variation was searched for host factors essential for PPV infection.
To locate regions of the genome associated with susceptibility to PPV, linkage analysis was performed on six biparental populations as well as on multiparental lines. To refine quantitative trait locus (QTL) mapping, a genome-wide association analysis was carried out using 147 Arabidopsis accessions.
Evidence was found for linkage on chromosomes 1, 3 and 5 with restriction of PPV long-distance movement. The most relevant signals occurred within a region at the bottom of chromosome 3, which comprises seven RTM3-like TRAF domain-containing genes. Since the resistance mechanism analyzed here is recessive and the rtm3 knockout mutant is susceptible to PPV infection, it suggests that other gene(s) present in the small identified region encompassing RTM3 are necessary for PPV long-distance movement.
In consequence, we report here the occurrence of host factor(s) that are indispensable for virus long-distance movement.
Viruses are obligatory intracellular parasites that rely on the host cell machinery for progression of their infectious cycle. Successful viral infection in plants requires compatible interactions between host and viral factors during replication of the viral genome, translation of the viral proteins, and viral movement, both cell-to-cell and long-distance (Gilbertson & Lucas, 1996; Carrington & Whitham, 1998; Noueiry & Ahlquist, 2003; Waigmann et al., 2004; Scholthof, 2005). Interruption of these processes by mutations or the complete absence of relevant host factors results in the failure of the corresponding viral infection step, which is operationally equivalent to recessive resistance (Fraser, 1990). Virus-permissive forms of these host factors, considered as virus susceptibility genes, also participate in other cell functions and are largely conserved between plant species. A case-of-study is the identification of the translation eukaryotic translation initiation factors 4E (eIF4E) and 4G (eIF4G) (for a review, see Robaglia & Caranta (2006)). Following this initial discovery, a growing number of recessive resistance genes have been cloned and nearly all of them, with the exception of the tom and rim genes (Yamanaka et al., 2000, 2002; Tsujimoto et al., 2003; Yoshii et al., 2009), encode translation initiation factors (Truniger & Aranda, 2009). The exact steps through which these factors contribute to viral infection have still not been identified precisely, but viral RNA translation, replication and/or virus cell-to-cell movement have been identified as potential candidates (Robaglia & Caranta, 2006; Truniger & Aranda, 2009). Interestingly, no susceptibility factor participating in the virus systemic movement has yet been identified, while it is obvious that multiple host factors are required for virus long-distance movement (Schaad & Carrington, 1996; Scholthof, 2005; Requena et al., 2006).
Although Arabidopsis is not a crop, it has significant utility for the acquisition of basic knowledge of plants and their reactions towards biotic or abiotic stress. Its small genome (c. 150 Mb) and the ease of its manipulation have positioned it as the model plant for biotechnological and genetic studies, including plant disease resistance analysis. Arabidopsis is susceptible to various viral pathogens such as potyviruses (e.g. Turnip mosaic virus; Tobacco etch virus (TEV); Lettuce mosaic virus (LMV); Plum pox virus (PPV)), Cucumoviruses (Cucumber mosaic virus), Luteoviruses (Beet western yellow virus) and others, making it an ideal host for identifying genes that underlie susceptibility to viral infection (Carr & Whitham, 2007). Because of the significant conservation of these host factors across species, genera and even taxonomic families, information on plant susceptibility factors for an Arabidopsis-virus pathosystem may be applied to screen for, to create, resistance against related viruses in crop species (Gómez et al., 2009; Pavan et al., 2010). Recessive resistance seems to be more frequent for plant–potyvirus pathosystems (40% of the resistances identified to date) than for other viral groups or for other plant pathogens (Fraser, 1990; Truniger & Aranda, 2009). Thus, extensive screening of susceptibility genes in model plants such as Arabidopsis offers the potential to develop resistances in a wide range of crops to which potyviruses pose a significant threat.
The goal of the present study was to identify host factors that influence the susceptibility to PPV. In Arabidopsis, PPV inoculation results in a wide range of phenotypic outcomes, from full susceptibility to complete resistance (Decroocq et al., 2006). To identify PPV susceptibility genes (and thus recessive resistance candidate genes), we conducted a resistance screen in two core collections of 24 and 20 Arabidopsis accessions (McKhann et al., 2004; Clark et al., 2007). Heterozygous F1 and segregating F2 populations were used to select and map host determinants linked to recessive resistance. Genetic analysis and linkage mapping in biparental and multiparental populations confirmed the existence of new recessive resistance genes that do not colocalize with already known susceptibility factors. In a second step, genome-wide association mapping was used to refine the position of candidate loci. By combining results from linkage mapping in F2 and recombinant inbred line (RIL) crosses with association analysis, several promising true associations were identified, including a major locus closely linked to restriction of PPV long-distance movement. This study illustrates the potential of genome-wide association scans completed by traditional mapping methods as a tool for dissecting the genetics of plant–virus interactions. It also allowed us to identify the first susceptibility gene(s) involved in viral long-distance movement.
Materials and Methods
Arabidopsis thaliana (L.) Heynh accessions from the 24-core collection were provided by the VNAT INRA facility (http://dbsgap.versailles.inra.fr/vnat/). Other accessions were purchased from the Nottingham Arabidopsis Stock Centre (http://nasc.life.nott.ac.uk/) (Clark et al., 2007). The accessions used are described in Table 1 and in the Supporting Information, Table S2. F1 and F2 populations from a cross between Ler (N8581) or Col-0 (N1092) and RRS-7 (N22688), Ts-1 (N22692) or St-0 (62AV) were obtained in the laboratory. Heterozygosity of the F1 populations was checked with the MSAT2.5 simple sequence repeat (SSR) marker (see http://www.inra.fr/internet/Produits/vast/msat.php). F2 populations were obtained by self-pollinating single F1, heterozygote plants. The RIL population derived from the cross between Ts-5 (N22648) and MZ-0 as well as the Multiparent Advanced Generation Inter-Cross (MAGIC) recombinant population were also obtained from NASC (http://nasc.nott.ac.uk/).
Table 1. Identification of recessive, allelic resistance mechanism(s) of Plum pox virus (PPV) systemic restriction in F1 Arabidopsis thaliana populations
Crossings (parent 1 × parent 2)
Number of plants tested
Observed segregation ratio (S : R)
M353, M336, M53, M338, M370 and M449 correspond to Multiparent Advanced Generation Inter-Cross (MAGIC) recombinant lines resistant to PPV infection. They are numbered following the NASC nomenclature (see http://nasc.life.nott.ac.uk/CollectionInfo?id=112). Haplotype of the MAGIC RILs at the sha3 locus are indicated in uppercase as follows: HI, Hi-0 haplotype; Sf, Sf-2 haplotype; Rec, recombinant in between Hi-0 and Sf-2 haplotypes.
Plants were grown in a BL-3 containment glasshouse under temperature- and humidity-controlled conditions (20°C and relative humidity of 60%). Six plants per accession of each core collection were phenotyped following PPV agroinoculation. Experiments were repeated at least three times.
Viral clones used in this study are all derived from the PPV-R isolate, which belongs to the PPV-Dideron strain (Riechmann et al., 1990). Construction of pICPPVnkGUS and pBINPPVnkGFP containing the full-length nucleotide sequence of PPV-R coupled with the β-glucuronidase (GUS) or green fluorescence (GFP) proteins has been described in Fernández-Fernández et al. (2001) and Jimenez et al. (2006), respectively.
PPV resistance phenotyping
The pBINPPVnkGFP viral clone was agroinoculated on the rosette leaves at 4 wk after sowing. Virus infection was scored at 21 d postinoculation (dpi) in noninoculated tissues (floral stems or newly developed rosette leaves). Protocols for Agrobacterium preparation and PPV agro- and biolistic inoculations have been described elsewhere (Decroocq et al., 2006; Jimenez et al., 2006). The agroinoculation procedure was carried out starting from Agrobacterium C58C1 cultures (OD600 = 0.6) induced with acetosyringone. Agrobacterium pBINPPVnkGFP-transformed cells were applied with a toothpick to three Arabidopsis rosette leaves. Direct particle bombardment inoculation was performed with the pICPPVnkGUS clone using a handheld device (Bio-Rad) and starting from 0.1 μg of infectious DNA per shooting.
At 21 dpi, viral accumulation was estimated for each individual plant from double antibody sandwich (DAS) enzyme-linked immunosorbent assay (ELISA) assays (Sicard et al., 2008). Optical densities (ODs) were normalized using the PPVnkGFP-infected Nicotiana benthamiana-positive control deposited on every ELISA plate of an assay. Quantitative data were determined relative to the value of the PPVnkGFP-infected N. benthamiana, which was set at 100. In the case of RILs, the final viral accumulation value is the average of normalized measurements from all PPV-inoculated replicates of each RIL.
Fluorescence detection of GFP-tagged PPV is described in Decroocq et al. (2006) and histochemical GUS assay was carried out according to the method of Jefferson et al. (1987). Leaf and flower tissues were incubated in reaction buffer containing 50 mM NaH2PO4 (pH7), 0.01% Tween 20, 10 mM Na2EDTA and 0.3% (w/v) 5-bromo-4-chloro-3-indolyl D-glucuronic acid. Explants were placed at 37°C for 24 h and later treated with ethanol 70% to intensify the blue staining under a Leica stereomicroscope.
Construction of parental linkage maps and mapping of the genetic determinants in biparental populations
From over hundreds of PCR markers available on the TAIR website (www.arabidopsis.org), we selected 100 which are distributed all along or throughout the Arabidopsis genome. They were then tested on a set of Arabidopsis parental accessions including Ler, RRS-7, Ts-1 and St-0. New SSR markers were also developed in the laboratory (P. Cosson et al., unpublished); they are named BSATX.YY, X being the chromosome and YY, its relative position on the physical map (see https://www6.bordeaux-aquitaine.inra.fr/bfp/Recherche/Equipe-de-Virologie-vegetale as well as Notes S1 for the table of primers and the Arabidopsis linkage maps).
Linkage maps were generated with Kosambi mapping function in JoinMap software (Van Ooijen & Voorrips, 2001). A logarithm of odds (LOD) score of 5 was used to assign markers to linkage groups. In few cases, a decrease of the LOD threshold to 3 was necessary to insert all markers in the map. Alternatively, they were added manually, following their expected position as depicted on the Arabidopsis physical map (http://www.arabidopsis.org/). Linkage maps are displayed in the mapping supplemental data. A first map based on 240 F2 individuals was produced and used for QTL linkage mapping. Based on this first information, regions of chromosomes 1 and 3 where QTLs for resistance to PPV infection were identified were especially targeted with five to eight extra molecular markers. They were used in F2 populations of 400 individuals at least, in order to carry out fine-mapping of the resistance loci.
For the Ts-5 × MZ-0 recombinant lines, a set of 84 markers was previously scored as shown on http://www.jic.ac.uk/staff/ianbancroft/arabidopsis-populations.htm#TJ or in Notes S1. They served to map Ts-5 PPV resistance determinant(s). Fifty of the original set of 92 Ts-5 × Mz-0 RILs were delivered by the NASC and later tested in replicates in a four-block random design. A PPV-resistant control, the eIFiso4E loss-of-function mutant, named E6 (Duprat et al., 2002), together with the two parents, Ts-5 and Mz-0, were added to the study.
Quantitative trait locus analysis on F2 and recombinant biparental populations was done as described in Sicard et al. (2008), following the multiple QTL mapping method after automatic cofactor selection. A permutation test was performed to calculate the 5% limit for statistical significance both at chromosome level and genome-wide. The percentage of the phenotypic variation explained by the QTL corresponds to the regression value R2 taken at the peak LOD score of the QTL in the MapQTL6 software (www.kyazma.nl). Epistatic interactions and additive QTL effects were tested by running the linkage mapping data through the QTLNetwork-2.1 software (http://ibi.zju.edu.cn/software/qtlnetwork/) (Yang et al., 2008) as described in Brachi et al. (2010).
Quantitative trait locus meta-analysis was performed using the BioMercator V3 software (under development, O. Sosnowski & J. Joest, unpublished) (Arcade et al., 2004) and was conducted in three steps as described in Marandel et al. (2009). The QTL meta-analysis algorithm implemented by Goffinet & Gerber (2000) was used to determine the best-fitting model, including the number of meta-QTLs effectively underlying the observed QTLs and their respective position and confidence intervals.
Mapping of the genetic determinants in multiparental population
Four hundred and thirty-five of the original set of 527 MAGIC recombinant lines were tested in triplicates, following a complete random three-block design. The 19 founders of the MAGIC lines were added, together with the PPV-resistant E6 mutant. The heritability among MAGIC lines was estimated by fitting a random-effects model (Kover et al., 2009). ANOVA allowed the specific effect of ‘genotype’ and the broad-sense heritability to be determined, which is the ratio between the genetic variance and the total phenotypic variance and is calculated using the formula , where is the genetic variance, is the environmental variance and n is the number of replicates. Procedures for QTL analysis are described in Kover et al. (2009). Briefly, the first step of the analysis is based on a probabilistic reconstruction of the haplotype mosaic of each MAGIC line. This allows computation of the probability for a given individual to be the founder individual at a given locus for a given MAGIC line, consequently giving the contributions of each founder to detected QTLs. The genome is then scanned for the presence of QTLs in each single nucleotide polymorphism (SNP) interval using a fixed effects model. We used the HAPPY R package to map QTLs (http://mus.well.ox.ac.uk/magic/).
One hundred and forty-seven accessions issued from Atwell et al. (2010) (Table S2) were tested following a complete random four-block design. Quantitative data for viral accumulation were generated as described earlier. For binary data and to avoid intermediate scores, accessions were rated as susceptible (and assigned a value equal to 1) when the OD value reached at least three times the mean OD value of the PPV-resistant E6 control (Decroocq et al., 2006). By contrast, plants were rated as resistant (value of 0).
Genotyping data (based on SNP markers) are from Atwell et al. (2010) and are publicly available. Four different types of analyses were performed to test for the association between genotypes and phenotypes. Fisher's exact tests were implemented on binary data. Wilcoxon rank-sum tests were used on quantitative data. In addition, a regression analysis was implemented using the Plink software (Purcell et al., 2007) (http://pngu.mgh.harvard.edu/purcell/plink/). These three analyses are expected to have false positives as a result of population structure. Thus, we also used the EMMA method (Kang et al., 2008), based on a mixed model that accounts for confounding as a result of population structure. The matrix of genotype similarity was calculated on a subset of 1000 SNPs regularly sampled in the set of filtered 214 553 SNPs using a homemade R program. The same set of 1000 SNPs was used to calculate the genetic relationship between the 147 accessions and to build a UPGMA tree using the DARwin software (Perrier & Jacquemoud-Collet, 2006). Linkage disequilibrium analysis was performed using TASSEL (Bradbury et al., 2007).
An in silico search was done on the TAIR8 Col-0 genome annotation for candidate genes linked to the SNP linkage disequilibrium (LD) bin(s). Ler-predicted amino acid sequences were retrieved from the MAGIC founders’ sequence database (http://mus.well.ox.ac.uk/19genomes/) (Gan et al., 2011).
Variation in response to PPV infection across accessions of two Arabidopsis core collections
We evaluated resistance to PPV, over three replicate experiments, with the 24-accession VNAT core collection (McKhann et al., 2004) and the 20-accession core collection (Clark et al., 2007). The pBINPPVnkGFP PPV infectious cDNA clone, introduced in Agrobacterium tumefaciens, was used to agroinoculate the plants because its original PPV parental form, PPV-Rankovic, is able to overcome the dominant restricted TEV movement (RTM) resistance, which restricts PPV long-distance movement in Arabidopsis (Decroocq et al., 2006, 2009). Arabidopsis accessions showed a range of phenotypic variation from full susceptibility to resistance to PPV (see Table S1).
The PPV-resistant accessions St-0, RRS-7 and Ts-1 were backcrossed with the Ler PPV-susceptible parent. The F1 populations were then challenged by agroinfection with the pBINPPVGFP viral clone. Nearly all individuals in the F1 generation of the St-0 × Ler, Ler × RRS-7 and Ler × Ts-1 cross supported PPV long-distance movement (hereafter called PPV-LDM) (Table 1), demonstrating that the resistance mechanism controlling PPV-LDM in these accessions is recessive.
In consequence, St-0 (62AV) from the VNAT, and RRS-7 (N22688) and Ts-1 (N22692) from the other collection (Clark et al., 2007) were selected for further phenotypic observations. These accessions were again inoculated, either via agroinoculation using pBINPPVnkGFP or by biolistics using pICPPVnkGUS. PPV accumulation in inoculated leaves and in uninoculated floral tissues was scored at 9 and 21 dpi, respectively (Fig. 1). Ler plants served as a positive, susceptible control while plants of the E6, AteIF(iso)4E knockout line served as a negative, fully resistant control (Duprat et al., 2002; Decroocq et al., 2006). While PPV was able to replicate and accumulate both locally and systemically in the Ler control plants, a clear restriction of PPV long-distance spread and systemic accumulation was demonstrated by both GFP fluorescence observation (Fig. 1a) and GUS histochemical staining (Fig. 1b) in the St-0, RRS-7 and Ts-1 accessions. As expected, no PPV infection could be detected, locally or systemically, in E6 (Decroocq et al., 2006). Since pBINPPVnkGFP and pICPPVnkGUS behaved similarly, the impairment in PPV-LDM is not related to resistance to Agrobacterium-mediated inoculation.
Linkage mapping of the recessive resistance trait(s) in biparental populations
Restriction of PPV-LDM was evaluated in five F2 biparental populations comprising 240–400 individuals each (see Table 2) following agroinoculation with the pBINPPVnkGFP viral clone. It segregated as a trait controlled by one or two recessive genes, segregation rates ranging from 24.89 to 43.10% of PPV-negative F2 plants. Five distinct linkage maps were constructed from SSR and simple sequence length polymorphism (SSLP) molecular markers available on the TAIR website (http://www.arabidopsis.org/) or developed in the laboratory (see https://www6.bordeaux-aquitaine.inra.fr/bfp/Recherche/Equipe-de-Virologie-vegetale and Notes S1). QTL analysis was performed as described in the 'Materials and Methods' section, using MapQTL6 and QTLNetwork software. Significant loci contributing to PPV-LDM restriction are presented in Table 2. When crossed with Ler, one to three loci (designated sha for resistance to sharka, followed by the linkage group number) were detected, the most significant being sha3 (LOD scores ranging from 28.61 to 15.86, with an R2 reaching 66.4%). Interestingly, when RRS-7 and Ts-1 were crossed with the PPV-susceptible parent Col-0, only the effect of the sha3 locus was observed, with LOD scores of 46 and 22.6, respectively, and a maximum R2 of 42.5% (Table 2). Col-0 and the PPV-resistant accessions, Ts-1 and RRS-7, possibly share similar alleles at the sha1 and sha5 loci or the effect of these loci is too low in a Col-0 susceptible background to be detected.
Table 2. Identification of three Arabidopsis thaliana quantitative trait loci (QTLs) involved in recessive restriction of Plum pox virus (PPV) long-distance movement in bi- and multiparental populations
Number of markers used to build the core genetic map (simple sequence repeat or simple sequence length polymorphism).
Significant after 1000 permutations and at 95% statistical confidence.
Detected only by interval mapping (IM).
Fine-mapping of the sha3 and sha1 loci was performed in those populations, with five to eight extra molecular markers.
in four repeats for the Ts-5 × JIC240#14 recombinant inbred lines (RILs) and triplicates for the Multiparent Advanced Generation Inter-Cross (MAGIC) lines.
logP is equivalent to the −log10(P-value). nd, not determined.
The sha3 locus is represented in bold. Bp, base pairs; maximum LOD, score associated with the peak of the logarithm of odds (LOD) plot using multiple QTL mapping; R2, proportion of the phenotypic variation explained by the peak of the LOD plot using multiple QTL mapping (explained variance); Nb, number.
These results suggest a major effect of the sha3 locus in the control of PPV-LDM in Arabidopsis in the three resistant accessions analyzed. No epistatic effect between the different QTLs was detected, at the probability of 1–5%.
In parallel, an F8 RIL population derived from a cross between Ts-5 and MZ-0 was challenged using the same conditions as described earlier. Only 50 of the 94 Ts-5 × Mz-0 RILs were obtained from the NASC and were tested in four replicates, following a full random block design. Twenty-one days after inoculation, PPV was scored by ELISA in the noninoculated tissues. Before the QTL analysis, a linkage map was generated from the Ts-5 × Mz-0 locus genotype file (http://www.jic.ac.uk/staff/ian-bancroft/arabidopsis-populations.htm#TJ). Composite interval mapping scans detected only the sha3 locus in the Ts-5 × Mz-0 recombinant population (Table 2). Genotyping the Ts-5 × Mz-0 RILs with an additional set of eight CAPS markers (Hou et al., 2010) (see http://amp.genomics.org.cn/) spanning the sha3 region allowed the fine-mapping of this single major-effect locus down to an 875 kb interval, delimited by markers AL390921–8963 (position 21 027 402 bp) and AL356014–9336 (21 903 190 bp) (data not shown).
Respective intervals of the sha3 locus were computed in a QTL meta-analysis. It showed that the best model fitting the phenotypic observations is 1-QTL model (A.I.C 18.4), meaning that one single metaQTL (named Meta-Sha3 in Fig. 2b) actually underlies all QTLs detected on LG3.
Linkage mapping of the recessive resistance trait in a multiparental recombinant population
Four hundred and thirty-five of the 527 MAGIC RILs described by Kover et al. (2009) and obtained from NASC were evaluated for restriction of PPV-LDM in a three random blocks design. The broad-sense heritability of PPV resistance for the MAGIC lines was calculated from the variance analysis (see the 'Materials and Methods' section) and reached 0.89. Each founder accession was tested in parallel to the MAGIC RILs (see Fig. S1). Only two accessions, Hi-0 and Sf-2, appeared fully resistant to pBINPPVnkGFP systemic infection. However, other accessions displayed a continuous distribution of the mean viral accumulation value estimated in ELISA assays, with six accessions (Kn-0, Edi-0, Bur-0, Ct-2, Wu-0 and Can-0) showing partial resistance to PPV accumulation. This indicates the existence of other genetic factors with significant effects on the degree of virus accumulation in susceptible accessions. A similar variability of PPV accumulation in the Col-0, Cvi-1 and Ler-susceptible accessions has been described previously (Sicard et al., 2008).
A significant block effect was also detected and accounted for either by integrating it in the computation or by analyzing each block separately. We performed a QTL mapping analysis using data from each block separately as well as one using the mean over all blocks. Analysis of the variation in susceptibility to PPV infection identified for each analysis one major QTL, with a maximum of the –log10(P-value) of 33.93 for the mean values. This locus mapped at the bottom of chromosome 3 (see Fig. 2c) and colocalized with the previously identified Meta-sha3 QTL. The genetic interval ranged from 16 666 699 to 23 291 586 bp, with a peak at 21 517 584 bp (Table 2). Interestingly, the same genomic region was identified when using data from each block separately or from the mean values of the three blocks. QTL analysis of MAGIC lines allowed reconstruction of the genome of each line as a mosaic of the founder haplotypes (Kover et al., 2009). Based on this reconstruction, it was possible to determine that the two founders contributing the QTL detected in the MAGIC lines were indeed Hi-0 and Sf-2 (Fig. S2).
To verify the recessive resistance identified in the MAGIC founders, Hi-0 and Sf-2, as well as various PPV-resistant MAGIC recombinant lines, were crossed to the Ler-susceptible accession. All F1 populations proved susceptible to PPV systemic infection (Table 1), confirming that the sha3-controlled resistance trait is recessive. In order to test whether the sha3 genetic determinant is the same in RRS-7, Ts-1, St-0, Ts-5, Hi-0 and Sf-2, these accessions were crossed to test allelism. All F1 populations were entirely resistant (Table 1), confirming that sha3 is allelic in all these accessions.
Genome-wide association mapping of Arabidopsis resistance to PPV long-distance movement
To refine QTL intervals to a candidate gene level, a genome-wide association study (GWAS) in 147 Arabidopsis accessions (Table S2) from wild populations was conducted (see Fig. S3). Two PPV-susceptible controls (namely Col-0 and Ler-0) that have been previously phenotyped under similar conditions (Decroocq et al., 2006) were added to the experiment, together with E6, used as a negative control. For phenotypic assays, a completely randomized design in four independent blocks was used. Two types of data were generated from the phenotypic assays: quantitative virus accumulation data (normalized ELISA OD values) and binary data (1 for susceptible, 0 for resistant).
The broad-sense heritability of PPV resistance for the accessions reached 0.92. A block effect was detected with both quantitative and binary data, which was mainly the result of a difference of means between blocks 1 and 2 on one side and blocks 3 and 4 on the other. We performed association analyses on each block separately and on means (mean of blocks 1 and 2 and mean of blocks 3 and 4). Using the Bonferroni correction (significant −log10(P-value) > 6.63), a very low number of SNPs showed association with resistance to PPV. However, on LG3, a block of SNPs showed significant association in all analyses (Table S3) with a peak at position 21 603 348 identified with both binary and quantitative analyses, including the EMMA analysis which takes population structure into account.
Notably, 15 of the 100 SNPs with the lowest P-values obtained when analyzing binary data were located in the sha3 interval as determined by QTL analysis in the multiparental population (16 666 699–23 291 586 bp; see Table 2). While processing the quantitative data, eight or nine SNPs of the 100 SNPs with the lowest P-values were identified in the same interval with the EMMA and PLINK packages, respectively (Table S3). In all analyses, the peak SNPs included the SNP at position 21 603 348, even after taking into account the population structure (EMMA analysis).
To determine the resolution of this association study, the extent of LD around the best-associated SNP (21 603 348) was investigated. We successively used three sets of SNPs: 460 SNPs covering the sha3 interval, 158 SNPs distributed 30 kb centromeric and telomeric to 21 603 348 (see the results in Fig. S4), and 82 SNPs present in a range of 10 kb above and under 21 603 348. A significant LD (P < 0.0001) around the 21 603 348 SNP, extending roughly from positions 21 592 393 to 21 606 184 (13.79 kb wide) was detected. For simplicity, SNPs that are highly correlated with SNP 21 603 348 are referred to as belonging to a sha3 LD bin and map in an interval of 13.8 kb (Fig. S4).
Other than in the sha3 region, several intriguing association signals were found elsewhere in the genome, mostly within the sha5 locus, with the highest evidence of association observed at position 15 820 863 and 17 196 614 (−log10(P-value) = 4.62 and 5.20 for the binary and quantitative data, respectively, Tables S3a, S3c). Most of the SNPs identified on chromosome five are located within or just over the sha5 region identified through linkage analysis in the St-0 × Ler (positions 8 665 026–17 061 229) and Ts-1 × Ler (positions 8 428 136-15 021 915) biparental populations. However, while the same locus was detected, GWAS based on quantitative and binary data did not share strictly the same SNPs (Tables S3a,b). A significant enrichment in SNPs associated with restriction of PPV-LDM is observed between positions 13 902 662 and 17 196 614, on chromosome 5, for quantitative data (14 SNPs within the 100 best-associated SNPs after EMMA analysis), and c. 15 820 863 for binary data. Nevertheless, the genetic interval remains large (c. 3.2 Mb) and the effect of this locus in relation to sha3 awaits further study. A larger set of Arabidopsis accessions might also help, in the future, to refine the sha1 and sha5 loci.
In summary, different alleles of the same sha3 gene appear to confer restriction of PPV-LDM in St-0, RRS-7, Ts-1, Ts-5, Sf-2 and Hi-0. Seven new accessions among the 147 that were tested presented a similar phenotype at 21 dpi, but this can be linked either to another sha3 allele or to alternative gene(s) in the same bin region being able to afford the same movement restriction phenotype. In the initial experiments, Fei-0 presented an intermediate response to PPV infection (Table S1), with no accumulation detected upon mechanical or biolistic inoculation, but with detectable (although extremely weak) accumulation upon agroinoculation. This could be explained by the occurrence of a ‘leaky’ sha3 allele or by the effect of sha3-independent background gene(s). We cannot favor one hypothesis over the other until we map the respective determinant(s), which is in progress for Fei-0, Ei-2, Ca-0 and An-1 (RILs at http://arabidopsis.info/webservices/index.html).
Genetic and geographic origins of PPV resistant accessions
Although GWAS has allowed significant advances for plant pathogen resistance studies (Aranzana et al., 2005; Nemri et al., 2010; Todesco et al., 2010), to date no susceptibility locus for virus infection has been identified using this strategy. LD analysis allowed us to refine the mapping of sha3. It consists of a tightly linked LD bin spanning c. 13.8 kb that includes 10 genes whose polymorphisms define resistance alleles that are found in Arabidopsis accessions issued from several geographic areas. There is no obvious geographic structure to the distribution of sha3-controlled, PPV-resistant accessions. While the Iberian populations (Spain, Portugal) concentrate half of the resistant alleles (Ts-1, Ts-5 and Sf-2), the origin of the other resistant accessions ranges from northern Europe (Sweden for St-0, the Netherlands for Hi-0) to North America (RRS-7).
In order to understand the origin of the PPV-resistant accessions, we selected a subset of 1000 SNPs regularly sampled in the set of filtered 214 553 SNPs. They were used to build a UPGMA tree (Fig. 3), which revealed two main clusters. Indeed, it appears that the PPV-resistant phenotypes are spread in two clusters and there is no strong correlation between resistance and the genetic origin of individuals. Hi-0, RRS-7 and St-0 belong to the first genetic cluster, as depicted in Fig. 3, while the Ts accessions belong to group 3. Sf-2 was not added in this analysis because it was not genotyped with the set of 250 K SNPs. Genetic similarity analysis also revealed new possible resistant alleles from all over Europe to the USA (Fig. 3). While one is related to the previously identified resistant accessions (Se-0 is grouping with Ts accessions), some are clearly distinct from the original founders (Ca-0, An-1, Pu2-7, Ra-0, Fei-0, Ei-2) or even belong to a new genetic similarity group (An-1 and Ra-0 in group 2, Fig. 3).
Candidate genes for the control of sha3 resistance trait
Focusing on the sha3 QTL interval, we retained SNPs that showed highly significant signal but also mapped within the sha3 QTL interval as identified in the multiparental population (Table S3). Assignment of the SNP coordinates to functional and structural positions was performed using the TAIR8 annotation (http://www.arabidopsis.org/).
Based on the genome-wide association scan, we observed the most significant association signal at 21 603 348 (Table S3a–c, Fig. 2e), which lies in the fourth exon of the RTM3 gene (At3g58350) (Table 3). The RTM3 product, which is characterized by a meprin and TRAF (MATH) homology domain, is hypothesized to be part of a multiprotein complex that blocks the long-distance movement of several potyviruses (Whitham et al., 1999; Cosson et al., 2010), including PPV (Decroocq et al., 2006). However, two elements seem to rule against the implication of RTM3 in a susceptibility mechanism controlling PPV-LDM. First, the resistance conferred by RTM3 in the Col-3 accession is dominant (Mahajan et al., 1998; Whitham et al., 1999), while sha3 controls a recessive resistance mechanism. Moreover, the PPV-R isolate used in this study, as well as all its derivatives, including pBINPPVnkGFP and pICPPVnkGUS, overcomes the RTM resistance mechanism (Decroocq et al., 2006, 2009) so that Columbia accessions (Col-0, Col-1 and Col-3) are fully susceptible to PPV-R (and its derivatives). In addition, the RTM3-/- mutants developed by Cosson et al. (2010) were fully susceptible to PPV infection when challenged with the pBINPPVnkGFP viral clone (Fig. S5). Since homozygous RTM3-/- mutants remain susceptible to PPV infection, this result rejects the hypothesis that RTM3 could be the SHA3 susceptibility factor needed for successful PPV-LDM in Arabidopsis.
Table 3. Candidate genes associated with the most significant single nucleotide polymorphisms (SNPs) in the sha3 region
SNP (position in bp)
Gene function (BEST Arabidopsis thaliana protein match)
RTM3, MATH-TRAF and coil-coiled protein
Noncoding region between AT3G58330 and AT3G58340
MATH-TRAF and coil-coiled proteins, RTM3-like
3p region, 51 bp downstream AT3G58320
Coil-coiled RTM3-like protein
Coil-coiled RTM3-like protein
Similar to unknown protein [Arabidopsis thaliana] (TAIR:AT1G48460.1)
5p region, 38 bp upstream AT3G57700
Putative protein kinase
3p region, 220 bp downstream AT3G56380
Identical to Two-component response regulator ARR17
3p region, 263 bp downstream AT3G57760
Wall-associated protein kinase 1
Leucine-rich repeat protein kinase
Leucine-rich repeat protein kinase
Nucleolar protein gar2-related protein
Between AT3G58390 and AT3G58400
Between genes coding for eukaryotic release factor 1 and MATH-TRAF domain-containing protein
Similar to S locus F-box-related protein
Between AT3G49350 and AT3G49360
Between genes coding for RabGAP/TBC domain-containing protein and glucosamine/galactosamine-6-phosphate isomerase-related protein
Still, the P values of four other SNPs in this region (in bold in Table 3) indicate that SNPs situated near the RTM3 gene are strongly associated with restricted PPV movement (Table S3, Fig. 2e). Three intergenic SNPs (SNPs 21 599 048, 21 596 493 and 21 613 164) and one exonic one (21 597 258) were found to be associated with restriction of PPV-LDM (Table 3). They all map within or in proximity to RTM3 MATH domain-containing gene copies. This result, together with the LD analysis (ranging from 21 592 393 to 21 606 184), defines an extended sha3 LD bin of c. 20 kb (from At3g58280 to At3g58400; see Fig. S6) that encompasses a cluster of seven MATH domain-containing genes (similar to the RTM3 MATH domain), five nonMATH domain-containing genes (phospholipases) and one pseudogene (Table S4).
We retrieved the entire genomic sequences of the 19 MAGIC parents (Gan et al., 2011), including the PPV-resistant (Hi-0 and Sf-2) founders, as well as the corresponding coding sequences (CDS) of each of the genes in the extended sha3 LD bin (from AT3G58280 to AT3G58400) (http://mus.well.ox.ac.uk/19genomes/). Genomic and coding sequences were aligned with ClustalW in the BioEdit package. Unfortunately, the relative high sequence variability (up to 14 substitutions and one indel in At3g58350, for example) allowed us to eliminate only one candidate gene (At3g58300), which is identical between PPV-susceptible and -resistant accessions.
In this study, we report the identification in A. thaliana of genomic regions associated with susceptibility to PPV, and more particularly with virus long-distance movement. In order to fine-map the host determinant(s), we combined linkage mapping in F2 and RIL crosses with association analysis. Each of the bi- and multiparental linkage mapping experiments detected a major and recurrent locus, named sha3, as well as another set of unique loci, depending on the genetic background. In a single cross, not all the loci affecting a specific complex phenotype are expected to be detected, since a locus can exert allele-specific effects only in crosses derived from two accessions carrying different alleles. Allelism tests indicate that restriction of PPV-LDM shared by the St-0, RRS-7, Ts-1 and Ts-5 accessions is controlled by the same gene, within the sha3 locus. However, we cannot rule out the possibility that, while they share the same genetic determinant, the mutations affecting the SHA3 susceptibility host factor are different in each accession. Linkage mapping allowed the sha3 locus to be mapped at the bottom of linkage group 3, between the F9D24.1 and T20N10.2 markers (c. 265 kb). Interestingly, mapping accuracy of the sha3 QTL was not improved in the multiparental population when compared with traditional biparental mapping population such as Ler × RRS-7, but the use of the MAGIC lines confirmed, with a relatively low mapping effort, the sha3 QTL as well as the identified two new PPV-resistant accessions.
A GWAS was conducted in 147 Arabidopsis accessions sampled among unrelated, wild populations. It resulted in a significant increase in mapping resolution, from QTL intervals to candidate genes. Overall, GWAS confirmed putative loci and fine-mapped the major sha3 locus down to 13 candidate genes. Candidates identified here support the overlap between host factors inhibiting PPV-LDM in a dominant fashion (RTM3) (Decroocq et al., 2006; Cosson et al., 2010) and other(s) participating in PPV-LDM (SHA3), restricting it in a recessive fashion. Indeed, the predominance of the peak over RTM3-like genes in the genome-wide scan demonstrates that allelic variation at this locus is the major determinant of global variation for PPV long-distance movement.
The comparison of genetic similarity in the set of 147 accessions with natural variation in PPV-LDM did not show strong geographic or genetic relationships among resistant accessions, although several of them originated from the Iberian Peninsula. It is notable that none of the Caucasian and Mid-Asian Arabidopsis ecotypes tested in the present study are resistant to PPV infection but they are also significantly underrepresented. Schmid et al. (2006) showed previously that accessions originating from the Iberian Peninsula and from Central Asia constitute distinct and genetically diverged clusters. They also showed that Central Asian accessions display a low degree of polymorphism, which could explain the absence of mutation(s) in the sha3 locus. Cao et al. (2011) confirmed a clear differentiation between European and Central Asian populations, thus possibly explaining the prevalence of PPV resistance alleles in the first population but not in the latter. More interestingly, François et al. (2008) determined a major east to west migration wave of A. thaliana in Europe, which is consistent with a natural, postglacial recolonization from an Eastern glacial refugium as suggested by Schmid et al. (2006). Similarly, Jørgensen & Mauricio (2004) placed the origin of North American A. thaliana populations with the group of weeds from Europe that invaded North America at the time of European colonization. Therefore, either mutation(s) in the sha3 locus may not be rare in the ancestral, Eastern population, explaining its occurrence in distinct European and North American populations after colonization, or the mutation(s) occurred independently in separate and unrelated populations. Alternatively, allelic variation at the sha3 locus, comprising functional and nonfunctional alleles in the same local population, could be maintained in distant, unrelated wild populations, and is consistent with this locus being under selection (see LD analysis in the Results section).
Effectively, a significant LD was detected within the sha3 region and around the most significant SNP marker (SNP 21 603 348) in comparison with the Arabidopsis whole-genome LD analysis (Kim et al., 2007). This also means that any polymorphic SNP marker detected in this sha3 LD bin will also be tightly linked to RTM3 and that sha3-controlled PPV-LDM restriction can be linked to polymorphic allele(s) of any gene in this sha3 LD bin.
Our results also show that RTM3 by itself is not a susceptibility gene since RTM3-/- lines remain susceptible to PPV infection. This clearly distinguishes the (dominant) RTM3-controlled restriction of PPV- LDM from the sha3 (recessive) control of PPV systemic invasion. In this second case, the product of the dominant SHA3 allele is predicted to be recruited by the virus to perform long-distance movement in the host plant. However, we cannot rule out at this stage the possibility that SHA3 might code for an RTM3-like protein. The possibility that more than one gene of the sha3 LD bin could be recruited for PPV-LDM, although not fully invalidated, seems to be largely ruled out by the allelism tests performed with the RRS-7, Ts-1, St-0, Ts-5, Hi-0 and Sf-2 accessions that indicated that all of them were allelic.
RTM3 and RTM3-like genes in the sha3 extended LD bin are characterized by a meprin and TRAF (MATH) homology domain and a coiled-coil domain (Park et al., 1999), but they differ in length and by the presence or absence of one of these two domains (Cosson et al., 2010). Indeed, within the extended sha3 LD bin, RTM3-like predicted proteins display RTM3 features, except for AT3G58300, AT3G58320, AT3G58330 and AT3G58370, which have only the highly conserved coil-coiled domain (Cosson et al., 2010). Other candidate genes of the sha3 region are characterized by a phospholipase domain with or without the MATH domain. The complexity of this MATH-related gene cluster suggests evolutionary rearrangements, including proximal duplication and/or gene conversion events. This would lead to contraction/expansion of MATH-related genes and probably loss/acquisition of function(s). A similar pattern of intracluster evolution has been extensively described for nucleotide binding site–leucine-rich repeat (NBS-LRR) resistance gene clusters and appears related to the accumulation of repeat sequences (transposons) (Ratnaparkhe et al., 2011), which is not the case in the sha3 LD bin.
The role of MATH proteins in plants is still poorly understood. The first biological function experimentally demonstrated for a MATH protein was its implication in the RTM resistance, which is active against at least three potyviruses (TEV, LMV and PPV) (Whitham et al., 1999; Decroocq et al., 2006; Cosson et al., 2010). MATH domain-containing proteins are able to self-interact (Park et al., 1999; Cosson et al., 2010) as well as interact with other proteins (Weber et al., 2005). Therefore, assuming that MATH proteins are able to interact between them, RTM3 may interact, as a negative dominant factor, with the SHA3 protein, preventing it from playing its role in potyvirus long-distance movement. However, this hypothesis can only be verified once the sha3 gene has been identified.
Ultimately, this linkage and association mapping delimited a short list of candidates for further genetic and functional studies. To our knowledge, the present report describes the first combined GWAS and linkage mapping study identifying susceptibility factors associated with virus long-distance movement. Future research will focus on validating the effect of the candidate genes identified in this study, challenging the corresponding loss-of-function mutants with PPV and understanding how the SHA3 gene product participates in the virus long-distance movement in A. thaliana. It also poses the question of the exact role of the MATH protein(s) in plant–virus interactions, either to restrict or to participate in virus long-distance movement. This will condition the use of this class of genes in crop species such as stone fruit trees for the deployment of an efficient resistance strategy.
This work was supported by the SharCo FP7 Small Collaborative Project No 204429, by grants from the EPR Aquitaine (nos 20081201005 and 20091201003) and INRA divisions of Plant Health and Plant Breeding. We are grateful to T. Mauduit and A. Bailly for plant production. Special thanks are given to G. Marandel and S. Decroocq for advice in the statistical analysis. The development of the new version of BioMercator is supported by ANR 08GENO126. Grateful thanks to F. Roux (LGEPV, Lille University) and Prof. A.G. Abbott (Clemson University) for critical reading of the manuscript. Author contributions: V.D. conceived and designed the experiments; G.P., P.P., S.P., M.B., G.G., L.N. and V.D. performed the experiments; P.C., G.P., P.P. and V.D. developed molecular markers and mapping tools; J.P.E. and A.C. contributed reagents/material/ELISA analysis; V.D., G.P. and P.P. performed the QTL analysis; S.M. analyzed the data for association mapping and from multiparental lines; V.D. and S.M. wrote the paper; and T.C. proofread the paper.