By continuing to browse this site you agree to us using cookies as described in About Cookies
Notice: Due to essential maintenance the subscribe/renew pages will be unavailable on Wednesday 26 October between 02:00- 08:00 BST/ 09:00 – 15:00 SGT/ 21:00- 03:00 EDT. Apologies for the inconvenience.
Map-based cloning has been widely used to identify genes responsible for mutant phenotypes in Arabidopsis, especially those mutants generated by EMS or fast neutron mutagenesis. The success of map-based cloning relies on the availability of molecular markers that distinguish the polymorphisms between two Arabidopsis ecotypes. So far, most molecular markers in Arabidopsis have been generated by individual laboratories or the Arabidopsis Information Resource (TAIR). However, the TAIR markers, which are distributed unevenly on the five Arabidopsis chromosomes, only cover approximately 25% of the Arabidopsis BACs. Designing and testing molecular markers is still a time-consuming endeavor. Here we report the construction of a high-resolution BAC-based Arabidopsis mapping platform (AMP), using Col-0 and Ler as model ecotypes. The AMP comprises 1346 markers (1073 INDEL and 273 CAPS/dCAPS markers), of which 971 were newly designed and experimentally confirmed, 179 were from published papers and 196 were TAIR markers. These AMP markers cover 1186 BACs, 1121 of which are in non-centromere regions, representing approximately 75% of the Arabidopsis BACs in non-centromere regions. All the marker information is included on the AMP website (http://amp.genomics.org.cn/) for easy access and download, and sets of standard markers for initial chromosomal localization of a particular gene are recommended. The feasibility of using the AMP to map mutated genes is also discussed.
Arabidopsis thaliana is a model plant that is widely used for genetic analysis of various biological processes. Many plant development questions, including those on flower development and hormone homeostasis/signaling, have been resolved through characterization of Arabidopsis mutants. However, identification of the molecular lesions responsible for mutant phenotypes is often difficult and time-consuming. Therefore, development of reliable molecular markers will greatly accelerate map-based cloning in Arabidopsis.
Finding an informative mutant is the starting point in addressing a biological question. Various strategies have been developed to identify the mutated genes in Arabidopsis mutants generated by various methods. Adaptor ligation-mediated PCR and thermal asymmetric interlaced PCR (TAIL-PCR) are often used to identify the flanking sequences of T-DNA or transposon insertion sites (Liu and Whittier, 1995; O’Malley et al., 2007). However, EMS-generated mutants are often cloned by mapping. Map-based cloning, also known as positional cloning or mapping, relies on linkage analysis of a mutant phenotype to genetic or molecular markers whose positions on the chromosome are known (Jander et al., 2002; Peters et al., 2003; Jander, 2006). Map-based cloning is effective for characterizing all types of mutants including T-DNA mutants. For point mutations or small insertions/deletions generated by EMS or fast-neutron bombardment, map-based cloning is still the best method until a cheap and effective sequencing technology is developed and utilized (Jander et al., 2002; Jander, 2006). Over the course of its development for more than 30 years, the markers used in map-based cloning have changed from genetic markers to molecular markers, e.g. restriction fragment length polymorphism (RFLP) markers, and PCR-based markers including random amplified polymorphic DNA (RAPD) and amplified fragment length polymorphism (AFLP) markers (Chang et al., 1988; Williams et al., 1990; Chang and Meyerowitz, 1991; Alonso-Blanco et al., 1998). In recent years, insertion/deletion (INDEL) and cleaved amplified polymorphic sequence (CAPS) markers have become the two most reliable mapping markers (Konieczny and Ausubel, 1993; Bell and Ecker, 1994).
Both INDEL and CAPS markers are commonly used PCR-based markers. With or without restriction enzyme digestion, the PCR products are subjected to electrophoresis analysis, from which linkage data are easily and directly obtained (Konieczny and Ausubel, 1993; Bell and Ecker, 1994). Many INDEL and CAPS markers have been developed and reported, and are also collected in the Arabidopsis Information Resource (TAIR) (http://www.arabidopsis.org/marker). In TAIR, approximately 25% of the Arabidopsis BACs have at least one INDEL or CAPS marker, which facilitates the cloning process. Moreover, a set of standard markers for first-pass mapping are also recommended (Lukowitz et al., 2000). However, these available markers are not evenly distributed across the chromosomes. There are several possible reasons for the limited number of available markers. For example, only those markers closely linked to the mutants may be reported and submitted to TAIR. In many cases, markers are even not submitted after gene cloning. Furthermore, some of the reported markers only work under strict PCR or electrophoresis conditions, or the amplicon variations between Arabidopsis ecotypes may be too small to distinguish, which limits the utility of these markers in map-based cloning. At the whole-genome level, 75% of the BACs have had no marker reported (http://www.arabidopsis.org). Although markers can be generated with additional help from the Monsanto polymorphism list (Jander et al., 2002), some published polymorphisms are incorrect, and the process to generate and verify new markers is time-consuming. This lack of markers makes map-based cloning a relatively time-consuming process in Arabidopsis molecular genetics studies.
Many Arabidopsis accessions or ecotypes have been used to study natural variations or adaptation (Koornneef et al., 2004). As the sequencing technology has developed rapidly and the cost has reduced greatly, more and more genome sequences of various ecotypes and accessions have been released, and new mapping strategies have been reported. For example, an affordable INDEL array, which covers 240 unique INDEL polymorphisms with a mean spacing of approximately 500 kb, has been developed (Salathia et al., 2007). Recently, another strategy called ‘deep sequencing’ has been described (Lister et al., 2009; Schneeberger et al., 2009). Deep sequencing adopts the standard mapping procedure before primary mapping, but allows sequencing of the whole genomes of more than 20 individual F2 plants so that the sequence variations can be directly characterized in approximately 1 week (Lister et al., 2009; Schneeberger et al., 2009). However, under current circumstances, deep sequencing is still very expensive. At the same time, deep sequencing is not applicable to identification of fragment deletion/insertion mutants or epigenetic mutants in which no DNA sequence differences occur. It is believed that combining conventional fine mapping with deep sequencing will allow the most accurate and fastest results to be obtained.
To improve the mapping efficiency, we performed a study to find as many as possible verified markers. By analyzing the genome sequences of Col-0 and Ler, we first identified 375 INDEL/CAPS markers from TAIR as well as from previously reported papers, and allocated them to the corresponding BACs in various chromosomes. Then we generated 971 INDEL/CAPS markers for the remaining BACs, all verified by PCR and electrophoresis. Information on all 1346 INDEL/CAPS markers is provided on the Arabidopsis mapping platform (AMP) website (http://amp.genomics.org.cn/). Information on approximately 2000 mutants described in research papers or reported in other databases is also included to enable easy determination of whether the mutant has been reported previously.
AMP mapping markers
Bacterial artificial chromosomes (BACs) and yeast artificial chromosomes (YACs) were used to construct a physical map for genome sequencing (Burke et al., 1987; Shizuya et al., 1992). Four types of clones, i.e. BAC, YAC, transformation competent artificial chromosome (TAC) or P1 clones, were used for genome sequence assembly in Arabidopsis. Here we use the simplified name ‘BAC’ to represent all four types of clones. In Arabidopsis, 1622 BACs cover the entire genome, of which 145 are in the centromere region and 1477 in the non-centromere region (Kumekawa et al., 2000, 2001; Hosouchi et al., 2002), according to TAIR (http://www.arabidopsis.org) (Table 1). TAIR has become the first choice for plant scientists to retrieve mapping markers. Although TAIR includes 341 INDEL markers and 318 CAPS markers, only some of them met our criteria and were integrated into the AMP. These markers only cover 24.4% of the Arabidopsis BACs (21.4% in the centromere region and 24.6% in the non-centromere region) (Table 1).
Table 1. Comparison of markers on TAIR and the AMP
Summary of Arabidopsis BACsa
Markers for BAC on TAIRb
Markers for BAC on the AMPb
The numbers in parentheses indicate BAC coverage.
aData from TAIR and the Monstanto polymorphism list. Here BAC represents BACs and YACs.
It is clear that the marker density in TAIR is unfortunately not high enough for efficient mapping, and the markers are not evenly distributed on the chromosomes. To increase the density and improve the distribution of markers on chromosomes, we designed and generated new markers based on polymorphic differences between the two most commonly used Arabidopsis ecotypes, Col-0 and Ler, and produced an Arabidopsis mapping platform (AMP). We first collected information about almost all the reported 179 markers from published papers (Table 2). We then searched the markers from TAIR using two criteria: (i) to fill in the marker gaps, and (ii) to obtain significant amplicon size differences between Col-0 and Ler. We found approximately 196 markers from TAIR in total (Table 2). Finally, we generated new markers for those BACs with no marker information. Based on the Monsanto polymorphism list downloaded from TAIR, we designed and experimentally tested INDEL polymorphisms with divergence greater than 6 bp. We eventually generated approximately 870 INDEL markers for the whole Arabidopsis genome (Table 3). For BACs without proper INDEL sites, we generated 101 CAPS/dCAPS markers to make the AMP more complete (Table 3). Although some DNA sequences are of lower quality or cannot be assembled in the centromere regions and a few other regions, we have found 45 markers for gene mapping in these difficult regions (Table 1). These markers altogether cover 73.1% of the Arabidopsis BACs, meaning that the AMP has the highest density of mapping markers for Col-0 and Ler to date (Table 1). The marker density in the non-centromere region was even higher, approximately 75.9%, and that in the centromere region was 31.0% (Table 1).
Table 2. Marker sources for the AMP
Published papers (%)
This study (%)
Table 3. Classification of newly generated markers in the AMP
INDEL (SSLP) (insetion/deletion length)
8 bp ≤ INDEL ≤ 20 bp
Marker types in the AMP
In the AMP, INDEL markers are always the first choice due to their convenience and simplicity. However, CAPS markers are a good supplement to INDEL markers because they are easy to design as a result of the abundance of single nucleotide polymorphism (SNPs). In summary, we generated 870 INDEL markers and 101 CAPS markers for the AMP (Table 3). For the INDEL markers, sites at which the insertion/deletion was longer than 6 bp were chosen to design new markers. Of the newly generated INDEL markers, 229 (26.3%) contain insertions/deletions >20 bp, and 572 (65.7%) contain insertions/deletions of between 8 and 20 bp (Table 3). Sixty-nine INDEL markers (7.9% of the total INDEL markers) showed smaller divergence (<8 bp) between Col-0 and Ler (Table 3). Almost all the SNP-based markers in the AMP were CAPS markers, with only a few dCAPS markers (Table 3).
Arrangement of markers on the AMP
To obtain a clear view of all the markers distributed on chromosomes, and allow users to retrieve them conveniently, we developed a new method of marker nomenclature. The name for each marker has three parts. The first part is a single number from 1 to 5, indicating the specific chromosome on which the marker is located. The second part is the BAC accession number at National Center for Biotechnology Information (NCBI) (http://www.ncbi.nlm.nih.gov/), representing the BAC clone in which the marker is located. The third part is a four-digit number, indicating the relative physical location of the marker on the chromosome. All the markers are arranged along the chromosomes according to their relative locations (http://amp.genomics.org.cn/), and the web interface is shown in Figure 1(a). The five chromosomes are arranged with a 1 million base pair (Mb) physical map scale. Each red bar indicates an individual marker, while each blue spot represents an individual mutant from TAIR (http://www.arabidopsis.org), the Arabidopsis hormone database (AHD, http://ahd.cbi.pku.edu.cn/), the Seed Genes database (http://www.seedgenes.org/) and the papers by Meinke et al. (2003) and Peng et al. (2009). Using the two buttons in the lower left quarter of the screen (Figure 1a), the scale can be magnified or reduced. The greatest magnification is 28-fold. A magnification level can be chosen at which the marker distribution on the chromosomes is clear, and detailed information on each marker or mutant is obtained by simply clicking the link.
As an example (shown in Figure 1b), detailed information for a typical marker on the AMP, 1-AC011611-1259, is given. The information for the marker has five major parts, i.e. the type of marker, the position information, the sequence information, the PCR information and the gel image. ‘Marker type’ includes three types, i.e. INDEL, CAPS or dCAPS. As an example, ‘1-AC011611-1259’ is labeled as an INDEL marker. ‘BAC name’ and ‘BAC accession’ are information about the TAIR BAC in which the marker is located. ‘Start position in BAC’ shows the physical position of the 5′ end of the polymorphism in the BAC, and ‘Start position in genome’ is the position of the 5′ end of the polymorphism on the chromosome. ‘+’ indicates that the marker is on the sense (+) strain of the DNA. It is also possible to search a specific sequence (a marker or flanking sequence) in the whole Arabidopsis genome by BLAST by choosing the ‘ath_genome.fasta’ database on the AMP BLAST webpage). The sequence position on the chromosome will be displayed in order to clarify whether the sequence is located in or beside the mapping interval. Only the markers newly generated in this study have detailed PCR information and corresponding electrophoresis images, from which suitable primers and their sequences can be downloaded for further mapping experiments. Typical detailed information on mutants from TAIR or other databases, which are adjacent to markers, in given in Figure 1(c).
Search for the best suitable markers or mutants on the AMP
In order to retrieve chosen marker information or find known mutants within a certain interval, we developed a powerful search engine with marker and mutant information. Marker information can be searched by either marker ID (AMP nomenclature), BAC name/BAC accession number or chromosome position (physical location) (Figure 2a,b). Similarly, mutants can be searched by either mutant ID (gene ID), mutant name, chromosome position or flanking markers (Figure 2c,d). Markers matching the query conditions are shown on the lower part of the same webpage (Figure 2). We also developed the search tool ‘GbrowseView’ to provide further information with various display channels (see the AMP website). A detailed tutorial on how to use these search tools is provided on the website (http://amp.genomics.org.cn/).
Recommendation of standard markers for primary mapping
Since the first successful isolation of an Arabidopsis gene by map-based cloning, many markers have been developed for primary mapping. Often different laboratories use different primary mapping markers. The most popular markers for primary mapping are the 22 recommended markers that are distributed evenly on the five Arabidopsis chromosomes (Lukowitz et al., 2000). In most cases, these markers work well; however, some of them require strict conditions. In the AMP, we also recommend 25 markers with band size variation of approximately 20 bp (Figure 3). These 25 markers are distributed evenly on each chromosome and produce good PCR bands using normal PCR and electrophoresis conditions, and are very suitable for first-pass mapping (Figure 3). These recommended markers are represented by green solid circles in the AMP (Figure 1a).
Application of the AMP for characterization of a methyl IAA-resistant mutant
In our screening of methyl IAA-resistant mutants (Hou et al., 2009), we obtained a strong methyl IAA-resistant mutant, which we designated methyl IAA-resistant 2 (mir2). On medium with a high concentration of methyl IAA, mir2 showed much longer hypocotyls and roots than wild-type for both dark- and light-grown seedlings (Figure 4). After crossing with Col-0, the mutant showed a 1:1 ratio for resistant:sensitive in the F1 progeny (Table 4), suggesting that mir2 was a heterozygous. Interestingly, the segregation of resistance in F2 was 2:1 (Table 4), suggesting that this mutation is dominant and embryo-lethal. This segregation ratio has never been reported previously for auxin-related mutants. We used AMP markers to perform primary mapping (markers T4I9-0729, FCA8-5268 and T15N24-7217) and fine mapping (markers F17A8-3303, T6G15-4247, FCA1-4498 and FCA5-4901). The mapping procedure was completed in <2 weeks (Figure 4). The mutated gene was mapped to the axr5/iaa1 locus with the same sequence mutation as reported previously (Yang et al., 2004). We also characterized another mutation, G→A, at position 1377 in a putative embryo development-related gene EMB2739 located 13 kb downstream of axr5 (X. Hou et al., unpublished data). These data suggest that the two mutations together possibly account for methyl IAA resistance and embryo lethality. Using the AMP, we were able to quickly characterize several other alleles of auxin-resistant mutants, such as shy2/iaa3, axr2/iaa7 and bdl/iaa12 (L. Li et al., unpublished data). We conclude that the AMP is a highly effective platform for mapping new alleles of known genes or unknown genes.
Table 4. Genetic analysis of mir2 mutants
χ2 (0.95 level)b
aSeedlings were scored for auxin resistance on 5 μm MeIAA in the dark.
bχ2 was calculated and compared with the theoretical value at 0.95 confidence level.
mir2/+ ♀× Col-0 ♂
0.11 < 3.84
mir2/+ ♂× Col-0 ♀
0.23 < 3.84
0.44 < 3.84
In this study, we generated 971 newly designed markers, confirmed their effectiveness, and documented them, together with 373 previously reported markers, in an Arabidopsis mapping platform (AMP). In addition, we recommend a new set of markers for first-pass mapping. Using this mapping platform, Arabidopsis researchers can easily narrow down the linkage interval into a 100 kb region in a short period of time.
Using the collection of previous markers on TAIR or from published papers and those newly generated in this work, we improved the marker coverage from 24.4% (TAIR) to 73.1% (AMP). For the non-centromere region, the coverage is even higher, reaching 75.9%. For most regions of the Arabidopsis chromosomes, the mean distance between two markers was <100 kb, which makes the AMP a highly effective mapping platform, with the highest marker coverage to date. Moreover, all the markers are sequentially arranged on the chromosome (Figure 1a). Using web-based technology, markers in the region of interest can be easily obtained by marker searching toolbar or obtain all the marker information in the whole-chromosome view. Furthermore, the newly designed markers (nearly 1000) identified in this study were all experimentally confirmed. We provide the PCR conditions and gel images for these markers to assist in use of the markers. This free resource will facilitate acceleration of the map-based cloning process in Arabidopsis.
Col-0 and Ler-0 are the two most commonly used Arabidopsis ecotypes for mapping. We created this mapping platform based on the polymorphisms between these two ecotypes to meet the demands of most laboratories all over the world. Because variation in genome is subtle and DNA polymorphisms among accessions and sometimes conserved markers generated from Col-0 and Ler-0 can also be used for other accessions. Therefore, AMP markers can also be used to map mutants from other rare accessions (such as Ws, C24 or En) or Arabidopsis relatives. In practice, we recommend that the AMP markers are first tested before performing first-pass mapping, especially when mapping with rare accessions. On the other hand, we also found that some markers used in mapping of interesting mutants in rare accessions are also applicable in Col-0 and Ler (data not shown). Therefore, the AMP can be further improved by recruiting more markers, and this can be done by AMP users themselves. We have included an online submission engine to allow users to update marker information. In addition, as all the markers (both newly designed and previously reported) in the AMP represent true polymorphisms between Col-0 and Ler, it is possible to combine them with high-throughput technology (such as DNA chips) to create a more convenient system for mapping.
Conventional positional cloning is a time-consuming process. An even faster technology, deep sequencing, has been developed to accelerate the mapping procedure (Schneeberger et al., 2009). Deep sequencing is suitable for mapping point mutations, but not for INDEL or epigenetic mutations. In addition, the cost of this technology is much higher than that of conventional mapping, which limits its wide application. It may be better to combine the AMP mapping system and deep sequencing. First-pass and fine mapping would be performed using the AMP to narrow down the interval into a small region, and to ensure that the mutation is a new one. Then deep sequencing could be used to obtain the exact mutation site. This combination of the AMP and deep sequencing will greatly accelerate the discovery of new genes or new alleles in Arabidopsis.
The two most commonly used Arabidopsis thaliana ecotypes, Columbia (Col-0) and Landsberg erecta (Ler) (Rédei, 1992), were used in this study. The genome sequences of Col-0 and Ler are available on the TAIR website (http://www.arabidopsis.org). Genomic DNA was extracted from 7-day-old seedlings by the CTAB method with slight modifications (Su et al., 2003). F1 DNA was isolated from the F1 offspring of Col-0 and Ler-0.
Search for INDELs or SNP polymorphism and confirmation
Because INDEL markers are more convenient to use than CAPS markers, identification of INDEL markers was our first priority. If no compatible INDEL polymorphisms were found, CAPS markers were designed based on SNP variations. Using the Monsanto Arabidopsis Polymorphism and Ler Sequence Collections (http://www.arabidopsis.org/browse/Cereon/index.jsp) released recently, we searched for INDEL longer than 6 bp variation between Col-0 and Ler. To confirm the variation, we used approximately 300 bp DNA sequences from Col-0 flanking the variation sites as query sequences for BLAST against Ler-0 (Jander et al., 2002). If the BLAST result also revealed this variation, the two 300 bp DNA fragments were used for primer design with Primer Premier 5.0 software (http://www.PremierBiosoft.com). All primer sequences were documented and are included in http://amp.genomics.org.cn/.
PCR and electrophoresis
We used standard PCR conditions and protocols, except that lower annealing temperatures were used in some cases. The PCR conditions for some primers were further modified to obtain better results such as increasing Mg2+ concentration or adding DMSO or BSA. The PCR products were separated using 4–5% agarose gels, and electrophoresis gel images were recorded and documented in the AMP website (http://amp.genomics.org.cn/), together with the PCR conditions.
The methods used for screening and genetics analysis of methyl IAA-resistant mutants were as described previously (Hou et al., 2009). The mapping procedure and protocols were also as described previously (Lukowitz et al., 2000). The primary and fine mapping markers used were obtained from the AMP.
We would like to thank Dr Hong-Wei Guo (College of Life Sciences, Peking University, China) for comments and discussions. This study was supported by the National Basic Research Program of China (grant number 2009CB941503), the National Natural Science Foundation of China (grant numbers 30625002 to L.-J.Q. and 30628012 to Y.Z.) and a China Postdoctoral Fellowship (20080440006 to L.L.).