Genome‐wide association study identifies five new cadmium uptake loci in wheat

Cadmium (Cd) toxicity is a serious threat to future food security and health safety. To identify genetic factors contributing to Cd uptake in wheat, we conducted a genome‐wide association study with genotyping from 90K SNP array. A spring wheat diversity panel was planted under normal conditions and Cd stress (50 mg Cd/kg soil). The impact of Cd stress on agronomic traits ranged from a reduction of 16% in plant height to 93% in grain iron content. Individual genotypes showed a considerable variation for Cd uptake and translocation subdividing the panel into three groups: (1) hyper‐accumulators (i.e. high Leaf_Cd and low Seed_Cd), (2) hyper‐translocators (i.e. low Leaf_Cd and high Seed_Cd), and (3) moderate lines (i.e. low Leaf_Cd and low Seed_Cd). Two lines (SKD‐1 and TD‐1) maintained an optimum grain yield under Cd stress and were therefore considered as Cd resistant lines. Genome‐wide association identified 179 SNP‐trait associations for various traits including 16 for Cd uptake at a significance level of P < .001. However, only five SNPs were significant after applying multiple testing correction. These loci were associated with seed‐cadmium, grain‐iron, and grain‐zinc: qSCd‐1A, qSCd‐1D, qZn‐2B1, qZn‐2B2, and qFe‐6D. These five loci had not been identified in the previously reported studies for Cd uptake in wheat. These loci and the underlying genes should be further investigated using molecular biology techniques to identify Cd resistant genes in wheat.


INTRODUCTION
Cadmium (Cd) toxicity is detrimental to human health. Some of the adverse effects of Cd on human health are bone demineralization, kidney dysfunction, renal dysfunction, and an increased risk of lung cancer (Bernard, 2008). Polluted environment can be a direct source of Cd intake in humans which includes industrial exposure and intake through contaminated food. Industries (such as marble, steel, and aluminum) and the application of Cd containing phosphate fertilizers severely contaminate agricultural land and water resources (Grant, Clarke, Duguid, & Chaney, 2008;Singh & McLaughlin, 1999). The situation is more critical in the developing countries such as Pakistan, where the Cd concentration ranges from 0.02 to 184 mg/kg from normal soil to contaminated soil (Parveen & Shadab, 2012). Only in the last decade, the concentration of Cd in the polluted lands has increased by 200% in Pakistan (Waseem et al., 2014). As compared to other heavy metals, Cd has a better water solubility which enables it to percolate into the deeper soil layers thereby causing contamination of irreplaceable agricultural land and ground water resources (Nica, 2015). This high solubility of Cd also lets it easily absorbed by plant roots and rapidly translocated to the aerial parts via symplastic and apoplastic pathways. Consequently, it accumulates into the aerial parts of the plants including grains resulting in their contamination (Clemens, Aarts, Thomine, & Verbruggen, 2013). Nearly 27% of Cd intake through diet is attributed to grains and grain products (EFSA, 2012). Since wheat (Triticum aestivum L.) is the largest harvested crop worldwide, contamination of wheat grains by Cd would not only result in a food shortage crisis but can also prompt harmful effects on human health. Cd toxicity affects both morphological and physiological traits of plants such as biological yield, plant height, leaf area, number of grains, kernel weight, and grain yield. Oxidative stress, destruction of photosynthetic apparatus, genotoxicity, reduction in the root metabolism, and poor enzyme efficiency caused by high Cd have been known to cause plant cell death in various agricultural crops including wheat (Ci et al., 2010;Duan et al., 2010;Oncel, Keles, & Ustun, 2000), rice (Hassan, Wang, Ali, & Zhang, 2005;Lin et al., 2013) and maize (Andresen & Küpper, 2013;Schutzendubel et al., 2001;Xu et al., 2010).
The uptake, accumulation, and translocation of Cd in plants are genetically controlled traits, hence vary among plant species and even between the various genotypes of the same plant species (Stolt, Asp, & Hultin, 2006;Verma, George, Singh, & Singh, 2007). Genetic approaches such as genome-wide association study (GWAS) are often used to identify potential genomic regions controlling such variations so that the favorable alleles can be used to improve desired traits. These approaches have also been used for

Core Ideas
• Cd uptake shows a genotypic variation in the diversity panel, which subdivides the panel into hyper-accumulators, hyper-translocators, and moderate lines. • Two Cd stress resistant lines  are identified in the wheat panel based on yield potential under stress. • Five new loci for Cd, Zn, and Fe uptake in seeds are identified.
identifying Cd resistant genes in wheat but only a limited success stories are known. A major breakthrough in the research for Cd resistant wheat was achieved more than a decade ago when Knox et al. (2009) reported that marker assisted selection of single low Cd allele on chromosome 5B (Cdu-B1) can significantly reduce Cd concentration in wheat plants. Recently, a Cd/Zn transporter TdHMA3-B1 has been cloned in the region of Cdu-B1. The gene is reported to have two alleles: TdHMA3-B1a and TdHMA3-B1b. TdHMA3-B1a is a resistant allele that transports Cd to the vacuoles in the roots, thus limiting its uptake and translocation to shoots. TdHMA3-B1b has a 17-bp duplication which makes it a non-functional allele resulting in a 2-to 3-fold increase in Cd uptake to the aerial plant parts in durum wheat (Maccaferri et al., 2019). Some studies have also identified genomic regions associated with Cd accumulation in bread wheat. Guttieri et al. (2015) identified five SNPs (IWA1439, IWA7579, IWA6681, IWA1752,  and IWB43741) associated with grain Cd in multilocation trials on moderate-Cd and low-Cd locally adapted wheat varieties in Nebraska, USA. All these SNPs are in linkage with each other and map in the homologous region of Cdu-B1 on chromosome 5A in bread wheat. In another study, five marker-trait associations (MTA) were identified for grain Cd on five different genomic regions including 1A, 2A, 2D, 3A, and 6D which do not colocalize with any of the previously identified loci (Bhatta et al., 2018). In a more recent study, Hussain, Campbell, Jarquin, Walia, and Morota (2019) have identified two loci for Cd on chromosome 2A and 2B by variance heterogeneity based GWAS, which determines the variability of trait rather than the mean value. These two loci also do not coincide with any of the reported loci. Such variation of the identified genomic regions in numerous studies attributes to the complexity of the trait in bread wheat. GWAS has also been effectively applied to identify genomic regions governing Cd accumulation and tolerance in other crops, such as Aegilops tauschii (Qin et al., 2015), Oryza sativa (Oono et al., 2014), and Arabidopsis thaliana (Chao et al., 2012).
Here, we present a genome-wide association of loci for Cd uptake in leaf and seed, Zn and Fe uptake in seeds, and grain yield related traits in a Pakistan historical spring wheat panel under Cd stress environment. Five MTA in our study (IWB29686, IWB39994, IWB2735, IWB12428, and  IWB66179) colocalize with Cdu-B1. However, the statistical significance of these MTA is relatively lower to pass the multiple testing correction. Five loci are identified as highly significant which are associated with Cd, Fe, and Zn uptake in seeds. These loci do not coincide with any of the previously reported QTL and should be further investigated by molecular biology techniques to identify causal genes for metal uptake and grain yield regulation under metal stress.

Planting material and experimental design
The diversity panel used in this study consisted of 120 historical spring wheat lines adapted to irrigated and rain-fed conditions of Pakistan (Supplemental Table S1). It comprised of 19 landraces, 30 green revolution lines , 32 post green revolution lines, (1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000) and 39 elite lines (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013). The panel was planted for three years in field plots (2011)(2012)(2013)(2014) at National Agricultural Research Centre (NARC) to check for phenotypic purity (Ain et al., 2015). The experiment was performed at the experimental field of the Department of Plant Sciences, Quaid-i-Azam University, Islamabad, Pakistan (33.7476 • N, 73.1381 • E). Special plots (4 m length by 4 m width by 2 m depth) with paved bottoms and sides were prepared for the field experiment. The soil for Cd stress plots was treated with 50 mg Cd kg −1 soil, which is an average reported toxic concentration in the Cd affected soils of Pakistan (Waseem et al., 2014). The conditions of the Cd affected areas of Pakistan were replicated in the experiment. Three replications of 1 m row for normal (treatment 1 [T 1 ]) and Cd stress (treatment 2 [T 2 ]) conditions were planted in an augmented alpha-lattice design. The dimensions of 4 m by 4 m plots were such that each plot had three beds of 1 m rows with each bed having 20 rows of plants. The row to row distance was 20 cm and the path between two beds was 0.5 m. Twelve plots were used for the experiment in total with each plot having 60 lines: 2 (plots/rep) × 3 (reps) × 2 (treatments) = 12 plots. Treatments and replicates were randomized within the plots. The bottoms and the sides of the plots were paved/cemented and a 2 m soil layer was added for planting. These precautions were taken to avoid the leaching of Cd into the deeper soil layers and to con-trol the treatment conditions. Soil fertility level and texture were evaluated by analyzing three randomly collected samples (Chen & Ma, 2001). The tested soil had following characteristics: 3:3:5 ratio of sand, silt, and clay, 7.5 pH, 1:3 soil to water ratio, 0.8% organic matter, 3.3 dS m −1 mean EC value, 0.045 mg kg −1 phosphorus, 0.06 mg kg −1 potassium, and 0.73 mg kg −1 nitrogen. Seeds were surface sterilized with 4% H 2 O 2 for 10 minutes and then thoroughly washed with double distilled water before planting. Thirty seeds from each accession were planted in 1 m rows.

Phenotyping
Phenotypic traits such as plant height (PH), grains per spike (GpS), tiller number (TN), thousand grain weight (TGW), grain yield (GY), spike length (SL), and biological yield (BY) were evaluated at the maturity stage following the methods of Pask, Pietragalla, Mullan, and Reynolds (2012). Chlorophyll index was measured nondestructively at the mid position of the penultimate leaf using a portable chlorophyll-meter (SPAD-502 Minolta, Japan) at two growth stages (tillering [Chl _till ] and heading [Chl _head ]). Stress tolerance index (STI) was calculated for each cultivar following the method of Zanke et al. (2014): Here, GY for each line refers to the average GY of the three replicates of that line, whereas the mean GY in T 1 refers to the average GY of all the lines in T 1 . Reduction (%) of agronomic traits from T 1 to T 2 was marked as treatment 3 (T 3 ) for the use in association analysis. Cadmium concentration was estimated from flag leaf (after anthesis) and seeds. The Cd concentration in T 1 was either zero or negative for both leaf and seeds. Iron (Fe) and zinc (Zn) concentrations were also estimated from the seeds. Certified reference material (CRM) of duck weed BCR (BCR-670; Sigma-Aldrich) was used to verify the accuracy of the analytical equipment. The CRM and the plant samples (0.5 g dry weight) were digested by adding 7.5 ml of 65% HNO 3 and 2.5 ml of 36% HCl (Merck) in CEM -MARS 6 (microwave accelerated reaction system). Three digested samples from each replicate were analyzed using an atomic absorption spectrophotometer model WFX-210 (Beijing Beifen-Ruili Analytical Instrument Co. Ltd., China) to estimate the concentrations of Cd in leaf and Cd, Fe, and Zn in seeds. The atomic absorption results of Duck weed BCR (BCR-670) CRM, that is, 20 ± 3 mg kg −1 , 0.90 ± 0.01 mg kg −1 , and 78.5 ± 2.1 µg kg −1 (Zn, Fe, and Cd, respectively) showed non-significant differences with its certified values (24 ± 2.1 mg kg −1 , 0.94 ± 0.04 mg kg −1 , and 75.5 ± 0.5 µg kg −1 for Zn, Fe, and Cd, respectively), which suggested the accuracy of analytical instrument.

2.3
Statistical analysis

Phenotyping
Descriptive statistics and a mixture-design analysis of variance for alpha-lattice design were performed in XLSTAT 2017 to evaluate the statistical variation between genotypes within a treatment and between two treatments (T 1 and T 2 ). Correlation, scatter plots, and histograms for all traits were visualized using GGally package in R.

Genotyping and genome-wide association analysis
Genotyping with Illumina iSelect 90K SNP array was reported in one of our earlier reports on the diversity panel (Ain et al., 2015). The SNP markers were aligned to IWGSC RefSeq v.1.0 to obtain their physical positions. Genotyped data had 81,587 markers which were trimmed for MAF < 5%, monomorphic markers, markers with missing data such as genomic positions, and markers with indels. Finally, 14,960 high quality polymorphic SNP markers were used for genome-wide association study (GWAS). Population structure (Q) was estimated using STRUC-TURE software while kinship matrix (K) was analyzed in TASSEL software. Linkage disequilibrium (LD) between marker pairs was estimated in TASSEL using the observed allele frequencies vs. the expected allele frequencies. LD estimation was based entirely on mapped markers. The critical r 2 value, beyond which LD is due to true physical linkage, was estimated by taking the 95th percentile of the square root transformed r 2 data of unlinked markers (Breseghello & Sorrells, 2006). The percentage of marker pairs significant at various r 2 values (0.2 and 0.2641) and P < .001 were estimated for all chromosomes to compare the degree of LD among genes. GWAS was performed using mixed linear model (MLM), the Q+K model, in TASSEL (Yu & Buckler, 2006). Significance of threshold for GWAS was set according to Bonferroni correction of α = 0.05 (0.05/14960) to account for false positives expected from the type-I error during multiple testing.

Identification of loci and candidate genes
Pairwise LD was used to estimate the linkage of MTA with the surrounding SNPs and to measure the dis-tance of LD blocks. The LD block distance was considered as one locus. If the MTA had no linkage with any other SNP or if the block size was too small due to a small number of SNPs in 90K array, the LD decay distance around the MTA was considered as a locus to get the associated genes. Genes were retrieved using IWGSC RefSeq v.1.0 GFF3 version present in the Ensem-blPlant database (http://www.ensemblgenomes.org/pub/ plants/release-44/gff3/triticum_aestivum). Gene annotation was retrieved using EnsemblPlant and EMBL-EBI (http://www.ebi.ac.uk/interpro) databases. Gene annotation was performed using BLAST2GO (Conesa & Götz, 2008) and was compared with orthologous gene functions from rice and arabidopsis for an accurate and reliable functional annotation.

Population structure, linkage disequilibrium, and association mapping
Population structure analysis using the rate of change of log probability between successive subgroup K values divided the panel into seven subgroups. Population structure estimation details were provided in our earlier reports on the diversity panel (Ain et al., 2015;Safdar et al., 2020), readers are encouraged to see those papers for further details. Population LD decay analysis showed highest LD in B sub genome (800 kb), followed by D sub genome (500 kb), and A sub genome (300 kb) (Supplemental Figure S1).

Gene prediction
Pairwise LD in the QTL regions (based on LD decay) of the five detected loci indicated that either the SNPs had no linkage with nearby SNPs or the LD block was too small (Figure 2). This could be due to the low marker density resulting from a small number of SNPs in the array dataset. This suggests that relying on pairwise LD to select the candidate region while dealing the array data may result in the loss of many potentially causal genes. There-fore, the genes were selected based on the sub genomic LD decay distance around the significant loci. Wheat reference assembly IWGSC RefSeq v.1.0 (Appels et al., 2018) database present at EnsemblPlant (http://plants.ensembl. org/Triticum_aestivum/Info/Index) was accessed to get the genes present in the five detected loci. This resulted in locating 208 genes (Table S5), among which 90 were regarded as IWGSC high confidence protein coding genes (Table S6). Limited information was found for these genes from the wheat genome annotation database, and hence the orthologous genes from rice and arabidopsis were extracted to get the annotation information. Based on the annotation information for the high confidence protein coding genes, many of the genes encoded proteins putative to the locus-associated traits (Cd, Zn, and Fe). These included plant peroxidase, acetyltransferase, hexokinase, glycosyltransferase, and zinc-finger proteins which have been associated with the genetic regulation of Cd in plants (Nazar et al., 2012;Whitt et al., 2018). Potentially important genes based on the information from annotations and literature are listed in Table 3.

Effects of favorable alleles on leaf _Cd and seed _Cd
The alleles of the significantly associated SNPs to Seed _Cd had a minor additive effect on the cadmium concentration in seeds. Seed _Cd concentration decreased with an increase in the number of favorable alleles. Linear regression (r 2 ) between the number of favorable alleles and phenotype was 0.20. There was no selection pressure on any of these favorable alleles as they were evenly distributed across old and modern cultivars (Figure 4a). The allelic distribution in SNPs associated to leaf and seed Cd showed that the F I G U R E 3 Genomic visualization of five loci and significant MTA identified at a significance threshold of 1/nSNP (-log10p > 4.31). The outer circle represents SNP density from the 90K array data in the diversity panel. The middle circle represents all MTA identified on chromosomes harboring significant MTA; -log10p increases from inside out. The inner circle represents the additive effects of MTA; Add F value increases from inside out. Black arrows indicate representative SNPs of novel loci. The detailed description of MTA and loci is presented in Table S3 and Table 2, respectively favorable alleles had a significant impact on phenotype (Figure 4b-m).

DISCUSSION
The present study describes the physiological implications of Cd toxicity in bread wheat and locates several potential loci for identifying Cd resistant genes. We observe that a high Cd uptake and accumulation in plants drastically lowers the uptake of other essential micronutrients like Fe (−93.5%) and Zn (−92.9%) attributing to the fact that Cd is transported by the same channels used for the Fe, Zn, and Mg uptake (Papoyan, Pineros, & Kochian, 2007;Roth, von Roepenack-Lahaye, & Clemens, 2006). Therefore, identifying Cd resistant genes is not only important to ensure edible quality of the grain but is also critical to maintain the mineral composition of the plant. In our diversity panel, the majority of genotypes expressed more than an 80% reduction in Fe uptake and a more than 75% reduction in Zn uptake, under Cd stress. Such deficiency of these micronutrients, especially Fe, causes severely damaging effects on different biological and physiological processes in plants including photosynthesis. Photosynthetic rate of a plant significantly decreases with the damage of photosystem II (PSII) under heavy metal stress, which results in an impaired plant growth (Rodriguez-Serrano et al., 2009;Tanyolac, Ekmekci, & Unalan, 2007). This damage of PSII may also involve a significant reduction in the chlorophyll content-an 8% reduction in the chlorophyll content is observed here at the tillering stage. Likewise, Cd toxicity also has unfavorable effects on other agronomic traits. For example, a reduction of 66% in BY is observed under Cd stress in this study, which is similar to an earlier report ofDias et al. (2013), where higher Cd concentrations are noticed to reduce the overall plant TA B L E 3 A list of potential genes associated with the uptake of Cd and other minerals in plants. These genes are shortlisted based on information from annotations and literature biomass by 76%. Such large-scale reductions in the crop productivity by Cd toxicity can potentially challenge the continuously increasing demand of food crops needed to feed an exponentially growing global population. Cd toxicity is so critical that it damages the plant chemistry even when the plant otherwise looks normal. For example, two varieties in our panel (SKD-1 and TD-1) retained a nor-mal growth in PH, TN, and GY under Cd stress and were considered resistant. Although these two varieties maintained normal growth, the grain Fe and Zn concentrations were both reduced by more than 80% in these varieties. All these findings indicate that Cd not only hampers grain yield or toxifies the grain, but also demineralizes the grains and reduces their overall nutritional quality. However, these resistant varieties can be used in conventional breeding approaches along with low-Cd and optimum mineral uptake varieties to breed nutritional and healthy wheat cultivars. GWAS analysis performed here identified several potential SNP-trait associations for Cd uptake and other agronomic traits under Cd stress. The MTA associated with agronomic traits as well as leaf and seed Cd were randomly distributed among landraces, green revolution, post green revolution, and modern cultivars, suggesting that they have not been targeted in any breeding programs. The cultivars having a higher number of favorable alleles can be used as a breeding resource and the present results may provide a basis to initiate genomic selection methods to maximize genetic gains (Jannink, Lorenz, & Iwata, 2010). Therefore, they should be combined in future breeding programs for low or no Cd, just like {Rasheed, 2016 #3}Ppd-D1, Rht-B1, Rht-D1, TaGW2-6A, and TaCKX6-D1 in Pakistani wheat cultivars (Rasheed et al., 2016). The excluder mechanism rendering low or zero Cd in grains is of significant importance and must be fully explored in future studies.
Most of the MTA identified here are mapped under the previously reported QTL either in wheat or its grass relatives. Among the ten MTA identified for PH in T 2 , four present on 2B, 4B, 5B, and 6B are positioned under a previously identified QTL for shoot height on the orthologous chromosomal positions (Xue, Chen, & Zhang, 2009). The Rht1 locus on 4B, which was a major breakthrough against lodging as part of the green revolution, is identified in both experimental conditions in the present study. MTA for BY identified on chromosome 1B is present under a QTL for shoot fresh weight under Cd stress in wheat (Ci et al., 2012). Similarly, MTA for Cd concentration on chromosome 1A, 1D, 2B, 2D, 3B, 3D, and 6B are present under a QTL for Cd uptake on the orthologous position in rice (Wang et al., 2018;Xue et al., 2009). Some loci identified in present study are located in the regions of already cloned Cd resistant genes in rice. For example, MTA for STI, GY, BY, and Seed _Cd on chromosomes 1A, 1B, and 1D are located in the region of OsMTP1 (Yuan, Yang, Liu, Zhang, & Wu, 2012) and MTA for Seed _Cd and TGW detected on chromosomes 2A and 2D are present in the orthologous position of OsPCR1 (Song et al., 2015). Five MTA on 5B (IWB29686, IWB39994, IWB2735, IWB12428, and IWB66179) are identified in the region of Cdu-B1 (5B: 695659830-696420845) which is reported as a major QTL controlling Cd uptake in wheat (Knot et al. 2009). These five MTA are associated to GY under T 2 (Cd stress) and their −log10 P values range from 3.19 to 3.92 (Table S3), none of which could pass the stringent criteria of multiple testing corrections along with many other important MTA discussed above. Although the five MTA present in the region of Cdu-B1 demonstrate the role of this important locus in the uptake of Cd in the diversity panel, it is unclear if the same QTL and candidate gene are causing the associations in this study. These findings suggest that the stringent criteria of Bonferroni correction, used so frequently in GWAS studies, result in the loss of many potential genetic variants (Gordon et al., 2016). However, this is not the first time that Cdu-B1 has been found as a minor effect locus in a GWAS study for Cd uptake in bread wheat. Another possible reason, apart from the statistical stringency, could be the polygenic and complex nature of this trait and the genotypes of different genetic background used, as has been suggested in earlier studies where Cdu-B1 was found as a minor effect locus (Bhatta et al., 2018). Another important SNP on 5B (IWB25639) was identified as associated with Seed _Cd with a −log10 P value of 5.092. This SNP had two alleles as AA and GG in the diversity panel. Genotypes having allele GG show a lesser uptake of Cd as compared to the genotypes with allele AA. Recently, Maccaferri et al. (2019) have reported a Cd/Zn transporter (HMA3) on chromosome 5B with two alleles; TdHMA3-B1a and TdHMA3-B1b. First transports Cd to the vacuoles in the roots thus limiting its uptake to shoot, while the later increases Cd uptake to the aerial plant parts in durum wheat. IWB25639 on 5B showing significant genotypic variation in Cd uptake for its two alleles may have been capturing this HMA3 transporter; however, this too could not fulfill the significance criterion of Bonferroni. This again suggests that the statistically stringency of multiple testing correction to remove false positives emerging from type-I error may result in false negatives caused by type-II error. Therefore, other factors such as consistency over experimental environments or association models and the relevance of underlying genes with the target traits should also be considered before ruling out the significantly associated SNP.
Five loci that passed the multiple testing correction do not colocalize with any of the reported QTL for cadmium uptake in wheat, and hence should be further validated and investigated for candidate genes using other functional genomics approaches. Genes present in these loci encode proteins including glucosyltransferase, acetyltransferase, zinc-finger, and others that have been associated with minerals (particularly Cd) uptake and regulation (Nazar et al., 2012;Whitt et al., 2018). An important locus on chromosome 2B (qZn-2B1 -IWB8132) has six genes that encode 2OG-Fe, zinc finger CCCH-type, zinc finger RING/FYVE/PHD-type, and glucosyltransferase proteins suggesting that it could potentially be involved in the uptake of Cd since Cd uptake takes the same route as that of Fe and Zn (Papoyan et al., 2007;Roth et al., 2006). Further research would require a careful and deep understanding of the evolutionary history of these genes and their physiological role in the relative grass species, considering the complexity and polygenic nature of Cd uptake in plants.

CONCLUSIONS
The present study provides a comprehensive information on genetic variations associated with Cd toxicity and uptake in a historical bread wheat diversity panel from Pakistan. Five novel loci associated with Cd, Fe, and Zn uptake are identified and their associated regulatory genes are studied in reference to their established mechanisms studied or reported in other crops. Candidate genes underlying these loci are potential targets for functional genomics approaches to develop high yielding and contamination free wheat varieties. Further, morphophysiological screening of the diversity panel has identified Cd stress resistant lines (SKD-1 and TD-1), which sustain their grain yield under stress although accumulating high Cd concentration in seeds. These lines can be used with moderate lines (low Cd uptake lines) in conventional breeding programs to produce resistant and healthy wheat cultivars. Thus, this study not only provides an invaluable information for functional genomics through candidate gene mining, it also supplies material for varietal development in stress affected areas.

DATA AVAILABILITY
The phenotype dataset used in this study is available in the supplemental file (Supplemental Table S2). The genetic variation dataset can be found in the European Variation Archives (EVA), Project: PRJEB36127; Analyses: ERZ1283759.

AUTHORS CONTRIBUTION STATEMENT
UMQ designed the study. LBS, FA, SS and ME participated in phenotyping. LBS, FA, SS and ZA participated in the laboratory experimental work. LBS and FA performed the statistical analysis and visualizations. LY, MMT and MI provided assistance in data analysis. LBS drafted the initial manuscript. UMQ, SL and ZM edited the manuscript. UMQ and SL provided the necessary resources and supervised the study. Also, we thank Professor Luigi Cattivelli (CREA -Research Centre for Genomics and Bioinformatics, Italy) for providing valuable suggestions to improve the analytical methods.

C O N F L I C T O F I N T E R E S T
The authors declare no conflict of interest.