Genome-wide association mapping and comparative genomics identifies genomic regions governing grain nutritional traits in finger millet ( Eleusine coracana L. Gaertn.)

Micronutrient deficiency is a serious and underestimated global health concern. 
Identifying existing micronutritional richness in traditional crops, and breeding this 
potential into staple crops that are more frequently consumed, could offer a potential 
low-cost, sustainable solution to micronutrient deficiency. Here, we provide the first 
insight into genetic control of grain micronutrient content in the staple food crop 
finger millet ( Eleusine coracana ). Quantifying the existing natural variation in nutritional 
traits, and identifying the regions of the genome associated with these traits, 
will underpin future breeding efforts to improve not only global food and nutrition 
security, but also human health.


Societal Impact Statement
Micronutrient deficiency is a serious and underestimated global health concern.
Identifying existing micronutritional richness in traditional crops, and breeding this potential into staple crops that are more frequently consumed, could offer a potential low-cost, sustainable solution to micronutrient deficiency. Here, we provide the first insight into genetic control of grain micronutrient content in the staple food crop finger millet (Eleusine coracana). Quantifying the existing natural variation in nutritional traits, and identifying the regions of the genome associated with these traits, will underpin future breeding efforts to improve not only global food and nutrition security, but also human health.

Summary
• Finger millet is an excellent cereal crop to alleviate micronutrient malnutrition in marginal communities because of its nutrient-dense grains. Identification of the alleles governing these characteristics will help to develop improved germplasm.
• An assembly of 190 genotypes was evaluated for six minerals (iron, zinc, calcium, magnesium, potassium, and sodium) and protein content. A combination of genotyping-by-sequencing (GBS) and genome-wide association study (GWAS) was applied to identify marker-trait associations (MTAs). Candidate genes underlying significant associations were predicted through comparative mapping with other cereals.
• A wide range of variation was observed for all traits. GBS generated 169,365 single-nucleotide polymorphisms and three subpopulations were identified. GWAS, using general linear and mixed model approaches to correct for population structure and genetic relatedness, identified 418 common markers (p-value ≤ 10 -3 , FDR < 0.1) linked with mineral content. Of these, 34 markers crossed the Bonferroni threshold, out of which 18 showed homology with candidate genes having putative functions in binding, remobilization or transport of metal ions.

| INTRODUC TI ON
Key micronutrients such as iron, zinc, and calcium are essential to almost every process in living organisms. Dietary deficiency, also known as micronutrient (MN) malnutrition or hidden hunger, impacts almost 2 billion people worldwide across all age groups, genders, and ethnicities (FAO & International Life Science Institute, 1997). Iron and zinc deficiencies affect almost 60% and 30% of the global population, respectively, and MN deficiency thus is a major global health concern to both low-and high-income countries (McLean, Cogswell, Egli, Wojdyla, & De Benoist, 2009). Where populations are weakened due to hidden hunger, individuals are at increased risk of succumbing to infection or developing chronic disease. As such, these account for almost 80% of deaths in malnourished populations. Young children, pregnant, and nursing women in the poorest regions of the world are at the highest risk of suffering from such malnutrition. In developing countries, these groups are also the least likely to be able to afford necessary and timely treatment for conditions related to MN malnutrition placing considerable pressure on both the healthcare sector and family income. Increasing the micronutritional value of staple crops through targeted breeding programs involving varieties with the highest nutritional contents may be a sustainable low-cost solution.
Millets are a group of traditional and heterogeneous cereals often grown in harshest areas of Asia and Africa (Ramakrishnan et al., 2017).
Recently, interest in this group has increased and they have gained much attention as potential crops for a "New Green Revolution" due to their high inherent nutritional quality and environmental hardiness (Goron & Raizada, 2015;Padulosi et al., 2009). Finger millet (Eleusine coracana L. Gaertn.), an allotetraploid (2n = 4x = 36, AABB) annual millet, is predominantly grown across Asia and Africa and contributes to 12% of the global cultivated area under millets (Vetriventhan, Upadhyaya, Dwivedi, Pattanashetti, & Singh, 2015). It is the fourth most important millet after sorghum, pearl millet, and foxtail millet (Upadhyaya, Gowda, & Reddy, 2007). In hot, arid regions, with low soil fertility, it is able to produce reasonable grain and fodder yields. This ability can partly be attributed to its efficient carbon concentrating mechanism, the C4 pathway (Hittalmani et al., 2017).
It is also a rich source of several essential amino acids and health benefitting MNs, phytochemicals, and vitamins .
Amino acids like lysine and methionine are often scarce in plant food crops, but they are found in abundance in finger millet. When compared to other cereals, finger millet also contains high concentration of calcium (350 mg/100 g) in its grains and can be an inexpensive food to treat problems related to osteoporosis .
Alongside calcium, finger millet grains are known to contain a high quantity of other MNs including iron, zinc, phosphorus, and potassium (Shashi, Sharan, Hittalamani, Shankar, & Nagarathna, 2007;Tripathi & Plate l, 2010). Other important properties of finger millet are that it is gluten-free (helpful to patients suffering from celiac disease), has low glycemic index (keeps a control on blood sugar level), and possesses excellent malting (nourishing food for infants) and nutraceutical properties (Kumar, Metwal, et al., 2016).
Major breeding targets for finger millet have included traits such as improvement of agro-morphological characteristics, calcium accumulation, blast resistance, nitrogen use efficiency, and tolerance to phosphorus deficiency (reviewed extensively in Sood et al., 2016 andGupta et al., 2017). Utilization of molecular marker technology is now emerging as an important tool in selection and breeding programs including traits that are expensive to phenotype and have complex genetic architecture. Important genes or genomic regions underlying complex quantitative traits can be tagged, by markers using biparental or association mapping approaches (Sehgal et al., 2015). Mapping using biparental mapping-population has limitations of low number of segregating alleles and lower mapping resolution . Association mapping (AM), on the other hand, utilizes the existing genetic diversity in natural germplasm populations and exploit their historic recombination events to map or fine map genes. This approach, also known as linkage disequilibrium (LD) mapping, has a higher mapping resolution due to the use of a genetically diverse population (Buckler & Thornsberry, 2002).
In finger millet, AM has been used to identify quantitative trait loci (QTLs) associated with different agro-morphological traits, protein and tryptophan contents as well as blast tolerance, and low phosphorus tolerance (Babu, Agrawal, Pandey, Jaiswal, & Kumar, 2014;Babu, Dinesh, et al., 2014;Ramakrishnan, Ceasar, Duraipandiyan, Vinod, et al., 2016;Ramakrishnan et al., 2017). In recent years, Sharma et al. (2018) used • This is the first report to utilize the phenotypic variability of grain minerals in finger millet genotypes to identify MTAs and predict associated putative candidate genes. Postvalidation, these markers may be employed to improve grain nutrient quality through marker-assisted breeding.

K E Y W O R D S
Finger millet, genome-wide association analysis, genotyping-by-sequencing, micronutrients, minerals, population structure, TASSEL SNP markers to conduct genome-wide association mapping of major agro-morphological traits in finger millet. Compared to these traits, understanding of the genetic basis of nutrient accumulation in finger millet grains remains limited.
Our objectives in this study were to generate a large set of genome-wide markers (single-nucleotide polymorphism; SNP) through genotyping by sequencing (GBS) and to demonstrate their use in capturing genetic variations associated with nutritional traits through genome-wide association studies (GWAS). We made use of a population of 190 genotypes assembled by combining individuals from core, minicore, and elite varieties for generating SNP variants, and phenotyped them for grain minerals such as iron, zinc, calcium, potassium, sodium, magnesium, and for total protein content.

| Plant material and assembly of GWAS population
A GWAS population was developed using a set of 190 accessions of finger millet with diverse geographic origins (Table S1). This experimental population included 142 traditional cultivar/landrace accessions, which are derived from the finger millet core and minicore collection at ICRISAT representing the entire trait diversity in finger millet germplasm (Upadhyaya, Gowda, Pundir, Reddy, & Singh, 2006;Upadhyaya et al., 2010). These BLUPs were used downstream analysis.

| DNA extraction and genotyping
The population was grown in a greenhouse setting at IBERS, Aberystwyth University, UK. About 100mg leaf tissue was harvested from 3-week-old seedlings and immediately frozen in liquid nitrogen. The samples were ground to a fine powder using TissueLyser

| SNP identification and filtering
The raw sequence files were parsed based on their barcodes and all reads were trimmed to 64 bps. In the absence of a finger millet reference genome sequence, SNPs were called de novo using the UNEAK pipeline (Lu et al., 2013). SNPs were also called using the GBS pipeline (Glaubitz et al., 2014) and the reference genome sequences of cereals closely related to finger millet such as tef (Eragrostis tef), foxtail millet (Setaria italica) and rice (Oryza sativa). Both pipelines were implemented in the TASSEL 3.0 package (Bradbury et al., 2007). All the pipeline arguments used are listed in Table S2.
Genotype likelihood scores were calculated based on Etter, Bassham, Hohenlohe, Johnson, and Cresko (2011) and the most probable genotype was assigned as a function of a genotype quality (GQ) score (https://softw are.broad insti tute.org/gatk/ calculated according to the GATK version; https://softw are.broad insti tute.org/gatk/). VCFtools version v0.1.12a was used to calculate the depth and missingness (Danecek et al., 2011). The remaining SNPs obtained from both of the pipelines were filtered according to commands described in Table 1. Initially, a set of SNPs that passed stringent filtering criteria (i.e., "stringent" filtering, Table 1) was generated to validate data quality as well as population structure analysis. Furthermore, larger SNP sets were generated using "relaxed" filtering criteria for GWAS analyses (Table 1). We used the PLINK software tool (Purcell et al., 2007) to calculate the number of unlinked single-nucleotide variants.

| Population structure and genetic relatedness analysis
Discriminant analysis of principal components (DAPC) (Jombart, Devillard, & Balloux, 2010) was employed to analyze the population genetic structure. The most likely number of clusters were inferred using the R package Adgenet (Jombart, 2008;Jombart & Ahmed, 2011;R Core Team, 2013) and its find.clusters function. The appropriate principle components were calculated from the probability of assignment of individuals to individual clusters as advised in the manual. Bayesian Information Criterion (BIC) for K = 1 -10 (K = number of populations) was used to indicate the optimal number of populations with minimum observed BIC value. The optimal numbers of PCs were retained through optimization of α-score, which measures the difference between the proportion of successful reassignment of the analysis (observed discrimination) and values obtained using random groups (random discrimination; Jombart, 2008;Jombart & Ahmed, 2011).
The SNP-based genetic groups were employed to estimate genetic distances using Nei's standard genetic distance (Saitou & Nei, 1987). These values were then used to construct a phylogenetic tree using a neighbor-joining (NJ) method in the R "ape" package (Paradis, Claude, & Strimmer, 2004). The "dist" function of the package was employed to calculate Euclidean distance matrix.

| Analytical determination of mineral and protein content
About 5 g seeds from each entry were finely ground in a Retsch ® miller (model ZM 200 GmbH Germany) fitted with a 1-mm-filter mesh. The milled samples were used for determination of six minerals: iron, sodium, potassium, magnesium, calcium, and zinc along with nitrogen for estimating total protein. Briefly, 1 g powdered sample was subjected to overnight aqua regia digestion in 100 ml Kjeldahl flasks. The digest was filtered using filter paper disks (Whatman ® ) into a 50 ml volumetric flask and further diluted with additional dilute aqua regia mix. The multielement analysis was carried out on each grain digest using inductively coupled plasma optical emission spectroscopy (ICP OES; Optima 8000DV, PerkinElmer, USA) in triplicates. Nitrogen content in the ground seed samples was detected by combustion followed by thermal conductivity using the Leco FP-528 Nitrogen/Protein Determinator (LECO, 2016). The total nitrogen percentage was multiplied by 6.25 to calculate crude protein content in the grains (Mariotti, Tomé, & Mirand, 2008). In order to ensure experimental accuracy, two standard analytical quality control samples and a blank were included for each run. All the estimations were conducted at IBERS analytical chemistry and metabolomics facility according to the standard association of analytical communities' (AOAC) protocols (AOAC, 2016). Two genotypes were excluded from downstream analysis as they failed the ICP-OES analysis.

| Linkage disequilibrium and genome-wide association analysis (GWAS)
The stringent filtered set of SNPS was used to conduct analysis of linkage disequilibrium (LD) using TASSEL software with the default settings. The Pearson correlation values (R 2 ) and pairwise distance between SNPs from above analysis were then imported in R to generate the genome-wide LD decay plots.
In order to conduct GWAS, three statistical models were employed in the software TASSEL 5.0 (Bradbury et al., 2007;Buckler et al., 2009).  • WINDOW 1,000, LD 0.5 PLINK ->./plink --bfile binaryfilename --makefounders --indep-pairwise 1,000 50 0.5 --out givenewname was estimated using a multiple testing approach with Bonferroni correction as well as by the false discovery rate (FDR) method (Benjamini & Hochberg, 1995) employed through the QVALUE package in R (Storey & Tibshirani, 2003). Overall, an MTA was called significant if it had a − log10 p ≥ 3.00; p ≤ .001 (for GLM) and − log10 p ≥ 2.00; p ≤ .01 (for MLM) and presented an FDR < 0.1 across both the models. The corresponding R 2 values were used to represent proportion of the phenotypic variation explained (PVE) by each marker. The p-values from the models were used as an input file to generate Manhattan and quantile-quantile (QQ) plots using the R package qqman (Turner, 2018

| Genotyping and detection of SNP variants
The two libraries (four lanes) generated a total of ~ 66GB of raw data with 1.02 × 10 9 reads (

| Population structure analysis
The population structure of this collection was described without any a priori group assignment. The functions "find.clusters" and "k-means" algorithm, retained 200 principal components (PCs), that accounted for more than 99% of the variance. Figure 1a shows the percentage of variance explained by the first 10 PCs. The α-score is an optimization procedure (reassignment probability for given populations minus the reassignment probability for random permuted groups) was used to evaluate optimal number of PCs to retain. The optimal number of retained PCs (based on the α-score), minimized the number to only two PCs needed for the assignment analysis ( Figure 1b). An elbow curve of BIC values, as a function of k, indicated that optimal number of cluster was 3 (Figure 1c).
Using this information (3 clusters from BIC and α-optimized 2 PCs from DAPC), a scatter plot was drawn (Figure 1d). This plot distinguished three separate clusters, corresponding to their geographic origins. The first PC which explained 7.63% genetic variation, separated cluster 2 (mainly Asia) from cluster 1 and 3 (African region; Figure 1d). The second PC explained 5.13% genetic variation and mainly separated these latter two populations into further two groups; East African populations (cluster 1) and those originating from southern Africa (cluster 3; Figure 1d). The probability of membership assignments was 100% for cluster 1 and 2 and 98% for cluster 3. Pairwise Fst values among DAPC clusters ranged from 0.047 (Cluster 1-Cluster 3) to 0.074 (Cluster 2-Cluster 3; Table 2) signifying high genetic differentiation of the subpopulations. In addition, the posterior probability plots that were drawn based on the posterior membership probabilities of each individual to either of the three clusters (Table S3)

| Phenotypic variation among accessions
The entire range of six minerals (calcium, iron, zinc, sodium, potassium, magnesium) and total protein content in the finger millet GWAS population is shown in Figure 3 and  The variability across all traits, as estimated by the coefficient of variation, ranged from ~ 8%-37%. The mean of calcium content over all the accessions was 314 mg/100 g which is in agreement with several previous studies . Minimum to moderately significant correlations were found between grain MNs and protein content traits (Table 4). Grain iron especially was only weakly correlated with other traits like magnesium and calcium content. Magnesium showed a positive correlation with calcium at a moderate level but a weak correlation with zinc content. Grain potassium content was found to always share a weak negative relation with magnesium and calcium content, but a mild positive correlation with sodium content. Mild correlation between zinc and sodium were weakly negative. Weak positive correlations were also found between sodium and potassium and protein and zinc. None of the traits showed very strong correlations.
When compared within groups, the mineral and protein content were not vastly different ( Figure S1). Calcium and potassium content was slightly lower and zinc was higher in the elite local varieties while iron was higher in the minicore accessions ( Figure S1). The genotypes belonging to Asian subpopulation had relatively higher calcium content than the African population, whereas that latter were richer in potassium and zinc content ( Figure. S2).

| Linkage disequilibrium, genome-wide association analysis, and identification of candidate genes
As finger millet genome sequencing is still in its infancy, it is difficult to predict the exact rate of LD decay. The rate of LD decay as measured by the R 2 values (squared (Pearson correlation) and plotted against physical distance between markers is shown in Figure   S3.  For grain iron content in finger millet genotypes, 894 makers were identified using GLM while 148 associations were found using MLM approach (Dataset S1; Figure S4). All the MLM model generated MTAs, except for the marker S1_42935743, were found to associate with MTAs for iron content identified through the GLM analysis. Of the 148 MTAs, a locus at marker S1_5895347 was the most strongly associated with iron and explained 24.59% phenotypic variation. The 444 markers that were identified to be associated with potassium content through GLM analysis were reduced to 174 markers after correcting for family relatedness (K). In both these analysis, 164 markers were found to be in common (Dataset S1; Figure   S5). The most significant association was shown by S1_55418346 explaining 18.03% variation. Grain sodium content was another trait with high number of trait-SNPs associations. GLM-and MLMbased association analysis revealed 639 and 106 MTAs, respectively (Dataset S1; Figure S6). One hundred and four (104) MTAs were collectively presented for this trait through both the approaches.
About 18.66% phenotypic variation for sodium content was explained by marker S1_47207745. Both GLM and MLM identified five common significant MTA for magnesium content viz. S1_47630040, S1_463458, S1_33226241, S1_37853063, and S1_23369654 (Dataset S1; Figure 4a-4c). Of this, the marker S1_47630040 was associated with magnesium with a R 2 % value 18.84%. Although the GLM approach identified 96 markers to be significantly associated with finger millet zinc content, none met the FDR cut-off parameter in the MLM model (Dataset S1; Figure S7). Similarly, GLM model identified 7 MTAs for calcium content above the p-value and FDR threshold. Although these markers were also identified through the MLM approach, they did not cross the FDR < 0.1 threshold (Dataset S1; Figure S8). Both GLM and MLM models identified several SNPs to be significantly (p ≤ .001) associated with grain protein content as well, however, they were not considered further after correcting for multiple testing.
We also found few SNPs to be associated with more than one trait.
The SNP S1_15880246 was significantly associated with iron (p-value identified across GLM and MLM models, the 34 high confidence MTAs were used to perform homology search with other plants. In addition, as finger millet is known for high calcium content in its grains, we also performed homology search using the seven SNPs found to be associated with grain calcium content in the GLM.
Due to the current lack of a high-quality complete assembly of the finger millet whole genome sequence, it was difficult to directly estimate the genomic location of identified associations.
Hence, based on an in silico comparative mapping approach using the 64 bp SNP harbouring sequences (Table S5) (Table S6). The remaining 12 SNPs did not show any hits in the genomes of other monocots. From those having an orthologous region, 18 were those of a predicted mRNA or genic sequences, whereas four belonged to chromosomal regions/ scaffolds. Furthermore, from the seven SNPs associated with calcium, we found that three SNPs (S1_4620123, S1_44130155, and S1_5982733) encoded for orthologous cDNAs in other species. The majority of orthologues were observed in other monocot species such Setaria italica, Oryza sativa, Zea mays, Ergostis tef, Brachypodium distachyon, Oropetium thomaeum, Sorghum bicolor, Panicum hallii, and Aegilops tauschii. Many of these were predicted to have roles in metal ion binding, metal remobilization, or detoxification (Table S6).

| D ISCUSS I ON
GBS has been used efficiently in millets with poorly assembled genomes, such as pearl millet, to generate a huge repository of SNPs for conducting various analysis such as genetic diversity and/or GWAS (Hu et al., 2015;Sehgal et al., 2012). Using this robust, multiplexed, high-throughput and low-cost GBS technique, 169,365 genome-wide high-quality SNPs were generated in our study. We therefore successfully showed that GBS can be used to generate a large number of high-quality markers in orphan species like finger millet, where marker number is currently limited.
The DAPC-based population structure analysis divided the population into three well-defined clusters related to their inherent genetic differences, which was mostly associated with their geographic origin. Irrespective of the germplasm collection, that is, among both the minicore/core and the elite germplasm, a clear distinction between the subpopulations enriched in accessions from Asia and those from parts of Africa was revealed. Such differentiation based on geographic pattern has also been previously reported (Kumar, Sharma, et al., 2016;. This division was further supported by Fst values. Pairwise comparisons among the two African subpopulations (cluster 1 and cluster 3) showed lower Fst values than those among the East African and Asian subpopulations (cluster 1 and cluster 2), or between South African and Asian subpopulations (cluster 3 and cluster 2). Thus, the less diverse genetic background of the East and South African accessions shows that they have a common evolutionary lineage and might have evolved from same natural population with its primary center of origin in Africa (Harlan & De Wet, 1971). Grouping of accessions from major finger millet growing regions (for example, in East African cluster countries such as Kenya, Uganda, and Tanzania) could be attributed to their close proximity geographically. This grouping is indicative of the conservation of a common gene pool between the primary and secondary center of origin of finger millet resulting in higher rates of gene flow in between the member countries (Hilu, de Wet, & Harlan, 1979). Unlike Bharathi's study (2011), however, European or American accessions were not grouped together into a single cluster. Marker density used in previous studies may have too poor to reveal these genetic groups. As Kumar, Sharma, et al. (2016) highlighted previously, some overlap of genotypes with accessions from other countries may be due to a big exchange and the ensuing hybridization and selection of Indian and African germplasm leading to allelic reshuffle among indigenous germplasm. The minicore and core germplasm sets used in this study have been well characterized previously and possess ample variability for several morphological and agronomic traits as well as grain nutrient content (Upadhyaya et al., 2006(Upadhyaya et al., , 2011. We specifically included 48 locally adapted elite genotypes in the set to establish a direct comparison (of minerals and protein contents) with the genotypes of the core and minicore collections. In agreement with the prior studies, we found high variation for all seven traits in this study, suggesting that our association panel has sufficient variation to be effectively used for GWAS of various grain quality traits. However, with the exception of the sodium content of elite genotype KNE 622, most of the minicore/core genotypes surpassed the trait values of that of elite varieties. This is possibly the result of the modern elite varieties being developed through breeding programs aimed at improving specific traits, such as grain yield, rather than for grain nutritional content. As the traditional objective of agriculture systems and public policies has been to improve crop yields rather than their nutritional content, such high yielding cultivars often suffer a tradeoff between quality and quantity (Graham, Senadhira, Beebe, Iglesias, & Monasterio, 1999). Higher micronutrient concentration, however, does not always lead to grain yield penalty as evidenced by grain yield remaining unlinked to higher grain iron and zinc content in wheat (Welch & Graham, 2004) and pearl millet (Gupta, Velu, Rai, & Sumalini, 2009;Rai, Govindaraj, & Rao, 2012). Such results suggest that simultaneous selection for higher micronutrient contents without compromising on grain yield is possible. Genotypes of the finger millet minicore/core with better nutritional content can be used as donor genotypes in crosses for finger millet improvement.
A decisive factor for design of GWAS is the systematic characterization of the LD patterns in the genome (Serba et al.., 2019). LD is affected by several factors including fertilization behavior, rate of recombination, selection pressure, genetic drift, physical linkage, population structure, etc. Being a self-pollinated species, the lower rate of recombination in finger millet allows for relatively larger haplotype blocks. A previous study in finger millet showed that 17.9% of SNP marker pairs had significant LD at R 2 > 0.05 (Sharma et al., 2018). In other self-crossing millets such as foxtail millet, the genome-wide LD is reported to range from 100 Kb-177 Kb (Jaiswal et al., 2019;Jia et al., 2013). In the polyploid Arabidopsis kamchatica, the mean LD decay was found to be 5-10 kb, similar to A. thaliana and Medicago truncatula (LD decay within2-10 kb; Branca et al., 2011;Cao et al., 2011). The exact LD pattern of markers could not be identified here due to the lack of physical mapping distances between markers. The requirement for a more detailed analysis to assess LD and efficient QTL analysis may benefit from the recently published whole-genome draft sequence and assembly of finger millet (Hatakeyama et al., 2018).
QTL studies in finger millet have utilized association mapping based on low numbers of genic or genomic SSR markers. Such studies have allowed the identification of QTLs in finger millet including three for resistance to finger blast, three for leaf blast, one for neck blast (Babu, Dinesh, et al., 2014), five for agro-morphological characters (Babu, Agrawal, Pandey, Jaiswal, et al., 2014), seven for leaf blast resistance, tiller number, root length, seed yield (Ramakrishnan, Ceasar, Duraipandiyan, Vinod, et al., 2016) and four for phosphorus response traits (Ramakrishnan et al., 2017). Principally, the selfing nature of finger millet, and the low rate of LD decay, should render a low marker density sufficient for identifying candidate genes within a larger genomic region (Ramakrishnan et al., 2017). Using this theory, a recent study found 109 novel SNPs to be associated with important agro-morphological traits such as grain yield in finger millet (Sharma et al., 2018). With only a few reports of genome-wide SNP markers and no reported MTA for micronutrient content in finger millet, we proceeded to generate a unique panel of SNPs and conduct a targeted GWAS. We attempted to identify MTAs between several new GBS-based SNP markers and a set of traits that are considerably crucial for human nutritional value. In total, 418 MTAs for four out of seven traits were identified. These SNPs can be considered robust as they were retained in both GLM and MLM analysis, although the latter model was more effective in controlling for confounding by population structure. Any spurious false positive associations due to multiple testing were controlled by applying FDR. The results also suggested that some of the significant MTAs were not detected by MLM because they did not reach the FDR criteria. It has often been reported that stringent FDR correction can sacrifice genuine MTAs as false negatives (Jaiswal et al., , 2019Kulwal et al., 2012).
Thus, the MTAs identified for grain zinc and calcium content may still be worthy of further study and validation. Candidate genes for iron, sodium, potassium, and magnesium have not been previously reported in finger millet so our results present a novel finding. The MTAs identified through such genome level profiling are critical to initiate the identification of donor genotypes carrying desirable trait to act as divergent parents to be utilized in finger millet breeding. The recent release of the finger millet whole genome sequence will serve as a powerful reference tool in the coming years. As the rate of LD decay becomes known and the genomic location of clusters of significant SNPs can be revealed, the data presented in this study can be precisely used to conduct high-throughput GWAS. In terms of marker density, although the number of SNPs used in this study could not discover the entire range of QTLs, an improved GWAS resolution will also benefit from a higher marker density that covers nearly every haplotype block to properly map finger millet. For example, an increase in SNP density in rice genotyping arrays from 44,100 SNPs to 700,000 SNPs, and a further imputation-based augmentation to 5.2 million SNPs, has markedly improved the genotype-phenotype associations and functional polymorphisms underlying many of the QTLs identified by GWAS (McCouch et al., 2016;Wang et al., 2018;Zhao et al., 2011). Thus, the dissection of accurate QTLs and associated SNPs will prove highly beneficial for MAS of grain nutritional traits of finger millet.
Furthermore, we used the markers to detect candidate loci underlying the complex trait of grain nutrient content and accumulation. Due to the lack of whole-genome sequence information, however, the exact genomic position of the identified loci could not be predicted. Unlike, the candidate gene-based approach employed by Babu, Agrawal, Pandey, Jaiswal, et al. (2014), Babu, Dinesh, et al. (2014) and Nirgude et al. (2014), we utilized the comparative genomics approach to identify genomic regions that remain conserved across genomes. This method has also previously proved useful in finger millet GWAS to delineate putative candidate genes for grain yield, flowering time and time to maturity (Sharma et al., 2018). The underlying sequences of 64bp harboring the SNP tag provided a relatively small sequence size to query against the NCBI/Phytozome/Ensembl databases. An orthologous predicted mRNA or genic sequence was present for 18 SNPs. In this respect, probably the most interesting candidate genes affecting these traits are those that include having predicted molecular function in metal ion binding or transport. One example, S1_30253617, which was found to be associated with iron content, was similar to a foxtail millet uncharacterized protein with a No apical meristem-associated (NAM) protein. NAM, a member of the NAC protein family (Puranik, Sahu, Srivastava, & Prasad, 2012), has been reported to play a role in iron and zinc remobilization to seeds during leaf senescence (Ricachenevsky, Menguer, & Sperotto, 2013).
On studying iron content associated with marker S1_23343453, a probable mitochondrial 3-hydroxyisobutyrate dehydrogenase-like 1 (LOC101754224), homologous to that in Setaria italica was identified. The mammalian homologue of this gene is involved in accumulation and trafficking of intracellular iron (Devireddy, Hart, Goetz, & Green, 2010;Liu, Velpula, & Devireddy, 2014). The identification of associations with a number of markers indicate that there may be several genes functioning to remobilize, traffic, and maintain the levels of grain iron content in finger millet. We also found an the sodium associated marker, S1_53281655, to be a possible homologue of uncharacterized Et_s7379-1.39-1.mrna1 from a close relative of finger millet, Ergostis tef. This gene encoded a transcript containing Tetratricopeptide repeat, believed to have a role in providing salinity tolerance (Rosado et al., 2006). It is possible that this kind of gene may work to maintain intracellular osmotic ratio and protect the seed from ionic imbalance. Some transcripts were identified with functions involved in phenotypic response such as root development, regulation of plant inflorescence architecture or those involved in defense response to stress, flavonoid biosynthesis, proteolysis, and regulation of transcription. Interestingly, of the seven SNPs associated with calcium content, the SNP S1_5982733 encoded a SEUSS-like transcriptional corepressor. It is known to be involved in several developmental process which may often occur in a calcium-dependent manner, as known in mammals (Kashani et al., 2006). There were no previous studies that report the role of these genes in finger millet, hence it is imperative to validate these significant MTAs in future studies to ascertain the mechanism of transport, homeostasis, and allocation of grain minerals from the source to sink organs. Postvalidation, these SNPs can be utilized in full-length gene cloning and aid the introgression of favorable alleles into locally well-adapted germplasm through marker-assisted breeding.

| CON CLUS IONS
Our study highlights the potential of a large set of GBS-derived SNP markers for identifying population structure and GWA mapping of six grain MNs and protein content in finger millet for the first time. We included a diverse set of finger millet germplasm, including several African elite varieties which have never been utilized before in any such study. The study uncovers several novel MTAs and underlying candidate genes for grain mineral content in finger millet that until now remained unidentified. Without further knowledge of LD decay and reference genome the true genomic location of the identified QTLs cannot currently be reported. However, with the finger millet genome sequence information being recently available in the public domain, the associated markers identified in this study could be used in a much more precise way as validated markers might be used in finger millet breeding programs. The work provides an opportunity to use this sequence variation for identification and better characterization of novel alleles and genotypes and help to breed more nutritious finger millet genotypes. Promising alleles can provide great leads in this direction for future marker-assisted selection or for candidate gene cloning. Several of the elite varieties identified as having lower nutritional content may be taken up as target breeding material by exploiting the existing variability for further improving its value. The work can be a route to understand genetic pathways underlying high nutritional value of the finger millet grains. Moreover, from the human nutritional perspective, they can be encouraging candidates to improve the delivery the recommended daily intake of MNs benefiting the population as a whole and particularly currently marginalized communities.

ACK N OWLED G M ENTS
We thank Dr. Gancho Slavov for key support with GWAS concept.
This study was performed thanks to the funds from the European and GACR Junior grant (20-25845Y), respectively. Moreover, constructive comments by the two reviewers are greatly appreciated as it helped us to significantly improve the manuscript.

CO N FLI C T O F I NTE R E S T
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.