Joint‐linkage mapping and GWAS reveal extensive genetic loci that regulate male inflorescence size in maize

Summary Both insufficient and excessive male inflorescence size leads to a reduction in maize yield. Knowledge of the genetic architecture of male inflorescence is essential to achieve the optimum inflorescence size for maize breeding. In this study, we used approximately eight thousand inbreds, including both linkage populations and association populations, to dissect the genetic architecture of male inflorescence. The linkage populations include 25 families developed in the U.S. and 11 families developed in China. Each family contains approximately 200 recombinant inbred lines (RILs). The association populations include approximately 1000 diverse lines from the U.S. and China. All inbreds were genotyped by either sequencing or microarray. Inflorescence size was measured as the tassel primary branch number (TBN) and tassel length (TL). A total of 125 quantitative trait loci (QTLs) were identified (63 for TBN, 62 for TL) through linkage analyses. In addition, 965 quantitative trait nucleotides (QTNs) were identified through genomewide study (GWAS) at a bootstrap posterior probability (BPP) above a 5% threshold. These QTLs/QTNs include 24 known genes that were cloned using mutants, for example Ramosa3 (ra3), Thick tassel dwarf1 (td1), tasselseed2 (ts2), liguleless2 (lg2), ramosa1 (ra1), barren stalk1 (ba1), branch silkless1 (bd1) and tasselseed6 (ts6). The newly identified genes encode a zinc transporter (e.g. GRMZM5G838098 and GRMZM2G047762), the adapt in terminal region protein (e.g. GRMZM5G885628), O‐methyl‐transferase (e.g. GRMZM2G147491), helix‐loop‐helix (HLH) DNA‐binding proteins (e.g. GRMZM2G414252 and GRMZM2G042895) and an SBP‐box protein (e.g. GRMZM2G058588). These results provide extensive genetic information to dissect the genetic architecture of inflorescence size for the improvement of maize yield.


Introduction
The male inflorescence (tassel) is an indispensable organ for maize production because it provides pollen for hybridization (Upadyayula et al., 2006). Although well-developed male inflorescences have been observed in teosintes (Piperno and Flannery, 2001), they were reduced during maize domestication and improvement (Duvick and Cassman, 1999) because a larger male inflorescence would compete for photosynthate or contribute to the shading effect (Crue and Wasson, 1996). In fact, varieties with an ideal male inflorescence can produce sufficient pollen and transform more energy into kernels (Eveland et al., 2014;Upadyayula et al., 2006). Brewbaker (2015) reported a significant decrease in male inflorescencerelated traits, such as tassel primary branch number (TBN) and tassel length (TL), from typical landraces to temperate adapted lines. The male inflorescence is controlled by a large number of genetic loci/genes (Tanaka et al., 2013); however, knowledge of its molecular basis remains poor (Eveland et al., 2014). Therefore, an in-depth investigation of the genetic loci/genes that control male inflorescence could provide information for determining the molecular mechanisms of its variation and could improve breeding strategies for enhancing maize grain yield.
There are considerable efforts to map quantitative trait loci (QTLs) for male florescence-related traits (Berke and Rocheford, 1999;Chen et al., 2014;Mickelson et al., 2002;Upadyayula et al., 2006). However, only limited number of QTLs were identified because only two parental inbreds were selected in each population, which resulted in a limited number of recombination events. To further dissect the genetic basis of the male inflorescence, Yang et al. (2014) used the GWAS method in a diverse panel and identified 30 TBN-and 33 TL-related genetic loci, which significantly improved the resolution compared with previous studies. However, the use of GWAS in diverse panels was challenged by the false discoveries caused by the population structure and unequal relatedness among individuals (Zhang et al., 2010). Thus, McMullen et al. (2009) employed a different design, that is the nested association mapping population (US NAM) to dissect genetic architecture of complex traits ) using joint-linkage analysis. Furthermore, effective and reliable results were obtained from GWAS in these populations . Using this US NAM population, many genetic loci associated with different traits of agronomic importance were identified Buckler et al., 2009;Cook et al., 2012;Kump et al., 2011;Peiffer et al., 2014;Tian et al., 2011). In particular, Brown et al. (2011) detected 39 TBN-and 37 TLrelated QTLs and also identified 241 TL-related QTNs and 325 TBN-related QTNs. In their study, 13 male inflorescence-related genes that were cloned using mutant genetics were validated with TBN-or TL-related QTNs falling within 1 centi-Morgen (cM) of these genes. The NAM populations provide high power for the dissection of the genetic architecture of the male inflorescence of maize. However, the mapping resolution is less desirable for gene cloning due to limited number of recombination events.
Recently, with the development of high-throughput sequencing platforms, Eveland et al. (2014) identified 164 genes that were specifically expressed in the male inflorescence using transcriptional expression profile analysis ofra1-R, ra2-R and ra3-fea1 introgressed families with an identical genetic background of B73. They established a regulatory model and provided additional proof for validating the gene function of the male inflorescence. However, the understanding of the male inflorescence remains limited due to complex molecular networks (Eveland et al., 2014) and a limited number of parental lines used previously Chen et al., 2014;Mickelson et al., 2002;Upadyayula et al., 2006). Therefore, more parental lines with wider genetic diversity are desirable, and a combination of joint-linkage mapping, GWAS and transcriptional expression profile analysis would be better for dissecting the genetic basis of the male inflorescence in maize.
In this study, we constructed another NAM population (CN NAM) using Huangzaosi (HZS) as the common parental line, with larger male inflorescence size (TBN of 13.93, TL of 29.06) compared with that of B73 (TBN of 10.24, TL of 28.76), which can provide abundant pollen for hybridization in breeding practice. The other 11 parental lines of CN NAM were from different heterotic groups of Reid, Tangsipingtou (TSPT), Lancaster, and the P and Tem-tropic I groups (Wu et al., 2014), which would provide some new allelic variations. In addition, one global association panel (AP) of 945 diverse lines was collected, which could maximize the genetic diversity of the maize germplasm. The objective of this study was to identify the QTLs, QTNs and genes associated with TBN and TL by combining CN NAM, US NAM and AP to provide extensive genetic information for dissecting their potential molecular mechanisms.

Phenotypic variations
The phenotypes of TBN and TL are shown in Tables 1, S2 and Figure S2. The results indicated that there were abundant phenotypic variations in the experimental materials, which benefited the dissection of the genetic architecture of the male inflorescence. Broad-sense heritability ranged from 0.83 to 0.92 for TBN, and 0.69 to 0.85 for TL. A correlation analysis of TBN and TL showed significantly positive associations among different environments, with a P value less than 1 9 10 À4 (Table S3). Therefore, a single best linear unbiased predictor (BLUP) value was calculated for TBN and TL across all locations.

Genetic loci controlling phenotypic variations
For CN NAM, 28 TBN-related QTLs were detected (Table 1 and Fig. 1a), 12 of which were shared in single families (Fig. S3); the average phenotypic variation explained (PVE) value was 10.19%, and this value varied from 5.69% to 25.33% (Table S5). A total of 25 TL-related QTLs were detected (Table 1 and Fig. 1b), 13 of which were shared in single families (Fig. S4), with an average PVE value of 11.46% and a range of 5.92% to 30.02% (Table S5). For US NAM, 35 TBN-related QTLs were identified (Table 1 and Fig. 1a), 29 of which were shared in single families (Fig. S3); the average PVE value was 10.73%, and this varied from 3.75% to 39.57% (Table S6). A total of 37 TL-related QTLs were identified (Table 1 and Fig. 1b), 22 of which were shared in single families (Fig. S4); the average PVE value was 10.58%, and this varied from 4.57% to 31.12% (Table S6).
The comparison analysis of the joint-linkage analysis results between CN and US NAM (Fig. 2a) showed that 32 tassel-related QTLs were observed specifically in CN NAM. A total of 51 tassel-related QTLs were identified specifically in US NAM. A total of 21 QTLs, including 13 TBN-and 8 TL-associated QTLs, were identified as common to both CN and US NAM. In addition, there were 23 common QTLs between TL and TBN (Fig. 2b).
The GWAS results showed that a total of 549 TBN-related QTNs with BPP values greater than or equal to 0.05 at a P value less than 5.46 9 10 À6 were identified. The use of CN&US NAM showed the largest detection power compared with any other population, where 328 QTNs were detected compared with 163 in US NAM, 32 in CN NAM, 15 in CN&US AP, 0 in US AP, and 11 in CN AP. Similarly, a total of 416 TL-related QTNs were identified with BPP values greater than or equal to 0.05 at a P value less than 5.66 9 10 À6 . CN&US NAM also showed the largest detection power (252 QTNs found) compared with 106 in US NAM, 39 in CN NAM, 19 in CN&US AP, and 0 in US AP and CN AP (Table S4). The comparison analysis of the GWAS results from CN, US and CN&US NAM showed that none of the common QTNs associated with TL were found in the three NAM populations (Fig. 2c). However, 28 TL-related QTNs were shared between US and CN&US NAM, 5 were shared between CN and CN&US NAM, and 3 were shared between CN and US NAM (Fig. 2c). In addition, there was one common QTN associated with TBN in CN, US and CN$US NAM, 45 QTNs were shared between US and CN&US NAM, 2 were shared between CN and US NAM, and 4 were shared between CN and CN&US NAM (Fig. 2d).

Overlap between QTNs and QTLs
Significant overlaps existed between the QTNs detected using GWAS and the QTLs detected using the joint-linkage analysis (Fig. 3), with P value less than 0.001 between actual proportion and expected ones. Detailed information on the QTNs and QTLs are listed in Table S4. For CN&US NAM, the frequency of the observed SNPs falling within the QTLs was significantly higher than that of random SNPs, and similar results were found in both CN NAM and US NAM (Fig. 3b). In addition, for the TBN-related QTNs, 300 of 549 QTNs were co-located within the QTL intervals, a finding that was supported by the joint-linkage analysis, with 78 QTNs specific only to the CN NAM QTL intervals, 179 QTNs specific only to the US NAM QTL intervals, and 43 QTNs shared between the CN NAM and the US NAM QTL intervals. Among the 416 TL-related QTNs, 243 were co-located within the QTL intervals, with 61 QTNs specific only to the CN NAM QTL intervals, 161 QTNs specific only to the US NAM QTL intervals and 21 QTNs shared between the CN NAM and US NAM QTL intervals. These loci would be more robust and closely linked to the causal effects of variations in the male inflorescence.

QTN enrichment in candidate genes
The QTNs identified in association studies only indicated the linkage disequilibrium between the SNPs and the phenotypes. To build connections between these associated SNPs and the potential causal genes, we conducted the enrichment studies on candidate genes. The results showed significant QTN enrichment within the candidate genes, with 3.47 vs 1 between the actual QTNs and the random SNPs falling within 0 kb of the candidate genes, 1.5 vs 1 between the actual QTNs and the random SNPs falling within 50 kb of the candidate genes, 1.4 vs 1 between the actual QTNs and the random SNPs falling within 100 kb of the candidate genes, 0.82 vs 1 between the actual QTNs and the random SNPs falling within 200 kb of the candidate genes, and 0.61 vs 1 between the actual QTNs and the random SNPs falling within 400 kb of the candidate genes (Fig. 4). In addition, 113 candidate genes (Fig. 1, Table 2 and Table S4) with peak signal SNPs falling within 100 kb of them are listed, among which 24 male inflorescence-related genes were previously cloned using mutant genetics. For example, the SNP of S7_166858805 with a P value of4 9 10 À10 falls within 100 kb of the ra3 gene ( Fig. 1a and  b), and the SNP of S1_46687160 with a P value of 5.3 9 10 À10 falls within 6 kb of the tasselseed2 (ts2) gene ( Fig. 1a and Table S4). Additionally, we identified 23 new candidate genes that were remarkably associated with tassel phenotypic variations and that are specifically expressed in the male florescence (Eveland et al., 2014). These candidate genes are involved in tassel development and provide positive information for dissecting the genetic basis of tassel architecture variations.

Comparison of GWAS among different populations
The GWAS results using different populations were compared (Table 1, Fig. 5 and Table S4). For the TL trait, 252 QTNs were identified when using the largest mapping population of CN&US NAM (6595 RILs), which showed the highest number of QTNs among the three NAM populations; only 106 QTNs were identified in US NAM using 4623 RILs, and 39 QTNs were identified in CN NAM using 1972 RILs. In addition, the American panel with 280 lines and the Chinese panel with 665 lines showed no detection of TL-related QTNs when using the two populations independently; however, 19 TL-related QTNs were identified when the two APs were combined (Fig. 5  TL). For the TBN trait, 328 QTNs were identified using CN&US NAM, which also showed the highest number of QTNs

Abundant phenotypic variations in the male inflorescence
The US NAM involved included 26 diverse parental inbreds, which maximized the maize phenotypic diversity in the population . In the study conducted by Brown et al. (2011), the population had a large phenotypic variation in TL, ranging from 3.80 to 58.50 cm, and in TBN, ranging from 1.00 to 41.00 (Table 1), which was considerably higher than in the bi-parental populations of ILP9 B73 (Upadyayula et al., 2006) and Chang 7-2 9 787 (Chen et al., 2014). Because greater phenotypic variation would be beneficial for dissecting the genetic architecture of the male inflorescence, more experimental materials, including 1732 RILs of CN NAM and 945 diverse lines of CN&US AP, were added in the present study. These lines were evaluated across 13 environments, and the results showed that the phenotypic variations in TL and TBN were remarkably higher than those in any single population described previously Chen et al., 2014;Upadyayula et al., 2006).

Common and adaptive loci affect male inflorescence conjointly
The results obtained in this study revealed some common and adaptive genetic loci in both CN NAM and US NAM. A total of 25 common QTLs (17 for TBN and 8 for TL) between CN NAM and US NAM, 32 specific QTLs in CN NAM (15 for TBN and 17 for TL) and 51 specific QTLs in US NAM (22 for TBN and 29 for TL) were identified (Fig. 2a,b). In addition, 64 QTNs were co-located within 25 common QTL intervals, 340 QTNs fell within 51 US NAM QTL intervals, and 139 QTNs fell within 32 CN NAM QTL intervals (Table S4). A common genetic mechanism for the male inflorescence is perhaps under standable because some parental inbreds of CN NAM were clustered into the same group as that of US NAM (Fig. S1). The adaptive genetic loci that were detected might be explained by the different germplasm background of CN NAM and US NAM. The common parent of US NAM, that is B73, which is a representative inbred of the SS group, is widely used in the U.S. maize breeding programme (van Heerwaarden et al., 2012), with TBN of 10.24 AE 4.94 (Table S2). The other 25 parents of US NAM, including 13 inbreds from tropical regions (TBN of 11.57 AE 5.32), 9 inbreds from temperate regions (TBN of 6.55 AE 5.96) and 3 from Europe (TBN of 6.25 AE 5.78), showed larger variations in the male inflorescence, that is the tropical lines showed the largest TBN value, which is consistent with the report of higher diversity in tropical maize germplasm (Yan et al., 2009). However, according to Mir et al. (2013), maize in Eastern Asia is closely related to the Mexican highland landraces. After its initial introduction into China, probably 500 years ago, maize has extended to wide regions of this country and has formed a number of landraces with strong adaptability.
Huangzaosi (HZS) is a typical inbred line that was derived from local Chinese landraces. This line has well-developed tassels to provide abundant pollen under different ecological environments and has an average TBN of 13.93 AE 4.34, which is significantly higher than that of B73 (Table S2). The feature of large tassels is one reason why HZS is more adaptive than B73 in the Chinese maize breeding programme, especially in the Huanghuihai region. HZS has become a foundation line in the TSPT group, which is a popular heterotic group used in China (Wang et al., 2011). The other parents of CN NAM included 5 inbreds from TSPT, 3 from modified Reid, 1 from Lancaster and 2 from Temtropic I, some of which were derived from U.S. germplasm (Wang et al., 2008;Wu et al., 2014).Thus, although the number of QTL detected in CN NAM was less than that in US NAM due to a narrower genetic diversity of the CN NAM population, a joint analysis of the two NAM populations could reveal the common and adaptive genetic loci controlling the male inflorescence of maize. This information may provide interesting clues for maize improvement programmes with different breeding targets aimed at different environments.
Larger and more diverse mapping populations improved the detection resolution Differences in the population size and genetic diversity would be the main factors for different detection resolutions and power because larger population sizes and allelic diversity would contain a greater number of recombinants (McMullen et al.,209). In this study, the two independent NAM populations were used in the joint-linkage analysis, resulting in 72 and 53 male inflorescencerelated QTLs that were identified in US NAM using 4623 RILs and CN NAM using 1972 RILs, respectively (Table S5 and S6). In addition, 269, 71 and 580 male inflorescence-related QTNs were identified in US NAM, CN NAM and CN&US NAM, respectively (Table S4). For the association panels, CN&US AP with 945 diverse lines showed the highest resolution with 34 tassel-associated QTNs, whereas CN AP with 665 inbreds showed the second highest resolution with 11 tassel-associated QTNs, and US AP with 280 lines had no detection power (Table S4). The results suggest that larger and more diverse populations may benefit the dissection of the genetic architecture of complex traits, irrespective of the linkage analysis and GWAS.
Terpene synthase (GRMZM2G049538) catalyses the biosynthesis of terpene, which defends the plant from insect attack (Schnee et al., 2002). O-methyl-transferase (GRMZM2G147491) is a key enzyme in the biosynthesis of lignin (Fornale et al., 2006), which could increase the strength and stiffness of fibres, improve the efficiency of water transport through the vascular system and protect plants from pathogen attack (Boerjan et al., 2003). The helix-loop-helix (HLH) DNA-binding proteins (GRMZM2G414252 and GRMZM2G042895) regulate the biosynthesis of anthocyanin in maize (Tominaga-Wada et al., 2012), which attracts pollinators and seed dispersers and defends plants against abiotic and biotic stresses (Petroni and Tonelli, 2011). Cytochrome (P450) (GRMZM2G075244) affects meristem function (Miyoshi et al., 2004). Glutathione S-transferase (GRMZM2G447632) plays a role in the cellular response to auxins during the normal metabolism of plant secondary products, such as anthocyanins and cinnamic acid (Banerjee and Goswami, 2013). Legume lectin (GRMZM2G465165) contains a conserved residue of carbohydrate recognition domain (CRD), which is associated with organisms from all kingdoms of life (De Hoff et al., 2009) and which can coordinate metals (Ca 2+ and Mn 2+ ) (Arason, 1996) and bind to phytohormones such as adenine-related cytokinins, abscisic acid and gibberellic acid (Bogoeva et al., 2004). The SBP-box protein (GRMZM2G058588) regulates primordia initiation, and family members of the transcription factor genes ub2 and ub3 affect TBN .
These candidate genes may play important roles in male inflorescence development, but their biological functions require further investigation. Although many of QTLs/QTNs identified in this study were enriched for previous candidate gene studies, our study was primary served the first step towards full discoveries of genes controlling male inflorescence.

Plant materials and phenotypic evaluations
A total of 6595 recombinant inbred lines (RILs) were used in this study, including 4623 RILs from US NAM and 1972 RILs from CN NAM. The construction of US NAM and CN NAM has been described previously . In addition, one global association panel consisting of 945 diverse lines was used in this study, which was collected from the U.S. and China, including one U.S. association panel (US AP) with 280 diverse lines and one Chinese association panel (CN AP) with 665 diverse lines. Detailed information on the NAM parents, CN NAM RILs and AP lines are listed in Table S1 and shown in Figure S1.
For US NAM and US AP, the TBN and TL data were collected across eight environments that were described previously   At each location, the parental lines and the RILs were planted according to a randomized experimental design, with a single row plot, a row length of 3 m, 0.6 m between adjacent rows, two replications and 11 plants in each row. TL and TBN were measured 15 days after pollen shedding, according to a method described previously . TL was defined as the length from the base of the first branch to the tip of the tassel, measured in centimetres (cm), and TBN was counted in the branch zone.

Phenotypic data analysis
Phenotypic data analysis was performed using SAS software (Release 9.3; SAS Institute, Cary, NC), and ANOVA was performed using the PROC GLM model. Pearson correlation analysis was calculated using the PROC CORR model. The broad-sense heritability (h 2 ) for TBN and TL was calculated as follows: h 2 ¼ r 2 g =ðr 2 g þ r 2 ge =n þ r 2 e =nkÞ (Hallauer and Miranda, 1988), where r 2 g is the genotypic variance, r 2 ge is the interactive variance of genotype and environment, r 2 e is the error variance, and n and k represent the environment and replication number, respectively. The best linear unbiased predictor (BLUP) calculation was performed using the PROC MIXED model, with genotype, location, genotype*location and replications as random effects .

Joint-linkage analysis
The RILs from CN NAM and US NAM were genotyped using the method of genotyping by sequencing (GBS) (EIshire et al., 2011). High-density markers (~1 000 000 SNPs) were obtained, and approximately 950 000 high-quality SNPs were selected to construct a bin map. A total of 5296 and 4932 bin markers were observed for US NAM and CN NAM, respectively (Li et al., 2015). To perform the joint-linkage analysis of TBN and TL, these bin markers were selected independently in US and CN NAM using the stepwise selection procedure in the PROC GLMSELECT model . TBN-and TL-related QTLs were independently mapped in US NAM using 4623 RILs and in CN NAM using 1972 RILs. For US NAM, the family term was introduced into the model, and each of the 5296 marker-by-family terms was made available for inclusion. Significance levels for the entry and exit of the model terms were determined by permutation as follows: Phenotypic data were separately permuted against the genotypic data within each family. All 5296 marker-by-family terms were tested, and the lowest P value was recorded for each permutation. A total of 1000 permutations were performed, and alpha was set at 0.001. For CN NAM, 4932 bin markers were used to perform the joint-linkage analysis using the same method described for US NAM. For each single bi-parental family, QTL analysis was performed using the IciMapping software with a modified algorithm of composite interval mapping (ICIM) (Li et al., 2007).

Genomewide association study of tassel-related traits
The genomewide association study (GWAS) used the association panel (AP) that was genotyped previously using the MaizeSNP50 BeadChip (Cook et al., 2012;Wu et al., 2014) and the two NAMs that were genotyped as described above. All markers were screened according to the following criteria: (i) the minor allele frequency (MAF) was larger than 0.05, and (ii) the marker had an unambiguous position on the physical map. Subsequently, GWAS analysis was performed independently for US NAM, CN NAM, CN&US NAM, US AP, CN AP and CN&US AP. Approximately 0.5 million SNPs for the NAMs and 44 000 SNPs for AP were selected to conduct the GWAS by using mixed linear model implemented in the GAPIT R package (Lipka et al., 2012).
On top of GWAS analysis, we used the subsampling-based multiple SNPs model Tian et al., 2011). Briefly, 80% of the original entries were sampled in the new subpopulation without replacement, and forward regression was used to fit the SNPs using permutation-derived significance thresholds. This process was repeated 100 times to obtain a bootstrap posterior probability (BPP) value for each SNP, ranging from 0 to 1, which represented the proportion of samples in which that SNP was selected. Only SNPs with BPP values greater than or equal to 0.05 were considered to be significantly associated with phenotypic variations and were then designated to quantitative trait nucleotides (QTNs).

Overlapping between QTNs identified using GWAS and QTLs identified by joint-linkage analyses
To evaluate the overlap between the GWAS and joint-linkage results, 549 observed TBN-, 416 observed TL-associated QTNs and 125 QTLs identified in CN NAM and U.S. NAM were selected to compare whether QTNs filled within QTLs according to the identical physical position. If one QTN filled within QTL, we deduced this QTN was overlapped with relevant QTL, in contrast, none overlapped was existed between QTN and QTL. Then, calculate the actual proportion of overlapped QTNs, with the number of QTNs filled within QTLs divided by the total number of QTNs, which was defined as actual intra-QTL ratio. In addition, a subset of 965 SNPs was randomly sampled from 0.5 million SNPs across the genome, with 1000 replications. Then, the proportion of random SNPs falling within the QTL intervals was calculated as described above, this was defined as the expected intra-QTL ratio by chance. Then, a comparison analysis between the actual intra-QTL ratio and random intra-QTL ratio was performed using the binomial distribution test, and overlaps between the QTNs and QTLs were recorded as significant when the actual intra-QTL ratio was significantly higher than the expected intra-QTL ratio . Beyond that, we have compared the significant differences of (intra-QTL ratio)/(1 -intra-QTL ratio) between actual values and expected ones using the binomial distribution test, providing another proof for validating the overlapping between GWAS results and linkage results.

QTN enrichment in candidate genes associated with tassel traits
To evaluate the QTN enrichment, 203 tassel-related candidate genes were collected according to previous literature (Bai et al., 2012;Brown et al., 2011;Chuck et al., 2007Chuck et al., , 2014Eveland et al., 2014;McSteen and Hake, 2001). These candidate genes included 43 male inflorescence-related genes cloned using mutant genetics and 164 candidate genes that are specifically expressed in the male inflorescence (Eveland et al., 2014). First, we have defined candidate genetic regions as 0, 50, 100, 200 and 400 kb apart from target candidate genes, then evaluate the overlapping between QTNs and relevant candidate genetic regions (CGLs). QTNs falling within CGL according to the identical physical position were overlapped with relevant CGLs that were defined as the actual intra-CGL, and relevant number was counted. Then, calculate the actual proportion of overlapped QTNs, with the number of QTNs filled within CGL divided by the total number of QTNs, which was defined as actual intra-CGL ratio. In addition, a subset of 965 SNPs was randomly sampled from 0.5 million SNPs across the genome, with 1000 replications . The proportion of random SNPs falling within CGL intervals was calculated as described above, which was defined as the expected intra-CGL ratio by chance. Then, a comparison analysis between the actual intra-CGL ratio and random intra-CGL ratio was performed using the binomial distribution test, overlaps between the QTNs and CGLs were recorded as significant when the actual intra-CGL ratio was significantly higher than the expected intra-CGL ratio , with ratio of actual values divided by expected ones to be more than 1.
Elimination of potential biases due to population structure and nongenetic effects As our plant material was developed and measured in both the US and China, it is critical to eliminate the biases due to germplasm and nongenetic effects such as the environmental difference between the US and China. We have modelled our analyses in three layers to eliminate the potential biases. First, the joint-linkage analyses and the association studies were performed on inbred BLUPs, which were adjusted by locations, and the interaction between inbreds and locations. The phenotypic differences due to nongenetic effects, such as the environmental differences between the US and China, were removed from BLUPs. Second, the joint-linkage analysis included families as covariates. The QTL effects were based on segregation within family. Third, we derived principal components (PCs) from all genetic markers that cover whole maize genome. We incorporated PCs as covariate in the US-CN joint GWAS to eliminate the bias due to population structure.

Supporting information
Additional Supporting information may be found in the online version of this article: Figure S1 Scatter plot of the first three principal components. Figure S2 Phenotypic variations of tassel-related traits. Figure S3 QTL associated with tassel primary branch number (TBN) across the 36 RILs families. Figure S4 QTL associated with tassel length (TL) across 36 RILs families. Table S1 Summary information of the CN NAM and the lines in the association panel.