Using an incomplete block design to allocate lines to environments improves sparse genome‐based prediction in plant breeding

Genomic selection (GS) is a predictive methodology that trains statistical machine‐learning models with a reference population that is used to perform genome‐enabled predictions of new lines. In plant breeding, it has the potential to increase the speed and reduce the cost of selection. However, to optimize resources, sparse testing methods have been proposed. A common approach is to guarantee a proportion of nonoverlapping and overlapping lines allocated randomly in locations, that is, lines appearing in some locations but not in all. In this study we propose using incomplete block designs (IBD), principally, for the allocation of lines to locations in such a way that not all lines are observed in all locations. We compare this allocation with a random allocation of lines to locations guaranteeing that the lines are allocated to the same number of locations as under the IBD design. We implemented this benchmarking on several crop data sets under the Bayesian genomic best linear unbiased predictor (GBLUP) model, finding that allocation under the principle of IBD outperformed random allocation by between 1.4% and 26.5% across locations, traits, and data sets in terms of mean square error. Although a wide range of performance improvements were observed, our results provide evidence that using IBD for the allocation of lines to locations can help improve predictive performance compared with random allocation. This has the potential to be applied to large‐scale plant breeding programs.


INTRODUCTION
Genomic selection (GS) was proposed by Meuwissen et al. (2001) to exploit dense genome-wide markers for predicting complex traits. It is a predictive methodology that trains a statistical machine-learning model using a reference population (with phenotypic and genotypic information) to calculate predicted breeding or phenotypic values for new lines that were only genotyped. For this reason, GS allows candidate lines to be selected early in the selection process, and under a careful and efficient implementation, offers tremendous opportunities to improve rates of genetic gain in plant and animal breeding (Bhat et al., 2016;Crossa et al., 2017;Heffner et al., 2010;Zhong et al., 2009).
The objective of plant breeding is genetic improvement by producing new genotypes (lines) with improved productivity and quality. In many plant breeding programs at preliminary breeding stages, a majority of hybrids are generated by crossing doubled-haploids lines (or lines developed using a pedigree scheme) to a tester from a complementary heterotic group. The test-cross hybrids are evaluated in some locations (3)(4)(5), and subsequently, the best 10-15% of the lines within or across locations are selected to advance to further yield trials (Beyene et al., 2019). Effective selection decisions at the initial stage of yield testing (typically denoted Stage 1) are critical for the advancement of lines with the greatest potential to perform in the resource-intensive multilocation, multitester testing stages (typically denoted Stage 2; Atanda, et al., 2021). However, phenotypic selection in Stage 1 material is not completely effective because of the presence of only one tester for test-cross hybrid evaluation in a few locations, which do not guarantee a representative sample of the target population of locations (Endelman et al., 2014).
Multilocation trials are key elements in successful breeding programs, permitting the evaluation of promising candidate genotypes under different locational conditions. As such, it is possible to identify stable genotypes or genotypes with specific adaptation by modeling the genotype × environment (or location) (G×E) interaction. However, the ideal implementation where all genotypes are observed in each location requires extensive field testing and considerable resource allocation (Smith et al., 2015a(Smith et al., , 2015b. Experimental designs are powerful tools that have historically been used in breeding programs to increase the precision or reduce the cost of generating parameter estimates in field trials. Popular experimental designs in plant breeding include randomized block designs, incomplete blocks designs (IBD), row-column designs, or α designs (see Bailey [2008], John and Williams [1995], and Patterson and Williams [1976], for examples). For early gener-

Core Ideas
• Incomplete block design (IBD) principle is applied in sparse field testing. • Genome-based sparse testing from IBD concept is proposed. • Sparse testing across environments for genomebased prediction is optimized. • Genome-based prediction sparse testing with IBD includes G×E interaction.
ation testing, both the p-rep design developed by Cullis et al. (2006) and p-rep with augmented designs (where only checks are repeated developed by Williams et al. [2011]) are popular.
Sparse testing is a technique where not all lines are observed in all locations, with lines allocated to locations using a sparse testing design. For example, cross-validation CV2 evaluates the prediction accuracy of models when some genotypes have been evaluated in some locations but not in others and can be used for building sparse testing designs. However, it is also possible to use many traditional experimental designs to allocate treatment to plots or blocks and thus build a sparse testing design. This reshapes the original multilocation breeding trial system into one where all lines are not replicated in all locations, as high costs and factors like seed, land, and water availability might impede the implementation of replicated trials.
In this study we investigate the use of IBDs to more efficiently allocate lines to locations in order to enable sparse genomic prediction. We also compare predictive performance from the allocation of lines to locations using IBD against the conventional random allocation using three crop specieswheat (Triticum aestivum L.), groundnut (Arachis hypogaea L.), and maize (Zea mays L.)-each including different traits data. This comparison of prediction accuracy uses mean squared error (MSE) of prediction of the IBD and random allocation implemented under the popular Bayesian genomic best linear unbiased predictor (GBLUP) model, which was used for comparison, as it is the most widely used model in genome-enabled prediction. The resulting predictions under the two methods were also compared in the absence (NO_GE) and presence (GE) of G×E interactions.

MATERIALS AND METHODS
Data sets used for the benchmarking of the two allocation methods are described below.

Data Sets 1 and 2. Elite wheat yield trial years 2013-2014 and 2016-2017
Two data sets from the Global Wheat Program at the International Maize and Wheat Improvement Center (CIMMYT) were used. They consisted of performance data from elite wheat yield trials (EYTs) established in four different cropping seasons with four locations in each. The lines involved in this study correspond to years 2013-2014 (Data Set 1) and to 2016-2017 (Data Set 2). The EYT Data Set 1 and Data Set 2 contain 766 lines and 980 lines, respectively. In both data sets, an experimental alpha-lattice design was used where the lines were sown in 39 trials, each covering 28 lines and two checks in six blocks with three replications. In these data sets, several traits were available for the selection of locations and lines. In this study, we included four traits that were measured for each line in each location: days to heading (DTHD, number of days from germination to 50% spike emergence); days to maturity (DTMT, number of days from germination to 50% physiological maturity or the loss of the green color in 50% of the spikes); plant height in cm; and grain yield (GY in tons by hectare). Full details of the experimental design and computation of best linear unbiased estimates (BLUEs) can be found in Juliana et al. (2018). For EYT Data Set 1, the selected locations were bed planting with five irrigations (Bed5IR), flat planting with five irrigations (Flat5IR), early heat (EHT), and late heat (LHT). For EYT Data Set 2, the locations were Bed5IR, EHT, Flat5IR, and flat planting with drip irrigation (FlatDrip).
Genome-wide markers for the 1,746 (766 + 980) lines in the two data sets were obtained using genotyping-bysequencing (Elshire et al., 2011;Poland et al., 2012) at Kansas State University using an Illumina HiSeq2500. After filtering, 2,038 markers remained from an initial set of 34,900 markers. The imputation of missing marker data was carried out using LinkImpute (Money et al., 2015) and implemented in TAS-SEL v5 (Bradbury et al., 2007). Lines that had >50% missing data were removed, thus providing a total of 1,506 lines for this study (766 lines in the first data set and 980 lines in the second data set). A high level of relatedness by pedigree or kinship between lines is expected within a year of testing and across years of testing because of the nature of the lines under study.

Data Set 3. Groundnut
The phenotypic data set reported by Pandey et al. (2020) includes information on the phenotypic performance of 318 groundnut lines for various traits in four locations. We assessed genomic-enabled predictions for the following four traits: pods per plant (NPP), pod yield per plant (PYPP) measured in grams, seed yield per plant (SYPP) in grams, and yield per hectare (YPH) in kilograms. The locations are denoted as Location 1 (ENV1: Aliyarnagar_Rainy 2015), Location 2 (ENV2: Jalgoan_Rainy 2015), Location 3 (ENV3: ICRISAT_Rainy 2015), and Location 4 (ENV4: ICRISAT Post-Rainy 2015). The data set is balanced, giving a total of 1,272 assessments with each line included once in each location. Marker data were available for all lines, and 8,268 singlenucleotide polymorphism (SNP) markers remained after quality control (with each marker coded with 0, 1, or 2).

Data Set 4. Wheat data
This data set was first used by Crossa et al. (2010) and Cuevas et al. (2016Cuevas et al. ( , 2017Cuevas et al. ( , 2019 and is comprised of 599 wheat lines from the CIMMYT Global Wheat Program evaluated in four international locations representing four basic agroclimatic regions (mega-locations). Here, we considered GY data available for the lines evaluated in each of the four mega-locations. The 599 wheat lines were genotyped using 1,447 diversity array technology markers generated by Triticarte Pty. Ltd.

Data Set 5. Maize data
This maize data set was included in Souza et al. (2017), originating from Universidad Sao Paulo and consisting of 722 maize hybrids obtained by crossing 49 inbred lines. The hybrids were evaluated in four locations (E1-E4) in Piracicaba and Anhumas, São Paulo, Brazil, in 2016 to yield a total of 2,888 observations (722 hybrids × 4 locations). The hybrids were evaluated using an augmented block design with two commercial hybrids as checks to correct for microlocational variation. At each site, two levels of nitrogen (N) fertilization were used: ideal N conditions (plots received 100 kg ha −1 of N [30 kg ha −1 at sowing and 70 kg ha −1 in a coverage application] at the V8 plant stage) and low N (plots received 30 kg ha −1 of N at sowing). The parental lines were genotyped with an Affymetrix Axiom Maize Genotyping Array (Unterseer et al., 2014) of 616 K SNPs. Markers with minor allele frequency of 0.05 were removed. After applying quality control, 54,113 SNPs were available for predictions.

Bayesian GBLUP model
The Bayesian GBLUP model is represented by the following equation: where L i is the fixed effect of locations; g j , where j = 1,. . . ,J, is the random effect of lines; gL ij is the random effect of location−line interaction; and ε ij is random error components in the model assumed to be independent normal random variables with mean 0 and variance σ 2 . Furthermore, it is assumed that = ( 1 , … , ) ∼ (0, σ 2 g )and = ( 11 , … , 1 , … , ) ∼ [0, σ 2 gL ( ⊗ )], where G is the genomic relationship matrix as computed by Van-Raden (2008), ⊗ denotes the Kronecker product, and I is the identity matrix of size I. The implementation of this model was carried out in the BGLR library of Pérez and de los Campos (2014). It is important to point out that this model (Equation 1) contains G×E interaction but was also implemented without G×E interaction (NO_GE), that is, the model without the fourth component on the right side of Equation 1.
Under both types of allocation methods, IBD and random allocation, we use the notation J as the number of lines, k as the number of lines per location, I as the number of locations, and r as the number of replications of each line j in the entire design. It should be noted that in IBDs, k will be less than J, since not all of the lines in each location can be assigned. An equal number of entry replication is the best way to ensure minimum variance when making all possible pairwise comparisons. Therefore, since r i = r for all lines, the total number of observations in the experiment is N, where N = J(r) = I(k).

2.6
Allocation of lines to locations using the IBD method A balanced IBD design is where all pairs of lines occur together within a location an equal number of times (λ). In general, we will specify λ jj as the number of times line j occurs with j′′ in a location. To generate this sparse allocation of lines to locations, we can use the function find.BIB() using the R package crossdes. For example, suppose there were J = 12 lines and I = 4 locations, this means that we need 48 plots to allocate the 12 lines to the four locations. However, assume that we will use an IBD and a training set equal in size to N_TRN = 36 (75%) of the total plots required under a randomize complete block design. Therefore, the number of lines by locations can be obtained by solving (kI = N_TRN) for k, which results in k = N_TRN/I. This means that k = 36/4 = 9 lines per location. Then, the corresponding elements for the training set can be obtained with the function find.BIB (12,4,9) using the package crossdes. The numbers used in the function find.BIB() denote the lines, the locations, and the lines per locations, respectively. Finally, the lines tested in each location that correspond to the training set are shown in Table 1.
Based on Table 1, each line is present in three locations and missing in one. All the lines shown in Table 1 correspond to the training set, while those not allocated in each location form the testing set. For example, in Location 1, the test set includes T A B L E 1 Allocation of J = 12 lines to I = 4 locations under the incomplete block design method. This information allocated represents the training set (75%) and the size of the location, which is equal to nine, and each line is repeated r = b(k)/J = 36/12 = 3 times Locations 1 2 3 4 5 6 7 8 9 lines G2, G8, and G10; in Location 2, the test set is comprised of lines G4, G6, and G12; in Location 3, the test set has lines G1, G7, and G9; and in Location 4, the test set is comprised of lines G3, G5, and G11. It is important to highlight that the function does not always guarantee a balanced IBD, and for this reason, we generally use the IBD method to guarantee a balanced or a partially balanced IBD (Sailer, 2013).

Random allocation of lines to locations
Starting from a balanced data set with J lines and I locations, the conformation of the random allocation of lines to locations was done in such a way that approximately each line will be repeated in r out of I locations, and all locations will be of the same size (k). The algorithm of this random allocation is as follows: 1. First, we compute = (least integer greater than or equal to ). Then k lines out of J lines are randomly allocated to the first location. 2. Then for the second location, k out of the J lines were again randomly allocated. 3. This process is repeated until the Ith location is completed, with the caveat that the lines allocated to a particular location are only present in less than or equal to r locations, ideally in exactly r locations. The lines that do not satisfy this restriction are not candidates for allocation to a particular location.

Cross-validation strategy
To evaluate and compare the predictive performance of the IBD and random allocations, we used cross-validation with 10 random partitions and 50% of the data for training and 50% for testing. The average MSE was computed with the 10 random partitions and this metric was used to assess the predictive performance in each data set. For each location in each data set, the predictive performance in terms of MSE was computed as the average of the 10 MSEs in the 10 random partitions. Across locations, the MSE in each partition was computed between averages of true and predicted phenotypic values over locations; subsequently, the average of the MSEs of the 10 partitions was reported as prediction performance in each data set. It must be highlighted that 50% of the data was used for training-testing in each partition since each of the five data sets under study included four locations. Therefore, under both types of allocations, we guaranteed that each line was replicated exactly two times (in two locations). Those lines allocated under the IBD and random allocations were used as training and the remaining were used as testing sets. To compare the predictive performance between the IBD and random allocation, we computed the relative efficiency (RE) as follows:

RE = MSE_Random MSE_IBD
where MSE_Random is the MSE under random allocation and MSE_IBD is the MSE under IBD allocation. The RE indicates how much more efficient (in percentage terms if the RE is multiplied by 100) the IBD allocation is in comparison with the random allocation; if the value of RE is >1 then the IBD allocation results in a smaller prediction error; however, if the RE is <1, the IBD allocation is less efficient (with more prediction error) than the random allocation. Relative efficiency is commonly used to make comparisons between randomized complete block designs and IBDs (Kuehl, 2001).

RESULTS
First, we provide a summary of the phenotypic values and variance components of each trait for each data set. The summary of each trait for all data sets is given in Table 2, where we can see that each trait has a different scale and varies significantly, as exemplified by its minimum and maximum values of each trait. We can also see that that the GY traits of the wheat and maize data sets are scaled for this reason, as they yielded values between −3.58 and 4.88. Likewise, we can appreciate that the mean and median are different for most of the traits except for YPH, PYPP, and NPP in the groundnut data set and height in Data Set 1 (EYT) and Data Set 2 (EYT). The difference between the mean and median was stronger, and for this reason, the data are more asymmetric for these traits.
In Table 3, we can see the variance components of locations (L), genotypes (G), genotype × location (G×E) interaction, residual, R, and total (and its corresponding proportion of total variability) explained for each component in each trait of all the data sets. We can see (Table 3) that in Data Set 1 (EYT), the largest proportion of total variability was explained by the locations, while the second largest was for lines in traits DTHD, DTMT and Height. In the GY trait, the second largest was in the G×E and residual. In Data Set 2 (EYT), the largest proportion of variability was explained by locations in three out of the four traits, whereas in the DTHD trait, the largest proportion of variability was explained by the genotypes. However, in the groundnut data set, the largest proportion of variability was explained by the G×E and residual variance components. Conversely, in the wheat data set, the largest proportion of variability was explained by the residual and the second largest by the G×E variance component. Finally, in the maize data set, the largest proportion of variability was also explained by the G×E and residual terms (Table 3). In Appendix A, biplots for each trait of each data set show how similar and different the locations and cultivar under study are based on the site regression model (Crossa & Cornelius, 1997).

Data Set 2 (EYT years 2016-2017)
First, the prediction performance for each location is given including the G×E interaction. In Table 6 we can observe that the IBD allocation outperformed the random allocation in terms of MSE since for each of the traits, the relative efficiencies in most locations were >1. For trait DTHD, the REs observed were 1.107 (Bed5IR), 1.069 (EHT), 1.190 (Flat5IR), , which means that the IBD outperformed the random allocation by 10.7, 6.9, 19.0, and 23.1%, respectively. For the DTMT trait, the IBD was more efficient than the random allocation by 15.8, 9.9, 15.4, and 6.2% in locations Bed5IR, EHT, Flat5IR, and FlatDrip, respectively, since the REs were 1.158, 1.099, 1.154, and 1.062, respectively (Table 6). For the GY trait, the IBD outperformed the random allocation by only 2.1, 1.6, and 0.7% in locations Bed5IR, EHT, and FlatDrip, respectively. Whereas for the height trait, the IBD was superior to the random allocation by 5.0% only in location Flat5IR. Also, in Table 6, when ignoring the G×E interaction (NO_GE), we can observe in each location that the IBD allocation was better than the random allocation since for most of the traits, the RE in locations were >1. For the DTHD trait, the REs observed were 1.104 (Bed5IR), 1.052 (EHT), 1.148 (Flat5IR), and 1.209 (FlatDrip). Therefore, in this trait, the IBD allocation outperformed the random allocation by 10.4 (Bed5IR), 5.2 (EHT), 14.8 (Flat5IR), and 20.9% (FlatDrip) ( Table 6). For the DTMT trait, the IBD outperformed the random allocation by 11.5 (Bed5IR, with RE = 1.115), 6.2 (EHT, with RE = 1.062), 10.7 (Flat5IR, with RE = 1.107), and 6.5% (FlatDrip, with RE = 1.065). For the GY trait, the IBD was better than the random allocation by 1.6, 1.9, and 0.7% in locations Bed5IR, EHT, and FlatDrip, respectively (Table 3). While in the height trait, the IBD outperformed the random allocation by only 4.0% only in location Flat5IR (with RE = 1.040) ( Table 6).
Across locations, including G×E interaction for the four traits of Data Set 2, we can observe that the best prediction performance (lower MSE) was obtained under the IBD allocation with the following REs in each trait: 1.184 (DTHD), 1.101 (DTMT), 1.253 (GY), and 1.014 (height). This means that the IBD increased prediction performance in terms of MSE over the random allocation by 18.4, 10.1, 25.3, and 1.4% in traits DTHD, DTMT, GY, and height, respectively (Figure 2, Table 5).
T A B L E 5 Data Sets 1-5. Prediction performance in terms of mean square error (MSE) across locations for the five data sets under study

Data Set 3 (groundnut)
For this data set (groundnut), which also contained four traits (NPP, PYPP, SYPP, and YPH), we first provide the results including G×E interaction. Across locations, the IBD allocation outperformed the random allocation in terms of MSE since the REs obtained in the four traits are all >1: 1.090 (NPP), 1.126 (PYPP), 1.099 (SYPP), and 1.114 (YPH). This means that the increase in terms of prediction performance (lower MSE) of the IBD over the random allocation was of 9.0, 12.6, 9.9, and 11.4%, respectively (Figure 3, Table 5). When the G×E interaction was not considered (NO_GE), the IBD allocation ( Figure 3, Table 5) also outperformed the random allocation with the following relative efficiencies: 1.076 (NPP), 1.124 (PYPP), 1.094 (SYPP), and 1.108 (YPH). This implies that the prediction performance of using the IBD over the random allocation increased in the four traits by 7. 6, 12.4, 9.4, and 10.8%, respectively (Figure 3, Table 5). Details of the prediction performance for each location for this data set can be found in Appendix Table B1.

Data Sets 4 (wheat) and 5 (maize)
The wheat data set (Data Set 4) only contains the GY trait, and initial results include G×E interaction. Across locations, the IBD allocation outperformed the random allocation, in terms  (Figure 4a, Table 5). When G×E interaction was ignored (NO_GE), the IBD allocation ( Figure 4a) outperformed (RE = 1.188) the random allocation by 18.8% (Figure 4a, Table 5). Similarly, the maize data set (Data Set 5) only contains the GY trait, and when considering the G×E interaction, we observed that the IBD allocation was superior to the random allocation by only 0.8% (RE = 1.008) in terms of MSE (Figure 4b, Table 5). When the G×E interaction was ignored (NO_GE), the IBD allocation (Figure 4b, Table 5) only had a 0.7% (RE = 1.007) gain over the random allocation ( Figure 4b, Table 5). Details of the prediction performance for each location for these two data sets can be found in the Appendix Table B2 (Appendix).

DISCUSSION
Genomic selection can help optimize resources for the early selection of candidate genotypes. This is because only a The Plant Genome F I G U R E 2 Data Set 2. Prediction performance in terms of mean square error (MSE) across locations for Data Set 2 (elite wheat yield trial years 2016-2017) for trait (a) days to heading (DTHD), (b) days to maturity (DTMT), (c) grain yield (GY), and (c) height. NO_GE, model ignores the genotype × location interaction; GE, model considers the genotype × location interaction; IBD, incomplete block designs sample of candidates need to be phenotyped and genotyped, while the remaining individuals must only be genotyped and use genome-enabled prediction models to compute their genomic estimated breeding values. The accuracy of GS is linked to the quality of the predictions, and therefore, better predictions lead to more accurate GS methodology. For this reason, research to improve the efficiency of the GS methodology continues and our study aimed to test the use of IBDs for improving the efficiency of sparse testing. This has the aim of saving significant resources without a loss of prediction power compared with the standard practice of random allocation. We found that the allocation of lines to locations (or environments) using IBD is superior to the random allocation across the data sets analyzed. In Data Set 1, IBD outperformed random allocation across locations and traits by between 6.1 and 20.3% (for GE) and between 8.07 and 18.0% for NO_GE. In Data Set 2 the IBD method outperformed the random method across locations and traits by between 1.4 and 18.4% (for GE) and by between 6 and 26.5% for NO_GE. In Data Set 3 across locations and traits, the IBD gain over the random method was between 9 and 12.6% for GE and between 7.6 and 12.4% for NO_GE. In Data Set 4, the IBD was superior to the random allocation method by 16.4% for GE and by 18.8% for NO_GE. These results also show that the superiority of the proposed IBD allocation is not significantly affected in its per-formance for the degree of G×E interaction, as exemplified by the five data sets studied. (Table 3). These results show empirical evidence that the allocation of lines to locations under the random allocation, which is common practice in plant breeding programs to design sparse testing in the context of genomic selection, is less efficient than the IBD allocation, which allocates the lines to locations under a classical experimental design called balanced IBD or partially balanced IBD.
However, the gain in predictive performance when using IBD over random allocation requires additional considerations. Specifically, the allocation of lines to locations using the IBD method is computationally more demanding than random allocation because the IBD allocation is built under a combinatorial process, which is considerably more time consuming. As the number of lines increases, so too does the time requirement for the allocation process. However, in real applications, this allocation process is only required once.
Additionally, the IBD allocation does not always guarantee that each line is allocated exactly to r out of I locations, meaning that the allocation is not always balanced. Even under these circumstances, the IBD allocation it expected to perform better overall than the random allocation. In this sense, it is of paramount importance to continue studying strategies for efficient sparse allocation of lines to locations to increase the efficiency of the GS. Our study presents new areas of opportunity to evaluate numerous IBDs. In addition to helping the optimization of parameter estimates, they can also be helpful for the construction of sparse testing allocation of lines to locations to increase prediction accuracy.
Experimental designs play an important role in plant breeding since appropriate experimental designs guarantee accurate data collection, proper data analysis, precise parameter estimates, and the right interpretation of the data (Masood et al., 2008). Additionally, breeders are aware that a properly planned experiment is necessary to ensure that the right type of data and a sufficient sample size and power are available to answer the research questions of interest as clearly and efficiently as possible.
In general, experimental designs are important in guaranteeing the quality of parameter estimates that provide more precision to the research questions at hand. Nevertheless, in the current study, we illustrated the use of experimental designs (partially balanced IBDs) for the sparse allocation of lines to locations, thus improving the accuracy of predictions. Therefore, from our results, we observe that the improvement of parameter estimates by using partially balanced IBDs for the sparse allocation of lines to locations also is translated to an increase of prediction performance.
The proposed allocation of lines to locations under partially balanced IBD (IBD allocation) is primarily of interest to breeders when their goal is to evaluate some lines (J denotes all lines available) in some locations such as evaluating each line in r out of I locations and making predictions of the untested (observed) lines in those locations. This allocation of lines to locations is done only with the goal of prediction, and the allocation of those lines in each location should be allocated to plots, blocks, and trials under a different and specific experimental design. From this local (inner experimental design) allocation of lines to plots, blocks, and trials, we can obtain the BLUEs using the specific experimental designs in each location. This means that it is possible within each location to allocate the lines under different experimental designs. Then with BLUEs of each line in each location, the model will be trained with the training set resulting from the allocation of lines to locations under the partially balanced IBD to predict the lines not observed in those locations.
Therefore, this process involves the use of two experimental designs: (a) one for the allocation of those lines to plots, blocks, and trials within each location (that can use a different experimental design in each location) and (b) another experimental design for building the training set with the BLUEs of the lines tested in each location. This second experimental design should be a partially balanced IBD that uses lines allocated for each location as the training set so that those unallocated lines to each location as the testing set will be predicted with the trained model. Our proposed approach coincides with what is called two-phase experimental design, where a ran-domization in each phase is performed to be able to obtain robust phenotypic data (McIntyre, 1955). This approach has been proposed in the context of plant breeding for improving parameter estimates in horseshoe pelargonium [Pelargonium zonale (L.) LʼHér.] (Brien et al., 2011;Molenaar et al., 2017Molenaar et al., , 2018; however, to the best of our knowledge this is the first time that this two-phase experimental design is proposed for the context of genome-based selection. In this study, we do not evaluate the role of population structure on the proposed method. This ceased to be a concern when de los Campos et al. (2015) pointed out that population structure does not play the role of a confounding factor, rather a modified factor. However, for a complete understanding of these issues, future studies should be conducted to be able to quantify how the population structure of the genomic relationship matrix or kinship matrix affects the prediction performance of the incomplete blocks created to implement the sparse testing method proposed here.
Furthermore, more evaluations are necessary since even though the five data sets are from three different crops and with different levels of explained total variability of each of its variance components, they are not representative of all crops and variability of data generated in plant breeding programs. Finally, as pointed out above, it is possible that other forms of experimental IBDs can be used to design sparse testing methods for allocating lines to locations. This will further support the goal of increasing the prediction performance in the context of GS. However, specific additional designs still need to be evaluated to ensure that they help increase the prediction performance.

CONCLUSION
In this study, we proposed the use of IBDs for sparse testing allocation of lines to locations for genomic prediction. We found that the proposed IBD allocation helps to significantly improve predictions compared with the standard random allocation of lines to locations. However, we also found that when the data set is larger, the allocation of lines using IBDs are more time consuming and computationally intensive. However, this component is unlikely to be a major barrier, as the allocation is only required once in a breeding application. The proposed IBD method contributes to increasing the availability of sparse testing methods for plant breeding that makes the GS methodology more efficient, as it provides better prediction performance than the random allocation of lines to locations. However, we suggest performing more empirical evaluations to accumulate further evidence of the utility of IBD for an efficient allocation of lines to locations for sparse testing in GS. Other experimental designs can be evaluated for their use in sparse testing genomic prediction, supporting an increase in the power of the GS methodology.  (Crossa & Cornelius, 1997) provides the multiplicative operators computed from a reduced-rank model matrix of deviations of the parametric cell mean of the genotype (G) in the environment (E) from the mean of the environment (i.e., the effects of the genotypes plus the effect of the G×E).
For the SREG biplot [ Figures A1-A14, where cultivars are in green color and environments (or sites) are in blue colors], the cosine of the angle between two cultivar (or environment) vectors approximates the correlation between the cultivars (or environments) with respect to the main effect of cultivar plus the G×E. Acute angles indicate positive correlation, with parallel vectors (in exactly the same directions) representing a correlation of 1.0. Obtuse angles represent negative F I G U R E A 1 Biplot for trait days to heading (DTHD) of Data Set 1 (elite wheat yield trial) F I G U R E A 2 Biplot for trait days to maturity (DTMT) of Data Set 1 (elite wheat yield trial) correlations, with opposite directions indicating a correlation of −1.0. Perpendicularity of directions indicates a correlation of zero. Environmental vectors having the same direction as the cultivar vectors have positive cultivar plus the G×E effects (that is, these environments favored these cultivars), whereas vectors in the opposite direction have negative cultivar plus G×E. Thus, environmental vectors on the same direction indicate a less complex G×E than those located in opposite directions. For example, biplot of Figure A13 for grain yield (GY) of data set wheat displayed Site 1 as being on different direction than Sites 2, 3, and 4, thus showing a more complex G×E than that of Figure A14 for GY of data set maize, where all sites pointed towards the right-hand side of the biplot. The Plant Genome F I G U R E A 3 Biplot for trait grain yield (GY) of Data Set 1 (elite wheat yield trial)

F I G U R E A 4
Biplot for trait height of Data Set 1 (elite wheat yield trial) F I G U R E A 5 Biplot for trait days to heading (DTHD) of Data Set 2 (elite wheat yield trial) F I G U R E A 6 Biplot for trait days to maturity (DTMT) of Data Set 2 (elite wheat yield trial) The Plant Genome F I G U R E A 7 Biplot for trait grain yield (GY) of Data Set 2 (elite wheat yield trial) F I G U R E A 8 Biplot for trait height of Data Set 2 (elite wheat yield trial) F I G U R E A 9 Biplot for trait number of pods per plant (NPP) of groundnut data set F I G U R E A 1 0 Biplot for trait pod yield per plant (PYPP) of groundnut data set The Plant Genome F I G U R E A 1 1 Biplot for trait seed yield per plant (SYPP) of groundnut data set F I G U R E A 1 2 Biplot for trait yield per hectare (YPH) of groundnut data set F I G U R E A 1 3 Biplot for trait grain yield (GY) of wheat data set F I G U R E A 1 4 Biplot for trait grain yield (GY) of maize data set The Plant Genome

APPENDIX B
T A B L E B 1 Data Set 3. Prediction performance in terms of mean square error (MSE) for each location for Data Set 3 (groundnut)