Including genotypic information in genetic evaluations increases the accuracy of sheep breeding values

Abstract The impact of inclusion of genome‐wide genotypes into breeding value predictions for UK Texel sheep is addressed in this article. The main aim was to investigate the level of change in the accuracy values for EBVs when information from animal genotypes is incorporated into the genetic evaluations. New genetic parameters for a range of lamb growth, carcass composition and health traits are described and applied in the estimation of conventional breeding values (EBVs) for almost 822,000 animals as well as genomic breeding values (gEBVs) after adding 10,143 genotypes. Principal component analyses showed that there are no major distinct groups; hence, the population is mainly homogenous and genetically well‐linked. Results suggested that the highest change in accuracy was observed for the animals that are not phenotyped but have good links to the reference population. This was seen especially for the lowly heritable health traits thereby proving that the use of genotypes in breeding values estimation may accelerate the genetic gain by producing more accurate values especially for young, un‐phenotyped animals.


| INTRODUCTION
Genomic evaluation is now widely used as a breeding tool for genetic selection in several species of farm animals across the world but is less well developed for ovine (Hayes et al., 2012;Berry et al., 2016, Rupp et al., 2016, Fitzmaurice et al., 2021, Berry et al., 2022).In the UK, estimation of breeding values in sheep is based only on the conventional Best Linear Unbiased Prediction (BLUP, Henderson, 1949) method of analysis of animal phenotypic and pedigree data provided by the breeders and Breed Societies.This approach works well, especially for animal traits with moderate to high heritability and/or sufficient amounts of phenotypic data available.Availability of informative pedigree linking animals from different flocks is also a prerequisite (Simm, 1998).These conditions are met for multiple growth and carcass traits; however, this method has a limitation to the maximum accuracy level of the estimated breeding values (EBV) that can be achieved by young animals which have not yet had the chance to be phenotyped.Furthermore, for traits that are difficult to measure, measured on one sex only or with low heritability, such as reproduction and health traits, the accuracy of EBV may be low, even for animals with measured phenotypes.This means that selection decisions based on such EBVs may slow down the achievement of the genetic goal set by the breeders.The alternative approach, and one which simultaneously combines animal genotypes with phenotypes and pedigree, is Single-Step BLUP (SS-BLUP) (Christensen & Lund, 2010;Legarra et al., 2009;Misztal et al., 2009).This approach may lead to more accurate EBVs and reduce the generation interval by enabling an early selection of candidates for selection, especially for traits that can be measured late in life or on adult progeny only, such as mastitis.
Genomic selection is already used in small ruminant breeding in many countries, such as Australia (https:// www.sheep genet ics.org.au),New Zealand (Auvray et al., 2014), Ireland (https://www.sheep.ie/),France (Palhière et al., 2022) or internationally (Teissier et al., 2022) mostly addressing production traits.Furthermore, incorporating health traits such as mastitis or parasite resistance to the breeding programmes might affect positively the overall animal welfare, as well as the economic gain (Pacheco et al., 2021;Walkom et al., 2022), and this will be possible only if the accuracy of animal EBVs is satisfactorily high and above a certain threshold, which would allow publication of EBVs and reduce risks associated with making selection decisions.Minimum accuracy thresholds are extensively used across variety of traits in many species in the UK (www.fas.scot),Australia (https://breed plan.une.edu.au/gener al/under stand ing-ebv-accur acy/) and Ireland (Sheep Ireland Guide & Directory of Breeders, 2020).
The objective of the present study was to assess the impact of incorporating genomic information in the EBV estimation on the accuracy of genetic evaluation for health (footrot and mastitis) and production (birth weight, weaning weight, scan weight and fat and muscle depth) traits in UK Texel sheep.

| Phenotypes
Two sets of traits were examined in this research.Firstly, these included growth and carcass composition traits measured in lambs, which were: birth weight (BWT), eight-week weight (EWW, growth rate to 8 weeks of age), scan weight (SWT, growth rate to finishing), muscle depth (MD, loin muscularity) and fat depth (FD, potential to produce lean/fat carcasses).Growth and carcass measurements of sheep were taken from the iTexel database, where data for these phenotypes are routinely reported by the breeders.This dataset contained phenotypes for 645,840 animals born between 1970 and 2021.
Secondly, we included health traits measured in adult ewes, namely, footrot (FRT) and California Mastitis Test score (CMT).The CMT was used instead of the conventional somatic cell count in milk as a mastitis indicator because it was found to be highly (up to 0.98) correlated with somatic cell count (McLaren et al., 2018), and can be measured on-farm without the need for laboratory analyses.For health traits (FRT and CMT), the data were collected specifically for this research project between 2015 and 2019 on 32 farms across the UK on 3434 milking females.Trained technicians visited the farms at least once per year to score animals using a five-point scale, from 0 -not infected -to 4 -severe infection (Conington et al., 2008;McLaren et al., 2018) -for both health traits.Animals were scored between one and five times over the course of 5 years in 2015-2019.
Further edits on phenotypes were performed, removing values that were out of biologically expected ranges, as summarized in Table 1.

| Genotypes
The data used in this research contain 10,193 Texel sheep genotypes, collected between January 2015 and March

Implications
The use of animal genotypes from genome-wide DNA arrays in the genetic evaluation process has become a new standard for many livestock species.Improving the accuracy of breeding value estimates is critical to the success of breeding programmes.Accuracies are often low especially for lowly heritable traits with low numbers of phenotypic measurements.They are also often low when the traits of interest are expressed in female adults, but males are selected as young stock, such as in sheep breeding programmes.This research shows that incorporating genomic information into the genetic evaluation increases the accuracies of breeding values enabling selection especially for animals without phenotypes and for low to moderate heritability traits.It allows for more accurate selection of males and female replacements to be made at an earlier stage in life without having to wait for phenotypes from adult females to be collected.
Animal genotypic data received from the genotyping laboratory were subjected to a series of quality control procedures.These included the rejection of genotypes that did not meet the call rate threshold of 89.4% (which is used in the UK national genomic evaluations for cattle).Subsequently, a parentage check was undertaken to discard genotypes for which the parentage was not confirmed.The opposing homozygotes method (Hayes, 2011) was used on a subset of 8119 SNPs that were common to the four genotyping arrays as described in Kaseja et al. (2022), and if an animal failed the genomic parentage verification (over 1% of conflicting SNPs) and the correct parent did not appear from the parentage discovery, the unverified parent was set to "unknown."Additionally, when there was more than one sample per animal and only one passed parentage verification, then that sample was kept.If more than one sample confirmed the parentage, then the genotypes were compared with each other to confirm they were identical, and if so, one genotype was chosen based on the density of the SNP array used for genotyping, in the following order of priority: 50 K, HD, LDv2 and LDv1.
The final dataset contained 9391 genotypes (971 HD, 2709 50 K, 2350 LDv1 and 3361 LDv2).The next stage was quality control checks at the SNP level, which involved removing SNPs with call rate under 89.4%, minor allele frequency below 0.05 and not in Hardy-Weinberg equilibrium (p-value at 0.05), thereby producing subsets of 450,686, 36,654, 12,427 and 10,725 SNPs for HD, 50 K,  LDv2 and LDv1 SNP arrays, respectively.All genotypes were then imputed to the subset of most informative SNPs from the 50 K array (n = 36,654 SNPs) using Findhap V3 software (VanRaden et al., 2011).

| Pedigree
A pedigree file including all phenotyped animals and their parents (n = 821,692) was built using information provided by the breeders on the iTexel database and altered accordingly based on the information from genomic parentage verification and discovery as described above (where possible).

| Data analysis
The following mixed effect model was used in all data analyses: where y is vector of observations, X is design matrix of order, relating records to b -vector of fixed effects, Z is design matrix of order, relating records to a -vector of random additive genetic effects, W is design matrix for random permanent environment effect, p is vector for random permanent environment effect, and e is a vector of random residual effects.Random effects were assumed to be normally distributed with the mean of zero.A summary of the fixed and random effects is shown in Table 2.As the data were collected across many years and in many flocks, contemporary grouping (CG) was used to compare  animals more directly.CG were defined as flock-season of birth-sex for production traits or month-year and farmyear for CMT.
Both FRT and CMT were analysed as the natural logarithm of the sum of scores for all hooves and both udder halves, respectively, plus one (to avoid sum of zero) as described in McLaren et al. (2018).

| Parameter estimation
Variance components were first estimated for each trait with Model (1) using the ASReml software (Gilmour et al., 2015), with the use of an informative subset of data containing only animals with at least one valid phenotype, born between 2011 and 2021.Additional data edits in this step excluded lambs that had been fostered, were born as a result of embryo transfer or were born within a litter of over four lambs.The contemporary group had to have a minimum of five individuals.The data used for parameter estimations are summarized in Table 1.In this analysis, the random additive genetic effects were assumed to be normally distributed ~ 0, A 2 a for all animals, where 2 a is the additive genetic variance and A is the pedigree relationship matrix.A separate series of bivariate analyses based on Model (1)-derived estimates of the genetic and phenotypic correlations among the studied traits.
The estimated variance components were then used to derive animal EBVs for each trait with Model (1) and the MiX99 software (Lidauer et al., 2015).

| Genomic prediction
Conventional BLUP estimates were derived first with the same distribution assumptions for the random effects as the variance component estimation step.Subsequently, the random genetic effect variance structure was modified to accommodate inclusion information from two sources: G −1 -A −1 gg where G (obtained using first method from VanRaden, 2008) is a genomic relationship matrix and A gg is pedigree-based relationship matrix for genotyped animals, replacing in the model the A matrix with H -combined relationship matrix (Christensen, 2012).The key difference between these two analyses was the addition of information from animals' genotypes to SS-BLUP, while pedigree and phenotypes remained the same in both.
Breeding values were estimated for the full available dataset (n = 821,692 animals).Reliabilities of the estimated breeding values were estimated using Apax99 software (Lidauer et al., 2015), using a two-step method which, in the first instance, calculates information due to observations coming from the above model and secondly uses Misztal and Wiggans (1988) method to add the relationship information.The reliabilities produced were subsequently converted to accuracy values by using their square root value.

| Population structure
The results from the PCA to determine population structure are shown in Figure 1.Clustering based on the principal components of the genotype matrix did not reveal any major outliers, indicating that the population is mostly homogenous.The first two principal components explained 14.8% and 4.7% of variation, respectively.The obtained structure of this population is showing a small cluster of 80 animals being somewhat separated from the main cluster.Further investigation has indicated that these animals T A B L E 2 Fixed and random effects considered in parameter and breeding value estimation.
were imported into the UK from New Zealand; hence, their genetic background differs from the rest of the main population which is UK Texel sheep.

| Genetic parameters
A summary of variance components and genetic parameters by trait is shown in Table 3.The trait heritabilities range from 0.07 (low, for CMT) to 0.33 (moderate, for SWT) and are in line with the heritabilities obtained from similar research for growth and health traits in sheep (McLaren et al., 2018;Mucha, Bunger, & Conington, 2015;Mucha, Mrode, et al., 2015;Safari et al., 2005).All correlations between traits estimated using the bivariate models are summarized in Table 4.The genetic correlation between FRT and CMT was 0.28 (±0.11), indicating likely pleiotropic effects on these two health traits.To the authors' knowledge, these are the first estimates of mastitis and lameness correlation for meat sheep.The genetic correlations between the health traits and growth or body composition traits were not significantly different from zero.However, the correlations estimated among lamb growth and body composition were significantly different from zero and within the range of values expected and previously reported for example by Fitzmaurice et al. (2021), Lambe et al. (2008) or Mortimer et al. (2018).

| Accuracy values
When comparing accuracy values generated from SS-BLUP to conventional BLUP, there was almost no change in mean accuracy values for the whole population (n = 821,692).For BWT, EWW, SWT, MD and FD, the average difference in accuracy was 0; for FRT and CMT, the change was 0.02 and 0.03, respectively.This is due to very high volume of animals included in the prediction, where some of them are not that very well connected with the genotyped population, hence do not benefit from the  inclusion of the genotypes in the evaluation.The results also showed that there are some animals whose EBV accuracy may decrease after the inclusion of genotypes although this was only observed for production traits and for animals that were not phenotyped and ungenotyped.The maximum reduction was 0.03 for BWT, EWW and SWT and 0.04 for MD and FD.There was no observed reduction in accuracy value for FTR and CMT.Further analysis revealed that animals that had lower accuracy values following the inclusion of genotypic information were not themselves genotyped or phenotyped and had no close relatives that were phenotyped either.The accuracy for the conventional BLUP EBVs of these animals was less than 0.15 meaning that these animals were not likely to be serious candidates for selection.There were no reductions in accuracy values for any traits for any of the animals that had been genotyped.The accuracy of GEBVs increased for the majority of genotyped animals compared with their conventional evaluation.The maximum increase in accuracy values were 0.40 for BWT, 0.32 for FD, 0.31 for MD, 0.25 for EWW and 0.22 for SWT.For health traits, these were 0.47 for FRT and 0.52 for CMT.These high increases were observed for animals that were genotyped, but not phenotyped.
Changes in the accuracy of EBV from SS-BLUP and BLUP for genotyped animals are summarized in Table 5 which shows there are no genotyped animals with reduced accuracy values following the inclusion of their genotypes in the genetic evaluation.On average, the biggest change in accuracy values is observed for traits with the lowest heritability, which are the health traits (CMT and FRT) and 8-week weight.Changes in accuracy values were greater for animals with no phenotypic information available, meaning that the inclusion of genotypic information is critical to increase the precision of EBV especially for young animals or for males in terms of measuring the CMT.The average changes in accuracy values for un-phenotyped animals were 170% higher than those T A B L E 4 Estimates for genetic (below diagonal) and phenotypic (above diagonal) correlations; estimate followed by standard error in brackets.from the reference population (animals which are both genotyped and phenotyped).The accuracy values for all traits obtained with conventional BLUP versus SS-BLUP for both phenotyped and non-phenotyped animals that were genotyped are shown on Figure 2, illustrating the potential of genomic information to enhance the accuracy values, especially for hard to measure health traits with low heritability (FRT and CMT), where the maximum accuracy increased from 0.18 to 0.47 for FRT and 0.30 to 0.52 for CMT.For traits that are more easily recorded, have more records available and which are moderately heritable (SWT, MD and FD), the increase of accuracy

Trait
for genotyped is still clear but substantially lower than for FRT or CMT.For all traits of this study, phenotyped animals had more accurate EBVs regardless of the evaluation method, which is in accordance with the theoretical expectations (Simm, 1998).These findings are in line with results from previous studies, demonstrating increased accuracy when information from genotypes is included, such as for Manech Tête Rousse dairy sheep (Macedo et al., 2020), small population of Dorper sheep (Moghaddar et al., 2022) or chicken mortality (Bermann et al., 2020).Furthermore, several studies comparing accuracies obtained from various genomic models indicate single-step BLUP as the best way for getting high accuracy values in sheep (Baloche et al., 2013) or dairy goats (Mucha, Bunger, & Conington, 2015;Mucha, Mrode, et al., 2015).

| CONCLUSION
This study has combined production and health traits recorded in a well-phenotyped population of UK Texel sheep.This was the first study to estimate the potential gain in prediction accuracy of adding genomic data into the estimation of breeding values for UK sheep.It has showed that the structure of the data used for the evaluations affects the changes seen in accuracy values, which also differ according to the traits analysed.In all scenarios, adding animal genotypes in a single-step BLUP evaluation increased the accuracy of prediction compared with conventional BLUP.Therefore, increased animal genotyping is recommended in a breeding programme in order to improve the accuracy of estimated breeding values and reduce the risks associated with making selection decisions.It also will achieve accelerated rates of genetic gain, enhanced efficiency of production and lead to enhanced animal welfare when health traits are included in the breeding programme.
(1) y = Xb + Za + Wp + e T A B L E 1 Data description by trait.

F
Plot of first (PC1) and second (PC2) principal component of the genomic relationship matrix for all genotyped animals.Animals originated from New Zealand coloured in red.[Colour figure can be viewed at wileyonlinelibrary.com]T A B L E 3 Estimated variance components and parameters, followed by standard errors.

F
Regression of the accuracy of estimated breeding values derived from single step BLUP on conventional BLUP for genotyped animals with and without phenotypes by trait.Grey dots represent genotyped animals without phenotypes and black dots represent animals with both genotypes and phenotypes available.

Trait Direct genetic variance Permanent environment variance Maternal variance Residual variance Phenotypic variance Heritability
Note: N / E Correlation could not be estimated due to Negative Sum of Squares.Abbreviations: BWT, birth weight; CMT, California mastitis test; EWW, eight-week weight; FD, fat depth; FRT, footrot; MD, muscle depth; SWT, scan weight.Summary of changes* in accuracy of estimated breeding values (EBV) for genotyped animals with and without phenotypic records per trait.