Distinguishing between recent balancing selection and incomplete sweep using deep neural networks

Balancing selection is an important adaptive mechanism underpinning a wide range of phenotypes. Despite its relevance, the detection of recent balancing selection from genomic data is challenging as its signatures are qualitatively similar to those left by ongoing positive selection. In this study, we developed and implemented two deep neural networks and tested their performance to predict loci under recent selection, either due to balancing selection or incomplete sweep, from population genomic data. Specifically, we generated forward‐in‐time simulations to train and test an artificial neural network (ANN) and a convolutional neural network (CNN). ANN received as input multiple summary statistics calculated on the locus of interest, while CNN was applied directly on the matrix of haplotypes. We found that both architectures have high accuracy to identify loci under recent selection. CNN generally outperformed ANN to distinguish between signals of balancing selection and incomplete sweep and was less affected by incorrect training data. We deployed both trained networks on neutral genomic regions in European populations and demonstrated a lower false‐positive rate for CNN than ANN. We finally deployed CNN within the MEFV gene region and identified several common variants predicted to be under incomplete sweep in a European population. Notably, two of these variants are functional changes and could modulate susceptibility to familial Mediterranean fever, possibly as a consequence of past adaptation to pathogens. In conclusion, deep neural networks were able to characterize signals of selection on intermediate frequency variants, an analysis currently inaccessible by commonly used strategies.

at the population level. In negative frequency-dependent selection, rare alleles have a fitness advantage. Finally, spatially and temporally varying selection creates a scenario where different alleles are advantageous in different environments.
Until 2006, the general consensus was that only few loci in the human genome have been targets of balancing selection (Asthana et al., 2005;Bubb et al., 2006). Since then, the availability of largescale population genomics data and the development of ad hoc statistical tests contributed to the current view that balancing selection is a widespread adaptive mechanism underlying a broad spectrum of features in the genetic architecture of phenotypes (Key et al., (2014); Llaurens et al., 2017).
In humans, balancing selection is responsible for shaping the diversity of genes involved in the adaptive and innate immune response (Andrés et al., 2009);DeGiorgio et al., 2014;Ferrer-Admetlla et al., 2008;Meyer et al., 2006), metabolism  and other processes (Bitarello et al., 2018). Notably, variants targeted by pathogen-driven balancing selection have been found to be associated with susceptibility to several autoimmune diseases (Fumagalli et al., 2011). Therefore, by elucidating the genomic signals of balancing selection we have the ability to identify common alleles with critical functional consequences. For instance, balancing selection has been hypothesized to maintain a common variant in an angiotensin-converting enzyme  which has been recently associated with increased susceptibility to SARS-CoV-2 (Delanghe et al., 2020).
The application of such methods to large-scale human population genomic data has enabled the characterization of targets of long-term balancing selection (i.e. selection that predates the time to the most recent common ancestor in a species) in humans and their association to several diseases (Cagliani et al., 2008;Siewert & Voight, 2017). Nevertheless, all these studies contributed little to the understanding of the role of balancing selection in recent human evolution, despite short-term or transient balancing selection being predicted to be a common phenomenon in nature (Sellis et al., 2011).
Recent balancing selection leaves traces that are almost indistinguishable from those left by recent positive selection (Fijarczyk & Babik, 2015), with beneficial alleles segregating at intermediate frequency in contemporary genomes in both cases (Charlesworth, 2006). Additionally, even when signatures of balancing selection are identified, the underlying evolutionary mechanism (e.g. overdominance or negative frequency-dependent selection) is often unknown (Llaurens et al., 2017). As such, current methods have only limited power to identify and characterize signatures of recent balancing selection in the human genome.
A promising solution to address this issue is provided by supervised machine learning (ML) which has been recently introduced in population genetics and successfully applied for evolutionary inferences . For instance, several ML methods have been proposed and successfully applied to population genetic data to predict and classify neutral and selective events on genomic loci Lin et al., 2011;Mughal & DeGiorgio, 2019;Pavlidis et al., 2010;Ronen et al., 2013;Schrider & Kern, 2016;Sugden et al., 2018). Deep learning is a class of ML algorithms based on artificial neural networks (ANNs) which comprise nodes in multiple layers connecting features (input) and responses (output) (Lecun et al., 2015). ANNs have the potential to be used in population genetics to estimate parameters from genomic data using multiple summary statistics as input (Sheehan & Song, 2016).
Notably, deep learning algorithms can effectively learn which features (i.e. measurable properties of the data) are sufficient for the prediction (Krizhevsky et al., 2012;Lecun et al., 2015). Despite deep learning in population genetics being in its infancy, several studies have already introduced the use of convolutional neural networks (CNNs) to full population genomic data with convolutional layers automatically extracting informative features (Chan et al., 2018;Flagel et al., 2019;Sanchez et al., 2020;Torada et al., 2019;Xue et al., 2021). A convolution layer is comprised of several weight matrices that slide across the input image and perform a matrix convolutional to produce image matrices (Jiuxiang et al., 2018;Lecun et al., 1998).
Recent reviews provide more detailed information on convolutional neural networks in population genetic inference (Flagel et al., 2019;Sanchez et al., 2020).
In this study, we aimed at developing and implementing deep neural networks to predict loci at intermediate allele frequency (i.e. between 40% and 60%) under natural selection (Test 1). By doing so, our goal is also to distinguish between signals of incomplete sweep (e.g. ongoing positive selection) and signals of balancing selection (Test 2), either due to overdominance or negative frequency-dependent selection. As mentioned above, these two types of selection are different biologically but leave similar signatures in genomes, making their discernment particularly challenging.
Specifically, we compared the predictive power between ANNs (i.e. based on summary statistics) and CNNs (i.e. based on full population genomic data) to perform such classification.
Finally, we deployed the trained deep neural networks on population genomic data to identify and characterize signals of natural selection acting on the MEFV gene. Mutations in the MEFV gene cause familial Mediterranean fever (FMF), an autoinflammatory disease with recurrent episodes of fever, abdominal, joint and chest pain, with gradual development of nephropathic amyloidosis (kidney failure) in some cases (Touitou, 2001). FMF is highly prevalent in populations of Mediterranean origin (Touitou, 2001), and the 3' terminal region of the MEFV gene has been hypothesized to be under balancing selection due to overdominance in some European populations (Fumagalli et al., 2009a). Recently, causative mutations in the MEFV gene have been reported as target of recent positive selection in the Turkish population as they confer resistance to Yersinia pestis (Park et al., 2020). By applying our deep neural networks on a large sample size of genomic data, we sought to establish which type of natural selection has been acting on MEFV with regard to susceptibility to FMF.

| Simulations of population genomic data
We performed extensive simulations both to assess the predictive power of summary statistics and to train deep neural networks.
We generated synthetic population genomic data using SLiM 3.2, a forward-in-time genetic simulation software (Haller & Messer, 2019). Further details on the simulation model employed are available in Table S1 (Gravel et al., 2011).
For simulating scenarios of natural selection, we generated loci of 50 kbp (base pairs) with the selected variant at the centre of the simulated sequence. We assumed a model of selection on a de novo mutation. For illustrative purposes of this study, the selected mutation was introduced in the European population at 21 different times, ranging from 40 k to 20 kya ( Figure S1). We classified these times into three categories: recent (20 k to 26 kya), medium (27 k to 33 kya) and old (34 k to 40 kya) selection.
To mimic the effect of a selected variant at intermediate frequency, we conditioned the final (i.e. contemporary) allele frequencies to be between 40% and 60% in the sample. If the final frequency of the selected allele was not within this range, the simulation restarted at the generation where the selected variant was introduced. For each selection scenario and time of onset of selection, we chose selection coefficients and parameters which maximized the probability of the final allele frequency being between 40% and 60% (Table S2). At the end of the simulations, we sampled 198 chromosomes (i.e. haploid individuals) to match the sample size of CEU (Central European) individuals in the 1000 Genomes Project (1000Genomes Project Consortium, 2015.
In the neutral scenario, no selected variant was introduced.
Instead, we generated data with a neutral variant at the centre of the sequence with a frequency between 40% and 60%. To achieve this, we (i) simulated a larger region of 500 kbp under neutral evolution, (ii) sampled 198 chromosomes, (iii) identified a variant with a frequency between 40% and 60%, and (iv) trimmed the large region to obtain a 50 kbp locus ( Figure S2).
For CNN, we created images from the alignment of sampled haplotypes, similar to previous studies (Chan et al., 2018;Flagel et al., 2019;Torada et al., 2019)). In this data representation, each row of the image is a sampled haplotype (i.e. individual chromosome) and each column corresponds to a specific segregating site. The colour coding indicates if a variant is derived or ancestral, or any other polarization of alleles (e.g. major/minor, reference/alternate). To disentangle the effect of random sorting of sampled haplotypes (Torada et al., 2019), we reordered rows of images as follows: (i) sampled haplotypes are divided into two groups based on the presence or absence of the targeted allele, (ii) haplotypes within each of the two groups are sorted separately based on haplotype frequency, (iii) the two sorted groups are combined to obtain the final reordered image.
Lastly, to take into account the different dimensions of simulated loci, we resized images into 128 × 128 pixels (Torada et al., 2019) using the Image module from Pillow package (https://pypi.org/proje ct/Pillow).

| Implementation and training of neural networks
Both ANN and CNN models were implemented in Python using Keras library with Tensorflow backend (Chollet, 2015). ANN model comprises one input, three hidden and one output fully connected (i.e. dense) layers. Similar to a previous study (Sheehan & Song, 2016), the hidden layers consist of 20, 20 and 10 neurons, respectively, all with a Rectified Linear Units (ReLU) activation function. The output layer, which performs the binary classification, consists of a single neuron with a sigmoid (i.e. logistic) activation function. To control for overfitting, in addition to batch normalization, we used a dropout rate of 0.5 and L2 weight decay of 0.005 across all but the output layers. Models were optimized using the Adam optimizer with a batch size of 64 and a learning rate of 0.005 (Kingma & Ba, 2014;Ruder, 2017).
The CNN model consisted of three sets of 2D convolution layers, each followed by a batch normalization layer and ReLU activation layer. A max-pooling layer was also applied after the first two convolution layers. All convolutional layers consisting of 32 filters had a kernel size of 3 x 3, applied at stride 1. The size of the pooling layers was 2 x 2, which were applied at stride 2. The convolutional layers were followed by a flatten layer, which transforms a two-dimensional feature matrix into a vector. Finally, we used a fully connected layer consisting of 128 units that uses the flattened feature vector as an input, followed by an output layer. Again, we used ReLU activation function on the output from the fully connected layer and the sig-  Table S4. Further, we performed data augmentation during the training of CNN models by randomly flipping images horizontally (Figure S10) using the ImageDataGenerator function from Keras (Chollet et al., 2015). Similarly, we performed hyper- We performed 480,000 simulations in total for training all deep neural networks. Each single model employed 80,000 simulated data samples, 64,000 of them for training and the remaining 16,000 for validation. All models were trained for 50 epochs each. Testing was performed on approximately 16,000 data samples. We trained both ANN and CNN to perform two classification tasks: predict loci under natural selection vs. neutral evolution (Test 1) and predict loci under balancing selection vs. incomplete sweep (Test 2). The predictive power of ANN and CNN for each test was quantified with a confusion matrix, where each row represents the instances of true class and each column the corresponding number of predicted instances.

| Prediction of natural selection from genomic data
We deployed the trained networks on phased population genomic data from the 1000 Genomes Project for the CEU population (1000Genomes Project Consortium, 2015. We filtered all non-biallelic positions and selected all variants with a frequency between 40% and 60% in CEU populations within the MEFV gene region. We retrieved 41 such variants and, for each one, generated a haplotype matrix (Torada et al., 2019) of 50 kbp surrounding the putative target variant. We calculated summary statistics (for ANN) and generated images (for CNN) for each variant by applying the same pipeline used for training the networks. Test 2 was performed only on variants predicted to be under selection for Test 1. Genomic annotations were obtained using the EnsDb. Hsapiens.v75 package in R (Rainer, 2017), and Gviz package was used for visualization (Hahne et al., 2016). We also employed the same procedure on data from 99 randomly sampled individuals of Tuscans in Italy (TSI) from 1000 Genomes Project (1000Genomes Project Consortium, 2015.
We further deployed the trained networks on genomic regions hypothesized to be neutrally evolving. We extracted two putative neutral

| Software availability
A Python package called BaSe (Balancing Selection) that implements deep neural networks (both ANN and CNN) for the detection of selection and for discerning between incomplete sweep and balancing selection is available at https://github.com/ulasi sik/balan cing-selec tion. Data visualizations were performed in R, using ggplot2 (Wickham, 2016), ggpubr (Kassambara, 2020) and pheatmap (Kolde, 2018) libraries. All remaining analyses were performed in Python.

| Summary statistics are not sufficient to discriminate between balancing selection and incomplete sweep
Our first aim was to test whether commonly used summary statistics were sufficient to discriminate between loci under neutrality and natural selection, the latter comprising both incomplete sweep and balancing selection (Test 1). We calculated a total of 64 different summary statistics and compared their distributions calculated on simulated loci under either neutrality or selection, with the targeted allele at intermediate frequency (between 40% and 60%) in the centre of the region ( Figure S15). Figure 1 (upper panel a) shows a subset of these comparisons and indicates that the distribution of several summary statistics under neutral evolution or natural selection is statistically different. Therefore, these summary statistics can be used to predict loci under natural selection. This effect is particularly notable for haplotypebased summary statistics (Figure 1, upper left panel a), and it is consistent across all times of onset of selection (recent, medium and old), in line with the effect of recent selection on patterns of LD.
Next, we tested whether summary statistics were able to distinguish between loci under incomplete sweep and balancing selection (Test 2), and, again, we compared their distributions ( Figure S16). Figure 1 (lower panel b) shows the same subset of comparisons. These results suggest that only few summary statistics can discern genomic patterns created by incomplete sweep from those created by balancing selection, and only marginally. This deficiency is particularly severe for allele frequencybased summary statistics and for medium to old times of selection onset.

| Convolutional neural network has higher prediction accuracy than ANN to distinguish between incomplete sweep and balancing selection
As summary statistics do not have power to discriminate between incomplete sweep and balancing selection if considered individually, we then tested whether their predictive power increased when jointly integrated. Thus, we implemented a deep ANN which receives as input all calculated summary statistics (Sheehan & Song, 2016) and predicts whether a given locus is under either neutrality or natural selection, either due to an incomplete sweep or balancing selection (Test 1). We compared the predictive accuracy of ANN to an approach based on convolutional layers, in the form of a CNN applied to full population genomic data as an alignment of sampled haplotypes (Torada et al., 2019).  events. However, we should stress that ANN will achieve better performance (and possibly similar prediction accuracy to the CNN) if a larger number of informative statistics are given as input.
Overall, CNN had high power to identify loci under selection and substantial power to distinguish between incomplete sweep and balancing selection, two modes of evolution that leave extremely similar genomic patterns.

| Convolutional neural network is more robust than ANN to misspecified training data
The training of a neural network for population genetic inferences is conditional on a demographic and selection model to generate genomic data under different evolutionary scenarios. Therefore, we tested the robustness of both ANN and CNN to misspecified evolutionary parameters during training. Specifically, we used the already generated synthetic data and calculated the prediction accuracy for identifying loci under selection (Test 1) and for distinguishing between incomplete sweep and balancing selection (Test 2) when both ANN and CNN were trained on a specific time of onset of selection (recent, medium, old) but tested on a different value. By doing so, we were able to quantify any drop in accuracy when the training data did not reflect the underlying true evolutionary model.

Numbers outside the antidiagonal indicate accuracy values when
the models employed for training and testing differed. We observed a marginal decline in accuracy when using incorrect training data for Test 1 for both networks which performed similarly. These results were confirmed when investigating all corresponding confusion matrices ( Figure S17). For Test 2, the drop in accuracy when employing a different model for training was more evident than for Test 1, although CNN outperformed ANN in most scenarios (Figure 3, Figure   S18).
To further test the robustness of our inferences to a misspecified model, we tested both architectures trained on a European population to simulated data generated from a demographic model  (Jouganous et al., 2017).
Accuracy values for Test 1 are marginally affected by a misspecified demographic model ( Figures S19 and S20), while we observed a slightly more pronounced decrease in performance for Test 2 ( Figures S19 and S20).

| Convolutional neural network identifies signatures of recent natural selection in MEFV gene
We deployed the trained networks, both ANN and CNN, on genomic data for the MEFV gene from CEU population from the 1000 Genomes Project (1000Genomes Project Consortium, 2015. MEFV gene has been previously associated with both balancing selection (Fumagalli et al., 2009a) and ongoing positive selection (Park et al., 2020). Here we tested whether MEFV gene has been targeted by natural selection and, if so, whether by balancing selection or incomplete sweep.
To assess the false-positive rate, we extracted flanking genomic regions to MEFV predicted to be under neutral evolution (Arbiza et al., 2012) and deployed both ANN and CNN algorithms on all intermediate frequency variants. We expected the networks not to predict signals of selection within these control neutral regions.
ANN predicted 23 out of 42 sites to be under selection regardless of the time of onset of selection ( Figure S21). Therefore, we decided Sites predicted to be under selection (or in LD with the target of selection) encompass a haplotype block spanning from intron 2 to 3' UTR (untranslated region, Figure S23). Most of these variants are possibly functionally silent as they lay within introns or represent synonymous substitutions (Figure 4, third to fifth panels from top).
However, two mutations within this region represent either missense (rs1231123, rs1231122) or stop-gained (rs1231122) substitutions, depending on the corresponding isoform. The predicted signals of selection in the MEFV gene were confirmed when deploying the trained network to genomic data from TSI samples (1000Genomes Project Consortium, 2015, another European population ( Figure   S24). However, the results obtained using TSI population showed a higher false-positive rate when deployed to neutral genomic regions ( Figure S25) than the ones obtained using CEU population, possibly because the network was trained on simulated data conditional on a demographic model inferred for the CEU population. In fact, 7, 14 and 10 out of 38 neutral sites were predicted to be under selection with recent, medium and old time of onset, respectively, using TSI population. In contrast, 3, 13 and 9 out of 42 neutral sites were labelled as targets of selection with recent, medium and old time of onset, respectively, using CEU population.

| DISCUSS ION
In this study, we demonstrated the utility of deep learning to identify genomic signals of recent natural selection on intermediate frequency variants. We showed that algorithms based on either summary statistics (i.e. ANN) or full genomic data (i.e. CNN) had comparably high power to infer selective regimes ( Figure 2) and exhibit lower false-positive and false-negative rates than commonly used neutrality tests ( Figure S26). However, CNN had higher accuracy to distinguish between loci under balancing selection and incomplete sweep (Figure 2), it was generally more robust to incorrect training data (Figure 3), and it had a lower false-positive rate when deployed on neutral genomic regions than ANN (Figures S21 and   S22). Finally, we illustrated the applicability of deep neural networks to detect and characterize signals of natural selection on common variants within the MEFV gene region (Figure 4).
Our results on the high predictive power offered by deep learning, and specifically by convolutional neural networks, to detect signals of natural selection expand previous findings (Chan et al., 2018;Flagel et al., 2019;Sanchez et al., 2020;Torada et al., 2019) to cases where the beneficial allele is at intermediate frequency.

CNN outperformed ANN to distinguish between incomplete sweep
and balancing selection, although, in our analyses, its training was slower by a factor of 300. In fact, CNN had more than 4 million parameters to estimate, in contrast to ANN which had approximately 2,000. Additionally, ANN received as input informative features (i.e. summary statistics) while convolutional filters in the CNN learned the optimal features from the raw data while training. In machine learning, the design of such features had been a major part of information engineering. As an illustration, in the field of computer vision, the 'features' used for many practical algorithms until the early 2000s consisted of hand-engineered gradient estimators (Shen & Bai, 2006), typically at multiple spatial scales (Gauch, 1999;Lowe, 1999), applied to images (arrays of pixels). The observation that features emerge within a deep network has been repeated in different domains. Therefore, we envisage that a novel area of research will focus on extracting informative features from trained networks for population genetic inference, possibly by analysing activation or saliency maps (Bahdanau et al., 2015). It is important to note that ANN will achieve higher performance with the inclusion of additional summary statistics not considered herein.
This study also contributes to ongoing efforts to design architecture and devise training techniques for deep learning algorithms in population genetics (Sanchez et al., 2020). Resizing images to smaller dimensions appeared to reduce overfitting and learning time ( Figure   S9) and could be considered a complementary strategy to approaches based on cropping or padding (Flagel et al., 2019). We show that deep neural networks achieved higher prediction power to differentiate between the effects of neutral evolution, balancing selection and incomplete sweep for variants segregating at intermediate frequency ( Figure 2) than commonly used summary statistics ( Figure 1). However, the accuracy to distinguish between incomplete sweep and balancing selection using CNN ranges from 72% to 80% depending on the time of onset of the selection, with more recent events (around 20 kya) more accurately classified ( Figure 3). While this accuracy is far higher than that achieved using summary statistics, higher accuracy could be achieved by employing a larger training data set, by using more extensive hyper-parameter tuning and architecture search, and by treating overdominance and negative frequency-dependent selection as separate prediction categories. In fact, future extensions of this study will include testing to distinguish between overdominance and negative frequencydependent selection once a variant is predicted to be under balancing selection. It is likely that a different CNN architecture and training data is needed for this purpose as, for instance, information on heterozygosity (not considered herein given the simulation strategy) will likely emerge as an important feature. Additionally, a wider spectrum of times of the onset of selection should be considered to assess the power to predict balancing selection at different evolutionary scenarios. Finally, the CNN herein proposed requires fully resolved haplotypes with no missing data. Furthermore, we argue that such approach should be locus-specific as the network needs to be trained with the local characteristics of the region of interest (e.g. recombination rate). Therefore, this CNN is more suitable to be deployed to deep resequencing data on single loci of interest rather than to genome-wide low-coverage sequencing data. In the latter case, we argue that an ANN receiving as input statistical estimates of summary statistics from genotype likelihoods (Korneliussen et al., 2013) might be a valuable alternative. Nevertheless, the effect of data uncertainty should be further explored.
The analyses on the MEFV gene performed herein complement previous findings (Schaner et al., 2001) et al., 2007;Stella et al., 2019). Nevertheless, from our data this central region is apparently under recent selection (Figure 4) or is in LD with beneficial alleles ( Figure S23) (Park et al., 2020).
However, the possibility that other pathogens could have concurred in conferring a selective advantage cannot be ruled out. Indeed, contrary to previous claims of overdominance acting on MEFV (Fumagalli et al., 2009a), our new results and Park et al's study suggest that the selection on human Pyrin is directional and either recent or possibly still ongoing. In fact, the frequency of M694V and V726A kept rising (Park et al., 2020) although no plague outbreaks rose to the scale of a pandemic after the 17th century.
The population sample we analysed in this study is different from the Turkish cohort investigated by Park et al which overlaps significantly with one of the plague outbreak sites. Nevertheless, even in the different population sample we analysed, the data presented herein suggest signals of recent selection on the human Pyrin. While our computational predictions are unable to identify the causal variant, it is possible to hypothesize that Pyrin, specifically its B30.2 region, could confer resistance to a broader range of pathogens including those causing more recent pandemics. Likewise, we cannot rule out more complex evolutionary scenarios, with MEFV being subjected to long-term balancing selection and recent positive selection on standing variation. A comprehensive picture of ongoing selection signatures in MEFV could be achieved by deploying deep neural networks trained on variants segregating at low or high frequency and to a wide range of Mediterranean populations. Finally, additional power to characterize recent selection in MEFV could be gained by integrating data from ancient genomes (Dehasque et al., 2020) as this would be particularly suitable to relate adaptation to past epidemics to current pathogenic threats (Patin, 2020).
In this study, we demonstrated how deep learning and, in particular, convolutional neural networks were able to perform predictions currently inaccessible by commonly used strategies based on summary statistics. In particular, we showed that deep neural networks can differentiate between signals of incomplete sweep and balancing selection, despite the two evolutionary events leaving qualitatively similar patterns of genetic variation. Furthermore, our application to detect signals of selection on FMF-associated alleles highlighted the importance of a population genetic approach to understand the molecular basis of susceptibility and/or resistance to infectious diseases.

ACK N OWLED G EM ENTS
This work was supported by a Leverhulme Trust Research Grant (RPG-2018-208) and an Imperial College European Partners Fund to MF. We acknowledge the support offered by the Erasmus+ programme to UI. We are grateful to Aida Andrés, Anil A. Bharath and Mehmet Somel for discussions and Bárbara Bitarello and two anonymous reviewers for comments on the manuscript. We also thank

Kivilcim Basak Vural and the METU Comparative and Evolutionary
Biology Group for computational support.

AUTH O R CO NTR I B UTI O N S
MF and UI designed the research. UI performed the research with contributions from AS. MF, UI and AS analysed data and wrote the paper.

DATA AVA I L A B I L I T Y S TAT E M E N T
Detailed tutorials on pipelines for training and prediction, along with all the scripts used in this study, are available within BaSe package at https://github.com/ulasi sik/balan cing-selec tion. Sequencing data on human populations were retrieved from The International Genome Sample Resource (IGSR) at https://www.inter natio nalge nome.org.