The search for sexually antagonistic genes: Practical insights from studies of local adaptation and statistical genomics

Abstract Sexually antagonistic (SA) genetic variation—in which alleles favored in one sex are disfavored in the other—is predicted to be common and has been documented in several animal and plant populations, yet we currently know little about its pervasiveness among species or its population genetic basis. Recent applications of genomics in studies of SA genetic variation have highlighted considerable methodological challenges to the identification and characterization of SA genes, raising questions about the feasibility of genomic approaches for inferring SA selection. The related fields of local adaptation and statistical genomics have previously dealt with similar challenges, and lessons from these disciplines can therefore help overcome current difficulties in applying genomics to study SA genetic variation. Here, we integrate theoretical and analytical concepts from local adaptation and statistical genomics research—including F ST and F IS statistics, genome‐wide association studies, pedigree analyses, reciprocal transplant studies, and evolve‐and‐resequence experiments—to evaluate methods for identifying SA genes and genome‐wide signals of SA genetic variation. We begin by developing theoretical models for between‐sex F ST and F IS, including explicit null distributions for each statistic, and using them to critically evaluate putative multilocus signals of sex‐specific selection in previously published datasets. We then highlight new statistics that address some of the limitations of F ST and F IS, along with applications of more direct approaches for characterizing SA genetic variation, which incorporate explicit fitness measurements. We finish by presenting practical guidelines for the validation and evolutionary analysis of candidate SA genes and discussing promising empirical systems for future work.


Impact Summary
Genome sequences carry a record of the evolutionary and demographic histories of natural populations. Research over the last two decades has dramatically improved our ability to detect genomic signals of adaptation by natural selection, including several widely-used methods for identifying genes underlying local adaptation and quantitative trait variation. Yet the application of these methods to identify sexually antagonistic (SA) genes-wherein variants that are adaptive for one sex are maladaptive for the other-remains underexplored, despite the potential importance of SA selection as a mechanism for maintaining genetic variation. Indeed, several lines of evidence suggest that SA genetic variation is common within animal and plant populations, underscoring the need for analytical methods that can reliably identify SA genes and genomic signals of SA genetic variation. Here, we integrate statistics and experimental designs that were originally developed within the fields of local adaptation and statistical genomics and apply them to contexts of sex-specific adaptation and SA genetic variation. First, we evaluate and extend statistical methods for identifying signals of SA variation from genome sequence data alone. We then apply these methods to reanalyze previously published data on allele frequency differences between sexes-a putative signal of SA selection. Second, we highlight more direct approaches for identifying SA genetic variation, which use experimental evolution and statistical associations between individual genetic variants and fitness. Third, we provide guidelines for the biological validation, evolutionary analysis, and interpretation of candidate SA polymorphisms. By building upon the strong methodological foundations of local adaptation and statistical genomics research, we provide a roadmap for rigorous analyses of genetic data in the context of sex-specific adaptation, thereby facilitating insights into the role and pervasiveness of SA variation in adaptive evolution.
A population's evolutionary capacity for adaptation hinges upon the nature and extent of the genetic variation it harbors (Fisher 1930). In simple environments where selection is uniform over time, across space, and among different classes of individuals within the population, adaptation may proceed by fixing unconditionally beneficial mutations and eliminating deleterious ones. Yet species exist in complex environments, where opportunities for adaptation can be limited by genetic trade-offs among traits and fitness components (Otto 2004;Gomulkiewicz and Houle 2009;Chevin 2013;Connallon and Hall 2018) or by gene flow and conflicting directional selection among habitats within a species' range (Kirkpatrick and Barton 1997;Lenormand 2002;Duputié et al. 2012). Such contexts allow maladaptation to persist in spite of abundant genetic variation within the population (Walsh and Blows 2009).
"Sexually antagonistic" (SA) genetic variation-wherein alleles that are beneficial when expressed in one sex are harmful when expressed in the other-represents a particularly common form of genetic trade-off Bonduriansky and Chenoweth 2009;Van Doorn 2009). SA genetic variation arises from sex differences in selection (a.k.a. sex-specific selection) on traits that are genetically correlated between the sexes (Connallon and Clark 2014b), and may contribute substantially to fitness variation (Kidwell et al. 1977;Abbott 2011;Connallon and Clark 2014a;Olito et al. 2018) and maladaptation (Lande 1980;Matthews et al. 2019). Estimates of phenotypic selection suggest that sex differences in directional selection are common (Cox and Calsbeek 2009;Lewis et al. 2011;Gosden et al. 2012;Stearns et al. 2012;Morrissey 2016;De Lisle et al. 2018;Singh and Punzalan 2018), implying that many genetic variants affecting quantitative traits have SA effects on fitness. Likewise, estimates of genetic variation for fitness suggest that the genetic basis of female and male fitness components is partially discordant, with some multilocus genotypes conferring high fitness in one sex and low fitness in the other reviewed in Connallon and Matthews 2019).
Although studies of sex-specific selection indicate that SA alleles contribute to fitness variation in several animal and plant populations (e.g., Chippindale et al. 2001;Fedorka and Mousseau 2004;Svensson et al. 2009;Delph et al. 2011;Mokkonen et al. 2011;Berger et al. 2014), the population genetic basis of this fitness variation is largely unknown, leaving several important questions unanswered. For example, what fraction of genetic variance for fitness is attributable to SA alleles versus other classes of genetic variation (e.g., deleterious mutations)? Is SA genetic variation attributable to many small-effect loci or to a few large-effect loci? Are SA polymorphisms maintained under balancing selection, or are they transient and primarily evolving via mutation, directional selection, and drift? Are SA alleles randomly distributed across the genome or are they enriched on certain chromosome types (e.g., sex chromosomes)? These questions are part of the broader debate about the genetic basis of fitness variation and the evolutionary forces that maintain it (Lewontin 1975;Charlesworth and Hughes 1999), which is one of the oldest in this field and perhaps the most difficult to resolve (e.g., Lewontin 1975, p. 23).
As in most areas of evolutionary biology, research on SA selection is increasingly drawing upon genomics. A few studies have identified candidate SA polymorphisms with large effects on traits related to fitness (Roberts et al. 2009;Barson et al. 2015;Rostant et al. 2015;VanKuren and Long 2018;Pearse et al. 2019), and this handful of SA loci almost certainly represents the tip of the iceberg (Ruzicka et al. 2019). Many other studies have highlighted genomic patterns of expression or sequence diversity that could be indicative of sex-specific selection (e.g., Innocenti and Morrow 2010;reviewed in Kasimatis et al. 2017;Mank 2017;Rowe et al. 2018).
Although there is optimism that genomics will facilitate the study of sex-specific selection, we still face several challenges in applying genomic data to identify and characterize SA genetic variation. For example, some putative genomic signals of sex-specific selection, such as sex-biased gene expression, are ambiguous: at best, they may serve as indirect proxies of sexspecific selection, or at worst provide no information about contemporary selection on each sex (Kasimatis et al. 2017;Rowe et al. 2018). Allele frequency differences between sexes (e.g., between-sex F ST ) may represent more straightforward genomic signatures of sex-specific selection (e.g., Cheng and Kirkpatrick 2016;Lucotte et al. 2016; but see Bissegger et al. 2019;Kasimatis et al. 2020). Yet ambiguous null hypotheses for empirical estimates of between-sex F ST , along with high statistical noise relative to biological signal in these estimates, raise questions about statistical power and the prevalence of false positives within such data (Kasimatis et al. 2019). In addition, we need to better understand the extent to which common pitfalls of genome sequence datasets-for example, mismapped reads (Tsai et al. 2019), sampling biases, hidden population structure, and effects of linkage and hitchhiking-yield artifactual signals of sex-specific processes, and thus questionable candidate SA genes.
The challenges in applying genomics to sex-specific selection have strong parallels in the fields of local adaptation and statistical genomics. Although the study of SA loci is still in its infancy, the fields of local adaptation and statistical genomics have already grappled intensively with many of the conceptual and methodological challenges that research on sex-specific selection now faces (Hoban et al. 2016;Visscher et al. 2017). For example, local adaptation research has long emphasized the importance of clear null models for distinguishing genes involved in local adaptation from false positives that simply reside in the tails of neutral null distributions (Lewontin and Krakauer 1973;Günther and Coop 2013;Whitlock and Lotterhos 2015;Lohse 2017). Similarly, statistical genomics researchers have repeatedly warned that hidden population structure in genome-wide association studies (GWAS) can lead to spurious conclusions about the genetic basis of quantitative traits, complex diseases, and the role of adaptation in population differentiation (Lander and Schork 1994;Price et al. 2010;Barton et al. 2019;Berg et al. 2019;Sohail et al. 2019). Lessons from local adaptation and statistical genomics research can therefore sharpen hypothesis framing, guide statistical methodology, and inform best practices for disentangling signal, noise, and artifacts in studies of sex-specific selection.
Here, by drawing insights from local adaptation and statistical genomics research, we present practical guidelines for population genomic analyses of sex-specific fitness variation. We first outline two statistics that can, in principle, provide indirect evidence of sex-specific fitness effects of genetic variation: between-sex F ST , which is sensitive to sex differences in viability selection and some components of reproductive success, and F IS , a measure of Hardy-Weinberg deviations in diploids, which is sensitive to sex differences in overall selection (i.e., cumulative effects of viability, fertility, fecundity, and mating competition). We develop theoretical null models for each metric, provide an overview of their sampling distributions and statistical power, and present a reanalysis of published F ST data in light of our models. We also highlight complementary methods adapted from case-control GWAS to overcome some of the limitations of these metrics. Second, we evaluate several direct approaches for characterizing sex-specific genetic variation for fitness, which combine elements from quantitative genetics, reciprocal transplant studies, and experimental evolution. These direct approaches have been extensively employed to study the genetic basis of locally adapted phenotypes and quantitative traits, but rarely to identify SA loci. Third, we outline approaches for validating candidate genes and discuss best practices for the analysis and interpretation of their evolutionary histories.

Indirect Approaches for Identifying SA Genes
Estimating fitness under natural conditions is difficult, rendering approaches for identifying SA genes that rely on fitness measurements (i.e., direct approaches; see section "Direct Approaches for Identifying SA Genes") unfeasible for many populations. Any widely applicable approach must therefore make use of indirect empirical signals of SA selection in genome sequence data, which can now be collected for virtually any species.
Two specific patterns of genome sequence variation could be indicative of contemporary SA selection, as emphasized by several recent studies (e.g., Cheng and Kirkpatrick 2016;Lucotte et al. 2016;Eyer et al. 2019;Kasimatis et al. 2019). First, because sex differences in selection during the life cycle are expected to generate allele frequency differences between breeding females and males (i.e., the members of each sex that contribute to offspring of the next generation; see Box 1), allele frequency differences between samples of females and males of a population could be indicative of sex differences in selection-including SA selection, sex-limited selection, or ongoing sexually concordant selection that differs in magnitude between the sexes. Ideally, inferences of sex-specific selection from allele frequency estimates should be based on samples of breeding adults that have passed the filter of viability selection, sexual selection, and fertility/fecundity selection, although in practice genome sequences of random samples of adults are more readily obtainable and will reflect viability selection (Cheng and Kirkpatrick 2016;Kasimatis et al. 2019). Second, allele frequency differences between breeding females and males of a given generation elevate heterozygosity in offspring of the next generation relative to Hardy-Weinberg expectations (Kasimatis et al. 2019). Inflated heterozygosity in a large random sample of offspring could therefore reflect sex differences in viability selection, sexual selection, and/or fertility and fecundity selection during the previous generation.

SELECTION IN FIXATION INDICES
Fixation indices, which are widely applied in studies of population structure (Wright 1951), can be used to quantify allele frequency differences between sexes (F ST ) and elevations in heterozygosity in the offspring of a given generation (F IS ), each of which are predicted consequences of sex-specific selection (Boxes 1-2). Several studies have estimated F ST between sexes using gene sequences sampled from adults (Cheng and Kirkpatrick 2016;Lucotte et al. 2016;Flanagan and Jones 2017;Wright et al., 2018Wright et al., , 2019Bissegger et al. 2019;Vaux et al. 2019) or from breeding individuals (Dutoit et al. 2018), yet it remains unclear how much information about SA selection is contained within these estimates. A major problem, as recently emphasized by Kasimatis et al. (2019), is that the contribution of sex-specific selection to allele frequency differentiation between the sexes may often be weak compared to effects of sampling error in allele frequency estimates. Indeed, simulations presented in several studies (e.g., Cheng and Kirkpatrick 2016;Lucotte et al. 2016;Connallon and Hall 2018;Kasimatis et al. 2019) confirm that signals of SA selection in between-sex F ST are swamped by sampling error in small population genomic datasets. Nevertheless, without clearly defined probability distributions for between-sex F ST and related metrics of allele frequency differentiation, it remains difficult to evaluate whether sampling error, by itself, is sufficient to account for the empirical distributions of population genomic metrics that are putatively associated with sex-specific selection.
As we show in Appendix A (Supporting Information) (see Box 2), a null distribution for between-sex F ST estimates at loci with no sex differences in selection conforms to a special case of Lewontin and Krakauer's (1973) classic null model for F ST estimated between populations. An appealing feature of our two-sex null model is its insensitivity to some of the simplifying assumptions inherent in Lewontin and Krakauer's original model (i.e., that subpopulations are independent; see Nei and Maruyama 1975;Robertson 1975;Charlesworth 1998;Beaumont 2005; Whitlock and Lotterhos 2015), or issues arising from genetic linkage (Charlesworth 1998), which do not affect the twosex null distribution when F ST is independently estimated per single nucleotide polymorphism (SNP), but can strongly impact the null for concatenated sequences (e.g., gene-wide F ST estimates; see Booker et al. 2020; Appendix A [Supporting Information]). Another appealing feature of the null model for F ST is its insensitivity to the distribution of allele frequencies in the population, provided SNPs with very low minor allele frequencies (MAFs) are excluded from the analysis (e.g., MAFs < 0.05 for small datasets; MAFs < 0.01 for large datasets; see Appendices A and D [Supporting Information]). This insensitivity to MAF under the null is unique to F ST ; other metrics of allele frequency differentiation strongly covary with MAF, so that simulations based on the MAF distribution of the study population are required to generate genome-wide null predictions against which data can be compared ( By comparing the distribution of between-sex F ST under the null (i.e., no sex differences in selection) with the corresponding distribution under SA selection (Box 2), we can formally evaluate the minimum strength of selection (s min , defined as the minimum cost, per sex, of inheriting the "wrong" SA allele) required for SA loci to reliably reside within the upper tail of the null distribution. For example, the 99th percentile for the null distribution isF ST(99%) ≈ 3.32/n H , where n H is the harmonic mean sample size of female-and male-derived gene sequences, andF ST refers to an estimate of F ST (see Box 2; results are based on Nei's 1973 estimator for F ST , which closely aligns with Wright's 1951 definition for F ST between a pair of populations; see Appendix A [Supporting Information] for discussion of alternative F ST estimators). Roughly 1% of F ST estimates should fall above this threshold when there are no sex differences in selection. The probability that a SA locus resides within the tail of the null distribution depends on the allele frequencies at the locus, the strength of selection, and the sample size of individuals that are sequenced (Appendix A [Supporting Information]). In studies with large sample sizes (i.e., n H = 10 5 or greater, as in some human genomic datasets: see Fig. 1),F ST for a SA locus with intermediate allele frequencies and a fitness effect of a few percent will reliably fall within the upper tail of the null distribution ( Fig. 1; Appendix D [Supporting Information]). In contrast, studies where n H < 10 4 require very strong selection (s min > 0.05) for SA loci to reliably reside within the upper tail of the null F ST distribution ( Fig. 1), and are unlikely to identify individual SA loci (i.e., significant F ST outliers), even in cases where SA genetic polymorphism is common throughout the genome. Under scenarios of sex-limited selection, or sexually concordant selection that differs in magnitude between the sexes, sample sizes required to reliably identify true F ST outliers must be even larger, as the degree of allele frequency differentiation under sex-limited

) with intermediate equilibrium allele frequencies (p, q = 1/2) is in the top 5% tail of the null distribution forF ST . |s m | is the fitness cost of being homozygous for the "wrong" SA allele (see Appendices A and B [Supporting Information]). See Appendices C and D (Supporting Information) for further analyses of statistical power.
selection is roughly half the differentiation expected under SA selection, and differentiation is further muted under sexually concordant selection.
A second putative signal of sex-specific selection is an enrichment of heterozygotes among offspring cohorts, as inferred from high values of F IS (defined in Box 2; Appendix B [Supporting Information]). Although only a single study has used estimates of F IS (F IS ) to test for sex-specific selection (Eyer et al. 2019; see Boxes 1-2), the potential for future applications warrants evaluation of signals of sex-specific selection using this metric. Hardy-Weinberg deviations in a sample, as captured byF IS , may arise from selection, nonrandom mating (e.g., inbreeding or population structure), or random sampling of genotypes from the population (Crow and Kimura 1970;Weir 1997;Lachance 2009). Statistical properties of Hardy-Weinberg deviations in genotype samples are well established (Weir 1997), and easily adaptable for our point of interest: the distribution ofF IS in a randomly mating population in which allele frequencies may differ between the female and male parents of a given generation (Box 2). As illustrated in Box 2, the sampling variance forF IS exceeds that ofF ST by a substantial margin. Consequently,F IS has far less power thanF ST to distinguish signal of SA selection from noise ( Fig. 1A), let alone distinguishing sex-limited selection or sexually concordant selection that differs in magnitude between the sexes. An additional issue is that signals of elevated heterozygosity are expected to be strongest among cohorts sampled at birth, yet selection occurring during the life cycle can potentially decrease heterozygosity, further dampening signals of sex-specific selection inF IS estimates from adult samples.

SEX-SPECIFIC SELECTION
Genome scans for individually significant SA loci (F ST orF IS outliers) are severely underpowered unless SA loci segregate for intermediate frequency alleles with large fitness effects and sample sizes are very large (see above). Indeed, no empirical F ST study to date has yielded individually significant autosomal candidate SA SNPs that have survived corrections for multiple-testing and rigorous controls for genotyping error and read-mapping artifacts (see below).
Although F ST scans for individually significant SA loci are highly conservative, the full empirical distribution of F ST for autosomal SNPs may nonetheless carry a cumulative signature of sex-specific selection at many loci-even in the absence of individually significant SA genes. For example, SNPs responding to sex-specific selection (i.e., SNPs with sex-specific fitness effects or SNPs in linkage disequilibrium with them) should inflate aver-ageF ST and the proportion of observations in upper quantiles predicted by null models (Fig. 2). An excess of observations in the upper quantiles of the theoretical null may imply an enrichment of SNPs responding to sex-specific selection in the tail of the generates some discrepancy between the discrete distribution of F ST estimates (for both observed and permuted data) and the continuous theoretical null outlined in Box 2. The top 1% theoretical quantiles for flycatchers and pipefish are enriched by ∼50% and ∼100%, respectively, implying that ∼1/3 and ∼1/2 of SNPs above the 99% threshold of the null are "true" positives (which still require biological validation and filtering for putative artefacts, as described in the main text). Code and data are available at https://github.com/ldutoit/male_female_fst. empiricalF ST distribution, with SNPs in the tail representing interesting candidates for follow-up analyses (see section "Validation and Follow-up Analyses of Candidate SA Genes"). Moreover, the fraction of true versus false positives among candidates can be quantified. For example, if 2% of observed SNPs fall within the top 1% quantile of the theoretical null, this implies a 1:1 ratio of true to false positives within the top 2% of observations (i.e., a false discovery rate of 50%).
To test for elevation of empiricalF ST estimates relative to our theoretical null model, we reanalyzed three representative datasets from previously published studies (collared flycatcher Ficedula albicollis: Dutoit et al. 2018; gulf pipefish Syngnathus scovelli: Flanagan and Jones 2017; human: The 1000 Genomes Project Consortium 2015 data used by Cheng and Kirkpatrick 2016; Code available at https://github.com/ldutoit/ male_female_fst). For human and flycatcher whole-genome resequencing datasets, we used autosomal coding variants, excluding any SNP with missing data. For the pipefish RAD-seq dataset, coding and noncoding variants with less than 50% missing data were included; sex-linked regions are unknown in this species and could not be excluded. In all datasets, polymorphic sites with MAFs below 5% were also excluded, as sites with low MAF exhibit inflated sampling variances (see Whitlock and Lotterhos 2015; Appendix A [Supporting Information] and Figs. S5-S7). Analyses were carried out in bedtools (Quinlan and Hall 2010), vcftools (Danecek et al. 2011), and R (R Core Team 2020).
For all three datasets, permutedF ST distributions (i.e.,F ST calculated after randomly permuting sex labels across individuals) conform well to the theoretical null model forF ST (Figs. S1-S4). For the 1000 Genomes human dataset,F ST observations (i.e., nonpermuted estimates) are indistinguishable from both the theoretical null and permuted distributions, with no enrichment of observations within the top quantiles of the null ( Fig. 2; χ 2 tests; 5% tail: P = 0.18; 1% tail: P = 0.33; comparison between the nonpermuted meanF ST and theF ST means for 1000 permutations of the data: P > 0.05). In contrast, flycatcher and pipefish datasets show elevatedF ST values relative to the 5% and 1% tails of their null distributions ( Fig. 2; χ 2 tests; P = 4.33 × 10 −9 and P < 2.2 × 10 −16 for flycatcher; P < 2.2 × 10 −16 and P < 2.2 × 10 −16 for pipefish), and exhibit inflated means for observedF ST relative to theF ST means of 1000 permutations of the data (P < 0.05 for both datasets). Such enrichment shows that sampling error is not sufficient to explain empirical distributions ofF ST , and instead implies that many loci are responding to sex differences in selection (either directly or indirectly through hitchhiking with selected loci), or that a false signal of elevatedF ST has been generated by population structure and/or data quality issues, as discussed below.

SEX-SPECIFIC SELECTION
The analyses presented above suggest thatF ST , although severely underpowered for detecting individually significant outlier SNPs, may capture polygenic signals of sex-specific selection. However, sex differences in selection should not be invoked as the cause of such elevations inF ST without excluding artifacts that may generate similar patterns.
First, incorrect mapping of sex-linked markers to autosomes can potentially lead to artificial inflation of F ST estimates. For example, Y-or W-linked sequences may be erroneously mapped to sequence paralogs on autosomes (Tsai et al. 2019), resulting in artificially high F ST inferences at autosomal sites (Bissegger et al. 2019;Kasimatis et al. 2020). This problem can be mitigated in species with high-quality reference genomes, where mismapped reads can be eliminated through quality-filtering steps (i.e., removal of SNPs associated with low MAFs or extreme deviations from Hardy-Weinberg expectations), removal of candidate regions showing high sequence similarity to sex chromosome sequences (Kasimatis et al. 2020), and excluding regions with sex-biases in read coverage. However, mismapping is difficult to control for in species lacking high-quality reference genomes, including those where the sex determination system is unknown (e.g., the pipefish dataset in Fig. 2), or where sex chromosomes are young or rapidly evolving. Moreover, the effects of demographic processes, including recent admixture events or sex-biased migration, play out differently between sex chromosomes and autosomes (Hedrick 2007), so that caution is required in interpreting elevated between-sex F ST on the X or Z, as has been reported in humans (e.g., Lucotte et al. 2016).
Second, sex differences in population structure-arising from the broad geographic sampling of individuals or recent migration into a single sampled population-can also generate signals of genetic differentiation between females and males in the absence of sex differences in selection (Box 1). Taxa with broad contemporary distributions (e.g., humans and Drosophila) often show significant genetic differentiation among populations. Uneven or unrepresentative sampling of individuals of each sex from a set of different locations can, by chance, inflate allele frequency differences between the sexes beyond expectations for a single, panmictic population. If loci showing high between-sex F ST also exhibit high between-population F ST , this could be indicative of population structure contributing to allele frequency divergence between the sexes in the empirical sample. Studies that sample individuals from a single population may also show artificially elevated between-sex F ST if migration is sex biased (Box 1), which is common among animals (Trochet et al. 2016).
Sex-specific population structure can be accounted for by leveraging the statistical framework of case-control GWAS, in which associations between polymorphic variants and binary phenotypic states are quantified (e.g., presence or absence of a disease). The case-control GWAS approach treats sex (female or male) as the binary phenotypic state and scans for loci with the strongest associations, which should exhibit elevated absolute odds ratios (see Appendix C [Supporting Information]; Kasimatis et al. 2020;Pirastu et al. 2020). Although the underlying logic is identical to between-sex F ST , existing analytical methods for case-control GWAS can take population structure and relatedness in the empirical sample into account by including kinships (or the top principal components derived from kinships) as covariates (Astle and Balding 2009;Price et al. 2010). The case-control GWAS framework also permits estimation of SNP-based heritability of the phenotype (i.e., sex; Yang et al. 2011;Speed et al. 2017), which can be used to quantify a genome-wide signal of sex-specific selection.
Despite the advantages of leveraging an existing statistical framework, using case-control GWAS to test for associations between alternative alleles and sex does not sidestep all of the challenges faced by F ST and F IS statistics (see Appendices C and D [Supporting Information]). As with F ST and F IS , large sample sizes remain essential for discriminating between sampling variance and true signal of sex differences in selection (especially when selection is weak), and the methods perform poorly when MAFs are low. Additionally, association tests using odds ratios depend heavily on a normal approximation, and there is a deep and still evolving literature regarding hypothesis testing using these methods that users should be aware of (e.g., Haldane 1956; Wang and Shan 2015).

Direct Approaches for Identifying SA Genes
In exceptional study systems, candidate SA polymorphisms can be identified through explicit statistical associations between genotypes and fitness. Such direct inference approaches present two major advantages over indirect methods. First, the inclusion of fitness measurements can potentially increase power to detect individual SA loci, relative to indirect methods (e.g., Fig. 1). Second, association tests can be conducted across many components of fitness (e.g., viability, fecundity, and mating success), facilitating identification of the life history stages and selective contexts affected by SA loci. We outline two general approaches for direct inference of sexspecific selection-GWAS and evolve-and-resequence (E&R) studies-which have been extensively employed to identify genes associated with human trait variation and/or local adaptation (Long et al. 2015;Visscher et al. 2017), yet rarely to identify SA loci.

GWAS OF SEX-SPECIFIC FITNESS
GWAS quantify statistical associations between phenotypic variation and polymorphic SNPs throughout the genome. Using GWAS to identify SA loci further requires that data on fitness components and genotypes are collected from individuals of each sex. A major advantage of GWAS is the availability of statistically rigorous methods to identify candidate loci, including methods to control for covariates in analyses (Price et al. 2010), and approaches that correct for multiple testing (e.g., family-wise or false discovery rate correction; Benjamini and Hochberg 1995) or that reduce the number of tests through gene-based association analysis (Nagamine et al. 2012;Riggio et al. 2013;Bérénos et al. 2015). We discuss the application of GWAS to three dataset types: (i) datasets in which genotypes and phenotypes are each measured independently in each sex (e.g., humans); (ii) systems amenable to experimental manipulation, in which each genotype can be replicated among female and male carriers (e.g., isogenic or hemiclone fruit fly lines); and (iii) pedigreed populations, in which the genealogical relationships between all individuals are known (e.g., some sedentary vertebrate populations).
Where genotypic and phenotypic measurements are performed among independently sampled individuals of each sex, as in humans, SA loci can be identified by first performing a separate GWAS in each sex ("sex-stratified" GWAS) (Fig. 3A), and then quantifying the difference between male-and femalespecific effect sizes (see also Gilks et al. 2014). Illustrating this approach, Winkler et al. (2015) performed a sex-stratified GWAS on several human anthropometric traits, then defined a t-statistic where β M and β F are the sexspecific effect sizes, SE M and SE F are the sex-specific standard errors, and ρ is the between-sex rank correlation among genomewide loci. For each polymorphic site, P-values were generated by comparing the observed t statistics to a null t-distribution with no sex-specific effects (where E[t] = E[β M − β F ] = 0 under the null). This approach has been applied to nonfitness traits in humans (Randall et al. 2013;Myers et al. 2014;Winkler et al. 2015;Mitra et al. 2016; reviewed in Khramtsova et al. 2019), but has yet to be applied to fitness components (e.g., "number of children" phenotype in the UK Biobank; Sudlow et al. 2015).
In some experimental systems (e.g., fruit flies; flowering plants), the creation of isogenic or hemiclone lines (Abbott and Morrow 2011;Mackay et al. 2012;Berger et al. 2014) allows the same genotypes to be replicated and phenotypically assayed in carriers of each sex. Here, genotypes are effectively transplanted into male and female bodies or "environments," analogous to the reciprocal transplantation of individuals sampled from different environments in local adaptation studies (Price et al. 2018). Identifying SA loci can then be achieved by transforming the bivariate coordinate system of male and female fitness values of a set of genotypes through matrix rotation (see Berger et al. 2014), which generates a univariate SA phenotype amenable to GWAS analysis (Fig. 3B). The approach is exemplified by a recent study in D. melanogaster (Ruzicka et al. 2019), which identified ∼230 candidate SA polymorphic sites.
In pedigreed vertebrate populations, such as Soay sheep (Ovis aries) or Florida scrub jays (Aphelocoma coerulescens), the genetic relationships between all individuals are known, and transmission of individual alleles across successive generations can be estimated (MacCluer et al. 1986). Because an individual's genetic contribution to future generations is a genuine representation of its Darwinian fitness, alleles transmitted more frequently by one sex relative to the other represent candidate SA variants. Analyses of pedigreed populations of Florida scrub jays have been used to identify alleles with above-average transmission rates to descendants irrespective of sex (i.e., unconditionally beneficial alleles; Chen et al. 2019), yet this type of GWAS remains to be used to identify SA loci. It should be noted, however, that many pedigreed populations are necessarily small (given the logistics of monitoring them), which may hinder detection of loci affecting fitness variation.
Although GWAS-based identification of candidate SA loci shows promise, two major drawbacks must be kept in mind. First, measurements that capture total lifetime reproductive success are difficult to obtain, and caution is required in interpreting results based on single fitness component (e.g., reproductive but not viability selection), which may correlate imperfectly with total fitness. Second, effect sizes are typically small for polygenic traits (Visscher et al. 2017), including fitness. Powerful GWAS of sexspecific fitness may therefore be logistically prohibitive, and candidate SA loci will necessarily represent the subset of loci with particularly large fitness effects.

E&R WITH SEX-LIMITED SELECTION
Experimental elimination of selection in one sex but not the other (i.e., sex-limited selection) is a powerful way to identify SA selection in action. Various sex-limited selection designs have been implemented, including (i) restricting transmission of the genome to the male line, and thereby removing selection through females (Rice, 1996(Rice, , 1998Prasad et al. 2007;Bedhomme et al. 2008;Abbott et al. 2010), or vice versa (Rice 1992), using Drosophila hemiclones; (ii) eliminating fitness variance in one sex (e.g., by enforcing random contributions to offspring number, or removing opportunity for mate choice) but not the other (Rundle et al. 2006;Morrow et al. 2008;Maklakov et al. 2009;Hollis et al. 2014;Immonen et al. 2014;Chenoweth et al. 2015;Veltsos et al. 2017); and (iii) applying sex-limited artificial selection on a specific fitness component, such as mating success (Dugand et al. 2019), lifespan (Berg and Maklakov 2012;Chen and Maklakov 2014;Berger et al. 2016), reproductive tactic (Bielak et al. 2014), or mating investment (Pick et al. 2017).
To identify SA loci, sex-limited selection can be combined with genotyping at multiple time points during experimental evolution within each selection regime (E&R), thereby connecting population genetic changes to the phenotypic responses accrued during experimental evolution. By tracking allele frequencies in male-limited and female-limited selection lines, alleles that show a significant time-by-treatment interaction point to candidate SA loci, and their frequency dynamics can be characterized using current analytical tools for E&R experiments (Wiberg et al. 2017;Vlachos et al. 2019). E&R is a powerful and proven approach for identifying the genomic basis of phenotypic variation and local adaptation (Turner et al. 2011;Long et al. 2015;Barghi et al. 2019). Moreover, because it is experimental, the issues of sexspecific population structure that arise in between-sex F ST studies (see section "Indirect Approaches for Identifying SA Genes") can be minimized. Yet despite these advantages, we are not aware of any published study that has used E&R to identify SA loci (see Chenoweth et al. 2015 for the closest effort to date).
Resequencing can be performed using population samples taken at multiple time points within a single generation  or across multiple generations, with the latter approach benefitting from the fact that allele frequency responses to selection are cumulative over multiple generations. E&R, like GWAS, remains best suited for detecting loci with relatively large fitness effects. Selection on complex polygenic traits typically leads to small changes in allele frequencies at large numbers of loci, resulting in genomic signals of selection that are difficult to distinguish from genetic drift (Schlötterer et al. 2015). Consequently, the study organism, the number of replicates, the effective population sizes of selection and control lines, and the duration of experiments must be carefully considered in the design of E&R experiments (Baldwin-Brown et al. 2014;Kofler and Schlötterer 2014;Kessner and Novembre 2015).

Validation and Follow-Up Analyses of Candidate SA Genes
Candidate SA genes and SNP sets enriched for SA alleles (i.e., identified using methods outlined above) provide context for addressing long-standing questions about SA variation, including the genomic distribution, biological functions, and population genetic processes shaping SA polymorphisms. We focus on two specific issues in follow-up analyses of putative SA variants. First, we outline approaches for biologically validating SA candidates-a crucial task given that candidate gene sets may include appreciable proportions of false positives and artifactual signals of sex-specific selection (see section "Indirect Approaches for Identifying SA Genes"). Second, we discuss population genetic analyses and issues of interpretation with bearing on the evolutionary histories of SA genes.

BIOLOGICAL VALIDATION OF SA GENES
Candidate SA loci can be directly validated in laboratoryamenable taxa by experimentally manipulating each allele and measuring its sex-specific fitness effect. A good example of experimental validation of naturally occurring SA genes is a study by VanKuren and Long (2018), in which RNA interference and CRISPR-Cas9 were used to demonstrate SA effects of tandem duplicate genes Apollo and Artemis on offspring production in D. melanogaster. Similarly, Akagi and Charlesworth (2019) used manipulative molecular experiments to study candidate SA genes in several plant species. As a third example, several studies investigated a P450 transposable element insertion that upregulates the Cyp6g1 gene and increases DDT resistance in D. melanogaster (Smith et al. 2011;Rostant et al. 2015;Hawkes et al. 2016). Although evidence for SA effects at this particular locus is mixed, the experimental approaches used-including measurements of sex-specific fitness among isogenic lines and tracking the frequencies of each alternative allele in experimental cages-represent validation steps with potential for broad usage.
Direct experimental manipulation of candidate SA genes is not always feasible, and in instances where it is not, their biological validity can be assessed in other ways. In organisms harboring nonfunctional genomic material, it is possible to test whether candidate loci are enriched in genomic regions that are putatively functional (e.g., coding or regulatory) rather than inert (e.g., intergenic). Such "genic enrichment," which is expected for SA polymorphisms with genuine phenotypic effects, has previously been used to strengthen validity of candidate alleles for local adaptation (Barreiro et al. 2008;Coop et al. 2009;Key et al. 2016). Another way to increase confidence in candidate loci is to look for multiple signals of SA selection. For example, candidates identified through elevated F ST that are also associated with SA fitness effects in a GWAS represent the best candidates for follow-up evolutionary analyses (see below). Finally, if independent data exist on the sex-specific phenotypic effects of individual mutations (e.g., in RNAi databases), these data can be mined to support the validity of candidate SA genes.

EVOLUTIONARY DYNAMICS OF SA GENES
We do not outline the range of population genetic analyses that could be used to describe the evolutionary dynamics of candidate SA loci, as these have been comprehensively reviewed elsewhere (e.g., Vitti et al. 2013;Fijarczyk and Babik 2015). Instead we provide guidance on some common issues that are likely to arise when analyzing patterns of genetic variation at SA loci and interpreting their mode of evolution.
First, we emphasize that the evolutionary dynamics of a contemporary SA gene may have, in the past, been governed by any combination of genetic drift, net directional selection (selection favoring fixation of one SA allele), or balancing selection (selection maintaining SA polymorphism). Theory often focuses on the conditions generating balancing selection at SA loci (e.g., Kidwell et al. 1977;Patten and Haig 2009;Fry 2010), leading some empirical studies to use signals of balancing selection (e.g., elevated Tajima's D) as indirect proxies for SA selection (e.g., Du-toit et al. 2018;Wright et al., 2018Wright et al., , 2019Sayadi et al. 2019). However, whether contemporary candidate SA alleles evolved under balancing selection hinges upon both the historical pattern of sex-specific selection and dominance at such loci (Kidwell et al. 1977;Connallon and Chenoweth 2019), which can be influenced by spatial and temporally varying selection , and effective population size Clark, 2012, 2014a;Mullon et al. 2012). SA loci with large and symmetric selection coefficients or beneficial reversals of dominance (e.g., h f = 1 and h m = 0 in Box 1) are most conducive to balancing selection, whereas sufficient asymmetry between the sexes in the strength of selection (e.g., Sharp and Agrawal 2013) should result in net directional selection that removes SA polymorphism rather than maintaining it (Kidwell et al. 1977;Kasimatis et al. 2019). Even when conditions for long-term balancing selection are met, the efficacy of balancing selection relative to drift may often be weak at SA loci, leading to genetic diversity patterns that are indistinguishable from neutrally evolving loci Clark, 2012, 2013;Mullon et al. 2012). In short, loci under contemporary SA selection can have a broad range of possible evolutionary histories. As such, the typical mode of evolution operating at candidate SA loci cannot be assumed a priori and should instead be viewed as a question that must be resolved empirically.
Second, the detection of elevated polymorphism at SA loci does not necessarily imply balancing selection. For example, SA candidate loci may exhibit significantly elevated MAFs relative to non-SA loci (Ruzicka et al. 2019), yet relaxed directional selection can account for this pattern if non-SA loci encompass a mix of neutral sites and sites evolving under sexually concordant directional selection. To establish that SA loci are evolving under balancing selection, it is necessary to show that SA genetic variation is significantly elevated compared to confirmed neutral sites (e.g., short introns, Parsch et al. 2010) and cannot be accounted for by demographic or mutational processes (Andrés et al. 2009;DeGiorgio et al. 2014;Bitarello et al. 2018). On the other hand, significant reductions in polymorphism at SA loci, relative to neutral sites, do not necessarily rule out balancing selection either. Counterintuitively, when the equilibrium frequency of the minor allele is low (i.e., equilibrium MAF < 0.2, approximately), balanced polymorphisms can be lost more rapidly than neutral polymorphisms, leading to reduced genetic variation relative to neutral expectations (Robertson 1962;Mullon et al. 2012).
A third and final point is that nonrandom patterns of genetic variation at SA loci can be generated by ascertainment bias alone. For example, data filtering steps that remove low-MAF SNPs (see "Indirect Approaches for Identifying SA Genes") necessarily exclude rare SA variants from all downstream analyses. Elevated power to detect fitness effects among intermediate-frequency sites in a GWAS can also generate a spurious positive relationship between candidate SA sites and MAF that might be mistaken for nonneutral evolution. Similarly, SA loci could be nonrandomly distributed across the genome (e.g., enriched in regions with low or high recombination), thereby generating spurious patterns in population genomic data that appear to indicate nonneutral evolution. It is therefore important to correct for such biases where possible by, for example, incorporating external data on recombination rate variation (Comeron et al. 2012;Elyashiv et al. 2016) or assessing evidence of trans-species polymorphisms among SA loci before and after removal of CpG sites (Leffler et al. 2013).

Moving Forward
We have critically assessed a broad range of methods for detecting genomic signatures of SA selection, including indirect methods based on genome sequence analysis (section "Indirect Approaches for Identifying SA Genes") and direct methods based on associations between genome sequences and fitness measurements (section "Direct Approaches for Identifying SA Genes"). An inescapable conclusion from our indirect inference models is that very strong sex differences in selection or very large sample sizes are required to detect individual SA candidate polymorphisms with high confidence ( Fig. 1; Appendices A-D [Supporting Information]), in agreement with previous simulation studies of between-sex F ST (Lucotte et al. 2016;Connallon and Hall 2018;Kasimatis et al. 2019). Nevertheless, estimates of the full distribution of F ST from previously published flycatcher and pipefish datasets reveal an intriguing elevation of genome-wide F ST relative to our null models, which justifies future empirical studies of allele frequency differences between sexes (see below). Although an elevated signal of between-sex F ST is not present in our reanalysis of human data, the absence of a signal is perhaps unsurprising given the small number of loci analyzed, following the removal of noncoding sequences and loci with rare polymorphisms. Genome-wide analyses of sex-specific selection in humans using larger datasets (Kasimatis et al. 2020), and examinations of sex-biased expression among sites with elevated F ST (Cheng and Kirkpatrick 2016)-which were the focus of previous work, but not of our reanalysis-are therefore encouraged. With regard to direct methods for identifying SA genes, the substantial logistical challenge of accurately measuring fitness must be circumvented, yet the approach is powerful when feasible (see Ruzicka et al. 2019) and certain to be a key component of future work on the genetics of sex-specific fitness variation.
Although there is little doubt that identifying and characterizing SA genes is challenging, there are several reasons for optimism. First, the low power of indirect metrics to detect selection at an individual locus level does not rule out the detection of a cumulative signal of polygenic sex differences in selection (e.g., Fig. 2). Although such an approach implies that candidate SA genes (e.g., those in the highest F ST quantiles) will include many false positives, elevated false discovery rates are not necessarily problematic if we are interested in the general properties of SA candidates relative to samples of putatively neutral (or non-SA) loci. Nevertheless, in studies with low-to-moderate sample sizes, where many candidate genes will be false positives, researchers should minimally demonstrate that (i) the empirical distribution of the metric of interest differs significantly from its appropriate null (see Kasimatis et al. 2019; our reanalyses in section "Indirect Approaches for Identifying SA Genes"), (ii) putative signals of selection are not driven by sex-specific population structure or other artefacts (see section " Indirect Approaches for Identifying SA Genes"), and (iii) candidate loci are situated in putatively functional genome regions (see section "Validation and Followup Analyses of Candidate SA Genes").
Second, the power to detect SA genes using indirect metrics can often be increased in relatively simple ways. For example, pooled sequencing is a cost-effective method for estimating allele frequencies from samples of many individuals (Schlötterer et al. 2015), and well-suited for genome-wide F ST scans (although not for F IS scans, as estimating F IS requires individuallevel genotype data). Researchers could, alternatively, focus attention toward large publicly available genomic datasets that are adequately powered for detecting loci under moderately strong SA selection (see Fig. 1), or toward genomic regions predicted to have relatively high statistical power. For example, studies of pseudoautosomal regions of recombining sex chromosomes have substantially higher power to detect F ST outliers driven by sex differences in selection (Qiu et al. 2013;Kirkpatrick and Guerrero 2014). Targeted sampling strategies may also amplify power to identify SA genes. For example, estimating allele frequencies among breeding adults-which have passed filters of viability selection and components of adult reproductive success-increases the number of episodes of selection that can contribute to allele frequency differentiation between sexes, improving the potential for detecting elevated between-sex F ST .
Third, well-chosen study systems can improve prospects for accurately measuring lifetime reproductive success and identifying SA loci through direct methods (GWAS or E&R). For example, difficulties in accurately measuring fitness under field conditions can be mitigated in pedigreed populations, where the genetic contribution of each individual to successive generations is known (provided the population is well monitored), and each genotype can therefore be associated with an accurate estimate of total lifetime reproductive success in each sex. Emerging approaches to infer pedigrees from genomic data alone (Snyder-Mackler et al. 2016) may further facilitate identification of SA loci in the absence of long-term monitoring efforts. In some experimental systems, such as laboratory-adapted hemiclones of  Delph et al. 2011;Muyle et al. 2012), plant systems remain underused in research on SA selection. One advantage of plants is their greater amenability to field measurements of fitness components, as widely used in studies of local adaptation and species' range limits (Hargreaves et al. 2014). Another advantage is the great diversity of reproductive systems in flowering plant species, the vast majority of which are hermaphroditic and susceptible to SA selection (Jordan and Connallon 2014;Tazzyman and Abbott 2015;Olito 2017;Olito et al. 2018), potentially leading to allele frequency differences between the pollen and ovules contributing to fertilization under haploid selection, and to elevated F IS among offspring. A third advantage of plants is their greater tendency to express genetic variation during the haploid stage of their life cycle (e.g., Immler and Otto 2018). Haploid (relative to diploid) expression is expected to inflate the contribution of genetic polymorphism to fitness variance and magnify evolutionary responses to selection, including within-generation allele frequency divergence between sexes (Connallon and Jordan 2016). Exploiting plant systems may thereby increase statistical power to identify candidate SA genes or genomic signals of SA variation using direct (GWAS and E&R) or indirect inference approaches.

Box 1. Processes generating sexually divergent allele frequencies
Several evolutionary scenarios can lead to sex differences in the frequencies with which individual alleles are transmitted to offspring (Hedrick 2007;Úbeda et al. 2011;. We focus on two processes-sex differences in selection and sex-biased migration-that may each commonly arise and affect estimates of allele frequency differences between sexes. Sex differences in selection . Consider a single biallelic locus in which the focal allele (allele A) has a frequency of p at birth within a given generation; the alternative a allele has a frequency of 1p. Selection during the life cycle alters the allele frequencies in the set of adults that contribute offspring to the next generation. The frequency of the A allele in breeding females and males (respectively) is follows: where s f and s m are female and male selection coefficients for the A allele, h f and h m are the dominance coefficients (Table 1), and O(s 2 f , s 2 m , s f s m ) refers to second-order terms in the selection coefficients, which are negligible (and can be ignored) when s f and s m are small, as expected for most loci (Charlesworth and Charlesworth 2010, p. 97).
We expect allele frequency differences between breeding adults of each sex (p f = p m ) when the fitness effects of each allele differ between sexes, that is, (1) the same allele is favored in each sex but the strength of selection differs between sexes (e.g., p f > p m when s f > s m > 0); (2) alleles have sexlimited fitness effects (e.g., p f > p m when s f > 0 = s m ), or (3) alleles are sexually antagonistic (e.g., p f > p m when s f > 0 > s m ). Allele frequencies are expected to remain equal between sexes (p f = p m ) when genetic variation is neutral (s f = s m = 0), or selection and dominance coefficients are identical between sexes (s f = s m and h f = h m ).
Sex-biased migration . Consider an island population receiving new migrants each generation, with migration occurring before reproduction during the life cycle. At birth, the frequencies of A and a alleles in the island population are p and 1p, respectively. Let m f and m m represent the proportions of breeding females and breeding males on the island that are migrants. The expected frequency of the A allele in breeding females and males (respectively) will be p f = 1 − m f p + m fp and p m = (1 − m m ) p + m mp , wherep is the frequency of the A allele in migrant individuals. The identity, p f − p m = (m f − m m )(p − p), implies that allele frequency differences between breeding females and males (p f = p m ) would require sex-biased migration (m f = m m ) and allele frequency differences between migrant and resident (nonmigrant) individuals (p = p).

Box 2. Fixation indices (F ST and F IS ) applied to sex differences
Allele frequency differences between breeding adults of each sex can be quantified by way of fixation indices, originally devised by Wright (1951) for characterizing genetic differentiation among populations. We again consider a biallelic autosomal locus with the focal allele (A) at a frequency of p f in breeding females and p m in breeding males. Between-sex F ST . F ST is a standardized measure of the allele frequency difference between the sexes: has an approximately noncentral chi-squared distribution with one degree of freedom and noncentrality parameter of λ = (p f − p m ) 2 ( where n f and n m are the numbers of sequences derived from females and males, respectively. The approximation can break down when n f and n m are small or the minor (rarer) allele at the locus has a frequency close to zero. Under the statistical null distribution, the true allele frequencies do not differ between the sexes (p f = p m ), and thereforeF ST ≈ X 0 /2n H , where X 0 is a chi-squared random variable with one degree of freedom, and n H = 2(1/n f + 1/n m ) −1 is the harmonic mean sample size.
F IS in offspring. F IS can be used to quantify deviations between the observed heterozygosity in a cohort of individuals before selection (e.g., individuals sampled and sequenced at birth) and the expected heterozygosity under Hardy-Weinberg equilibrium. With random mating among breeding adults with female and male allele frequencies p f and p m , and ignoring effects of genetic drift or segregation distortion, the frequency of the A allele in offspring of the next generation will bep = (p f + p m )/2, and the proportion that is heterozygous will be P Aa = p f (1p m ) + p m (1p f ). Under these conditions, F IS will be where the final expression is equivalent to F ST between breeding females and males of the prior generation (Kasimatis et al. 2019). The above expression for F IS applies to the entire set of offspring in a population, whereas empirical estimates of F IS will be based on the genotypes of offspring sampled from the population. As shown in Appendix B (Supporting Information), estimates of F IS will be approximately normally distributed with a mean and variance of where n is the number of offspring genotyped for the locus. The approximation applies when the sample size is large. Under a null model, in which offspring are outbred and mating is random, estimatesF IS will be normal with mean of 1/2n and variance of 1/n.

Supporting Information
Additional supporting information may be found online in the Supporting Information section at the end of the article. Figure S1. Permuted versus observedF ST for simulated data. Figure S2. Permuted versus observedF ST for the flycatcher dataset. Figure S3. Permuted versus observedF ST for the pipefish dataset. Figure S4. Permuted vs. observedF ST for the human 1000 Genomes data. Figure S5. Human 1000 Genomes data with SNPs with low minor allele frequencies (MAFs) included in the analysis. Appendix A. Distribution of F ST estimates. Table S1. Sex-specific relative fitness of genotypes of a biallelic locus with additive fitness effects in each sex. Appendix B. Sex-specific allele frequencies and F IS estimates. Appendix C. Case-control GWAS and the Log-Odds Ratio. Figure S6. Probability that L. /SE for an additive SA locus. Figure S7. Theoretical signal of multilocus SA polymorphism: the ratio of observed versus permuted absolute log odds ratio estimates within 100 quantiles of the theoretical null of L. /SE. Appendix D. Comparison of null model distributions for different metrics allele frequency differentiation between sexes.