Measuring viability selection from prospective cohort mortality studies: A case study in maritime pine

Abstract By changing the genetic background available for selection at subsequent life stages, stage‐specific selection can define adaptive potential across the life cycle. We propose and evaluate here a neutrality test and a Bayesian method to infer stage‐specific viability selection coefficients using sequential random genotypic samples drawn from a longitudinal cohort mortality study, within a generation. The approach is suitable for investigating selective mortality in large natural or experimental cohorts of any organism in which individual tagging and tracking are unfeasible. Numerical simulation results indicate that the method can discriminate loci under strong viability selection, and provided samples are large, yield accurate estimates of the corresponding selection coefficients. Genotypic frequency changes are largely driven by sampling noise under weak selection, however, compromising inference in that case. We apply the proposed methods to analyze viability selection operating at early recruitment stages in a natural maritime pine (Pinus pinaster Ait.) population. We measured temporal genotypic frequency changes at 384 candidate‐gene SNP loci among seedlings sampled from the time of emergence in autumn until the summer of the following year, a period with high elimination rates. We detected five loci undergoing allele frequency changes larger than expected from stochastic mortality and sampling, with putative functions that could influence survival at early seedling stages. Our results illustrate how new statistical and sampling schemes can be used to conduct genomic scans of contemporary selection on specific life stages.

changes associated with ontogeny, in addition to potentially associated changes in demography and the environment, should result in viability selection variation with age. Importantly, selective pressures operating during critical ontogenic and demographic transitions might have a disproportionate contribution to total viability selection. In the case of long-living tree species, and many other organisms producing vast offspring cohorts, most mortality occurs during the first few months after emergence, frequently exceeding 90% during the first year (e.g., Castro, Gómez, García, Zamora, & Hódar, 1999;Vizcaíno-Palomar, Revuelta-Eugercios, Zavala, Alía, & González-Martínez, 2014). Although the intensity of selection is not expected to increase rapidly with early elimination rates (Haldane, 1931), the combination of strong early seedling mortality, high interindividual variance in seed fecundity (Kang, Bila, Harju, & Lindgren, 2003), and substantial genetic variation at early fitness traits (e.g., Savolainen, Bokma, García-Gil, Komulainen, & Repo, 2004) indicates ample opportunities for strong viability selection at early tree life stages (Hufford & Hamrick, 2003;Petit & Hampe, 2006;Savolainen, Pyhäjärvi, & Knürr, 2007).
Methods for detecting loci associated with viability selection during specific developmental stages are thus essential to understand the evolution of traits that are potential major contributors to total lifetime fitness and therefore important determinants of species niches and range limits under present and future climatic conditions (Donohue, 2014;Donohue, Rubio de Casas, Burghardt, Kovach, & Willis, 2010;Jackson, Betancourt, Booth, & Gray, 2009).
Genomic scans for selection are a widely used tool for detecting adaptive loci, based on the comparison of observed versus expected patterns of genetic variation under particular demographic and selective models (Jensen, Foll, & Bernatchez, 2016). Several test statistics are available to identify signatures of selection from polymorphism data taken at a single point in time, such as site frequency spectra, patterns of linkage disequilibrium, or interpopulation differentiation data (reviewed in Bank, Ewing, Ferrer-Admettla, Foll, & Jensen, 2014). These approaches are not suitable for investigating viability selection separately, however, because they are designed to detect footprints of historical selective processes, with target patterns of variation reflecting the cumulative effects of demographic and selective processes over generations, including fecundity, sexual, and gametic selection. Alternative methods use genomic time series data, tracking allele frequency trajectories over multiple generation samples, within and/or among populations (Bollback, York, & Nielsen, 2008;Feder, Kryazhimskiy, & Plotkin, 2014;Ferrer-Admetlla, Leuenberger, Jensen, & Wegmann, 2016;Foll, Shim, & Jensen, 2015;Gompert, 2016;Malaspinas, Malaspinas, Evans, & Slatkin, 2012;Mathieson & McVean, 2013;Nishino, 2013;Schraiber, Evans, & Slatkin, 2016;Shim, Laurent, Matuszewski, Foll, & Jensen, 2016). The temporal dimension usually allows more powerful joint inference of selection coefficients and demographic parameters such as effective population size and gene flow, in the necessary disentanglement of the confounding effects of demography (versus selection) on allele frequency change across generations (Malaspinas, 2016). Even if these methods were powerful enough to infer selection over as few as two consecutive generations, however, they would still be measuring total selection and thus failing to isolate signatures of viability selection.
Genomic scans for viability selection acting on specific developmental stages require methods capable of testing allele frequency change caused by mortality during the target ontogenic period, within a single generation. For this purpose, a logistically appealing approach for long-living tree species would consist of simultaneously sampling for genotyping several cohorts of different ages coexisting nearby within the same population. This rapid cross-sectional sampling scheme has been used frequently in studies of neutral molecular variation across tree life stages (e.g., Alvarez-Buylla, Chaos, Piñero, & Garay, 1996). But this strategy is questionable, at least for viability selection inference, because the ontogenetic staging of cohorts recruited at different times being necessarily asynchronous, observed allele frequency differences will be inflated by cross-cohort differences in parental structure (e.g., due to masting and/or interannual pollination variation) and by temporal variation in selection.
Longitudinal cohort studies circumvent this problem, by sequentially sampling a target cohort of same-age individuals multiple times as they age and die, and are more suitable for analysis of viability selection (e.g., Christiansen, Frydenberg, & Simonsen, 1978;Edwards & Heath, 1983). The few available genome-wide analyses of selection based on longitudinal cohort studies have investigated contemporary allele frequency change caused by mortality under experimental selective regimes (Anderson, Lee, & Mitchell-Olds, 2014;Gompert et al., 2014;Pespeni et al., 2013), after mass die-offs caused by toxins in the wild (De Wit, Rogers-Bennett, Kudela, & Palumbi, 2014), and in response to unknown natural selective pressures operating during critical ontogenic and demographic transitions (Bourret, Dionne, & Bernatchez, 2014) or during invasion (Monnahan, Colicchio, & Kelly, 2015). All of these studies formally tested the null neutral hypothesis that observed temporal allele frequency changes were caused by sampling and stochastic mortality alone, while two of the studies additionally estimated selection coefficients. In particular, studies where it was feasible to track and genotype exhaustively every single individual in the (small) experimental cohort did conduct selection coefficient estimation, based on observed absolute differences in survival probability between alternative homozygotes (Anderson et al., 2014;Gompert et al., 2014). The other studies focused on big natural or experimental populations, in which individual tagging and exhaustive sampling were unfeasible, so instead independent random samples were taken from the large target cohorts before and after the selective episode of interest, and no attempts were made to estimate selection coefficients from such incomplete longitudinal samples (Bourret et al., 2014;De Wit et al., 2014;Monnahan et al., 2015;Pespeni et al., 2013). The null hypothesis of genotypic-independent mortality was still tested, either based on permutation analyses of genotypes among temporal samples (De Wit et al., 2014;Pespeni et al., 2013), homogeneity tests of genotypic proportions (Monnahan et al., 2015), or through a direct application of outlier tests designed for the analysis of historic adaptive differentiation among populations (Bourret et al., 2014). None of these studies evaluated the expected statistical performance of the employed methods with simulations, as would be required to determine their error rates and reliability (Lotterhos & Schaal, 2014).
As in the case of many tree species, natural populations of numerous other plant and animal taxa are large, with vast seedling or larval cohorts in which individual tagging and tracking may be difficult and costly. Further methods to infer viability selection from independent random samples in longitudinal cohort mortality studies would thus be broadly useful, facilitating the investigation of adaptive processes at particular ontogenetic stages and contemporary selective episodes. For instance, monitoring contemporary selection in exploited fisheries or forests during critical developmental or demographic stages could provide valuable real-time information to orient adaptive conservation management. Measuring stage-specific contemporary selection should also be central in studies tracking ongoing global-change effects on the evolutionary ecology of wild or managed populations. We propose here tools to address this kind of practical questions, introducing tests of neutrality and a Bayesian scheme to estimate viability selection coefficients from independent random genotypic samples, taken sequentially from a large cohort within a generation. Using simulations, we analyze the accuracy and expected error rates of these methods under contrasting sampling and selective scenarios. Using SNP markers within a large number of candidate genes, we apply the approach to investigate viability selection operating at early recruitment stages in a relict Mediterranean maritime pine (Pinus pinaster Ait.) population.

| Demographic model
We assume a cohort of diploid individuals, of which a set of (n 0 , n 1 , …, n T ) are sampled randomly at an initial reference time step (t = 0) and at T subsequent time steps, with T ≥ 1. The cohort size throughout the sampling period (N 0 , N 1 , …, N T ) is assumed to be unknown. We assume that the sampled cohort is coetaneous, and that no new individuals join it after t = 0, neither through reproduction nor through migration. The cohort is followed over time to determine whether genotypic-dependent mortality occurs throughout one or several demographic or environmental episodes of interest, temporally delimited by the sampling intervals. For this purpose, the temporal samples are genotyped at L biallelic loci, yielding the y = {y 11 lt ,y 12 lt ,y 22 lt } vector of observed genotypic counts, where y 11 lt , y 12 lt , and y 22 lt are the number of individuals in the sample collected at time t that are homozygous for the first allele, heterozygous, and homozygous for the second allele at locus l, respectively.
We consider as point of reference for measuring genotypic frequency change the initial genotypic frequencies in the zygote pool originating the target cohort (p 0 ), immediately after gamete fusion. The N 0 genotypes in the cohort at t = 0 can then be assumed to be multinomial draws from the initial zygote pool, while the n 0 genotypes in the initial temporal sample are in turn drawn from a finite population (cohort) of size N 0 . This two-step sampling is equivalent to one-step multinomial sampling of n 0 genotypes from the initial pool of genotypes, no matter whether the n 0 individuals are sampled from the cohort with replacement or not (Nei & Tajima, 1981;Waples, 1989). This equivalence holds as well for subsequent samples (t ≥ 1) in the absence of selection. If selection operates after the initial reference sample is collected at t = 0, however, then expected genotypic frequencies for t > 0 will diverge from p 0 .

| Quasi-exact and Monte Carlo tests of neutrality
We propose a quasi-exact test and a fast Monte Carlo approximation to test the null hypothesis of genotypic-independent mortality (i.e., neutrality) in the cohort. The question is whether observed genotype frequency changes among temporal samples are greater than those expected from random mortality and sampling alone. For the l locus, we used the standardized variance in allele frequency for measuring the observed allele frequency change between time steps t − 1 and t (Pollak, 1983;see Waples, 1989 for the two-allele expression) where f lt = (2y 11 lt + y 12 lt )∕(2n lt ) and f lt−1 = (2y 11 lt−1 + y 12 lt−1 )∕(2n lt−1 ). We allow for differences in sample size across loci to account for possible differences in missing data. Under the null (neutral) hypothesis, the n i genotypes are multinomial draws from the initial zygote pool (with frequencies p l0 ) for any t. We can thus compute exactly, given p l0, the cumulative probability (the p-value of the test) that two independent multinomial draws of n lt and n lt-1 genotypes from the initial pool yield an allele frequency change equal or greater than the where y 11 l , y 12 l , and y 22 l are the observed counts of genotypes 11, 12, and 22 in the total pooled sample, respectively, and n l is the (2) p 11 l0 = (y 11 l + 1∕3)∕(n l + 1),p 12 l0 = (y 12 l + 1∕3)∕(n l + 1) andp 22 l0 = (y 22 l + 1∕3)∕(n l + 1), pooled sample size. We will refer to the test in Equation 2 as the quasi-exact neutrality test, to reflect the fact that, although it is nonparametric and based on exhaustive combinatorial enumeration, it relies on the estimation of initial allele frequencies. Unlike homogeneity tests used in adult-offspring selection component analysis (Christiansen & Frydenberg, 1973;Monnahan et al., 2015), our test does not require Hardy-Weinberg equilibrium assumptions. And in contrast to homogeneity tests specifically designed for the analysis of a sequentially sampled cohort (Christiansen et al., 1978;Edwards & Heath, 1983), our quasi-exact test is not based on parametric asymptotic approximations, expected to break down for low genotypic counts and rare alleles (e.g., Cressie & Read, 1989).
The quasi-exact neutrality test becomes however extremely time-consuming for large sample sizes. Therefore, we also employed a fast approximation, by approaching the expected neutral distribution of Δ lt with a random subsample of the possible outcomes of the compound binomial draws. In particular, we used the following Monte Carlo simulation algorithm: 1. Randomly draw the number of individuals from each genotypic class in the simulated samples at time steps t (y sim lt ) and t − 1 (y sim lt−1 ) from a multinomial distribution with three classes with probabilities {p 11 l0 , p 12 l0 , p 22 l0 } and n lt and n lt-1 trials, respectively. 2. Compute the standardized allelic frequency increment in the simulated sample (Δ sim lt ) as in Equation 1, but using allelic frequencies calculated from the simulated genotypic counts y sim lt and y sim lt−1 . (1) and (2) 10,000 times and calculate the p-value as the proportion of simulated replicates where Δ sim lt ≥Δ obs lt .

| Bayesian inference of selection coefficients
Under selection, expected genotypic frequencies for t > 0 will diverge from p 0 . Denote the relative fitness of each genotype as w 11 lt = 1 + 2s lt , w 12 lt = 1 + 2h l s lt and w 22 lt = 1, where s lt is the selection coefficient for locus l and time step t, and h l is the heterozygous effect for locus l. The expected genotypic frequencies after selection for 0 < t ≤ T are then (Gillespie, 2004) where w lt = w 11 lt p 11 lt + w 12 lt p 12 lt + w 22 lt p 22 lt is the mean fitness of the cohort. Given the above assumptions and the unknown initial genotypic frequencies p l0 , selection coefficients s l , and dominance coefficient h l , the probability of observing the genotypic count y l0 at locus l in the initial sample is given by while the probability of observing genotypic count y lt for t > 0 can be written where s lt− is the vector of size t with the values of the selection coefficients for locus l at previous time steps (from 0 to t − 1). In computing Pr(y lt |p l0 ,s lt− ,h l ) for the i-th time step (i > 0), the vector of expected genotypic frequencies are calculated sequentially from t = 1 to i using Equation 3, conditional on initial genotypic frequencies p l0 , selection coefficient values at previous (t < i) time steps s lt− , and dominance coefficient h l . Note that multinomial Equation 5 assumes either that individual sampling is nondestructive (i.e., with replacement) for t > 0 or that it is destructive but from a cohort of large enough size (N i >> n i ), so that the multinomial approximates sufficiently well the actually hypergeometric sampling process.
The likelihood for locus l considering all time steps is then We employed uninformative prior distributions (f) for all parameters, namely a flat Dirichlet prior for the genotypic frequencies at each locus and time step, p lt ~ Dir(α = 1), a uniform prior on the interval (−0.5, 10 6 ) for the selection coefficient of each locus and time step (s lt ), and a uniform prior on the interval (0, 1) for the dominance coefficient of each locus (h l ).
Given the y l vector of genotypic counts for locus l, the joint posterior distribution over parameter set l = (p l0 ,s l ,h l ) is given by Bayes' rule: where the f functions on the right-hand of the equation are the prior distributions. For each locus, the joint posterior distribution of Equation 7 was estimated using the MCMC algorithm described in Supporting information Appendix S1.

| Simulation study of methods performance
Using the Monte Carlo simulations detailed in Supporting information Appendix S2, we calculated the power and Type I error rate of the quasi-exact neutrality test (Equation 2) and its Monte Carlo fast approximation. We also computed the expected bias, accuracy (root mean square error, RMSE), and credible interval noncoverage rate (NCR) of Bayesian estimates of selection coefficients (Equation 7).
We considered a cohort of N individuals, from which two temporal samples of size n were taken and genotyped at one locus with initial minor allele frequency MAF and selection coefficient s. We assumed codominance (h = 0.5) and investigated the effect on s estimates of variable levels of N (from 10 2 to 10 5 ), of n (ranging from 10 2 to 10 4 ), of s (0, −0.01, and −0.1), and of MAF (from 0.05 to 0.5). Exhaustive sampling scenarios (i.e., n = N) were included to assess the performance of the methods in an extreme reference setting without random sampling effects, even if the presented methods are actually inappropriate in that case, as the independent multinomial (or hypergeometric with n << N) sampling assumption is then violated, and more direct and powerful approaches then apply (e.g., Gompert et al., 2014). All simulation and inference algorithms were coded in C++ programs.
Executables of the program used for inference (sgcs) will be available from https://sites.google.com/site/jjrobledo2/software.

| Empirical study site, field sampling, and laboratory analysis
We studied a small native population of P. pinaster at Fuencaliente, ing slope (averaging 40%), high summer temperatures (with maxima occasionally exceeding 40°C), and low summer precipitation (averaging 59 mm) contribute to strong early mortality during the seedto-sapling transition in the area (Charco et al., 2017). Seed dispersal by wind typically spans from May to October in Iberian populations of P. pinaster, with seedlings emerging from October to November, enhanced with the first rains after the summer drought (Juez et al., 2014;personal observations). In a preceding study at Fuencaliente, Unger, Heuertz, Vendramin, and Robledo-Arnuncio (2016) tested the presence of exotic gene flow from conspecific plantations growing in the surroundings of the native stand, and its potential early fitness consequences. For this purpose, they took three random sequential seedling samples from a contemporary seedling cohort in the native stand, from the time of emergence in autumn until the summer of the following year. Using microsatellite markers, they found absence of seed and pollen immigration from the exotic plantations in the sampled seedling cohort. As detailed in Unger et al. (2016), seedlings were collected where found over the entire population, and the three temporal samples comprised n 0 = 101 seedlings collected shortly after emergence (November 2010), n 1 = 109 seedlings taken in late winter (March 2011), and n 2 = 45 seedlings taken in mid-summer (July 2011). The initial cohort size was presumably in the order of 10 5 or more, with a study in the same population reporting over 17,000 cones (typically bearing well over ten viable seeds each) produced during a single seed dispersal season (Charco et al., 2017).
Using candidate-gene SNP loci, we genotype here the same three temporal seedling samples to investigate whether selective mortality occurred during the first 8 months after seedling emergence at the P. pinaster population in Fuencaliente. Protocols for DNA extraction can be found in Unger et al. (2016). The 255 seedlings were genotyped with an oligo pool assay including a selection of 384 SNPs using the Illumina VeraCode platform (see table S2 in Budde et al., 2014; see also Jaramillo-Correa et al., 2015). The used SNPs are distributed in 221 candidate genes that include drought stress candidates, genes overexpressed under abiotic stress, and functional candidates for phenology, growth and wood properties, as described in more detail in previous studies using the same assay (Budde et al., 2014;Chancerel et al., 2011;Jaramillo-Correa et al., 2015). For each locus, we tested the null neutral hypothesis using the quasi-exact test in Equation 2. We corrected for multiple testing by setting the expected rate of false discoveries among rejected null hypothesis at FDR = 0.05 (Benjamini & Hochberg, 1995). We proceeded to estimate selection coefficients (using Equation 7) for loci with positive tests after FDR.

| Expected accuracy of Bayesian inference of selection coefficients
No convergence problems in MCMC runs used to obtain the posterior distribution of selection coefficients (s) were observed for any of the simulated data sets. We present first as reference the results for a simulated cohort of size N = 10 4 . Assuming that the locus under study is neutral (s = 0), the single most important factor determining the expected errors of the selection coefficient estimates was sample size (n), with both the bias and RMSE of ŝ rapidly increasing with decreasing n (Figure 1). Although biases were consistently lower when measured with respect to the median than with respect to the mean, the corresponding variances were similar and comparatively high, translating into very similar RMSE in general, except for the lowest sample size considered (n = 100), for which the smaller RMSE of the median was more evident (white versus gray bars in wide CIs reflecting the uncertainty in s estimation (Figure 4).
For a given sample size n, the RMSE and NCR of ŝ were weakly sensitive to the cohort size N, irrespective of the assumed values of s and MAF (Figures 1-3 Figures S2-S10). The exhaustive sampling scenarios (N = n) were exceptional, however, as the RMSE and (especially) the NCR of ŝ decreased relative to scenarios with the same n but with n < N (Figures 1-3 and Supporting information Figures S2-S10).

| Expected power and error rates of neutrality tests
The fast Monte Carlo approximation to the quasi-exact neutrality test (Equation 2) was very close in the simulated data sets, with strong p-value correlation (R 2 = 0.999; Supporting information Figure S1). The simulation results reported below were obtained with the quasi-exact test in cases with n ≤ 500, and via the fast F I G U R E 1 Effect of sample size (n) and minor allele frequency on selection coefficient estimates for a neutral locus (s = 0). RMSE is the root mean square error, and NCR the noncoverage rate of 95% credible intervals (the dotted line shows the nominal 5% value). Bias and RMSE were measured with respect to the mean (white bars) or with respect to the median (gray bars). Based on 1,000 Monte Carlo replicates per scenario, assuming a cohort of size N = 10,000 and a biallelic locus Minor allele frequency approximation for n > 500, as the quasi-exact test became prohibitively time-consuming in the latter cases.
The expected Type I error (false positive) rate of the neutrality test (assuming s = 0) was zero in the extreme exhaustive sampling scenario, and ranged between 3 and 6% for the different values of sample size n and minor allele frequency MAF considered ( Figure 5, top), that is, close to the nominal 5% value in all cases.
The expected false positive rate of the neutrality test was quite similar to the NCR of 95% CI of ŝ obtained with the Bayesian inference method when s = 0, although the NCR tended to be very slightly larger for the smaller sample sizes and lower minor allele frequencies in the simulations (Figure 1 and also plotted for reference in Figure 5, top).
The power of the neutrality test was moderate-to-low for a strongly selected locus (s = −0.1), increasing with raising n and MAF  Figures S11-S13).

| Genomic analysis of early selection in Pinus pinaster
The 384-plex SNP assay produced reliable seedling genotypes at 356 loci, of which 302 were polymorphic. The genomic analysis of selection was conducted on 237 (78%) polymorphic loci having MAF > 0.05, which exhibited a substantial range of temporal allele frequency change, somewhat wider during the first (winter) than during the second (spring-summer) target periods ( Figure 6). The allele frequency change was not larger than expected by random mortality and sampling for most loci, as indicated by the quasi-exact neutrality test (Figure 6). In particular, there was no significant evidence of genotypic-dependent mortality during the second study period for loci showed temporal changes significantly larger than expected from random mortality and sampling ( Figure 6 and Table 1 The tests corresponded to either the quasi-exact neutrality test in Equation 2 ("QE" white symbols) or to the proportion of times the 95% CI of Bayesian s estimates did not include zero ("By" black symbols). Assumed temporal sample sizes were n = 100 (circles), n = 500 (triangles), n = 1,000 (diamonds) and n = 10,000 (squares). Based on 1,000 Monte Carlo replicates per scenario, assuming a cohort of size N = 10,000 and a locus with two codominant alleles  (Table 1). The loci potentially under selection are found in genes related with water stress response (m650), abiotic stress responses (m1115), disease resistance (m1496), or more general functions such as cellular signaling (m102) and protein assembly (m698).

| Measuring selection from cohort mortality studies
Our model represents a first attempt to explicitly estimate contem- TA B L E 1 Identity, annotation, and Bayesian selection coefficient estimates (ŝ) for five SNPs exhibiting significant temporal allele frequency change in the Pinus pinaster prospective cohort mortality study The allele increasing in frequency appears in bold. b Minor allele frequency. c Observed allele frequency increment (p-value provided by quasi-exact neutrality test after FDR correction).
s estimates become very large relative to small s values even for large samples (Figure 3). This is because observed allele frequency changes in the cohort are largely driven by sampling noise under weak selection, an intrinsic problem that seems difficult to circumvent (as already noted by Christiansen et al., 1978) and is shared with methods based on multigenerational temporal genotypic samples Gompert, 2016).

| Model assumptions and perspectives
Unlike statistical methods to infer selection from multigeneration proportions. The method relies however on some potentially limiting assumptions that should be noted. First, samples taken from the target cohort are assumed to be free of immigrants and recruits newly established after the initial sampling step, either because they do not exist or because they can be discriminated during sampling (as they could represent a confounding source of temporal genotypic frequency change). This requirement will be easily met in experimental studies, but also in natural populations of many organisms, such as plant species for which both seed dispersal and germination occur in discrete synchronous periods, or animal species with synchronous reproduction and temporally coherent juvenile cohorts (e.g., anadromous fishes). Otherwise, special caution should be taken, for instance when studying seedling cohorts in plants with spread germination periods, since genetically determined emergence phenology (Donohue, 2005) might inflate temporal variation in genotypic frequencies. The model also assumes a cohort of same-age individuals, considering as reference population the initial zygotic pool that originated such cohort. If multiple age classes were present, then the total cohort itself (at the time of the initial sample) could be considered as reference population, and our methods should presumably be robust to this reference change as long as the cohort is large.
Another assumption of the model is that sampling is random, that is, that the probability of sampling a genotype does only depend on showed that exhaustively sampling a small cohort, which violates the independent sampling assumption, tends to hamper the statistical detection of selection (tests become nonsignificant), especially for weak selection, even if the RMSE of the (nonsignificantly different from zero) selection coefficient estimates actually decrease. This is not, however, a drawback of our random-sampling oriented methods, as they are not intended for studying a cohort small enough that every individual could be tracked, in which case more straightforward and powerful approaches should be employed (e.g., Anderson et al., 2014;Gompert et al., 2014).
In summary, as guidance for empirical studies, our approach should work best when large independent random genotypic samples are sequentially taken from a vast cohort of same-age individuals, in which immigrants or new recruits arriving after the time of the first temporal sample either do not exist or can be easily discriminated. Ideally, the cohort should consist of same-age individuals, but the presence of multiple age classes is unlikely to compromise inference if the cohort is large. As a rule of thumb, loci under strong selection should be detectable with acceptable power, and the corresponding selection coefficients estimated with acceptable accuracy, for sample sizes in the order of 10 3 genotypes randomly taken from a cohort of size in the order of 10 4 individuals or more. Our model neither makes assumptions nor provides information on whether correlated selection among loci occurs or not (e.g., due to genetic linkage or epistasis). Estimated selection coefficients represent differences in expected single-locus marginal fitness, that is, difference in relative fitness between the alternative homozygotes in a locus, averaged over genomic backgrounds (Ewens, 2004;Gompert et al., 2014). The ability of the model to detect selection will thus depend on the size of expected locus-specific effects, including

| Contemporary selection on maritime pine recruits
We investigated for the first time whether genotypically based selection could determine nonrandom mortality during a critical early developmental stage in a tree species. In line with the model assumptions, seedling emergence is expected to be largely synchronous in the species, and samples were obtained over the entire population area, proportionally to local abundance and nondestructively whenever possible (Unger et al., 2016). as it changes the genetic background available for selection at subsequent life stages, especially in novel environments or extreme climatic events, and when pleiotropy is present (Donohue, 2014).
We used candidate-gene marker loci, so positive tests must necessarily correspond to putatively functional SNPs. It is however noteworthy that three of the five identified loci potentially under selection during the first 4 months after seedling emergence have been related with functions that could influence survival precisely at early stages. In particular, the putative ADP-ribosylation factor (SNP m650; Table 1) has shown altered expression levels in P. pinaster seedlings subject to experimental water deficit (Dubos et al., 2003). The SNP m102 encodes proteins related to the protein-kinase family, associated with general cellular signaling functions increased during pine seedling development (Ávila, Pérez-Rodríguez, & Cánovas, 2006). And the putative NAC transcription factor encoded by m698 belongs to a family apparently involved in abiotic stress responses of P. pinaster seedlings (Pascual, Cánovas, & Ávila, 2015). Of the two other identified loci, the putative protein encoded by m1496 is related to the TIR/NBS/LRR disease resistance protein family, found in Arabidopsis and other plant genomes including pines (Meyers, Morgante, & Michelmore, 2002), and the 6-phosphogluconate dehydrogenase protein-coding gene (m1115) is associated with general abiotic stress responses in several model plant species (Hou, Huang, Yu, & Zhang, 2007).
The estimated average absolute strength of selection experienced by the five loci over the 4-month period was rather high (0.362), though of the same order than the one estimated for loci undergoing exceptional temporal allele frequency change in another short-term cohort mortality study, which focused on rapid adaptation of stick insects to a sudden host shift (Gompert et al., 2014).
These values should obviously not be regarded as unbiased estimates of the average strength of genome-wide selection, since they correspond to a small subset of sampled loci with the strongest evidence of selection, and estimates of locus-specific size effects based on crossing significance thresholds are expected to be upwardly biased (Göring, Terwilliger, & Blangero, 2001;Ioannidis, 2008;Williams & Haines, 2011). Moreover, the observed exceptional allele frequency changes were probably determined by both selective mortality and sampling noise, with their relative contributions being unknown (Gompert et al., 2014).
Apart from statistical and stochastic factors, two main biological reasons could help explain the significant signals of strong stage-specific selection observed in our field cohort study. First, drought stress is a major driver of pine early seedling mortality in the Iberian Peninsula (Castro, Zamora, Hódar, & Gómez, 2004;Vizcaíno-Palomar et al., 2014), and recent mortality could be genotype-dependent because the relict P. pinaster population might be currently undergoing rapid genetic adaptation to new drier conditions. Recent climate change is actually resulting in a higher warming rate across Spain than the global average, accompanied by reductions in rela- microenvironment. An alternative biological explanation consistent with our strong selection estimates is that viability selection (measured by survival) tends to be stronger when measured over shorter periods, which could result from selection over longer time intervals tempering bursts of strong selection, via reversal or stasis periods (Hoekstra et al., 2001).
Beyond increasing sample sizes for enhanced statistical power, further insights could be gained by replicating our study on different cohorts, newly established over subsequent seed dispersal seasons. It would then be possible to test whether the same set of loci tend to exhibit nonrandom temporal allele frequency changes for cohorts established under similar seasonal environmental conditions. This would help supporting the true association of detected loci with selective mortality under a particular environment (Chanock et al., 2007). If in addition reverse locusspecific allelic frequency changes were consistently detected in cohorts established under contrasting environmental conditions (e.g., dry versus wet), it would suggest that fluctuating selection is acting at early developmental stages, a process potentially contributing to the high within-population genetic variation of trees (Yeaman & Jarvis, 2006). It would be particularly interesting to

ACK N OWLED G EM ENTS
This work was supported by the Spanish Ministry of Economy, Industry and Competitiveness and the European Regional Development Fund (projects CGL2009-09428 and CGL2015-64164-R to JJRA and PhD grant BES-2010-031797 to GMU). We specially thank Santiago C.
González-Martínez for providing early access to SNP marker information and protocols, and for genotyping assistance. We also thank Myriam Heuertz and Fernando del Caño for laboratory and field assistance, and three anonymous reviewers for helpful comments.
Most computations in this study were conducted at CESGA high performance computing facilities, with funds from INIA's Information Technology Unit.

CO N FLI C T O F I NTE R E S T
None declared.

DATA A R C H I V I N G S TAT E M E N T
Data for this study are available at the Dryad Digital Repository: https://doi.org/10.5061/dryad.3p5n8bj.