Determining the form of relationships between traits and fitness is central to many evolutionary studies. Under the assumption that a trait has a causative effect on fitness, the form of the relationship between a trait and fitness gives insight into the form of natural selection. Given this central importance in evolutionary biology, it is not surprising that a great many techniques for quantifying and characterizing the form of natural selection have been developed (Robertson 1966; Lande and Arnold 1983; Arnold and Wade 1984; Schluter 1988; Janzen and Stern 1998; Shaw et al. 2008), and also that long-running discussions exist regarding the merits and interpretation of different approaches (see, e.g., Endler 1986; Schluter 1988; Brodie et al. 1995).

Regression-based approaches have been central to the analysis of natural selection, and two regression-based techniques are particularly important. Following Pearson (1903), Lande and Arnold (1983) presented a means by which selection coefficients (gradients and differentials) can be obtained using parametric (multiple) regression analysis. In particular, their results lead to the simple regression equation

where *w* is relative fitness, that is, , *z* are phenotypic trait values, *a* is an intercept, *j* and *k* index traits, β are directional selection gradients, γ are quadratic selection gradients, and *e* are residual errors (Walsh and Lynch 2012a). Assuming multivariate normality of before selection the estimates of β and γ that can be obtained via equation (1) are directly applicable in a quantitative genetic framework for the prediction of evolution, and in standardized forms are easily comparable across studies. However, these estimators of the form of selection have limited interpretation in terms of assessing the form of selection (e.g., whether selection is predominantly directional, stabilizing, or disruptive). Essentially, although this approach yields very useful estimates of selection coefficients based on approximation of the (relative) fitness function, the shape approximated by the quadratic function itself should not be used to make quantitative inference of the form of selection. For example, positive or negative values of γ should not be interpreted as evidence for disruptive or stabilizing selection, respectively, even if they represent minima or maxima of the quadratic approximation within the range of observed phenotypes in a population (Schluter 1988). The main alternative to parametric regression has been the application of spline-based semiparametric regression analyses as first advocated by Schluter (1988). These analyses are desirable because they lead to inferences of the form of selection that make few a priori assumptions. In addition, these models can be fitted with much more sensitivity to nonnormal distributions of fitness residuals than the linear models used in the Lande–Arnold approach. Thus far these spline-based techniques have not replaced the parametric analysis because biologists cannot relate them to evolutionary parameters such as selection gradients, or compare them among studies in a standardized form.

As statistical models of relationships between phenotype and fitness, the least squares regression analyses used to generate selection gradient estimates using equation (1) is undesirable for statistical hypothesis testing. Residuals of fitness, or fitness components, are highly nonnormal. For example, viability over particular episodes of selection is typically a binary trait, and measures of reproductive success are generally better modeled as Poisson variables. Consequently, it has become common practice to use generalized linear models for statistically testing (null) hypotheses about the relationship between fitness and phenotype. However, because generalized linear models evaluate linear relationships between regression coefficients on imaginary but useful latent scales, they cannot be directly used for quantitative estimation of selection gradients. Also, it is very difficult to test hypotheses about the shape of phenotype-fitness functions. This is because link functions (i.e., typically logistic for binary data, logarithmic for Poisson data) impose curvature on the data scale when only linear terms are modeled on the latent scale.

Here we develop and illustrate an approach for obtaining quantitatively informative and interpretable selection gradients from any statistical characterizations of fitness functions, including spline-based selection analyses. The approach uses generalized models, allowing nonnormal distributions of fitness (components) to be explicitly modeled, while still yielding the quantitatively interpretable selection gradients that are typically obtained via least squares–based regression of relative fitness on phenotype. Given sufficient data, the approach is applicable using generalized additive models, that is, models with smooth regression terms and generalized response distributions, allowing flexible models of phenotype-fitness functions to be characterized in terms of explicit quantitative genetic metrics of selection.