Model-Based Estimates for Operant Selection

We present a new methodology to partition different sources of behavior change within a selectionist framework based on the Price equation – the Multilevel Model of Behavioral Selection (MLBS). The MLBS provides a theoretical background to describe behavior change in terms of operant selection. Operant selection is formally captured by the covariance based law of effect (CLOE) and accounts for all changes in individual behavior that involve a covariance between behavior and predictors of evolutionary fitness (e.g., food). In this article we show how the CLOE may be applied to different components of operant behavior (e.g., allocation, speed, and accuracy of responding), thereby providing quantitative estimates for various selection effects affecting behavior change using data from a published learning experiment in pigeons.


INTRODUCTION
Operant learning has repeatedly been characterized as a process that works analogous to natural selection (Broadbent, 1961;Campbell, 1956;Gilbert, 1970;Herrnstein, 1964;Palmer & Donahoe, 1992;Pringle, 1951;Skinner, 1966;Thorndike, 1900).Whereas in natural selection, species adapt to the environment as a result of the fitness consequences of inheritable traits, operant selection consists of individuals adapting to specific contexts as a result of the consequences of repeatable actions.Skinner (1981) proposed that both processes should be subsumed under the common explanatory mode of selection by consequences.
Although the conceptual framework of selection by consequences has a long tradition in behavior analysis and remains a popular narrative (e.g., Baum, 2023;Becker, 2019;Donahoe, 2011;Donahoe et al., 1993;Hull et al., 2001;McDowell, 2013McDowell, , 2023;;Simon, 2020;Simon & Hessen, 2019), it seems to have had few effects on the actual practices of many behavior analysts (but see McDowell, 2019, andLi et al., 2018).One possible reason for this gap between theory and practice may be that operant selection is sometimes used as a mere synonym for what is traditionally called "reinforcement."Of course, adopting the language and vocabulary of evolutionary biology alone does not add much to the theoretical foundations of the experimental analysis of behavior.To be useful for the development of substantive theory, the conceptual framework of operant selection needs to be scrutinized and formalized such that it becomes more than a loose analogy.Given such a formal account of operant selection, methodological implications may be derived that might eventually affect the way behavior analysts frame their experiments.
In this article, our goal is to provide a first building block for a methodology of behavior analysis that builds on the conceptual framework of operant selection.In particular, we seek to estimate the amount of selection in different components of operant behavior.To reach this objective, we provide a coherent theoretical background for operant behavior that builds on the concept of selection by consequences (Skinner, 1981).We start with an introduction to the selectionist account of operant behavior and its formalization within the multilevel model of behavioral selection (MLBS; Borgstede & Eggert, 2021).Second, we use the MLBS to develop a methodological approach that allows for the empirical estimation of operant selection by means of theory-based modeling.Third, we apply our methodology to derive the amount of operant selection on different components of behavior using training data from a published pigeon experiment.Finally, we discuss the implications of our approach with respect to behavioral selection theory and potential practical applications of the method.

BEHAVIORAL SELECTION THEORY
The principle of operant selection was initially formalized by Baum (2017) and further developed by Borgstede and Eggert (2021), who integrated individual-level behavioral selection with population-level natural selection within the MLBS.The MLBS builds on the abstract description of selection processes by means of the Price equation (Price, 1970(Price, , 1972)).The Price equation describes selection as the result of the covariance between the individual values of a quantitative character (e.g., an inheritable trait, such as size) and individual evolutionary fitness (i.e., the contribution of an individual to the future population).A positive covariance is associated with a positive change in mean character value (i.e., selection results in a higher average character value), whereas a negative covariance indicates a negative change (i.e., selection results in a lower average character value).In other words, character values that are statistically related to evolutionary fitness are selected and thus alter the population average.
In the Price equation, all other sources of change (i.e., those that do not refer to selection of the characteristic of interest) are subsumed in a residual term.This term has different interpretations depending on the context.For example, when applied to the evolutionary change of gene frequencies, the residual term captures the effects of imperfect transmission (i.e., mutation and recombination).When applied to phenotypic change over generations (e.g., body weight or behavioral traits), the term may capture environmental factors influencing the phenotype (Luque & Baravalle, 2021).The Price equation provides a mathematical description of all selection processes, without relying on the specific mechanisms of variation, selection, and transmission (Luque, 2017).In fact, in its most general form, the partitioning of change into a selection and a nonselection term is merely a mathematical identity.Therefore, it is more of a formal definition of what is meant by selection rather than a statement about hypothetical mechanisms.The benefit of such a definition is that it provides a consistent conceptual background that can then be used to construct more specific models of evolutionary change.
In the MLBS, the Price equation framework is applied to a population of individuals that vary in a certain quantitative behavior (such as average number of pecks emitted in the presence of certain environmental cues).At the population level, the covariance term refers to the effects of natural selection on average behavior tendencies of the population and the residual term captures the average change within individuals.The main contribution of the MLBS is that it formally links this latter within-individual change to the general framework of the Price equation.Following the rationale that individual changes in behavior can also be explained through selection by consequences, the MLBS extends the Price equation by applying the same covariance principle at the within-individual level.Here, the population average in behavior is not calculated over different individuals but over recurring instances of the same context (e.g., experimental trials).Behavior change, such as an increase or decrease in key-pecking rate, averaged over a longer period, is described according to the covariance partitioning from the Price equation.However, the criterion of selection is not individual fitness itself (in terms of a direct contribution to the future population) but statistical predictors of individual fitness.In other words, at the within-individual level, behavior is not selected by means of reproduction or survival (in fact, if the individual dies, all of its behavior immediately ceases) but by events that predict expected change in evolutionary fitness (Borgstede, 2020(Borgstede, , 2024)).For example, food is generally a positive predictor of evolutionary fitness because it raises the probability of survival and, thus, future reproduction.Conversely, physical threat is a negative predictor of evolutionary fitness because it lowers the chances of survival and future reproduction.The concept of a fitness predictor is largely equivalent to what Baum (2012) calls a "phylogenetically important event." The conceptual framework of the MLBS allows us to describe behavioral selection at the individual level by means of a within-individual covariance between behavior in recurring contexts and its consequences in terms of statistical fitness predictors. 1 Similar to the process of natural selection, behaviors that covary with events that signal a change in expected evolutionary fitness are selected, which in turn changes the average behavior of the individual.Following the MLBS, the amount of behavior change due to selection corresponds to the covariance between behavior and fitness predictor, weighted by the slope of the fitness function of the fitness predictor.Formally, the amount of behavior change due to within-individual selection can be expressed by the following equation (Borgstede & Eggert, 2021): 1 Note that statistical fitness predictors are not limited to events that directly increase fitness (such as mating or feeding) but also include indirect predictors of fitness, such as information about consistent cues for the availability of food (Anselme, 2022;Fortes et al., 2016;McDevitt et al., 2016;Zentall, 2016).Parameters that control information seeking might also be relevant predictors of evolutionary fitness, even if their influence may be less direct (e.g., Inglis et al., 1997; also Borgstede, 2021).
In Equation 1, b designates a quantitative behavior (such as the amount of key pecking in a recurring context) and Δ s b corresponds to the within-individual change in average behavior due to selection.The value p is a quantitative fitness predictor, and Cov b,p ð Þ is the covariance between the behavior and the fitness predictor.Finally, β wp is the slope of the function relating the fitness predictor p to the actual fitness w.
Just like the original Price equation, the MLBS introduces a nonselection term that captures all other sources of behavior change.Designating the overall change in average behavior as Δb, the within-individual behavior change can now be expressed as the sum of a selection term and a nonselection term δ: This equation is called the covariance-based law of effect (CLOE). 2 The CLOE can be regarded as a fundamental principle of behavior in that it captures the essence of behavioral selection by partitioning the overall change in behavior into a selection component and a nonselection component (Borgstede & Luque, 2021).The selection component largely corresponds to what is traditionally called reinforcement (i.e., contingency-based effects of behavioral consequences), whereas the nonselection term subsumes all other sources of behavior.Note that Equation 2 is valid for any combination of fitness predictors and behavioral measures.Therefore, instead of absolute counts of pecking and food delivery, relative amounts of pecking of behavior over two options ("allocation") may be used.Likewise, additional behavioral parameters may equally be treated as potential targets of selection, such as peck frequency ("speed") or the average success rate of pecking ("accuracy").In the following section, we use the MLBS as a starting point for a theorybased approach to quantifying selection as a source of within-individual behavior change.
However, it is less obvious how the MLBS might be applied to actual empirical data from behavioral experiments.The reason for this gap between theoretical explanation and empirical application lies in the nature of the Price equation.Because the Price equation is a mathematical identity, it makes no empirically testable predictions per se.Consequently, a fundamental theoretical principle such as the CLOE cannot itself be put to empirical test.The CLOE, like all fundamental theoretical principles, is best understood as a formalization of the conceptual framework used in the underlying theory.In other words, the CLOE tells us what exactly is meant by operant selection and, in doing so, provides the conceptual groundwork for more specific models that may then be applied to empirical data (Borgstede & Luque, 2021;Killeen, 2023).
Although the idea that a fundamental theoretical principle itself has no empirical content may seem at odds with the foundations of empirical science, it is in fact the rule rather than the exception.For example, some of the most fundamental "laws" of behavior are actually true by definition (Killeen, 1972).The same holds for fundamental principles in other sciences, such as physics.For example, Newton's second law of motion (F ¼ ma) alone says nothing about any particular physical system.It is only through the construction of specific models by means of auxiliary assumptions and empirically derived regularities that concrete applications become possible (Borgstede & Eggert, 2023).In other words, fundamental laws provide the theoretical backbone of more specific models that may then be applied to actual empirical data.Some of these applications may serve as critical experiments in the evaluation of the theory as a whole.Others may exploit the theory by estimating hitherto unknown model parameters.Such latter applications often do not question the theory itself but use it to infer the specific values of one or more theoretical entities in a given context.A well-known example in behavior analysis consists of using the generalized matching law to estimate the bias and sensitivity parameters in a given context (Baum, 1974).Applications that seek to infer the values of theoretical entities may not only provide useful practical information but also form the foundation of theory-based measurement (Borgstede & Eggert, 2023).The purpose of the following section is to provide a corresponding methodology for the estimation of operant selection from behavioral experiments.

MODEL-BASED INFERENCE OF OPERANT SELECTION
Although the partitioning of change into selection and nonselection components by means of the Price equation is always possible at a theoretical level, empirical applications require specific models for the dynamics of change.Estimation of selection effects becomes possible by comparing observed data to the predictions from such models.The basic rationale behind this approach consists of constraining 2 See the Appendix for a mathematical derivation of the CLOE from the elementary Price equation.
a specific model such that it predicts what would be expected in the absence of selection and contrasting this prediction with the actual empirical observations.Consequently, an empirical application of the MLBS requires a model of the specific conditions that generated the observed behavioral data.In particular, the model needs to account for the covariance between a behavior and a fitness predictor in a given context (e.g., a specific behavioral experiment).
As outlined in Borgstede and Luque (2021), the theoretical covariance between an observed behavior and a quantitative fitness predictor can be obtained from the feedback function of a reinforcement schedule.Moreover, the quantitative behavior itself (and, consequently, behavior change) can be observed over several repeated experimental trials.Given the feedback function of the target behavior in a specified context, the CLOE may then be applied to empirical data.If the goal is to obtain quantitative values for selection, one can use the model to calculate the amount of operant selection as defined in the MLBS.Technically, selection estimates are calculated using a constrained model (or null model) that is identical to the model that describes the observed behavior, except for the part that is responsible for selection to occur (Okasha & Otsuka, 2020).Practically, this means to apply a minimal change to the model parameters such that the covariance term becomes zero (because a zero covariance implies zero selection).
Mathematically, there are several ways to ensure that the covariance term in the CLOE is zero.However, the most plausible candidate for our null model is certainly that the fitness predictor (e.g., the amount of food received per time) is set equal across trials.For example, if we conduct an experiment with two consecutive trials, we may observe behavior that yields three food items per minute in Trial 1 and five food items per minute in Trial 2. The null model would use the feedback function from Trial 2 to calculate the behavior that would have resulted given the individual had received the exact same amount of food items per minute as in Trial 1 (i.e., only three food items per minute instead of five).Conditional on the MLBS, the difference between the actually observed behavior in Trial 2 and the behavior predicted from the null model corresponds to the amount of behavior change that can be ascribed to selection.The corresponding selection estimate may be calculated for any component of the observed behavior such as relative time allocation, peck frequency, or average success rate of pecking, the only difference being that the feedback function used in the null model needs to be specified such that it captures the effects of the corresponding target behavior.The proposed method can thus be summarized by the following steps: first, describing the experimental scenario in terms of behavioral selection using a specific model that is consistent with the MLBS, second, constructing a null model to calculate the amount of behavior change that would have occurred in the absence of selection, and, third, subtracting the behavior change predicted from the null model from the actually observed behavior change.
In the following section, we will demonstrate the method outlined above by applying it to an actual empirical data set.We will show how empirical estimates of selection for various components of operant behavior can be obtained and how these model-based estimates can be evaluated, compared, and tested for statistical significance using a permutation test framework.

APPLICATION: OPERANT SELECTION BETWEEN LEARNING TRIALS
We demonstrate how the general methodology described above may be implemented in an empirical study using a minimal example for illustrative purposes.We apply the method to the data from two training trials (first and last days) of a published behavioral experiment involving pigeons (Anselme et al., 2022).The focus of the main experiment was to investigate the effects of differential distribution of food items per patch in the holes of a board on foraging behavior.Here, we focus on the training trials administered prior to the main experiment, consisting of two conditions only ("no food" vs. "guaranteed food" at the beginning of a trial).The question we address is to what extent several components of the pigeons' behavior may have been the target of operant selection and whether the estimated selection effects are significantly distinct from zero.

Experimental apparatus and data acquisition
We exploited some video data from a published study (Anselme et al., 2022) to obtain suitable data for the application of the method outlined above.Here, we only provide the methodological details that are relevant to understanding our analyses.
Nine naïve pigeons were maintained at 85%-90% of their free-feeding body weight to motivate them to eat in the task.The pigeons were tested in a rectangular wooden box with a floor that was a horizontally removable plate of wood (120 Â 70 Â 2 cm [L Â W Â H]), perforated with holes (1.5 cm in diameter and ± 1.5 cm in depth).The foraging board contained 60 holes organized as six rows of 10 holes regularly spaced.The board was covered with a black plastic tape with a crosscut above each hole to create an opening, which allowed the pigeons to access the food items while being unable to visually detect their presence from a distance (Figure 1).Specific stimuli (green and red; 21 Â 14.5 cm [L Â W]) were used to signal the consistent presence or absence of food per hole in one area.The two areas were separated by means of a colored strip glued on the plastic tape, dividing the board in two equal left and right areas of 30 holes each from the entrance compartment.Each session was recorded with an external camera that was placed above the apparatus.
In each of the 30 holes of one area (left or right, counterbalanced across trials within the same individuals), we positioned one food item (corn, green pea, yellow pea, or sunflower), and this area was associated with one discriminative stimulus (red or green, counterbalanced across individuals) placed on each wall (4 cm above the floor level) from the first to the last trial.For a given individual, a stimulus location was counterbalanced across trials.The 30 holes of the adjacent area remained empty and were associated with the other stimulus placed on each wall.Of note, the pigeons were initially trained for 3-4 days with uncovered holes, each containing one grain, such that they could see the food in the holes on inspection.No discriminative stimulus was used at this stage.After this initial training, the pigeons were trained as reported above for four consecutive days with covered holes.

Data extraction
Data were collected on manual counting (food items consumed and number of pecks per area for each trial).Determining whether a peck at a hole was successful (food item grasped) or not was mostly impossible from the videos such that pecking is not synonymous with consumption.A peck simply means a vertical downshift of the pigeon's head above a hole.We considered a pigeon to be positioned in a given area if its head was in this area-because its body could be in one area while pecking in the adjacent one.Sometimes, the pigeon missed a grain (picked it up and lost it) such that it rolled on the board.In the attempt to get it, the pigeon could cross the demarcation line between the two areas.A peck given outside of a hole, even to get a missed grain, was not counted.

Data analysis
We focused on the observed quantitative changes in individual foraging behavior between trials and its relation to individual capture rates.The primary measures used in the analysis were the time spent at the food and the no-food region (T þ and T À , respectively), the number of pecks emitted while staying at the food region and the no-food region (B þ and B À , respectively), and the number of food items retrieved during each trial (R).Because the most plausible predictor of evolutionary fitness in the current scenario is the retrieval of food per time (capture rate, C), we divided the number of retrieved food items during a trial by the total amount of time spent foraging (i.e., the total time the pigeon spent either in the food or the no-food region during each trial) such that As possible targets of operant selection, we calculated three derived behavioral measures.First, relative time at the food region (time allocation, A) was calculated by dividing the time spent at the food region by the total foraging time-that is, A ¼ T þ =T.Second, differential peck frequency (peck speed or velocity, V ) was calculated by dividing the number of observed pecks at the food region by the time spent at the food region for each trial-that is, Third, the average success rate of a peck emitted at the food region (peck accuracy or skill, S) was calculated by dividing the number of retrieved food items by the number of pecks at the food region-that is, S ¼ R=P þ .
As pigeons are known to forage systematically, thereby avoiding sites that they have already exploited (Baum, 1987), the expected number of food items per peck only depends on the average pecking success, yielding a feedback function that is approximately linear until all grains are retrieved (for higher numbers of pecks, the slope of the feedback function is zero). 3The slopes of 3 The analysis would be equally possible for random foraging behavior.In this case, the slope of the feedback function would also depend on the number of food items that have already been retrieved, yielding a negatively accelerated change in expected feedback.Reanalysis of the data under the assumption of random foraging changed the quantitative estimates of the MLBS but did not change the overall qualitative patterns or the group-level effects.
the feedback functions correspond to the average gain in capture rate per unit change in the corresponding behavior for each trial.The average gains follow directly from the definitions of the derived measures.The slopes of the corresponding feedback functions are a result of the equality C ¼ AVS. 4 For time allocation, rearrangement of the above identity yields a slope of β CA ¼ VS; for peck speed, the corresponding slope is β CV ¼ AS; and for peck accuracy, we obtain β CS ¼ VA, respectively.As an illustrative example, the linear feedback function for time allocation of one individual (P118) is depicted in Figure 2. The dashed black line depicts the feedback function for the first trial that was obtained from the observed time allocation and capture rate during Day 1.
The observed data from the first trial are indicated by the filled black circle that lies on the dashed black line.The feedback function and data for the last trial (Day 4) are indicated by the solid black line and another filled black circle, respectively.The null model for each behavioral measure was constructed by replacing the observed capture rate on Day 4 by the observed capture rate on Day 1, thereby constraining the change of the quantitative fitness predictor to zero.In Figure 2, this constraint is illustrated by the horizontal dashed red line that indicates the capture rate on Day 1.The point where the red dashed line intersects with the feedback function from Day 4 (solid black line) designates the data predicted from the null model for Day 4 (marked with a filled red circle).The corresponding value on the horizontal axis for predicted time allocation on Day 4 is then compared with the observed time allocation on Day 4. The difference between these two values quantifies the amount of behavior change that can be attributed to operant selection and is indicated by a horizontal arrow in Figure 2. The selection estimates for the other two behavioral measures were calculated analogously using the respective feedback functions for peck speed and peck accuracy. 5 To test whether the theory-based selection estimates were significantly different from zero, we performed two-sided exact permutation tests (Edgington & Onghena, 2007).We also tested the absolute change values observed for each behavioral measure for significant deviations from zero using two-sided exact T A B L E 1 Comparison of first and last training trial (Day 1 and Day 4) with respect to time allocation (time at the food region divided by total foraging time), peck speed (number of pecks divided by time at the food region), peck accuracy (number of retrieved food items divided by number of pecks at the food region), and capture rate (number of retrieved food items divided by total foraging time).This identity can easily be verified because, by definition, capture rate may be decomposed such that permutation tests to evaluate whether the model-based estimates provided any information beyond the raw data.All statistical analyses were conducted using the software R, version 4.2.2 (R Core Team, 2022).

RESULTS
Table 1 summarizes the pigeons' behavior during the first and the last trial of training, respectively.The average time allocation during the first trial was 0.8 (SD = 0.09), indicating that pigeons already spent most of their foraging time at the food region during Day 1.The average time allocation during the last trial was 0.76 (SD = 0.14).Thus, average time allocation decreased over training trials.The average peck speed at the food region was 0.53 (SD = 0.27) on Day 1 and was 0.71 (SD = 0.28) on Day 4, suggesting an increase in mean peck speed.Average peck accuracy also increased from 0.07 (SD = 0.05) on Day 1 to 0.13 (SD = 0.07) on Day 4 but was unexpectedly low even after 4 days of training.Visual inspection of the video material revealed that it often took the pigeons several attempts (sometimes up to 10 pecks or more) to retrieve a food item from a hole in the board.
The difficulty of the task thus appears to be related to motor skills rather than failure of food detection.Of the three behavioral measures, the exact paired-samples permutation test was only significant for the change in accuracy ( p = .004). 6Average capture rate increased from 0.03 (SD = 0.01) on Day 1 to 0.06 (SD = 0.04) on Day 4. The corresponding paired-samples permutation test indicated that the observed increase in capture rate was significantly different from zero ( p = .004).
Table 2 shows the amounts of selection on time allocation, peck speed, and peck accuracy that were estimated from the corresponding null models by calculating the difference between the predicted and the observed values during the last trial (see Data analysis and Appendix for details).All three behavioral measures yielded positive selection estimates for all nine pigeons.Average change was largest for time allocation (M = 0.38, SD = 0.25) and peck speed (M = 0.38, SD = 0.29) and less expressed for peck accuracy (M = 0.06, SD = 0.04).However, given that accuracy was very low throughout all sessions, this difference appears to express the overall difficulty of food retrieval rather than a lower selection pressure.The exact two-sided permutation tests revealed that the selection estimates differed significantly from zero for all three behavioral measures (p = .004for each test).Note, however, that the significance tests are not independent, as the three selection estimates are positively correlated to a considerable degree (correlation coefficients ranging between .77and .9). Figure 3 presents a graphical comparison between the mean values and standard deviations for observed change and behavioral selection for time allocation, peck speed, and peck accuracy, which supports the conclusion that selection significantly differs from zero in all three behaviors.

CONCLUSION
In this article, we proposed a new method to quantify the amount of operant selection in behavioral experiments by means of model-based estimation.The method builds on a formal theory of operant selection, the As there are nine individuals, there are 2 9 ¼ 512 possible permutations from which the test distribution is constructed.Consequently, a p value of :004 means that only two out of 512 permutations deviate at least as much from the null hypothesis ("no change") as the observed data.For a two-sided test, this means that the observed test statistic was the most extreme deviation in the observed direction out of all possible permutations (see Edgington & Onghena, 2007, for a detailed exposition).
MLBS, which provides an explicit definition of operant selection in terms of an extended Price equation.Applying the MLBS to empirical data, we showed how selection may be inferred for different behavioral measures such as time allocation, peck speed, and peck accuracy.The rationale was to use the MLBS to construct a null model that predicts the expected change in behavior in the absence of selection.The difference between the observed behavior change and the predicted behavior change yields an estimate of the selection component of operant behavior.
The method allowed the estimation of different selection effects (i.e., selection on time allocation, peck speed, and peck accuracy) using data from a published foraging experiment.In contrast to the observed raw behavior changes, the selection estimates all significantly differed from zero, indicating that selection was effective even in cases where it was not obvious from the raw data alone.The data further revealed that the selection estimates of allocation, speed, and accuracy were not independent of one another.This latter result is hardly surprising, as the estimation procedure for any of the three behavioral components assumes that the other two behavioral components remain unchanged.However, actual changes in allocation, speed, and accuracy are likely to affect each other.For example, if a pigeon learns where food can be found, this might speed up pecking activity at the relevant area.Therefore, the selection estimates are not to be interpreted as independent additive effects.Instead, they tell us how much change in a certain behavior would be attributable to operant selection if selection was acting exclusively on this behavior.
We presented the first empirical application of the MLBS in the context of a behavioral experiment.The experiment itself was chosen such that it is as simple as possible to serve as a minimal example for the method proposed in this article.Of course, there are various limitations with respect to the data because they were originally collected for a different purpose.For example, the foraging board was constructed in a way that one could not unequivocally identify the retrieval of food on the videos.Making the floor below the cover transparent might have solved this problem (Baum, 1987).However, this would have possibly enabled the pigeons to see where the food is (because of the light emerging from the holes in the absence of food), a situation likely to affect their foraging behavior.Despite the limitations of the experimental application, our results show that the MLBS in combination with the model-based estimation approach provides a feasible theoretical foundation for the experimental analysis of behavior.
The general methodology of model-based selection analysis can be applied to many other experimental settings that involve behavior change over time.Probably, there are thousands of unused training data sets only awaiting to be analyzed.We hope that this article contributes to the foundations of behavioral selection as a general theory of behavior and encourages other researchers to put the behavioral selection perspective into practice.
In the Price equation, w i designates an individual's evolutionary fitness, b i designates the value of an arbitrary evolving character value, and w and b designate the population averages of w i and b i , respectively.The terms Cov i and E i are the population covariance and the expected value of the population, Δb i is the individuallevel change in character value (usually thought of as change between parent and offspring), and Δb is the population-level change in character value.
The term Cov i w i , b i ð Þ captures the effects of natural selection, whereas the term E i w i Δb i ð Þrefers to changes in the population average of the target characteristic b that are not natural selection.If the time frame is chosen sufficiently small, and individuals are treated as their own offspring, E i w i Δb i ð Þ captures changes that occur within individuals.
In the MLBS, the fitness-weighted within-individual change, w i Δb i , is itself partitioned into a covariance term and an expectation term: Whereas in the original Price equation, the covariance and expectation are taken over the individuals i of a population, in the MLBS, they are taken over a collection of recurring contexts j (so-called behavioral episodes) that are themselves nested within individuals.Consequently, the covariance term Cov j w ij ,b ij À Á refers to the part of within-individual change that can be attributed to selection at the individual level (i.e., reinforcement), whereas the expectation term E j w ij Δb ij À Á captures all sources of within-individual change that are not selection.
Given an arbitrary fitness predictor p (e.g., food), evolutionary fitness can be predicted by a linear regression of the form w ¼ β wp p þ ε.We can now rearrange to obtain , we get the covariance-based law of effect: Estimating operant selection from molar feedback functions For each behavioral measure b (which may be either time allocation, peck speed, or peck accuracy), the average change in capture rate C per unit change in b (holding everything else constant) can be expressed by a linear function of the form where β is the slope of the feedback function.For two different Trials 1 and 2, the observed change in behavior (Δb) is defined as Given the slopes of the feedback functions for the two trials (β 1 and β 2 , respectively), the observed change becomes with C 1 and C 2 being the capture rates observed in Trials 1 and 2, respectively.In the null model, the slopes of the two feedback functions remain unchanged, whereas the capture rate is fixed to the one observed in the first trial.Consequently, the change predicted by the null model (δ) is given by Because the predicted change from the null model corresponds to the amount of change that would be expected in the absence of selection (i.e., the nonselection term, δ, in the CLOE), it follows that the change in behavior due to selection (Δ s b) can be calculated by taking the difference between the observed and the predicted changes:

F
I G U R E 1 Experimental setting.Pigeons were put in a 120-Â 70-cm experimental chamber with 60 covered holes.In the food region, each hole contained a grain.In the no-food region, the holes were empty.F I G U R E 2 Calculation of selection estimate for time allocation from the feedback functions of Days 1 and 4 (data obtained from Pigeon 118).The difference between the predicted time allocation (filled red circle) and the observed time allocation on Day 4 (filled black circle) equals the estimated amount of operant selection acting on time allocation (see text for details).

F
I G U R E 3 Group means (filled circles) and standard deviations (error bars) for observed change and behavioral selection.The vertical axis depicts the change from Day 1 to Day 4 (delta) for the raw data (Panel a) and the model-based selection estimates (Panel b).6 Selection estimates for time allocation (time at the food region divided by total foraging time), peck speed (number of pecks divided by time at the food region), and peck accuracy (number of retrieved food items divided by number of pecks at the food region).The corresponding group-level p values were obtained from two-sided exact permutation tests. 4 Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/jeab.924,Wiley Online Library on [03/06/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License derivation of the calculations is provided in the Appendix.