gauseR: Simple methods for fitting Lotka‐Volterra models describing Gause’s “Struggle for Existence”

Abstract Point 1: The ecological models of Alfred J. Lotka and Vito Volterra have had an enormous impact on ecology over the past century. Some of the earliest—and clearest—experimental tests of these models were famously conducted by Georgy Gause in the 1930s. Although well known, the data from these experiments are not widely available and are often difficult to analyze using standard statistical and computational tools. Point 2: Here, we introduce the gauseR package, a collection of tools for fitting Lotka‐Volterra models to time series data of one or more species. The package includes several methods for parameter estimation and optimization, and includes 42 datasets from Gause's species interaction experiments and related work. Additionally, we include with this paper a short blog post discussing the historical importance of these data and models, and an R vignette with a walk‐through introducing the package methods. The package is available for download at github.com/adamtclark/gauseR. Point 3: To demonstrate the package, we apply it to several classic experimental studies from Gause, as well as two other well‐known datasets on multi‐trophic dynamics on Isle Royale, and in spatially structured mite populations. In almost all cases, models fit observations closely and fitted parameter values make ecological sense. Point 4: Taken together, we hope that the methods, data, and analyses that we present here provide a simple and user‐friendly way to interact with complex ecological data. We are optimistic that these methods will be especially useful to students and educators who are studying ecological dynamics, as well as researchers who would like a fast tool for basic analyses.


| INTRODUC TI ON
A century ago, Alfred J. Lotka (1920Lotka ( , 1925 and Vito Volterra (1926) published their canonical works concerning the mathematics of species interactions. These theories were among the first to suggest that the dynamics of complex ecological systems could be described through simple underlying equations (Israel, 1988;Scudo & Ziegler, 1978). Inspired by the possibility of understanding natural systems through mathematics, Georgy Gause stated in his "Struggle for Existence" (Gause, 1934b): […] in order to penetrate deeper into the nature of these phenomena we must combine the experimental method with the mathematical theory, a possibility which has been created by the brilliant researches of Lotka and Volterra. To match the strong assumptions and narrow scope of Lotka and Volterra's basic equations, Gause constructed simple experimental microcosms, in which he was able to show that interactions between pairs of species influenced their abundances and dynamics in ways that accorded with theoretical predictions (Gause, 1934a, b). These experiments laid the groundwork for a century of experiments based on mathematics as a conceptual framework. It has since become a primary goal of ecology to explain empirical observations with mathematical theory.
Despite their simplicity, the Lotka-Volterra equations are still the primary models used to describe species interactions. At their most basic, these models describe the changes per unit time in the abundance of a species, N i , as.
where r i is the intrinsic growth rate, which describes the growth of a species at low density and in the absence of other species, a ii describes the effect of species i on its own growth, and a ij describes the effect of species j on the growth of species i. The form that we use here differs somewhat from that often portrayed in some textbooks, as we leave r i inside the parenthesis and therefore do not standardize interaction coefficients by the growth rate (thus, we signify these as a rather than α).
If we divide both sides of Equation (1a) by N i to calculate the "per-capita" growth rate of species i, the model can be written as a simple linear equation.
This form is particularly useful for analyzing empirical data, as the parameters can be estimated using ordinary least squares regression of per-capita growth rates against species abundances, where r i is the y-intercept, a ii is the slope with respect to species i's own abundance, and a ij is the slope with respect to other species (Lehman et al., 2020).
Note that in a single species, this relationship can also be expressed in terms of the carrying capacity, K i , which describes the equilibrium abundance reached by a species growing in the absence of other species and falls on the x-intercept ( Figure 1). For those who are less familiar with regression or dynamical systems modeling, we have included a more in-depth discussion of this approach in the vignette accompanying our R package. We can solve for K i in terms of a ii and r i by setting Equation (1b) equal to zero, and solving for N i , which yields.
Because the Lotka-Volterra models make no specific assumptions about the mechanisms underlying species interactions, they can be parameterized in ways that approximate any combination of underlying mechanisms, at least locally around equilibrium (Chesson, 2000;Letten & Stouffer, 2019;MacArthur, 1970). For example, in a system with two species, if both a ij and a ji are less than zero, species suppress each other's growth, indicating competition.
If both parameters are positive, then both species benefit from the other's presence, indicating mutualism. Alternatively, if a ij < 0 and a ji > 0, then species i is suppressed by species j whereas species j benefits from the presence of species i, indicating a predator-prey relationship. Note, however, that in systems with more than two species, interpreting the net effects of species interactions is somewhat less straightforward-for example, if two species jointly respond to a shared predator, they may have a net negative effect on one another's abundances even if they do not interact directly (i.e., "apparent competition", Holt (1977)).
Gause's original experimental work still provides some of the best examples of these different kinds of interspecific interactions. Figure comparing per-capita growth rate (following equation 1b) versus abundance for Paramecium aurelia grown in monoculture. Note that when plotted in this manner, the parameter values for the Lotka-Volterra equations can be easily "read" off the graph, which r as the y-intercept, K as the x-intercept, and a ii as the slope. In multi-species systems, interaction coefficients a ij (i.e., effect of species j on species i) can be similarly computed based on the slope of dN i /N i dt versus N j However, because of the limited analytical tools available to him in the early twentieth century, Gause was only able to fit models to a small subset of his experimental data. Moreover, because of the journal formatting conventions of his time, much of Gause's data was never officially published. In the years since Gause's death, his data and results have been widely used as teaching tools and are frequently cited as canonical examples of competitive and predator-prey interactions, and have inspired theoretical work in many other fields (Takeuchi, 1996). For more details on this body of work, please see the blog post accompanying this article. Many groundbreaking studies have also gone on to reanalyze parts of his experimental data and extend some components of his theoretical models (e.g., Vandermeer, 1969). Nevertheless, to this day, only a small portion of Gause's experimental data has ever been quantitatively analyzed using the Lotka-Volterra equations.
Here, we introduce the gauseR package, which includes a collection of functions for confronting empirical observations with the classical Lotka-Volterra interaction models. The package combines methods for data formatting, model simulation, and parameter estimation into a simple-to-use, automated wrapper function. The package also acts as a digital repository for Gause's experiments on the "Struggle for Existence" and includes all available data from his 1934 book and associated publications. Below, we introduce the functions and basic workflow for the package, and explain the underlying theory. To demonstrate the package's functionality, we analyze two well-known datasets from Gause (1934a), as well as classic examples of multi-species interactions from McLaren and Peterson (1994) and Huffaker et al. (1963). Additionally, we discuss some limitations and possible extensions of the optimization methods used in the package. We foresee that gauseR will be especially helpful for researchers who seek simple tools for analyzing their data, as well as a teaching aid for introducing students to the Lotka-Volterra equations and to Gause's experiments.

| Functions and workflow
The general purpose of the functions in the gauseR package is to fit the general Lotka-Volterra model in Equation (1a-b) to observational time series data of one or more interacting species. The package, along with a walk-through vignette, is available at github.com/ adamtclark/gauseR. Functions are described in detail in Table 1. In general, this procedure will require estimating values for three types of parameters: starting abundances (N 0i ), intrinsic growth rates (r i ), and interaction coefficients (a ij ).
For a single species system, dynamics can be directly fit to the Logistic growth model using the get _ logistic function. For systems with two or more species, fitting proceeds in four steps. First, time-lagged abundances for each species are calculated using the get _ lag function, yielding pairs of abundance values separated by a fixed time interval for each species. Second, these time-lagged abundances are used to estimate per-capita growth rates at different time points, using the percap _ growth function. This function approximates dynamics following the linearized equation.
Third, linear regressions are used to test how per-capita growth rates vary as a function of conspecific and heterospecific species abundances, from which the parameters in Equation (1b) can be estimated. Finally, because parameter estimates from linearized growth rates often do not match observations for long or complex time series, the lv _ interaction and lv _ optim function are used to directly simulate dynamics and tune parameters to match model outputs to observations, which can greatly improve model performance (Carrara et al., 2015;Maynard et al., 2019;Rosenbaum & Rall, 2018).
All four of these steps are automated in the gause _ wrapper function, which takes observed species abundances as an input, and returns fitted parameter estimates. We suggest this option for most users.
To facilitate optimization, we include two additional functions.
First, the lv _ interaction _ log function simulates dynamics for log-transformed species abundances. This approach is helpful in systems where abundances approach zero, which can cause integration issues (e.g., predictions of negative abundances). This function is the default for gause _ wrapper. Lastly, the ode _ prediction function simulates dynamics for n species given a set of input parameter values and returns estimates of species abundances as a single column vector. This function is potentially useful for users who wish to apply other, more sophisticated optimization routines.

| Data and example analyses
The gauseR package includes a wide range of digitized data from Gauses' experimental work, as well as some other well-known multi-  (Rohatgi, 2015).
To demonstrate how to apply our functions, we also conduct example analyses using four of these datasets. First, gause _ 1934 _ science _ f02 _ 03 shows Gause's classic competition experiment between Paramecium aurelia and P. caudatum, including both species grown in monoculture and in mixture (Gause, 1934a, b (Huffaker et al., 1963). Data are available in huffaker _ 1963. Source code for the analyses of the McLaren and Huffaker datasets is available in the Appendix S1.

| RE SULTS
In most cases, the simple Lotka-Volterra model in Equation ( For predator-prey interactions between D. nasutum and P. caudatum, we found that parameter estimates from regressions fitted to the linearized growth rates failed to match observed dynamics ( Figure 4; after using the optimizer to tune parameters. For the wolf-moose-fir system of McLaren and Peterson (1994), results from the optimizer also matched observed dynamics ( Figure 5). Because we expected no direct interactions between wolves and fir trees, we set these interactions to zero prior to analyses. Additionally, because McLaren et al. hypothesized that moose populations are subject to top-down control in this system, we set self-limitation for moose to zero (Table 2d).
Lastly, for the Huffaker mite data, we were not able to fit the general Lotka-Volterra model to match observed dynamics ( Figure 6; Table 2e). In this system, dynamics follow sustained oscillations, but with periods and amplitudes that vary over time. However, our fitted models could only achieve either sustained oscillations of a fixed period and amplitude (top panel), or damped oscillations where the amplitude declined over time (bottom panel), but not both.

| Interpreting the models
In the monoculture experiments, our results demonstrate that P. caudatum followed logistic growth. As expected by theory, the species was able to increase from low abundance (i.e., r i > 0), limited its own growth as its abundance increased (i.e., a ii < 0), and approached a fixed carrying capacity (K) over time (Gause, 1934a, b;Lotka, 1925;Volterra, 1926). For competition between P. caudatum and P. aurelia, fitted parameters indicated self-limitation by both species, and inhibition of each species by the other (i.e., a ii , a jj , a ij and a ji < 0).

Again, these values match theoretical expectations from Lotka and
Volterra, and accord with Gause's original analyses of the data.
For the system with D. nasutum and P. caudatum, fitted models predicted that prey abundance increased in the absence of predators (i.e., r i > 0), and that the predators declined exponentially in the absence of prey (i.e., r i < 0,). Similarly, parameter estimates indicated that predators had a negative effect on prey, and prey had a positive effect on predators (i.e., a ij < 0 and a ji > 0, respectively). Jointly, these parameter values match biological expectations-that is, that predators consume prey, and starve when prey are absent. However, in contrast to the classic formulation of the Lotka-Volterra predator-prey model, we also identified self-limitation by both species (i.e., a ii < 0 and a jj < 0). These parameter values resulted in damped oscillations in species abundances, as predicted by theory (Lehman et al., 2020). Results were similar for the wolf-moose-fir tree example, for which fitted parameters

F I G U R E 4 Predator-prey interactions between
Didinium nasutum and Paramecium caudatum grown in mixture over 17 days, from Gause (1934a). Additional individuals of both species were added to the mixture periodically to prevent local extinction. Points show observations, and lines show fitted growth curves. Top panel shows results for model fitted using linear regressions of species abundances versus per-capita growth rate, whereas bottom panel shows results for parameters fitted using the simulated differential equations using the lv _ optim() function. Note that goodness of fit is varies substantially between the two methods. See Table 2c for parameter values  In contrast, none of the models that we fitted were able to match observations from the Huffaker et al. mite experiments.
Importantly, this limitation is a general characteristic of the Lotka-Volterra models and not a peculiarity of our fitting methods (Hatton et al., 2015;Lehman et al., 2020;Lotka, 1925). When self-limitation is absent for both species, Lotka-Volterra predator-prey models display "neutrally stable" oscillations, meaning that oscillations maintain a fixed amplitude and period. When self-limitation exists for either the prey species or the predator, or both, then oscillations become damped, meaning that over time they decrease in amplitude and period until they reach stable equilibrium values. Because the Huffaker data include persistent oscillations for which the amplitude and period vary over time, they cannot be matched by any parameterization of the Lotka-Volterra equations. Potentially, these complex dynamics are a result of the spatial structure in the Huffaker experiment, which often cannot be described with simple equations (Leibold & Chase, 2018).
Alternatively, they might be indicative of nonlinear or time-varying interactions among species-for example, that result from density-dependent interaction parameters (Murdoch, 1994).
Importantly, many modern modeling tools exist that are potentially more suitable for fitting these data than are the simple Lotka-Volterra methods that we present here (e.g., Karakoç et al., 2020).

| Notes on optimization
For the predator-prey examples that we analyze, our findings show that parameter estimates generated from the linearized growth rates in Equation (1b) failed to capture observed dynamics. This result accords with many other studies, which show that model performance is typically much better when parameters are tuned to match fully simulated dynamics, that is, as is accomplished with the lv _ optim function (Carrara et al., 2015;Maynard et al., 2019;Rosenbaum & Rall, 2018). In general, the difference in fit between these two approaches is driven by the sensitivity of the Lotka-Volterra equations to noisy parameter estimates-even very small changes in parameter values can lead to large changes in model predictions (Clark & Neuhauser, 2018;Maynard et al., 2019). We therefore suggest that in most cases, the gause _ wrapper function, which automatically applies the optimization routine to simulated dynamics, will provide the best fits to empirical data.
Importantly, the gause _ wrapper function applies several optimization "tricks" that help improve model performance.  from 1960to 1994, from McLaren and Peterson (1994. Points show observations, and lines show growth curves fitted using the lv _ optim() function. The left axis shows wolf and moose abundances, separated by a ten-fold scaling difference for easier visualization, whereas the right axis shows tree growth increments. Two methods for testing goodness of fit are shown. In the legend, values show univariate tests for each species, whereas "Total GOF" shows fit when considered across all observations simultaneously. Note that these two methods of comparison can lead to very different conclusions. See Table 2d for parameter values Year 1960 1965 1970 1975 1980 1985 1990 1995

F I G U R E 6 Predator-prey interactions between
Eotetranychus sexmaculatus and Typhlodromus occidentalis in a spatially structured experiment carried out on a grid of oranges over 60 weeks, from Huffaker et al. (1963). Points show observations, and lines show growth curves fitted using the lv _ optim() function. Top panel shows dynamics for a system where neither species directly inhibits its own growth (i.e., a ii = a jj =0), whereas bottom panel is for a system where only the predator directly inhibits its own growth. Note that neither option is able to fully capture the realized dynamics, and that goodness of fit is relatively low in both panels. See Table 2e for parameter values comparisons to observations. This approach helps prevent computational errors in the integration from erroneously driving abundances negative. Third, we also log-transform parameter values during the optimization process, after determining their sign a priori from the linearized functions. This makes it possible to limit the number of positive interaction coefficients, which otherwise can cause the system to generate unrealistically high abundance values (e.g., via run-away mutualism) and crash the optimizer (Lehman et al., 2020).
A trade-off that comes with this added stability is that this method that vary interaction strengths as a function of community size or composition (Case & Bender, 1981;Mayfield & Stouffer, 2017;Tuck et al., 2018). Users who wish to apply more sophisticated optimization and modeling methods should see the documentation and examples for the ode _ prediction function.

| Future directions
Given the strong correspondence between the models of Lotka (1920,1925) and Volterra (1926) and the experimental data of Gause (1934a, b), it is perhaps not surprising that these works have had such an enormous impact on ecology over the past century.
Together, these concepts have helped ecologists better describe and model their systems, and have greatly advanced conceptual understanding of how different arrangements of species and interaction types are likely to play out. Nevertheless, as the type of systems that ecologists study grow ever more complex, it seems likely that we may need to move beyond these basic models to new frameworks that allow more nuanced and dynamic interaction types (Letten & Stouffer, 2019). We therefore hope that the gauseR package is useful in better exploring both the classic insights and operational limitations of Lotka, Volterra, and Gause's framework.

ACK N OWLED G M ENTS
We thank the Harpole and Chase lab groups for helpful comments on earlier drafts of this manuscript and are grateful to our associate editor and two anonymous reviewers for helping us clarify the theoretical scope of our article. Open access funding enabled and organized by ProjektDEAL.

CO N FLI C T S O F I NTE R E S T
The authors declare no conflicts related to the research, code, or data presented in this paper.

DATA AVA I L A B I L I T Y S TAT E M E N T
All data, and corresponding metadata, for the analyses in this manuscript are available through the gauseR package, available at github.