The CV is dead, long live the CV!

Biology has an increasing need to reconsider the tools used to assess the variability of measurements, in addition to their central tendency. More than 100 years after Pearson's publication, most biologists still use the “good old” Pearson's coefficient of variation, PCV, despite its documented flaws such as sensitivity to excess zero values and/or irrelevant low mean values, which may compromise its use in some biological applications. A new statistic was developed in 2017 by Kvålseth, KCV, which is easy to implement. Unlike PCV, KCV is bounded (between 0 and 1), and it can be computed from PCV, ensuring backward compatibility with past studies. In addition to simulated data, we used the recent MASTREE+ database comprising the time series of the fruiting dynamics of perennial plants worldwide to compare the properties of PCV and KCV. Using as a benchmark the loose hump‐shaped relationship between the interannual variability of fruiting and latitude, KCV led to significant increase in statistical power as it required almost half as many time series as PCV to detect the relationship. Perhaps most importantly, simulated data showed that KCV allows huge reductions in the length of time series required to estimate the population true variability, saving more than half the duration of long‐term monitoring if fruiting fluctuations are very large, which is common in perennial plant species. Compared to the widely used PCV, KCV has great accuracy for estimating and analysing variability in biology, while strongly increasing statistical power. Selecting appropriate tools to assess the variability of measurements is crucial, particularly where the variability is of primary biological interest. Using Kvålseth's KCV is a promising avenue to circumvent the well‐known issues of the former Pearson’ PCV, its properties remain to be explored in other fields of biology, for purposes other than purely statistical ones (e.g. estimating heritability or evolvability of traits).


| INTRODUC TI ON
Biology has an increasing need to reconsider the tools used to assess the variability of measurements, in addition to their central tendency.This is particularly important in the fields of ecology and evolution, especially in the context of ongoing global change.For example, it is necessary to properly quantify the variability of population abundance in order to compare population dynamics and assess extinction risk.The dynamics and evolution of populations also depend strongly on the degree of variability of the environment or of individual phenotypes, which must be carefully assessed.
There is general agreement that the appropriate statistic to estimate variability has to be scaled to the mean to facilitate comparisons (Gaston & McArdle, 1994;Inchausti & Halley, 2002;McArdle & Gaston, 1995;Pélabon et al., 2020;Pimm, 1991).In this respect, Pearson's coefficient of variation, P CV (Pearson, 1896), is the statistic almost exclusively used in biological studies to date.P CV is computed from a series x of n non-negative elements as the sample standard deviation to the sample mean ratio.Note that P CV is unaffected by the order of elements in x so that if you are interested in this aspect you should use order-dependent statistics (see for example Bogdziewicz et al., 2023) and that there is a loss of information if your interest is the variance-mean relationship (see for instance fig. 2 in Pélabon et al., 2020).
P CV is commonly used as a convenient dimensionless statistics of variability, for instance in repeatability experiments it facilitates between-laboratories comparisons as they may use different units.
Another interest is when there are widely different means between groups, for example in finance to compare the variability of securities in a stock exchange.In biology, it is perhaps the scale invariance where λ is a strictly positive constant, that may explain its success.
Non-negative series is a common situation since all extensive variables (e.g.mass, length, surface, volume, meristic variables such as seed counts) belong to this class.Calculating variability in the case of intensive variables (e.g.speed, pressure, density, temperature) could be more tricky, for example the same set of temperature data (Kimber, 1991) gives different P CV values when expressed in degree Celsius or in degree Fahrenheit (Eisenhauer, 1993) (Gaston & McArdle, 1994;Kvålseth, 2017;Lewontin, 1966;McArdle et al., 1990;McArdle & Gaston, 1995;Pélabon & Hansen, 2008;Silveira & Siqueira, 2022) such as its sensitivity to outliers and the fact that it is strongly affected by small variation in the mean, or errors in the estimation of the mean.
A few attempts have been made to find alternatives to P CV: Lewontin (1966) proposed to work with the standard deviation of log-transformed values, which can easily be mobilized in the context of allometric studies, where sample values are strictly positive.
While this does not work, however, for counts including zero values, correcting the problem by using an arbitrary constant to enforce positivity in log( + x i ) is inappropriate because the scale invariance property is lost in the process.The proportional variability statistic, PV, was proposed (Heath, 2006;Heath & Borowski, 2013) to address these challenges, but itself has major weaknesses (see Table 1).
Recently, a new coefficient of variation has been proposed (Kvålseth, 2017), called hereafter Kvålseth's coefficient of variation, K CV, which has gone largely unnoticed by biologists.K CV is as easy to compute as P CV, since it is the sample standard deviation divided by the square root of the mean of squared values.What is more, K CV can be seen as a variance stabilization transformation of P CV: This relationship allows us to compute K CV from formerly reported P CV values, even if the original dataset is no longer available.This relationship also shows that when P CV tends to infinity, K CV is still bounded below 1.The other advantages of K CV over P CV are theoretically demonstrated in Kvålseth's paper.For instance K CV can be used with a signed ratio type scale mixing positive and negative values since it is not undefined, unlike P CV, when the mean is 0, and, at least for (Heath, 2006), P CV (Pearson, 1896) and K CV (Kvålseth, 2017) values on the same time series.The first issue is that the same PV values are obtained for time series composed mainly of very low values and including seldom high values and those mirror series mostly composed of high values with seldom low values (sets 1 and 2).The second issue is that very different PV values are obtained for time series that are nearly identical from a biological perspective (sets 3 and 4).The differences between sets 3 and 4 are minute or even meaningless yet commonly encountered as they may arise due to sampling fluctuations.In these case studies, both P CV and K CV are sensitive to meaningful differences and are insensitive to artifactual differences.Here, we highlight the interest of K CV on the basis of a practical case study and of simulations used to compare the gain in statistical power or in the sampling effort associated with the use of K CV vs P CV.For that purpose, we used annual seed production in perennial plant populations as a case study.These populations show diverse fruiting dynamics, ranging from nearly constant annual production, through extreme interannual variation (masting), to semelparity in some species such as the mainland Chinese bamboo Phyllostachys bambusoides with its seeding cycle of about 120 years (Janzen, 1976).This may represent the greatest known variation ever recorded among biological variables in terrestrial ecosystems, providing an ideal example of the challenges with measuring variability.

| MATERIAL S AND ME THODS
The demonstration of the statistical power gain associated with K CV rather than P CV in a biological context is carried out in two complementary steps, one based on quantified biological data in the field and the other on simulated data from true parameters known a priori.
In the first step, we used the numerous time series describing fruiting dynamics quantified at the scale of perennial plant popula- ter.Once this parameter is fixed, the true P CV and K CV are known (Kvålseth, 2017).From these simulation experiments, the gain in statistical power associated with the use of K CV can be estimated by the savings in sampling effort (number of years saved) to estimate the true CVs with a chosen degree of precision.To illustrate the approach, we initially used sdlog = 1.010768 in the rlnorm() function, which corresponds to variability at the boundary between the "large" and "very large" ranges for K CV, typical of masting studies, and the theoretical time series have true P CV and K CV of 1.33 and 0.8 respectively.For each length of time series, 10,000 replicates were sampled and statistics were calculated on the same time series.
Zero-inflated distributions were simulated by forcing a given fraction of the smallest values to zero.Then we generalized the procedure by using lognormal distributions to generate true K CV ranging from 0.4 to 0.95 (in steps of 0.25).
All computations were done under the R statistical software (R

| Comparison of P CV and K CV power with actual data
The level of variability in the population-scale fruit production has been recently examined over a large range of plant species and spatial scale in the Northern Hemisphere (Pearse et al., 2020) and the time-series variability exhibits a loose hump-shaped relationship with latitude (Figure 2).This is a perfect benchmark because the small part of total variability accounted for by the model (r 2 = 0.0481) requires a lot of data to get a significant relationship.
With P CV, the relationship is questionable because there is an overrepresentation of data in the intermediate latitude range [35°-55°] likely including by chance most of the outliers (anomalously high P CV values), which could be responsible for an artificial quadratic relationship.In this case, using K CV, the hump-shaped relationship is much more convincing because its values are bounded between 0 and 1 so that no heavy-tailed distributions are possible.
In this way, K CV is similar to using a log scale when dealing with highly skewed data but avoids the need for data transformations.
In this case, the advantage of a bounded statistic is obvious, preventing highly skewed distributions for P CV values (Figure 1) and helps, using Kelly's words (Kelly, 2023), in "fighting the urge to put things in bins".
Based on these data and from sub-sampling simulation, we found a massive gain in statistical power when using K CV instead of P CV (Figure 3): we may save about 40% of the sampling effort to reach a significant result.The advantage of using the K CV instead of P CV is worth considering, given the logistical difficulties in long-term field monitoring of seed production, which pose a major obstacle to progress in the field (Clark et al., 2021;Koenig, 2021).Even cutting-edge technologies (e.g.Jones & Allen, 2002) are unlikely to improve the situation in the near future.

| In silico comparison of P CV and K CV power
Using simulated data allows us to study the impact of sampling effort (the length of time-series) on the sampling fluctuation of statistics values, from both a central tendency and a dispersion point of view (Figure 4).The dispersion of sampling fluctuations decreases with sampling effort with K CV but is almost unchanged with P CV, an undesirable property.Examining the central tendencies, the convergence to the true value is faster with K CV than with P CV (Figure 4).
For example, reaching 80% of the true population value requires F I G U R E 2 Quantifying the relationship between variability and latitude using the two methods of calculating CV.Points show a subset of 1138 time-series from the Northern Hemisphere showing the relationship between P CV (left) or K CV (right) and latitude.The red line is the quadratic fit that minimizes the sum of squared residuals.
F I G U R E 3 Sub-sampling simulation showing how K CV dramatically reduces the number of samples required to detect a significant relationship between CV and latitude, as shown in Figure 2. Lines show the percentage of simulations for a given sub-sample size that produce a significant (p < 0.05) quadratic relationship between CVs and latitude, based on 10,000 replicates for each sub-sample size.Sub-samples were randomly selected from the 1138 MASTREE+ time-series.Detecting the relationship using K CV saved 43% of the sampling effort as compared with P CV. 22 years with P CV while it takes only 9 years with K CV, corresponding to a 13-year gain (i.e. more than 50% saved years).A similar gain was observed with zero-inflated time-series (not shown).At the expense of no extra cost, with the same dataset, we are always closer to the true population value with K CV.Crucially in the case of masting analyses, this enables substantial reduction in the number of years of monitoring needed prior to accurately measuring the intensity of masting (Figure 4).
The amount of sampling effort saved when shifting from P CV to K CV was also found to increase along with the degree of variability in the data series (Figure 5).For instance, considering that 80% of the true population value was reached, 13 years could theoretically be saved when K CV = 0.8, 25 years when K CV = 0.85 and even 56 years when K CV = 0.9.To summarize, whatever the length of the time series, K CV always outcompetes P CV and the reduction in the length of the time series allowed by K CV increases along with the intrinsic variability level of the dataset.

| CON CLUS ION
Kvålseth concludes his article by stating that "except for a long tradition of the use of P CV, there appears to be no reason not to prefer the use of K CV over P CV".The double negation in Kvålseth's delicate wording appears to us as an understatement: at least in studies devoted to understanding the temporal or spatial variability of biological quantities, we do have good reasons to shift from P CV to K CV as a scale-invariant statistic to properly quantify variability.Other applications of the K CV deserve to be explored, such as in evolutionary biology where inferring the evolvability of a trait, its phenotypic plasticity, or its selective value relies on accurate, and still debated, measures of variability (Hansen et al., 2011;Houle, 1992;Houle et al., 2011;Pélabon et al., 2020).Nonetheless, while K CV has a number of advantages for focal applications, as presented in our paper, the choice of statistics will depend on the questions being asked.

AUTH O R CO NTR I B UTI O N S
All authors contributed critically to the drafts and gave final approval for publication.Jean R. Lobry coordinated the ideas from all authors, wrote the R code for analyses and led the writing of the manuscript.
The idea for the title is from Marie-Claude Bel-Venner.
F I G U R E 4 Simulation experiment using a lognormal distribution to demonstrate that while P CV and K CV both underestimate the true population value, K CV converges more rapidly than P CV, reducing the number of years of observation required to estimate its value.The dotted lines are the true population values for P CV and K CV.The x-axis scale is representative of the length of the masting series available in MASTREE+, whose median is 10 years, and 50% of the time-series are between 4 and 17 years (indicated by the grey shading).The black point is at the mean, and the bars represent plus or minus one standard deviation (not confidence interval for the mean) to illustrate the dispersion of the sample statistics.The red point indicates the time-series length where 80% of the true population value is reached.
tions and species, and in various localities around the world.Data recently made available in MASTREE+(Hacket-Pain et al., 2022) offer a great opportunity to compare the behaviour of P CV and K CV because the series cover a very wide range of variability.This is a libre database available under a CC-BY-4.0 licence.We used the initial (2022-03-03) version.Quantitative time series with at least 12 documented values were selected (n = 1433 time series).From this database, we describe the relationship between P CV and K CV and then analyse the gain in statistical power associated with using K CV (compared to P CV) using a test to detect a previously published relationship between the degree of variability in fruiting and latitude(Pearse et al., 2020).To do this, we sub-sample the MASTREE database by randomly drawing time series.For each sub-sample size, we simulate 10,000 independent tests (either with K CV or with P CV) and determine the proportion of tests that detected a significant (p < 0.05) quadratic relationship between CVs and latitude.The power gain of using K CV instead of P CV is quantified by the difference between the sub-sample size needed by each statistic to detect a significant relationship in 95% of the tests.In a second step, we use a simulation experiment based on a lognormal distribution to generate the fruiting dynamics over a longer or shorter time series.The use of lognormal distribution has two advantages: (i) it allows the generation of fruiting dynamics consistent with observations, (ii) it requires the use of only one parame- Team, 2013).Non-parametric confidence intervals for statistics were computed with the boot package(Canty & Ripley, 2021;Davison & Hinkley, 1997) using the adjusted bootstrap percentile (BCa) method(Efron, 1987) and 9999 replicates.The R code to reproduce the analyses is available in the file CVisDead.zipat pbil.univ-lyon1.fr/R/donne es/ in the form of an RMarkdown document(Allaire   et al., 2020;Xie et al., 2018 Xie et al.,  , 2020) )  compiled with knitr(Xie, 2014(Xie, , 2015(Xie,  , 2020)).3 | RE SULTS AND D ISCUSS I ON3.1 | Comparison of P CV and K CV general properties based on true datasetsPaired calculations of P CV and K CV over a large dataset of field time series of fruiting dynamics by perennial plant species show that they are essentially the same up to moderate variability range, but for greater variability, P CV tends to stretch values to infinity.This is a common situation in masting studies since 74% of time series in MASTREE+ are in the large -or very large-variability range (Figure1).The K CV estimates are accurate enough (with confidence F I G U R E 1 Comparison of P CV and K CV statistics for 1433 masting time-series with at least 12 observations from MASTREE+(Hacket-Pain et al., 2022).The grey lines are the 95% bootstrap confidence interval(Efron, 1987).The vertical blue lines are the boundaries of Kvålseth's ranges for verbal interpretation of variability.The red curve is the theoretical relationship (y 2 = x 2 / (1 − x 2 )) between P CV and K CV.
intervals for K CV typically ±0.1) to consider as relevant the 5-class categorization of the [0,1] range values proposed by Kvålseth for verbal interpretation.The consistency of results when switching from P CV to K CV is ensured by their monotonic relationship; for instance all non-parametric rank-based tests are equivalent since ranks are preserved.The "too many zeros issue", common in masting studies, is solved neither by P CV nor by K CV, but it can be at least detected by the confidence intervals including zero, meaning that the corresponding data set does not allow us to reject the null hypothesis "H 0 : CV = 0".
The reduction in sampling effort (years) when switching from P CV to K CV to measure variability of reproduction.Simulations plotted show the reduction in effort required to reach, on average, a given fraction (0.5, 0.6, …, 0.9 as indicated in the topleft box) of the true population value as a function of the variability level ( K CV).The red arrow highlights the example shown in Figure4.The vertical blue lines are the boundaries of Kvålseth's ranges for verbal interpretation.The grey area is the interquartile range for K CV in our subset of MASTREE+: 50% of quantitative time-series with at least 12 observations are between 0.59 and 0.82.
because P CV is not invariant by translation.Scale types that are meaningless for P CV are given in table 1 in Pélabon et al., 2020.Even if the scale type is appropriate, there are still other well-known issues with P CV