The ZAX Herbivory Trainer— Free software for training researchers to visually estimate leaf damage

1. Plants lose a remarkable amount of energy to herbivorous animals, and this dam - age has substantial impacts on plant fitness and species' distributions. There are many ways ecologists can measure leaf damage, with some methods being more time-consuming than others. Due to a high variance in herbivory, accurate quan - tification of damage at the population level requires sampling of many leaves. A simple yet effective solution to this problem is to estimate leaf damage visually. 2. Visually estimating leaf damage may be less accurate than scanning methods, but visual estimates of leaf damage are much faster than digital measurements. Using simulations, we show that gathering larger quantities of data at a slightly higher level of inaccuracy gives a more accurate estimate of a population's over - all leaf damage than fewer, exact measurements. 3


| INTRODUC TI ON
In natural ecosystems herbivores consume approximately 15% of plant production each year (Cyr & Pace, 1993). The energy lost to herbivores can have substantial effects on plant fitness, growth, resource allocation and mortality (Hulme, 1996;Marquis, 1984).
Measuring leaf damage is important for scientists to gain insight into the impact of herbivores on plant health, crop yield and invasion success (Roloff et al., 2004;Shimazaki & Miyashita, 2002;Vasquez & Meyer, 2011). However, herbivore damage is incredibly variable at all levels; from within-individual plants (Coley, 1983), to among neighbouring plants (Edwards et al., 1993), between populations (Moreira et al., 2015), across species (Marquis et al., 2001) and through time and space (Herrera & Bazaga, 2011;Reynolds & Crossley, 1997). This variability makes it very difficult to get correct estimates of loss to herbivores. In this paper, we provide new information about the accuracy of visual herbivory estimates versus digital image analysis, and introduce the ZAX Herbivory Trainer (https://zaxhe rbivo rytra iner.com/), a new software that will make the study of herbivory faster, more accurate and more comparable across different studies and sites.
Ecologists have used three main approaches to measure leaf damage: manual, digital and visual. A search through 50 recent ecology papers shows visual estimates to be the most commonly used method for measuring herbivory (68% of studies), followed by digital methods (28% of studies; including estimates made using scanners + image analysis software or leaf area metres) and manual techniques (4% of studies; Appendix S1). Manual herbivory estimates involve placing transparent grids on top of leaves and counting the undamaged squares, which can be tedious and time-consuming (Benjamin et al., 1968;Cronin et al., 1998;Rudgers, 2004;Turcotte et al., 2014). Digital applications like BioLeaf (Machado et al., 2016) are useful for quick semi-automated estimations, however, the app is specified for estimating herbivory in agricultural crops only and is currently not available on iOS devices. LeafByte (Getman-Pickering et al., 2020), another phone application, is convenient for fast, accurate estimation of leaf area and herbivory, but is constrained by its inability to detect damage from some leaf miners and galling insects. LeafByte is also sensitive to phone tilt and shadowing (from ruffled leaves or poor light), which distorts herbivory measurements and has difficulty correctly estimating herbivory on leaves which are almost totally consumed (Getman-Pickering et al., 2020). Other digital methods include scanning leaves using a desktop scanner and calculating damage percentage using a computer programme such as ImageJ (Moles & Westoby, 2000;Neves et al., 2014). Although it is highly accurate, this method is extremely time-consuming, as analysing one leaf takes ~110 s; or up to 5 min when you include collecting and scanning (Getman-Pickering et al., 2020;Schaffer et al., 1997).
Leaf area metres (e.g. LI-3100 LI-COR) can be used to digitally estimate leaf damage, but have trouble estimating herbivory on leaf borders and are very expensive to purchase and maintain (Bergström et al., 2000;O'Neal et al., 2002). Due to a high variance in herbivory, accurate quantification of damage at both the individual and population level requires sampling of many leaves. This problem can easily be solved by visually estimating leaf damage as each measurement takes ~10 s to complete (Getman-Pickering et al., 2020). Most visual estimates of herbivory are conducted by classifying leaves into pre-defined categories (e.g. undamaged, up to 5%, 6%-12%; e.g. Kuźmiński et al., 2015), while others record a specific estimate of leaf damage (e.g. Kim & Underwood, 2015).
Some argue that visually estimating leaf damage is less accurate than scanning methods (Kogan & Turnipseed, 1980). Others have shown that measuring herbivory using any method will yield similar results (Kozlov & Zvereva, 2018). Visual estimates of leaf damage are much faster than digital measurements, which means that ecologists can sample many more leaves for a given amount of effort (Schaffer et al., 1997). It is well known that small sample sizes can lead to unreliable results (Button et al., 2013;Varoquaux, 2018). However, the effect of reducing measurement accuracy while increasing sample size has not been assessed for herbivore damage.
It has often been recommended that ecologists train themselves to make herbivory estimates using sets of images with known amounts of damage. However, there is no guidance on how long to train yourself for, or how to determine your average level of accuracy (Johnson et al., 2016), and there are no resources freely available.
Here, we present the ZAX Herbivory Trainer, a free web application that uses scanned leaf images with known percentages of damage which aims to train researchers world-wide to decrease inaccuracy of visual estimates of herbivory.
The key goals of this study are to: 1. Determine the accuracy of herbivory estimates from samples of various sample size and measurement error to ask if at some point more data that are measured less accurately will be better than fewer data that are measured more accurately. This is completed by using simulations with empirical data as a measure of reference.
2. Assess the efficacy of our online training tool, including whether it decreases users' estimate inaccuracy, how often it should be implemented by researchers, and the average time and number of images it takes to complete. We use data collected from voluntary participants of the ZAX Herbivory Trainer to answer these questions.

| MATERIAL S AND ME THODS
We began by downloading empirical herbivory data from a field study by Valoy et al. (2020) to quantify the relationship between the amount of measurement error and the amount of data (sample size) using simulations. The data recorded the proportion of each leaf eaten and contained a large number of zeros.
We simulated new datasets under a range of scenarios where sample size (n) and measurement error ( error ) could be changed, while three parameters controlling the amount of herbivory (p 1 , , herbivory ) were fixed across simulations. These fixed parameters were estimated from the empirical data to ensure the simulated data were realistic. p 1 is the probability a leaf will sustain damage and we used the proportion of leaves in the empirical data with non-zero damage. Logit p 2 and herbivory are the mean and standard deviation of the amount of damage (logit transformed), conditional on the damage being non-zero. These were calculated by logit transforming the non-zero herbivory observations and taking the mean and standard deviation. This approach simulated the mass of zeros in the empirical data, and the logit transform constrained the amount of herbivory to be between zero and one.
We first simulated ones and zeros under a binomial (n, p 1 ) distribution. Then we simulated the logit-amount of herbivory (for those leaves with non-zero herbivory) under a Gaussian distribution with mean log p 2 1 − p 2 and standard deviation error + herbivory . We tested samples sizes of 50, 100, 250, 750 and 1,000 and error of 0.001, 0.01 and 0.05. To determine the uncertainty on the combined estimate, each scenario was simulated 1,000 times. We then reestimated p 1 and p 2 from the simulated data to estimate the amount eaten (p =p 1p2 ). We then calculated the root mean squared error (RMSE) (Barnston, 1992), comparing the model estimates of mean leaf damage to the 'true' values used to generate the data. Monte Carlo standard errors of the RMSE were estimated using a jack-knife technique by recalculating the RMSE 1,000 times, each time using all but one of the simulated datasets, that is, MCSE (RMSE) 3. Repeat step 1 immediately after being trained.
Our app recorded participants' average accuracy across the 50 leaf images they estimated in steps 1 and 3 (where accuracy (%) is the difference between the actual leaf damage (%) and the user's estimated damage (%)). Because the distribution of our raw and logtransformed accuracy data deviated from normality, we used inaccuracy (i.e. 100 minus accuracy) as our unit of measure which we log 10 -transformed prior to analysis to solve the problem of heteroscedasticity. We then performed a paired samples t test comparing users' average estimate inaccuracy pre-and post-training.
To inform how often training should be undertaken we recruited 11 participants to:

| RE SULTS
Our simulations show that when studying herbivory, the effect of sample size far outweighs the effect of estimate inaccuracy. That is, analysing a larger sample with slightly more measurement error yields a substantially more accurate result than analysing a smaller sample with almost no measurement error (Figure 1). For example, a sample of 100 leaves measured at 5% inaccuracy is almost two times closer to the real-world values than a sample of 50 leaves measured at 0.1% inaccuracy (Figure 1). If a researcher wanted to be within 1% of the true herbivory mean of a population (which requires ~250 measurements for the empirical dataset we analysed; Figure 1), this would take just 42 min using visual estimates (~10 s per estimate), compared to over 7 and a half hours using a program such as ImageJ (not including the time taken to photograph the leaf and upload to a computer). The time difference between these methods could see c.
2,500 more samples be visually estimated and return results that are within 0.1% of the actual population mean.
On average, users took 8 min and 43 s, and estimated 48 images, to reach an estimate accuracy above 98% using the ZAX Herbivory Trainer (Figure 4a,b). Ninety per cent of users had completed training within 17 and a half minutes and 85 image estimations.

| DISCUSS ION
We have shown that an ecologist with a limited amount of time to sample would get a more accurate approximation of mean herbivory levels by making visual estimates of herbivory than by using more time-consuming but precise digital methods. A similar principle likely applies to other fields. For example, imperfectly sampled data collected by thousands of amateur citizen scientists could lead to broader, more reliable results than fewer, perfectly sampled data collected by professional researchers (Kirchhoff et al., 2020;Rowley et al., 2019).
It has been argued that overestimations when visually estimating herbivory is in response to a lack of proper training (Getman-Pickering et al., 2020). The ZAX Herbivory Trainer is the first tool that F I G U R E 1 Simulated relationship between sample size (number of leaves) and the difference between simulated herbivory values and the real population mean (RMSE) across different amounts of measurement inaccuracy (0.1%, 1% and 5%) in comparison to the true values from Valoy et al. (2020) herbivory data. As sample size increases, simulated herbivory means become closer to real population means (based on empirical data) across all amounts of measurement inaccuracy. Error bars show Monte Carlo standard errors (MCSEs) and are vertical, although they appear horizontal, due to small error values F I G U R E 2 Violin plot of users' average inaccuracy (%) when estimating damage on 50 random leaf images without feedback before and after using the ZAX Herbivory Trainer. Points represent means and error bars represent standard deviation provides real-time feedback on user estimates to allow for continued improvement in their estimate accuracy with each attempt. Like our application, Newton and Hackett (1994) previously demonstrated an improvement in assessor accuracy when participants used similar software to standardise the measurement of mildew on barley.
There are far fewer limitations when visually estimating herbivory than when using other methods (Getman-Pickering et al., 2020;Machado et al., 2016). The assessor should be able to identify and estimate all types of damage on any type of leaf, as long as they have completed training prior. Assessing herbivory quantitively with a trained estimator will likely result in substantially more accurate outcomes than those gained by coarser categories, increasing our ability to detect smaller effects, and furthering our understanding of plant-animal interactions.
Based on our results (Figure 3), we recommend ecologists to use our trainer once just before starting their herbivory measurements, with re-training every 3 months to ensure estimate accuracy is held to the highest standard. We also suggest that if a person takes more than 17.5 min or 85 images to complete training, they may not be suited to collecting herbivory data for research projects, as they fall above the threshold where 90% of users have already completed the app (Figure 4a,b).
The level of user inaccuracy was higher in the group used to study pre-and post-training inaccuracy (Figure 2) than in the group used to study training retention (Figure 3). This difference is likely related to the demographic of the participants. For the analysis comparing preand post-training inaccuracy, we targeted undergraduate students, who may not have applied as careful attention to detail when estimating leaf damage as their scores remained anonymous. For our accuracy retention analysis (Figure 3), we recruited PhD and Honours students who could commit to re-testing their accuracy at each time interval and whose identities were known. The lack of anonymity when sharing their scores to the research team could have encouraged these participants to perform better when estimating damage.
Our trainer will be useful to ecologists of any experience level, studying herbivory on any scale, anywhere in the world. Students in ecology classes and citizen scientists can also benefit from our app, as they can train themselves to estimate herbivory for projects without the need of guidance or being overseen by a supervisor. A possible Our application makes measuring herbivory faster and more accurate, and this brings many herbivory related research questions previously deemed unfeasible to execute within reach. International collaborations will be made easier and more reliable as the ZAX Herbivory Trainer can be accessed across the world and provides a standardised way of ensuring each assessor is properly trained.

ACK N OWLED G EM ENTS
We wish to acknowledge the Bedegal people who are the Traditional Owners of the land where this research was conducted. We also wish to thank the people who participated in our study. Kozlov, whose advice greatly improved this manuscript.

CO N FLI C T O F I NTE R E S T
We have no conflict of interest to declare.

DATA AVA I L A B I L I T Y S TAT E M E N T
Code and data for all analyses (Xirocostas et al., 2021) and the ZAX Herbivory Trainer (Debono, 2021) are archived on Zenodo.