Assessing the value of citizen scientist observations in tracking the abundance of marine fishes

The state of biodiversity for most of the world is largely enigmatic due to a lack of long‐term population monitoring data. Citizen science programs could substantially contribute to resolving this data crisis, but there are noted concerns on whether methods can overcome the biases and imprecision inherent to aggregated opportunistic observations. We explicitly test this question by examining the temporal correlation of population time‐series estimated from opportunistic citizen science data and a rigorous fishery‐independent survey that concurrently sampled populations of coral‐reef fishes (n = 87) in Key Largo, Florida, USA, over 25 years. The majority of species exhibited positive temporal correlations between population time‐series, but survey congruence varied considerably amongst taxonomic and trait‐based groups. Overall, these results suggest that citizen scientists can be effective sentinels of ecological change, and that there may be substantial value in leveraging their observations to monitor otherwise data‐limited marine species.


INTRODUCTION
Ecological change is unfolding rapidly in many ecosystems (Blowes et al., 2019;Dornelas et al., 2014), yet our understanding of how and why changes are occurring remains highly limited for most regions and taxa on Earth due to a paucity of long-term monitoring data.Although thousands of population monitoring programs have been conducted on various taxa across the world, only a small fraction of species are known to have any time-series of population abundance (McRae et al., 2017;Moussy et al., 2021).Given this, adopting and adapting to novel sources of ecologi-concern, however, is that data from decentralized citizen science programs are inherently unstructured and therefore driven by variable human behavior, effort, and skill, that introduces potential biases and stochastic errors in the observation process that could skew our perspective on how biodiversity is changing (Bird et al., 2014;Dobson et al., 2020).Statistical techniques could potentially compensate for this variable observation process using survey metadata (e.g., observer identity, search time, etc.), to still yield relatively accurate estimates of population and community change from unstructured data collection (Bird et al., 2014;Isaac et al., 2014;Kelling et al., 2019;Strien et al., 2013).If this is indeed the case, then citizen science programs could complement, if not radically transform, our capability to monitor the changing state of Earth's biodiversity.
Direct multispecies comparisons of both population trends and trajectories inferred from citizen science data and structured ecological surveys have now been examined for a number of taxa, including birds (Boersch-Supan et al., 2019;Horns et al., 2018;Neate-Clegg et al., 2020;Walker & Taylor, 2017, 2020) and insects (Dennis et al., 2017;Van Strien et al., 2013), to variable success.Though some programs exhibit considerable agreement with reference surveys (Dennis et al., 2017;Van Strien et al., 2013), in other cases the majority of species may show divergent population trends (Neate-Clegg et al., 2020).To date most of these comparisons have occurred at very broad spatial scales (e.g., national or global), where spatial mismatches in sampling between reference datasets can readily obscure any tests of true comparability.To critically test this question requires the rare conditions of citizen science and structured multispecies monitoring programs operating concurrently in the same assemblage with finescale spatiotemporal overlap relevant to local population dynamics.
The Florida Keys region hosts one of the largest reef complexes in the Americas, and is fairly unique for its long history of biodiversity monitoring through the Reef Visual Census (RVC), a multidecadal fishery-independent structured visual survey of coral-reef fishes that samples the breadth of the Florida Reef Tract to aid in the sustainable management of the multispecies reef fisheries (Ault et al., 1998;Brandt et al., 2009;Smith et al., 2011).The RVC survey provides a long record into the ecological changes that have unfolded as reef ecosystems in the Florida Keys have shifted considerably from climate-driven mass coral bleaching events (Somerfield et al., 2008), disease outbreaks (Palandro et al., 2008), and escalating exploitation pressure on the reef-fish community (Ault et al., 2005(Ault et al., , 2014)).Within this same region, Reef Environmental Education Foundation (REEF) also began its Volunteer Fish Survey Project (VFSP) in 1993, which collates species encounters and relative abundance observations from trained citizen scientists during roving dive surveys.The REEF VSFP has now amassed ∼10 million survey records globally (REEF, 2023), representing one of the largest temporal datasets on marine biodiversity in existence.The high degree of spatiotemporal overlap of the REEF VFSP and RVC programs in this region therefore provides a rare opportunity to directly compare opportunistic citizen science data with a rigorously-designed structured survey to critically assess the value of aggregated citizen science observations for tracking the population dynamics of a diverse fish assemblage.
Here, we test whether structured and opportunistic surveys yield similar or divergent population trajectories for 87 coral-reef fish species in the Florida Keys, using a multivariate state-space time-series approach that jointly models the observed data for each survey as manifestations of shared underlying population states.Assessing whether ecological data from various sources offer comparable or divergent views of ongoing biodiversity change is fundamental for overcoming the current biodiversity data crisis and to understand the complex responses of biodiversity to human stressors.

Reef Visual Census (RVC)
The RVC program has been ongoing in the Florida Keys since 1979 (Bohnsack & Bannerot, 1986), first by the National Oceanographic and Atmospheric Administration Southeast Fisheries Science Center, and subsequently as a collaborative effort by several government agencies and academic partners (Brandt et al., 2009;Smith et al., 2011).The survey consists of visual counts, and estimates of size distributions, for all reef-fishes that are observed by a pair of trained stationary divers in a 15 m diameter circular plot over a 20-min time-frame.Plots are selected through a probabilistic sampling design in a two-stage scheme.The "primary sample units" (PSUs) are 200 × 200 m grid cells defined on a digital map of the bathymetry and benthic habitats of the Florida Keys (Smith et al., 2011).Each year, PSUs are randomly selected for sampling within each reef stratum across the region (e.g., inshore patch reefs, mid-channel patch reefs, offshore patch reefs, and fore reef, at varying depths).The allocation of sampling effort among these strata is optimized based on stratum area within the sampling domain and the variance of density estimates for target species (Smith et al., 2011).From this stratified random sample of PSUs, a random sample of between 2 and 4 "secondary sample units" (SSUs), consisting of visual counts in the circular plots by stationary diver pairs (Bohnsack & Bannerot, 1986), were conducted within each PSU.

Reef Environmental and Education Foundation Volunteer Fish Survey Project (REEF VFSP)
The REEF VFSP collates data from recreational scuba divers and snorkelers reporting their fish encounters during roving-diver surveys (Pattengill-Semmens & Semmens, 2003).REEF VFSP surveys are distinct from RVC surveys in that observers are mobile and surveys vary in duration and area searched.Sampling effort also varies substantially among sites and habitats (Tables S1 and  S2), as it is driven by diver preferences.Rather than reporting direct counts, REEF surveyors score the total number of individuals encountered for each species as one of four rank-abundance categories (1, 2-10, 11-100, or > 100), along with accompanying survey metadata such as diver identity and surveyor experience, dive site location, date, total dive time, visibility, habitat type, and current.

Merging data sources
We intersected the REEF and RVC datasets in space and time, including observations recorded in both surveys from 1993 to 2018.We retained REEF surveys from dive sites that overlapped with the benthic habitat map and sampling grid of the RVC program (Smith et al., 2011), and from this we retained all REEF dive sites and RVC PSUs that fell within the bounds of the Key Largo subregion, which had the greatest overlap in sampling throughout the Florida Keys sampling domain (Figure S1).Each REEF dive site was also assigned a benthic habitat class and reef stratum based on its spatial location.We only included REEF surveys from sites and divers with at least 10 and 5 observations, respectively, throughout the period.In total we identified 3368 RVC surveys conducted at 2434 sampling sites and 9512 REEF surveys at 60 dive sites in Key Largo from 1993 to 2018.We used these datasets to construct population time-series for species that were sighted in at least 70% of years in both surveys, had a total sighting frequency over 1%, and were reef-associated and not commonly misidentified.In total there were 87 species from a range of taxonomic groups that met these criteria (Table S3).

Multivariate state-space models
To assess whether the RVC and REEF reef-fish surveys yield comparable population insights over time, we used a multivariate state-space model that estimates underlying population trajectories based on the observations from both surveys (Holmes et al., 2012;Ward et al., 2010).The state-space approach to population time-series assumes that the true population abundance (on a log e -scale;   = log(  )) changes from year-to-year according to a first order autoregressive process with annual log-normal deviations (  ) driven by demographic processes (ie.process variance,  2 ): The annual abundance estimate that we actually observe in a survey (  ) emerges from the true unobserved population state in a given year (  ) with additional stochastic white noise that represents the sampling variance in these annual estimates ( 2 ): Extending this state-space approach to two surveys, each with their own time-series of relative abundance ( , and  , ), we can directly estimate the degree of interannual synchrony between their latent population states ( ,  and  , ) if we treat their true population deviations ( , and  , ) as jointly arising from a correlated stochastic process : From the variance-covariance matrix (Σ) describing the within survey variances ( 2  and  2  ) and their covariance (    ), we can directly estimate the degree of inter-annual synchrony (ρ) in abundance fluctuations ( , and  , ) in the population states from each survey.We then assume the observed time-varying year effects in each survey (i.e.,  , and  , ) arise from the population states with independent measurement errors in each survey ( , and  , ): For our analysis, we estimate the mean abundance through time (  ; t = 1993, . . ., 2018) in each survey directly from diver observations in each dataset, but also condition these abundance estimates on multiple other survey-level factors that we expect to influence the true abundance (e.g., site or habitat effects) or the observations of a species' abundance (e.g., diver identity or dive length) on a given survey.These modeled factors additionally account for the uneven sampling that occurs over space, environments, and time within and among years when estimating annual expected abundance in both surveys.
For the RVC surveys, we conditioned annual abundance estimates on the spatial location and environmental features of each stationary count survey.The spatial clustering of multiple SSUs within a broader PSU grid cell is captured with a PSU varying effect,   , with similar varying terms for persistent abundance differences among benthic habitat types,  ℎ , reef strata,   , and the month of sampling in a given year,   .We also included a quadratic effect for depth of the survey (in covariate matrix, ).Counts in each SSU were reported as averages between the diver pairs, we treated these as true counts to include overdispersion by rounding each noninteger value upward to its nearest whole value.We model these counts from a log-link negative binomial distribution with a quadratic parameterization using the inverse shape parameter () for overdispersion: For the REEF surveys, observations of a species (  ) are reported as one of K = 5 ordered abundance categories (Y*: 0, 1, 2-10, 11-100, or > 100) with the probability of observing a particular category on a given survey i emerging from the latent (unobserved) abundance (  ).The latent abundance (  ) for a given survey i, will depend on a variety of mediating factors that we expect to shape the true abundance or the observation process on a survey beyond just the population abundance in a given year t (  ).We included varying effect terms for persistent abundance differences among survey sites (  ), site benthic habitat classes ( ℎ ) and reef stratum (  ), and calendar months (  ).We also include terms that we expect to shape a diver's observations including diver identity (  ), whether that survey occurred with other divers at the same sites on the same day (  ) or month of a given year (  ), and covariates estimated for each survey (in design matrix ) including total dive time (in minutes), current during that dive (ranked qualitatively on a 3-point scale), water visibility (ranked qualitatively on a 7-point scale), the average depth (ranked qualitatively on a 14-point scale; and depth squared), and whether the observing diver was of "expert" (divers with a minimum of 35 surveys in the region and a 90% success rate in identifying 100 species from that region) or "novice" experience.We modeled the probability of observing each abundance category for a species on a dive survey as an ordinal logistic regression, where cutpoints ( 1 , … ,  −1 ), estimated with an induced Dirichlet prior (see Appendix), categorize the latent abundance into probability intervals based on a set of linear predictors for   : =   +   +  ℎ +  st +   +   +   +  mc +   .
We estimated the parameters of this joint model using Stan (Carpenter et al., 2017), as implemented in cmdstanr (Gabry & Cešnovar, 2022).The full model, prior specifications, and diagnostic criteria are included in the Appendix.Each model was run for 1000 iterations across 6 chains, with the first 200 samples discarded as burn-in, for a total of 4800 posterior estimates.

Post hoc analyses
We used the posterior distribution of the correlation coefficient in abundance fluctuations between surveys (ρ) as a direct estimate of population time-series congruence for each species.We examined the posterior estimates of ρ among all species, and compared group-level differences in terms of taxonomy (Family, for n > 3; Table S3), coloration ("drab" or "colorful"), gregariousness ("conspicuous" or "cryptic"), and aggregation ("solitary," "shoaling," or "schooling"), and body size (in 3 bins, based on the upper/lower 20% of our sample: "small" (< 16.4 cm), "medium" (16.4-72.4cm), or "large" (> 72.4 cm)).We also examined the linear correlation among species' estimates of ρ and their mean abundance in each survey.We interpret the strength of evidence for differences directly based on the summary statistics of the aggregated posterior parameter estimates and their credible intervals (CI).
To determine whether our inferences on survey agreement would be robust under lower REEF sampling effort in the Key Largo region, we also implemented a sensitivity test whereby we randomly omit 75% of all observations in each year.We repeated this process for 20 random subsamples for the 9 species with the highest median estimates of ρ to test for overall changes in the posterior estimates of ρ.

RESULTS
The temporal correlation between population time-series estimated from RVC and REEF datasets varied consider-ably among species (Figure 1a).There was evidence for highly complimentary population states in many species (e.g., Figure 2a-c), while others showed uncorrelated or even divergent trajectories (e.g., Figure 2d).Median temporal correlations between surveys ranged from −0.55 for Goldspot Goby (Gnatolepis thompsoni) to 0.93 for Brown Chromis (Chromis multilineata), with 62% of species (54/87) having a median estimate of ρ > 0.25 (Figure 1a) and 75% (64/87) with a posterior mode ρ > 0.25 (Figure 1a).Species' mean abundance throughout the sampling period were also broadly correlated (r = 0.67, 95% CI: 0.57 to 0.79) between surveys (Figure 3), suggesting they are also recovering similar assemblage-wide patterns of abundance.There were considerable differences in the mean temporal correlation between surveys across taxonomic groups (Figure 1b).among species from other groups (Figure 1b).These taxonomic differences appear to be shaped in part by certain species' traits (Figure 4).There was some evidence that small to medium-sized species had lower estimates of ρ relative to the largest species in our sample (Δ ρ = −0.11[80% CI: −0.28 to 0.06] and −0.12 [80% CI: −0.27 to 0.02], respectively).Species that tend to be solitary had higher agreement than those that gather in schools (Δ ρ = 0.24 [80% CI: 0.04 to 0.43]) and slightly higher than shoals (Δ ρ = 0.10 [80% CI: −0.05 to 0.25]).Colorful species had slightly lower agreement compared to drab species (Δ ρ = −0.09[80% CI: −0.21 to 0.03), while conspicuous and cryptic species exhibited no difference (Δ ρ = 0.06 [80% CI: −0.13 to 0.24]).We found no evidence for a relationship between congruence and mean abundance (Figure S2).For 9 species with the highest posterior estimates of ρ, we found that iteratively removing 75% of observations in the REEF dataset by year resulted in less certain but otherwise unchanged posterior modes in almost every case (Figures S3 and S4), indicating that population indices built on far less data would still show similar levels of congruence, but with greater uncertainty.

DISCUSSION
Overall we found strong agreement in abundance timeseries estimated from opportunistic citizen science data when compared to concurrent structured surveys for the majority, but not all, of the coral-reef fish species we examined.Species with higher agreement between datasets tended to be larger-bodied and solitary, but taxonomic differences in survey congruence were far greater than any individual species trait we examined.This taxonomic structuring in agreement suggests that there are likely shared ecological or behavioral characteristics of clades that we did not examine influencing their enumeration in stationary fixed-area counts in RVC versus roving counts by REEF divers.The RVC sampling protocol was principally designed to quantify the abundance and biomass of the snappers (Lutjanidae) and groupers (Epinephelidae) that are central targets of the Florida Keys fisheries (Ault et al., 1998;Ault et al., 2005), but the accuracy of these methods for enumerating other incidentally sampled fish species is likely to vary widely.It is therefore promising generally that survey congruence was particularly high amongst the snappers and groupers where we expect the comparison between surveys to be the most appropriate.
After substantially reducing sampling effort in the citizen science dataset we found that estimates of temporal correlation were largely unchanged relative to the full dataset for species with high agreement (Figure S2).This suggests that even in regions with substantially less survey effort than the Florida Keys we may expect population insights from citizen scientists to still be robust.Disagreement in population trajectories may also emerge in part from the spatial distribution of sampling in each survey.REEF surveys tend to be highly aggregated toward high-relief spur-and-groove reef sites in the Key Largo region (Table S3), which are popular with divers due to the overall abundance and diversity of fishes.The abundance fluctuations observed at these particular sites may or may not be representative of the broader population context, which likely depends on species' habitat preferences and distribution throughout the reef complex.Though this spatial aggregation of effort in the citizen science dataset poses clear biases with spatial representation, the repeated sampling of the same sites also provides an advantage for disentangling persistent spatial differences in abundance from potential temporal fluctuations.The Florida Reef Tract exhibits considerable fine-scale spatial variation in coral composition and cover (Somerfield et al., 2008), and many reef-associated fishes exhibit strong site fidelity and fine-scale habitat preferences (Sale, 1991).This would suggest a high heterogeneity in abundance across sites, and indeed spatial components of variance in abundance are far greater than temporal variance in both surveys (Figures S5 and S8).That most species showed varying levels of agreement between surveys despite these large differences in spatial sampling representation suggests that at some level these high-value habitats must be important bellwethers for the overall population and that citizen science programs are especially suited to monitoring economically and ecologically important sites within coastal regions.
Overall our results suggest that opportunistic citizen science can be a valuable source for ecological insights into changing coastal ecosystems and that these data have the potential to strengthen our ability to manage biodiversity in the face of human impacts.Estimates of changes in relative abundance are vital for evidenced-based management and developing effective policy to safeguard populations and biodiversity generally.Exploited marine species that have data-rich stock assessments are more likely to be sustainably managed relative to their data-deficient counterparts (Hilborn et al., 2020).Yet the availability of species-specific data to enable these assessments is generally limited to only large commercial stocks (Ovando et al., 2021b), while the vast majority of smaller-scale fisheries frequently have limited data available to inform management (Carruthers et al., 2014;Costello et al., 2012).Fisheries-independent monitoring data are often a crucial supplement to catch estimates for assessing stock status (Dennis et al., 2015;Ovando et al., 2021a), but these data require significant time, financial investment and effort to obtain and are to be cost effective for many exploited species.In this regard, marine citizen science programs could be a useful compliment to or surrogate for independent monitoring for the many data-limited fisheries that provide a vital source of food security for coastal populations around the world.
Taken together, our findings demonstrate that citizen scientists can be effective ecological sentinels in capturing the population dynamics of fish in coastal ecosystems when analytical frameworks are designed to account for lurking biases in the data that emerge from collective opportunistic sampling.The coming decades are poised to bring about substantial changes to practically all of Earth's ecosystems, yet only a limited few are currently being broadly monitored.The expansion and adoption of citizen science programs holds considerable potential to expand biodiversity monitoring on a global scale and support local management efforts in the face of ongoing environmental change.

A C K N O W L E D G M E N T S
This study would not be possible without the numerous citizen scientists that have contributed their observations to the REEF Volunteer Fish Survey Project for nearly three decades.Funding to REEF in support of the VFSP by the Lenstra Fund, the Paul M. Angell Family Foundation, The Henry Foundation, and The Curtis and Edith Munson Foundation has made this dataset and research possible.Thanks to Jeremiah Blondeau, Jay Grove, and Jerald Ault for their discussions and advice on using the Reef Visual Census.Thanks to Christopher Brown, Leo Polansky, and two anonymous reviewers for their helpful feedback on an earlier draft of this manuscript.

C O N F L I C T O F I N T E R E S T S TAT E M E N T
The authors declare no conflicts of interest.

D ATA AVA I L A B I L I T Y S TAT E M E N T
All data and code for the presented analysis are included in the GitHub repository: https://github.com/dagreenberg/reef_timseries_comp.

R E F E R E N C E S
Species-level estimates of the temporal correlation of abundance change among surveys (ρ), colored by taxonomic affiliation in B, indicating the posterior median (circle), mode (triangle), 80% (thick line) and 90% (thin line) credible intervals.(B) Posterior distributions of the mean correlation estimate in year-to-year population fluctuations between surveys (ρ) for each major taxonomic group (Families or sub-Families) examined.Points indicate the median and bars represent the 80/90% credible intervals of the distribution.
Examples of species' relative abundance time-series estimated from structured RVC surveys (in blue) and opportunistic REEF citizen scientist surveys (in red).Solid lines and points represent median estimates of the annual expected abundance (α t ) with both process and measurement deviance, while dashed lines indicate the model-estimated estimated population state (x t ) with only process deviance.Shaded areas represent 90% Bayesian credible intervals for the latent population states.Time-series plots for all species comparisons are available in the Appendix.
Comparison of species' geometric mean relative abundance (on a log 10 scale) across all years between surveys.Estimates of species' average abundance from structured Reef Visual Census (RVC) surveys (x-axis) and opportunistic citizen science surveys (y-axis) collated by the Reef Environmental Education Foundation (REEF) in Key Largo, Florida, USA.Each observation is colored by the posterior mode estimate of the temporal correlation between surveys () with tails representing 90% credible intervals.The diagonal indicates the 1:1 line.The median correlation (r) in species' mean abundance across the posterior is reported with 95% credible intervals in the parentheses.Posterior distributions of the mean correlation in year-to-year population fluctuations between surveys (ρ) among species with different traits (in order from top): species that are either colorful or drab, species that tend to be conspicuous or more cryptic in their behavior, species that tend to be solitary, may form temporary small groups, or form coordinated schools, or sized-based categories indicating small (< 16.4 cm total length, smallest 20% of species sample), medium (16.4 to 72.4 cm; middle 60%) or large species (> 72.4 cm; largest 20%).Points indicate the median of the mean parameter estimates and tails indicate the 80/90% credible intervals.