Integrating ocean observations across bodysize classes to deliver benthic invertebrate abundance and distribution information

.

metadata standards such as Darwin Core, and (5) making data available through internationally recognized access points.These practices enable broader-scale analysis supporting research and sustainable development, such as assessments of indicator taxa, biodiversity, biomass, and the modeling of carbon stocks and flows that are contiguous over time and space.
The need to understand and manage risks around changes in benthic invertebrate abundance, distribution, and biodiversity is increasing in relation to natural variation, climate change, and growing ocean industrial activities.This is further driving efforts to better harmonize the collection of ocean information.Expansions in the types and scope of human impacts are being mitigated in part through the designation and management of marine spaces including areas defined for specific industrial use, and marine protected areas (MPAs) to improve living-resource sustainability.For example, the Post-2020 Global Biodiversity Framework of the United Nations Convention on Biological Diversity (CBD) has proposed a target to protect 30% of the land and ocean areas by 2030.
Biology and ecosystem variables are at the core of regulations that mandate the management of these areas (Miloslavich et al. 2018;Muller-Karger et al. 2018), as well as many facets of research and applied science to inform the United Nations Decade for Ocean Science for Sustainable Development.However, there are several issues that have limited the capacity to scale seafloor biological and ecosystems information beyond individual studies through to broadscale, gridded, spatiotemporally relevant ecological assessments.For example, how can datasets that sample differing parts of the body-size spectrum, or different sampling and observing methods, be used in integrated assessments and modeling to advance our understanding of seafloor biomass, energy flow, and related macroecological variables?
There has long been debate in benthic ecology about what organism size-class ranges should be measured, as well as what are the best ways to measure and sample those classes, and how these best address questions about ecological function or various policy and management needs (Moore and Bett 1989;Danovaro et al. 2020Danovaro et al. , 2021;;Ingels et al. 2021).The Global Ocean Observing System (GOOS) has set out the Essential Ocean Variable (EOV) concept to identify variables that have both high feasibility and impact (Tanhua et al. 2019).Other related concepts include the Group on Earth Observations Biodiversity Observing Network (GEO BON) Essential Biodiversity Variables (EBVs), which further outline variables of global interest (Miloslavich et al. 2018;Muller-Karger et al. 2018).EOVs also have supporting variables that enable specific uses, such as the need for temperature information to understand changes in hard coral cover or metabolism.EOVs can also have derived subvariables such as transforming the biomass of a benthic community into estimates of carbon stocks and flows.What we outline here is not the de facto or exclusive constitution of this EOV, but rather part of a body of work that is generally inclusive of EOV concepts that meet GOOS standards for feasibility and impact in the context of its Framework for Ocean Observing (Tanhua et al. 2019).In this contribution, we consider benthic invertebrate abundance, as may be quantified in terms of numerical density, or biomass density, per unit area of seafloor, and/or for some defined fraction of the fauna (e.g., taxonomically and/or by body size).What we outline is not exclusive of other EOV delivery concepts, such as areal cover and the use of environmental DNA to assess distribution.
Size-based information can be used to improve the value of individual datasets in terms of understanding both what has been sampled well, as well as informing on parts of the size spectrum that are not sampled.Here, we show how bringing together several benthic invertebrate data collection and processing steps facilitates the generation of coherent and robust estimates of abundance, and derived variables including the stock and flow of carbon and the determination of secondary production, sensu a GOOS EOV for benthic invertebrate abundance and distribution.This harnesses existing capability and capacity to more effectively deliver sizespecific biomass data for specific locations and times, as well as model estimations over gridded areas in hindcasts, nowcasts, and forecasts.These outputs can then be processed into indicator and scorecard information in the context of understanding the status and trends of key variables needed to manage industry use of marine spaces, MPAs, and other applications including fundamental research.

Size matters
It is a truism that the smallest organisms in any ecological unit are, in relative terms, extremely numerous and that the very largest specimens are extremely rare.Relationships between body size and abundance appear to follow power laws in size spectra (Fig. 1; Table 1; Mohr 1940;Sheldon et al. 1972;Damuth 1981;Brown et al. 2004;Bett 2013;Kelly-Gerreyn et al. 2014;Benoist 2020;Marchais et al. 2020).Consequently, the apparent numerical abundance of any invertebrate assemblage is critically dependent on the body size of the smallest entities included in the count.Similarly, the apparent biomass abundance of the assemblage may be very substantially impacted by the largest entities included in the measurement.Typically, the size of the largest entity encountered is highly dependent on the extent of a particular sample, whether that is measured as the number of specimens censused or the total seafloor time and space domain examined (Sanders 1960).

Formalizing
Power-law distributions of animal body size have a long history of application in ecological studies of abundance: terrestrial mammals (Mohr 1940;Damuth 1981), pelagic ecosystems (Sheldon et al. 1972;Marchais et al. 2020), benthic ecosystems (Bett 2013;Kelly-Gerreyn et al. 2014) and have been generalized in the Metabolic Theory of Ecology (Brown et al. 2004).The nomenclature and means of describing these power-law distributions have been somewhat inconsistent (Vidondo et al. 1997;Edwards et al. 2017), but can be unified by considering the power-law exponents: abundance $ (body mass) α , where α = À1.75, in the case of Damuth's rule (White et al. 2007), and α = À2, in the case of the Sheldon spectrum (Blanchard et al. 2017; Fig. 1; Table 1).Inconsistencies also arise depending on whether the nomenclature refers to the exponent of a continuous distribution, or to the slope of body-size spectra constructed with logarithmic classes.The Metabolic Theory of Ecology and related macroecological patterns have limited links to the structure and function of assemblages in terms of the numerical density, to their biomass and the flux of energy and mass through those assemblages, and other variables including species richness (Brown et al. 2004;Marchais et al. 2020).While particular taxonomic groups can exhibit systematic differences in their individual metabolic rates (Hughes et al. 2011), there is a clear central tendency toward a 0.75-power mass scaling of metabolism at the macroecological level (Brey 2010).

Limits
These relationships appear to be robust in benthic invertebrates across large ranges of body mass, taxonomies, and environments (Woolley et al. 2016;G orska et al. 2020;Marchais et al. 2020;Mazurkiewicz et al. 2020).These relationships nonetheless have limitations and are often less robust for smaller subsections of the body-size spectrum.While deviations from these relationships have been reported, some of these departures may be related to sampling artifacts and/or data analysis methods, such as examining small sections of the size spectra or sampling too few individuals (Bett 2013;Bett 2014;Edwards et al. 2017Edwards et al. , 2020)).Deviations from such standard scaling can also be informative about other aspects of life history at specific levels of taxonomy, spatiotemporal scales, and niche dynamics (McClain et al. 2020).For example, some species have clear competitive advantages in certain locations and/or times, or body composition such as calcareous tissue that relate to shifts in the relationships between mass, carbon content, and metabolism.

Utility
While the underlying mechanisms behind these relationships are the subject of debate, their usefulness in quantitative  1), bounded at a maximum body mass of c. 4 kg (as observed in field samples).Macrobenthos are sampled with a 250 μm (0.25 mm) sieve mesh, producing a lower bound body mass of c. 9 μg; megabenthos with a 32 mm mesh net, producing a lower bound body mass of c. 18 g.In each case, three random samples are drawn of 150 (green), 200 (red), and 250 (blue) individuals to mimic natural variability in numerical density (see Bett et al. 2023 for additional details).(B) Field sample data from the Porcupine Abyssal Plain Sustained Observatory (PAP-SO; see Hartman et al. 2021).Macrobenthos determined from core samples sieved on a 250 μm sieve mesh collected in 2 yr (red, green; Benoist 2020); megabenthos determined from c. 65,000 seafloor photographs, divided to central (red) and northern (green) abyssal plain and abyssal hill (blue; see Benoist 2020; Durden et al. 2020b).(C) Inset, comparative data for three megabenthos trawl catches (red, green, blue) from the PAP-SO (1989 samples; see Billett et al. 2001).Note, that the point of inflection between "well" sampled and "under" sampled in the field data is indicated with an arrowhead, that the un-or undersampled body mass range is indicated by fine dashed lines, and that large rarities (LR) were encountered in both the randomly sampled and field sampled (isopod, 118 mg) macrobenthos (see Sanders 1960).Data and metadata are available at https://doi.org/10.5281/zenodo.7725189.
Table 1.Parameters for describing the relationship abundance $ (body mass) α in Damuth's rule (White et al. 2007)  ecology is commensurate with their apparent ubiquity in nature.These relationships provide an opportunity to (1) assess the effectiveness of sampling, and potentially to (2) infer or impute un-or undersampled components of the body mass spectrum (Fig. 1).These relationships run across the meio-, macro-, and megabenthos size classes that are often used to compartmentalize benthic invertebrate communities (Wei et al. 2010;Danovaro et al. 2020).There are, however, detailed considerations with respect to various sampling issues that arise from different life cycles and field techniques, as well as objective issues concerning defining the limits of the "well-sampled" region of the size spectrum; and when inferring a result for the total system, what the true range of individual body masses might be.
A major issue with interpreting biomass data is that the presence or absence of large rare individuals has a very significant impact on the estimation of total biomass density (Sanders 1960).For power-law body-size distributions where α ≥ À2, the mean and variance of estimated biomass density is likely to increase with the number of samples or specimens examined (Newman 2005), that is, the apparent biomass density of the assemblage under study will systematically increase until the largest individuals in the system are well censused.
The relationships entailed in power laws, such as those of the Metabolic Theory of Ecology, provide predictable macroecological relationships that offer a robust quantitative framework to facilitate integrating data from various methods.Such allometric relationships allow for the formation and tuning of body mass-based models that can account for the slope(s), range, and y-intercepts of each respective equation, as well as variable resources and temperature.
What follows is a step-by-step illustration of the capabilities for, and value of, using body size to facilitate integrating data from different body size classes (meio-to megabenthos) and sampling methods (physical sediment core samples and seafloor imagery), and using size-spectra data to support the formation and evolution of biological and ecosystem analyses.This full data lifecycle and value chain approach provides a means for improving the quality, consistency, and amount of benthic invertebrate abundance and distribution data and its onward use in modeling and information product development.

Global seafloor invertebrate life
Benthic invertebrates are found across the world's seafloor areas from the chill of the Arctic to the warmth of the Red Sea, and from the coasts to the greatest depths of the Challenger Deep.These organisms play important roles in the carbon cycle and nutrient regeneration (Muller-Karger et al. 2005;Ruhl et al. 2008;Snelgrove et al. 2018;Priede et al. 2022).The invertebrate marine fauna, kingdom Animalia excluding subphylum Vertebrata, encompass 27 phyla of free-living organisms and span a body-size range from 20 ng to $ 20 kg.They inhabit diverse environments from fluid mud to granitic bedrock.Consequently, we have focused our attention on the best means of integrating the inevitably disparate data generated by variant field methods and the different fractions of the seafloor ecosystem assessed.Inevitably, no single methodology is capable of capturing or otherwise enumerating this group in totality.The group is divided typically into three broad size categories, the meiobenthos, macrobenthos, and megabenthos.Some of the smallest invertebrate members potentially fall into the microor nanobenthos (Burnett and Thiel 1988), but these are very few relative to the larger groups.For the three primary groups, no universally accepted definition of these size ranges is available, despite many decades of argument, and many pleas for standardization (Higgins and Thiel 1988;SCOR 1994).
Meiobenthos are broadly defined as invertebrates that pass through a 1 mm mesh size and are retained on a 32 μm sieve, although upper and lower size boundaries vary by practitioner (Giere 2009;Danovaro 2010;Schratzberger and Ingels 2018).This class is often dominated by Nematoda and Arthropoda among 18 phyla.Benthic prokaryotic foraminifera also make up a notable component of life in this size class, with taxa extending into the macrobenthos.Quantification is impacted by the choice of defining sieve mesh and sampler type (Bett et al. 1994;Gage and Bett 2005).Macrobenthos are generally defined as retained on sieve mesh sizes from 250 or 300 to 500 μm.Dominant taxa include Annelida and Arthropoda.Megabenthos are widely considered to be approximately 1 cm in size or larger (Grassle et al. 1975;Bett 2019).Depending on substratum and depth, megabenthos can be dominated by soft (Alcyonacea) or stony (Scleractinia) corals, anemones (Actiniaria), sponges (Porifera), sea cucumbers and their relatives (Echinodermata).
The lack of size range standardization is complicated by two factors: (1) The use of a taxonomic restriction on data recording and/or reporting, forcing the introduction of the additional qualifiers "sensu lato" and "sensu stricto," the former representing the "pure" size-based categorization, the latter a taxonomically restricted size-based categorization.For example, macrobenthos sensu stricto data often discounts the abundant occurrence of nematodes on the grounds that they are a "meiobenthic" taxon, and vice versa, taxonomic identification of larval macrobenthos observed within the meiobenthic size class may not be pursued.(2) The use, or not, of an upper size limit in any sampling exercise, for example, the use of two sieve meshes in the processing of sediment samples.For example, pooled data on the macrobenthos sensu stricto from two USNEL Mk II-type box core samples collected from 1900 m in the Rockall Trough (Northeast Atlantic) were processed using multiple sieve mesh sizes (Gage et al. 2002).When assessed via a single 250 μm sieve, the results indicate 1383 ind.0.5 m À2 and 1.003 gwwt 0.5 m À2 , when processed to < 1000 μm > 250 μm apparent biomass density dropped by 94% and numerical density dropped by 19%.This problematic effect has been noted for decades (Sanders 1960) and remains an issue today, along with other related issues of sampling bias associated with specific gear types (McIntyre 1956;Bett et al. 1994;Gage and Bett 2005;Benoist et al. 2019).

Sensing and sampling
Much of the existing abundance data for megabenthos is rather limited in terms of quantitative skill, having been derived from towed samplers, trawls and sledges, where the seafloor area effectively sampled is approximately estimated at best.This is due, in part, to the challenges with quantifying the largest benthos, where many surveying techniques are likely to under sample the largest sizes (Marchais et al. 2020).Recent decades have seen a rapid expansion in the use of mass photography to quantify megabenthos abundance and diversity more rigorously (Durden et al. 2016;Benoist et al. 2019;Simon-Lled o et al. 2020).This has enabled quantification of the potential undersampling by trawls (e.g., 20-60-fold underestimation of numerical density, Morris et al. 2014;20-200-fold underestimation of biomass density, Benoist et al. 2019).When considering that some burrowing fauna are not easily quantified from photography, but have some limited recovery in trawls, these differences are conservative.Quantification of the burrowing fauna continues to pose a significant challenge (Bett 2019), typically they are too rare to be appropriately sampled in cores, are inefficiently sampled by trawls, and are only partially censused in seabed photography.While many functional traits scale with size (Peters 1983), many are also taxon specific and manifold in their combinations.Where functions are known, they can provide useful information for modeling and onward use of information to understand specific habitat uses, trophic interactions, food web dynamics, behaviors, life stages, and ecosystem services.Key examples include functions of suspension feeding, deposit feeding, bioturbation, predation, and habitat formation (Durden et al. 2019).
The quantification of benthic invertebrate abundance and distribution typically occurs through either sediment sampling or image-based techniques (Clark et al. 2016;Thompson et al. 2017).Uses include research and applications for baseline assessment and environmental monitoring purposes for various industries, with the potential to be processed into EOV data.Other examples come from the Western Arctic Shelf Seas (Dunton et al. 2005;Grebmeier et al. 2006), Porcupine Abyssal Plain Sustained Observatory (PAP-SO) in the northeast Atlantic (Hartman et al. 2021) and the Sta.M timeseries study site in the northeast Pacific (Marchais et al. 2020), with many initiatives globally that regularly collect such samples and/or imagery (Glover et al. 2010;Levin et al. 2019; e.g., Ocean Networks Canada, the Ocean Observatories Initiative, and the European Multidisciplinary Seafloor and water column Observatory).
Various other forms of towed net sampling also have valued applications but are generally considered to be semi-or non-quantitative.Sediment coring provides the physical collection of samples over a known area of seafloor from a variety of devices, which may range in physical sample size between 20 cm 2 and 3 m 2 and may be variously subject to subsampling following collection (Gage and Bett 2005).Samples from these systems can be sieved using a variety of screen sizes, 250-500 μm for macrobenthos and 32-63 μm for meiobenthos (Kaariainen and Bett 2006).These systems perform poorly in sandy and mixed sediments and become nonfunctional on hard substrata.
Both microscopy and image analysis involve identifying individuals to some taxonomic level, typically to the most detailed level given available information and expertise.For example, it has long been common practice to estimate the individual mass of small infaunal organisms (e.g., meiobenthos) via length measurements and body volume estimation (Andrassy 1956;Giere 2009;Mazurkiewicz et al. 2016;Llopis-Belenguer et al. 2018;Ärje et al. 2020).Imaging, along with sediment sampling, has become part of many research and monitoring efforts (Narayanaswamy et al. 2006;Howell et al. 2007;Durden et al. 2016;Przeslawski and Foster 2018).Seafloor time-lapse photography and video has been ongoing at many locations globally, producing time-series of many invertebrate variables (Bett et al. 2001;Durden et al. 2020a).Broad-scale photographic ecological mapping by remotely operated vehicle (ROV), towed camera, and autonomous underwater vehicle (AUV) has also become more routine (Howell et al. 2010;Morris et al. 2016;Durden et al. 2016;Thornton et al. 2022).The resulting data have been used to estimate biomass, community composition, and other variables (Davies et al. 2015;Durden et al. 2020a,b), as well as for the development of machine learning-based object classification systems to streamline data processing steps (Piechaud et al. 2019;Durden et al. 2021).
There is a variety of specimen dimension-to-biomass estimation techniques, for example, empirical length to weight relationships or more generalized volumetric methods (Robinson et al. 2010;Durden et al. 2016;Benoist et al. 2019;Marchais et al. 2020).These estimates typically refer to fresh or preserved wet weight and can be converted to carbon mass using relationship formulas (Brey 2010).These conversions, and standardized use of them, may be key to the onward use of such data in several forms of modeling.However, dimension and volumetric measurements should be vetted against direct biomass data using representative numbers of specimens and taxon diversity where possible to increase accuracy of conversion factors used currently.The Ocean Best Practices System, and numerous other initiatives, form and organize information on how to collect samples and process the data for robust onward uses.The Ocean Biodiversity Information System (OBIS) is a repository in which to organize databases of such biomass conversion data and practices, including via cooperation with the World Register of Marine Species.Maintaining the raw form of conversion data can enable the construction of conversion libraries, including factors such as time, space, and environmental variation (Robinson et al. 2010;Benoist et al. 2019).

Data curation
Once samples are processed, where objects and regions of interest are classified and annotated, the fundamental step of data curation remains to facilitate the production of data that is more findable, accessible, interoperable, and reusable.At this point, data can be considered as GOOS EOV data (Benson et al. 2021).The taxon-specific data generated by sampling and surveying can be formatted to be joined with related environmental and other relevant data using the OBIS ENV-DATA approach, functional group or other attribute data, and they can then be appended with metadata using the Darwin Core metadata standard (De Pooter et al. 2017;Horton et al. 2020;Schoening et al. 2022).Recent work has outlined the need for greater standardization in image-based analyses (Howell et al. 2019) and the application of open nomenclature signs to reconcile issues of species concepts with morphotype concepts that apply to image-based identification of benthic invertebrates (Horton et al. 2021).The OBIS ENV-DATA approach can also be used to capture body size to mass conversion information with the Extended Measurement or Fact extension (De Pooter et al. 2017), linking multiple biometric measurements (weight, biomass) to a particular occurrence record or (sub)sample.In addition to sample geographic coordinates, multiple abiotic measurements can be associated with these records, such as water depth, sediment type and bottom water temperature, and more.

Quantitative processing
Depending on the application, accounting for issues arising from various sampling approaches can improve the analytical value of data.Among the important factors to consider in this process is the real extent surveyed and its implication for how well the largest/rarest individuals have been sampled.For example, when size is plotted on an x-axis, how far to the right does the size spectrum extend, and how imprecise does it become?It is key to identify the well-sampled part of the size spectrum for each sampling and observing method.Typically, sample data exhibit a positively skewed lognormal body mass distribution, where the right-side of the distribution reflects the reliably captured fraction of the assemblage, and the left side of the distribution represents the increasingly poorly captured body sizes (Fig. 1).For example, when the macro-and megabenthos data are plotted on a continuous scale, the left side of the megabenthos data reveal a section of the size spectra that can be interpreted as undersampled as smaller and smaller objects become harder and eventually impossible to detect in seafloor images.A similar effect is apparent in body mass data obtained from sieved physical sediment samples and trawl samples (Fig. 1B,C).Both sieved samples and visual assessments from seafloor images are thus subject to a "fuzzy" lower boundary on the minimum body mass reliably sampled (Bett 2013;Marchais et al. 2020).
The largest megabenthos are systematically subjected to higher variance due to their rarity or are not quantitatively observed at all (due to the limited sampling size operated).Macroecological analysis can focus on the "reliable region" of each component spectrum.The potential fit to a single underlying body-size distribution across the conventional invertebrate body-size categories shows how these data could be integrated and indeed "fill in" knowledge gaps about body sizes that were not sampled.Size-based analysis of this type has been similarly applied to improve the usefulness and comparability of fisheries data (Millar and Fryer 1999;Stepputtis et al. 2016).

Modeling and indicators
Seafloor ecosystem model frameworks include variety in geographical or temporal scale, degree of coupling with physics, and ecosystem representation (Peck et al. 2018).Model types range from those that are purely statistical through to those that explicitly couple physical, biogeochemical, and ecological components.In terms of ecological detail, models vary from those based on the most easily measured benthic properties (e.g., abundance, biomass), through to those that consider the ecosystem services performed by functional groups (e.g., remineralization), and nearly all reference organism size in some way.For example, the Benthic Organisms Resolved In Size model (BORIS; Kelly-Gerreyn et al. 2014;Yool et al. 2017) represents 16 size classes of meio-to macrobenthos animals with allometrically calculated metabolic and ecological parameters.This life shares a common detrital food source that is fed by the flux of particulate organic carbon (POC) to the seafloor.This model can produce gridded, time-variant projections (SSP1-2.6,i.e., sustainability; low greenhouse gas emissions, and SSP5-8.5 fossil-fueled development; very high greenhouse gas emissions) of seafloor biomass for BORIS forced by the seafloor flux of POC from the UK Earth System Model (Fig. 2A,B UKESM1 model; Cooley et al. 2022).The results show geographical changes across the 21 st century in seafloor biomass.There are corresponding changes in the size spectra of global biomass, and the relative change in seafloor biomass to different depth horizons (Fig. 2C,D).An updated version of BORIS is currently in development that explicitly includes habitat temperature, an important metabolic factor as benthic boundary layer temperature ranges > 20 C globally.Importantly, the BORIS model can be tuned using sediment samples of meio-and macrobenthos, and megabenthos estimates from imaging.The input POC flux can be derived from sediment trap samples, satellite-based algorithms (Lutz et al. 2008), or models, such as MEDUSA (Yool et al. 2013;Fig. 2).
In addition to size classes, there are other ways to partition and assess benthic invertebrates in models or other applications, for instance, via functional types.Models using mechanistic process formulations can simulate stocks and flows independently of body size, such as with the European Regional Seas Ecosystem Model (ERSEM, Butenschön et al. 2016), random forest modeling (Wei et al. 2010), and linear inverse models (LIM, Stratmann et al. 2018).The use of both-size and functional-type models can have issues where the size-based models require size-specific estimates of mass, or where functional-type models are not well-defined with respect to the size classes of included fauna.However, the collection of size information, as well as the classification of organisms into known functional groups can improve the interoperability of data between these two types of models.
Biodiversity is widely used to assess ecosystem state and as a proxy for resilience of communities to perturbation (Zerebecki et al. 2022).Some aspects of diversity can now be modeled, including via macroecological relationships (Peck et al. 2018).A catalog compiled of marine biodiversity indicators, consisting of over 600 examples, demonstrated that many have common assessment elements and inputs (Teixeira et al. 2016).Observational data can also be used for generating more robust statistical estimations of biodiversity that account for issues such as sampling effort and the number of observed individuals, both of which are known to affect biodiversity estimation.For example, rarefaction and extrapolation methods can be used to better quantify biodiversity (Hsieh et al. 2016).
Indicator taxa are often used in environmental management as a means of monitoring change in ecosystems (Gillett et al. 2015).Indicators can also be modeled or applied to broader scales, but the ways in which these typically work is by being tailored to a narrow taxonomic or functional group within a specific region.Useful indicator taxa for monitoring are sensitive to environmental change and should meet a number of other criteria that demonstrate their value as an indicator of broader change, pollution, or other factor(s) of interest.These include a well-known and stable taxonomy, natural history, ease of survey and data handling, broad geographic distribution of higher taxonomic levels, and patterns of biodiversity reflected in other taxa (Pearson 1994).Species or assemblage distributions and abundance can also be modeled with hindcasting and forecasting, providing an ability to predict shifts in species ranges or in environmental niches.

Drivers
In a global context, there are several initiatives that use benthic invertebrate data for research, observing, and statutory monitoring including GEO BON, the International Seabed Authority (ISA) and related needs for observing and model data to understand baseline and impact assessment for its Reserved Areas for prospective seafloor mining (Stratmann et al. 2018), the Intergovernmental Panel on Climate Change (IPCC; Yool et al. 2017;Tittensor et al. 2021;Cooley et al. 2022), the Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services (IPBES), the CBD, the Coastal and Estuarine Research Federation (CERF), the OSPAR Commission, the International Council for the Exploration of the Sea (ICES), the Deep Ocean Stewardship Initiative (DOSI), and those implementing the Deep Ocean Observing Strategy (DOOS; Levin et al. 2019).National and regional initiatives are also evaluating the status and trends in marine invertebrate variables including the European Marine Strategy Framework Directive (MSFD), the US Magnuson-Stevens Fishery Conservation and Management Act and related Integrated Ecosystem Assessments, the US National Marine Sanctuaries Act and related Condition Reports.While each has differing combinations of observing needs, size-specific information on benthic invertebrate biomass, including via a gridded timevariant format, can address needs across these groups.

A data lifecycle approach
While the above tools have all been used previously to scale up information, their broader and more harmonized application is needed to mark a step change in the quantity, quality, and impact of benthic invertebrate assessments.Here, we outline elements of a data stream plan for delivering science service workflows.This links the above concepts toward evolving processes and key steps to meet needs.To illustrate how such an integrative workflow can operate, we can follow the steps from drivers, data collection and integration through to the provision of benthic invertebrate information to userdriven information products and applications (Fig. 3).

Getting to EOV data
Engagement with drivers of ocean observing, data, and information users continues to develop and refine selection of platforms and methods including those for the collection of sediment samples, video, and still image data (Fig. 3A).Best practices in such data acquisition include the requirement to quantify the sampled or viewed area and facilitate the means to determine body size through examination of physical samples and/or photogrammetry and modeling.After some initial curation, to include making backups of digital data, there are several key sample/data processing steps.The choices in processing, (semi-)automation and later steps depend in part on the type of sample/data and the use case needs.This can include screening with sieves, microscopy, and sample sorting.Alternatively, with video and still images, this can include image processing such as improving image quality, registering image data geospatially, and identifying and measuring individuals in the imagery.This will include the growing application of machine learning and artificial intelligence tools and their curation, for example, BIIGLE (Langenkämper et al. 2017) and FathomNet (Katija et al. 2021).Visually observed morpho-species can be registered into a standard taxon referencing framework (Howell et al. 2019;Horton et al. 2021).Placing processed in situ observation and sample data in machine-readable forms for discovery and reuse will also be critical.When metadata have been applied, the EOV data are ready to be included into data archives and services such as OBIS, THREDDS, or ERDDAP, including size-, functional-type-, and taxon-specific forms.Indicator species and related concepts can then be extracted from these machine-readable sources.

Adding value
Benthic invertebrate EOV data can support modeling and data product development to understand industry impact and management practices, each in the context of a changing climate (e.g., seafloor mining, MPAs; Fig. 3B; Levin et al. 2019;Jones et al. 2020;Hofmann et al. 2021).Climate change has been shown to drive shifts in ocean temperatures, circulation, primary production, acidification, deoxygenation, and a myriad of other impacts (Cooley et al. 2022).These long termshifts are expected to cause major reductions in seafloor biomass, change the size distribution of benthic invertebrates (Yool et al. 2017), and community composition over time (Sweetman et al. 2017) simultaneously to ongoing impacts from ocean trawling, energy industry operation and development, litter and pollution, and prospective seafloor mining that could impact vast areas of the seafloor (Levin et al. 2019;Jones et al. 2020).
Once quantitatively processed for understanding the wellsampled part of the size spectrum, data can be used in models such as BORIS, where the invertebrate EOV data can then be used to optimize the model parameters for the intended application (Fig. 3B).Sets of models that provide hindcast, nowcast, and forecast capability allow for the inclusion of in situ and remote observations to improve model realism, as well as accounting for one or more scenarios such as different climate change management outcomes used by the Coupled Model Intercomparison Project (Eyring et al. 2016;Cooley et al. 2022).This can provide for point location or regional to gridded estimates of past, present, and future seafloor biomass using the best information available and with various contexts of different combinations of impact factors.Size-based, functional-type, and biodiversity and indicator-taxa models can also be parameterized using existing and newly processed invertebrate EOV data, or forced by IPCC emission scenarios or other drivers and supporting variables.Observing system simulations experiments (OSSEs) and related analyses can also identify system improvements and management adaptations.

Broad impact for users
Once dynamic model, statistical and/or other outputs are produced, these data can then be used to form interpretations tailored for specific user needs (Fig. 3C).These can include via online curated data views, and content for various assessment reports (Cooley et al. 2022), such percent change in the context of climate change (Fig. 2A,B,D).These can be digested to create time-series indicating status and trend, products for specific managed areas such as MPAs, seafloor mining and energy industry development areas, fishing management zones and the Living Planet Index used by the CBD (McRae et al. 2017).These model frameworks, together with data collection, (semi-)automated data processing, the application of metadata standards, and other practices comprise an overall workflow where data from different sources can be joined together.Finally, the provision of observations can then inform the collection of new biological data, and/or the further evolution of models and supporting variables, through system improvements and management adaptations.

Implementing
A fully common method for quantifying the entire invertebrate size spectrum is not possible.However, broader adoption of key practices for a GOOS EOV for Benthic Invertebrate Distribution and Abundance will make the goal of integrating data from various approaches readily achievable including by: (1) quantifying individual body size, (2) identifying the wellquantified portions of sampled body-size spectra, (3) taking advantage of (semi-)automated information processing including machine learning and artificial intelligence, (4) application of data standards such as Darwin Core, and (5) making data available through discovery points such as OBIS.Using data from multiple sources and methods such as sediment sampling and imaging requires data integration.These informatics are critical in evolving the way biology and ecosystem data can be applied to societal challenges.Such a systems approach can be achieved through iterative engagement and codesign that entails system improvements and management adaptations.
Each of the concepts outlined in Fig. 3 is important in the context of GOOS and how its members might be collecting and processing information to produce information for users.Several programs of the United Nations Decade for Ocean Science for Sustainable Development are working to refine, apply and share the data stream plans described here including OceanPractices, DOOS, Marine Life 2030, and Challenger 150.For example, this work is, in part, a result of data stream planning facilitated by the DOOS program, which itself includes input from several other networks including the US Marine Biodiversity Observation Network (MBON) and the GEO BON.These data stream plans provide a mechanism to guide further specification of system attributes, highlighting where and how best practices are applied.Finally, critical to continuous improvement of the data lifecycle, engagement with the specific user bodies informs practitioners as to whether user needs are being met, if and how systems can be improved, and where new investments might best be prioritized.Fig. 3.An integrative science service workflow.This illustrates a full data lifecycle, showing a data stream that links drivers and methodologies with key practices and data handling steps that facilitate data integration and the creation of broadscale gridded, time-variant invertebrate EOV estimates, that is, to deliver invertebrate abundance and distribution EOV information at scales needed for management and decision making.This data stream planning highlights (A) policy drivers and reporting needs (blue-gray), methods and platforms (blue), sample and data processing steps including size/mass estimation concepts (orange), metadata and quality control standards (brown), EOV and supporting variable types (green), and (B) adding value including via modeling tools such those using body size and other forms of functional and diversity concepts (plum).Critical to this is that all steps have specifications that enable (C) delivery of information suitable for specific stakeholders driving the collection of data, as well as a means for continuous improvement including feedback from information users and best practice experts.Abbreviations used here but not in the main text include Quality Assurance/ Quality Control of Real-Time Oceanographic Data (QARTOD), the National Centers for Environmental Information (NCEI), Thematic Real-time Environmental Distributed Data Services (THREDDS) servers, Observation Network for Earth (DataONE), The European Marine Observation and Data Network (EMODnet).

Fig. 1 .
Fig. 1.Body size distribution of benthic invertebrates in the macro-to megabenthos size range.(A) Theoretical distributions based on random sampling of a single underlying power law distribution having a slope exponent of 1.75 (see Table1), bounded at a maximum body mass of c. 4 kg (as observed in field samples).Macrobenthos are sampled with a 250 μm (0.25 mm) sieve mesh, producing a lower bound body mass of c. 9 μg; megabenthos with a 32 mm mesh net, producing a lower bound body mass of c. 18 g.In each case, three random samples are drawn of 150 (green), 200 (red), and 250 (blue) individuals to mimic natural variability in numerical density (seeBett et al. 2023  for additional details).(B) Field sample data from the Porcupine Abyssal Plain Sustained Observatory (PAP-SO; seeHartman et al. 2021).Macrobenthos determined from core samples sieved on a 250 μm sieve mesh collected in 2 yr (red, green; Benoist 2020); megabenthos determined from c. 65,000 seafloor photographs, divided to central (red) and northern (green) abyssal plain and abyssal hill (blue; see Benoist 2020;Durden et al. 2020b).(C) Inset, comparative data for three megabenthos trawl catches (red, green, blue) from the PAP-SO (1989 samples; seeBillett et al. 2001).Note, that the point of inflection between "well" sampled and "under" sampled in the field data is indicated with an arrowhead, that the un-or undersampled body mass range is indicated by fine dashed lines, and that large rarities (LR) were encountered in both the randomly sampled and field sampled (isopod, 118 mg) macrobenthos (seeSanders 1960).Data and metadata are available at https://doi.org/10.5281/zenodo.7725189.
. These changes are expected to occur Fig.3.Legend on next page.