The Coastal Carbon Library and Atlas: Open source soil data and tools supporting blue carbon research and policy

Quantifying carbon fluxes into and out of coastal soils is critical to meeting greenhouse gas reduction and coastal resiliency goals. Numerous ‘blue carbon’ studies have generated, or benefitted from, synthetic datasets. However, the community those efforts inspired does not have a centralized, standardized database of disaggregated data used to estimate carbon stocks and fluxes. In this paper, we describe a data structure designed to standardize data reporting, maximize reuse, and maintain a chain of credit from synthesis to original source. We introduce version 1.0.0. of the Coastal Carbon Library, a global database of 6723 soil profiles representing blue carbon‐storing systems including marshes, mangroves, tidal freshwater forests, and seagrasses. We also present the Coastal Carbon Atlas, an R‐shiny application that can be used to visualize, query, and download portions of the Coastal Carbon Library. The majority (4815) of entries in the database can be used for carbon stock assessments without the need for interpolating missing soil variables, 533 are available for estimating carbon burial rate, and 326 are useful for fitting dynamic soil formation models. Organic matter density significantly varied by habitat with tidal freshwater forests having the highest density, and seagrasses having the lowest. Future work could involve expansion of the synthesis to include more deep stock assessments, increasing the representation of data outside of the U.S., and increasing the amount of data available for mangroves and seagrasses, especially carbon burial rate data. We present proposed best practices for blue carbon data including an emphasis on disaggregation, data publication, dataset documentation, and use of standardized vocabulary and templates whenever appropriate. To conclude, the Coastal Carbon Library and Atlas serve as a general example of a grassroots F.A.I.R. (Findable, Accessible, Interoperable, and Reusable) data effort demonstrating how data producers can coordinate to develop tools relevant to policy and decision‐making.


| INTRODUC TI ON
Globally, wetland soils are estimated to contain 500-680 Petagrams of carbon (Poulter et al., 2021), equivalent to 60%-82% of all carbon present in the atmosphere (Ciais et al., 2014).Coastal wetlands such as tidal marshes, tidal freshwater forests, referred to throughout as swamps, mangroves, and seagrasses are known to sequester carbon on decadal to millennial time scales (Chmura et al., 2003;Howard et al., 2017;Mcleod et al., 2011;Ouyang & Lee, 2014).So-called blue carbon habitats have been garnering attention in recent years because of the capacity for management interventions to avoid greenhouse gas emissions by conserving wetlands or re-establish greenhouse gas removals by restoring them (Eagle et al., 2022;Kroeger et al., 2017;Lovelock & Duarte, 2019;Wylie et al., 2016).
Coastal wetland soils are unique in that they store carbon in situ as a dynamic response to sea-level rise (Morris et al., 2002) via mineral trapping and organic soil mass production (Morris et al., 2002;Turner et al., 2006).These dynamics mean that factors such as relative sea-level rise (Rogers et al., 2019), plant community (Doughty et al., 2015;Schile et al., 2017), and elevation (Callaway et al., 2012;Peck et al., 2020b), are of great importance to predicting carbon stocks and burial rates.Given the fact that tidal wetlands were underrepresented in previous efforts (Holmquist et al., 2018b), the requirements of specialist and interdisciplinary knowledge represented in tidal wetland datasets, and their distinctness from terrestrial soils, we propose that soils from blue carbon habitats require their own dedicated synthesis effort for disaggregated data.
Data syntheses for coastal wetland ecosystems have been undertaken previously.An original synthesis by Chmura et al. (2003) established coastal wetland carbon storage as a major carbon sink.
A subsequent expansion by Ouyang and Lee (2014) documented covariates associated with carbon burial.These foundational studies provided data reported or estimated from literature values but did not present original disaggregated information, original measurements rather than summary statistics, limiting their reuse for other purposes.Later synthesis datasets greatly expanded the available disaggregated carbon stocks data available for seagrasses (Fourqurean et al., 2012), tidal marshes (Holmquist et al., 2018b), tidal swamps (Krauss et al., 2018), and mangroves (Donato et al., 2011;Sanderman, 2017).These studies have many elements that a comprehensive database would require, but none were designed to be a living resource for the communities they helped establish.
Sharing of disaggregated data is vital to discovery in coastal carbon studies.Morris et al. (2016) documented a relationship between soil organic content and density which limits vertical accretion capacity.Holmquist et al. (2018b) independently validated soil carbon stock mapping strategies, uncovering an overestimate in carbon stock when relying on conventional maps.Rogers et al. (2019) utilized an earlier version of the database presented herein to show that the geography of relative sea-level rise explained variability in carbon density at a global scale.All of these important studies would not be possible without the sharing of disaggregated data from researcher to researcher.However, data sharing between researchers is much rarer in practice than is implied through data availability statements in academic peer-reviewed journal articles (Tedersoo et al., 2021).
In this paper, we describe the creation and implementation of a set of standards and introduce the Coastal Carbon Library v1.0.0.This database is unique from previous efforts in that it spans multiple be used to visualize, query, and download portions of the Coastal Carbon Library.The majority (4815) of entries in the database can be used for carbon stock assessments without the need for interpolating missing soil variables, 533 are available for estimating carbon burial rate, and 326 are useful for fitting dynamic soil formation models.Organic matter density significantly varied by habitat with tidal freshwater forests having the highest density, and seagrasses having the lowest.Future work could involve expansion of the synthesis to include more deep stock assessments, increasing the representation of data outside of the U.S., and increasing the amount of data available for mangroves and seagrasses, especially carbon burial rate data.
habitats, pays special attention to gathering site characteristics and methodological meta-data required to re-interpret data for new uses, and it contains disaggregated data, rather than summary statistics.
We then present the Coastal Carbon Atlas (https:// shiny.si.edu/ coast al_ carbon_ atlas ), a web tool for visualizing, querying, and downloading the Coastal Carbon Library.We also discuss the strengths of this database in the face of rapidly changing coastal ecosystems.Finally, we discuss potential future directions and proposed best practices.

| Goals and scope
F.A.I.R. (Findable, Accessible, Interoperable, and Reusable) data principles are a set of best practices, seeking to maximize the re-usability of data.To meet goals of F.A.I.R. data, the goals of the data structure were four-fold, and grounded in best practices for data management (Wilkinson et al., 2016;Wilson et al., 2017).First, we preserved data in as much detail as is practical to maximize versatility.Second, we were transparent, relying on stable repositories.Third, we explicitly connect individual datasets within the synthesis to their original sources through one more associated citations.Fourth, we emphasized simplicity.
For scope, we focused on vegetated coastal tidal wetlands.We focused on emergent vegetation (marsh), scrub/shrub, and forested wetlands including mangroves and tidal swamps.We included seagrasses and tidal flats, but not kelp beds, coral reefs, or oyster reefs (Howard et al., 2017).These included saltwater, brackish, and freshwater ecosystems in the tidal zone and subtidal seagrasses.In addition to current tidal wetlands, we also included former tidal wetlands that had been modified by human impact and uplands or non-tidal wetlands if that comparison was a key aspect of the original study.
Finally, while we did classify datasets based on their completeness or utility, we did not exclude any datasets based on methods used, origins, or lack of attributes beyond a few key requirements such as basic positional and depth information, as well as at least one relevant measured depth series attribute (Data S1).

| Data structure
Coastal Carbon Library data formatting (Holmquist et al., 2023) follows practices (Wilson et al., 2017) for maximizing data reuse such as descriptive attribute names, consistently formatted 'no data' values, multiple quality control attributes with consistent formatting, and multiple places for explanatory comments written in prose.Data files are long rather than wide, designating each row as a single observation, each column as a single variable, and each table as a set of observations and variables (Wickham, 2014).Data files are 'flat', all information is stored as simple, uniform, text.The structure consists of seven data tables (Figure 1): 1. Site: contains site-level information (Table 1).
3. Depth series: provides the disaggregated information collected across depth increments of each soil profile (Table 3).
4. Methods: contains information about the methodology used for a collection of sampling and measurement events (Table 4).

5
. Species: provides the plant species observed proximal to the sampling location (Table 5).

Impacts: provides a controlled classification of anthropogenic
impacts associated with the sampling location (Table 6).
7. Bibliography: associated information of sources associated with datasets (Table 7).
All tables are relational with common linking variables study_id, site_id, methods_id, core_id, and bibliography_id.The site, core, depth series, species, impacts, and methods tables are hierarchical.Depth increments are nested within cores.Cores are nested within sites.
Species and impacts are nested within cores, or within sites, if sitelevel data are the highest resolution provided by the study (Figure 1).Categorical variables have defined vocabulary listed in Table 8.

| Data sourcing
Data entered the Coastal Carbon Library through two tracks, with and without curatorial assistance by Coastal Carbon Network personnel.If a data source was already available through a data publication then the original data and associated metadata were downloaded to the Library repository (Figure 2).Ingestion scripts were then written in R code (R Core Team, 2021) to reshape, and reformat so that datasets could be merged (Figure 2).

| Data quality control and quality assessment
Data were quality-controlled using automated and visual checks.Automated quality tests were performed at both the individual study and whole synthesis levels and included (but were not limited to): Relationships between the tables that make up the Coastal Carbon Library.Attributes tracked in each table are listed.Bold and italicized text indicates common attributes that can be used to join tables.The structure of the library is mostly hierarchical (a site contains multiple cores, and a core contains multiple depth increments in a series).One coring location can also have multiple anthropogenic impacts and plant species associated with it.A methods table links directly to depth increments as more than one set of methods may be used to analyze depth increments within a core.
• checking that the controlled attributes and variable names match guidelines.
• checking for the presence of mandatory and conditionally mandatory attributes (Data S1).
• verifying the uniqueness of core identifiers for a site and study.
• verifying the uniqueness of core locations in the overall database.
• verifying numeric attributes expressed as fractions were bound between 0 and 1.

During visual checks, Coastal Carbon Network personnel
mapped the provided coordinates of sampling locations, the relationship between dry bulk density and organic matter, organic matter and organic carbon (Callaway et al., 2012;Craft et al., 1991;Morris et al., 2016), as well as depth-series of any depth profiles of soil measurements or radioisotope values (Arias-Ortiz et al., 2018).

| Data post-processing
We built a post-processing workflow that added geography attributes, detailed habitat classifications, and classified profiles based on the data's utility, completeness, and quality.Following initial processing, cores were assigned geographic units, using spatial joins in the sf package (Pebesma, 2018)  Code indicating the quality of information present for a core containing data relevant for carbon stock assessment Categorical 1.We detected whether the entry originates from a habitatspecific synthesis for mangroves (Sanderman, 2017) or seagrasses (Fourqurean et al., 2012).
2. Non-forested wetlands were classified into marshes, seagrasses, or tidal flat based on vegetation_class.
3. If the vegetation class was not descriptive, then we crossreferenced species present at the coring location with the USDA Plants database (USDA, NRCS, 2022) to determine the growth form of the plant.If multiple growth forms were reported for a species, then we classified it based on the tallest reported.We used habitat-specific lists of taxa (Larkum et al., 2006;Tomlinson, 2016) to determine whether or not a species had any special designation as seagrass or mangrove.
Any graminoids, forbs, or ferns not otherwise classified as seagrasses were classified as marshes.Any trees not otherwise classified as mangroves were classified as swamps.
4. If previous steps were inconclusive, we used partial string matching to detect whether any descriptive language such as 'mangrove' or 'seagrass' was present in site_id or core_id.
5. A small number of estuarine forested wetland locations from the United States (Nahlik & Fennessy, 2016) could not be sorted into swamps and mangroves using the previously outlined methods, We assumed sites were mangrove if they occurred south of 29.75° latitude, the maximum northern extent of mangroves observed on the U.S. Atlantic coast of the U.S. (Cavanaugh et al., 2014).
Conversely, we assumed estuarine forested sites north of this threshold were tidal swamps.
We developed an automated workflow to classify cores based on the type of study the data can be used for, including carbon stock assessments, calculation of carbon burial rates, and fitting of models of future carbon sequestration and wetland resilience.If dry bulk density and either organic matter or organic carbon were present in-depth series, then the core meets the minimal inclusion criteria for carbon stocks (C).If the core was confirmed to reach the bottom of the profile then it was considered a high-quality core (C1).If not, it was classified as of lower utility (C2).Here, the wetland sediment profile is considered complete if the contact point between wetland sediment and bedrock or non-wetland sediment is reached.
If any profile age-related or disaggregated data associated with these techniques were present in a core, then the core met the minimal inclusion criteria for calculating carbon burial rates (B).If any radioisotope activity levels were reported, then we required that the associated instrument error be reported for dating information to be considered complete (B1).This included 137Cs, total 210Pb, excess 210 Pb, 14C, as well as any isotopes used to estimate supported 210Pb, 226   Ra via its proxies 214Bi and/or 214Pb in gamma analysis.If the depth of 137Cs peaks were reported, then we required a 137Cs activity depth profile for completeness.If excess 210Pb was reported, then we required that total 210Pb be reported as well for the dataset to be considered complete.For 14C ages, the material dated needed to be specified for the dataset to be considered complete.Any missing radioisotope errors or conditional attributes resulted in the dataset being classified as having incomplete data reporting (B2).
Cores with any age-depth data and elevation data met the minimum inclusion criteria for accretion modeling (A; Schile et al., 2014).
We differentiated between elevations that were interpolated from remotely sensed or spatially interpolated digital elevation models (A3) and those that were directly measured using precise real-time kinematic GPS data or better (A1 or A2), which typically only have errors of a few centimeters.Cores with both a full suite of dating information and precise elevation were classified as the highest utility for data-model integration (A1), and cores with precise elevation but some missing dating information were graded lower (A2).

| Versioning, data use policy, and citation
The version of the database presented herein is v.1.0.0.and current as of January 12, 2023(Coastal Carbon Network, 2023).
Preliminary versions of this database were summarized in previous publications (Malhotra et al., 2019;Todd-Brown et al., 2022).New versions of this database will adopt standardized semantic versioning with the first digit indicating major changes to database structure, the second minor changes and additions of new datasets, and the third, backward-compatible changes.We anticipate regular updates to the Coastal Carbon Library as new datasets are submitted, approximately quarterly.It has a digital object identifier and a web link that will automatically route to the most up-to-date version.
The Coastal Carbon Library is licensed under CC-BY-4, meaning that it can be used without restriction other than attribution.
Full attribution of the Coastal Carbon Library requires three things, first, citing the original primary datasets, papers or data publications, when they are reused for other purposes.In cases which an original and, this manuscript should be cited in addition to primary and synthesis sources.Importantly, when data are downloaded and analyzed for new purposes, original primary sources need to always be cited to comply with the policy; citation of the Coastal Carbon Library and this paper alone would not be sufficient.

Because relationships between original data sources and the
Coastal Carbon Library are complex, we include a relational table connecting study_id's to bibliographic information (Figure 1).The Coastal Carbon Altas (Figure 3), automatically formats bibliographic information based on the subset of datasets downloaded.

| Quantification of data coverage
We compared the makeup of cores in the Coastal Carbon Library to the global allocation of blue carbon area based on habitat and countries.For habitats, we reference literature for global area estimates for seagrasses, tidal marshes, and mangroves (Bunting et al., 2022;McKenzie et al., 2020;Murray et al., 2022; Table 9).For countries, we used a probabilistic map of intertidal area (Murray et al., 2022), counting tidal wetland pixels with greater than a 50% chance of being classified as tidal flat, tidal marsh, or mangrove in 1999.Country borders included both the land borders (ESRI Data and Maps, 2015b) and exclusive economic zones (Flanders Marine Institute, 2019).

| ACCE SS ING DATA VIA THE COA S TAL C ARBON ATL A S
The Coastal Carbon Atlas (https:// shiny.si.edu/ coast al_ carbon_ atlas ; Figure 3) is an R-shiny web application that interfaces with the Coastal Carbon Library allowing users to explore, query, and directly download data.This tool has been adapted to suit user needs through community feedback.The Coastal Carbon Atlas consists of a map interface (Figure 3a) displaying sampling locations.The tabular view allows for a direct visual review of the site, core, and depth series tables (Figure 3b).3c).
The user can specify custom depth intervals to download quantitative attributes summarized as depth-weighted averages of the depth increments specified.
Also, 93.6% of cores in the database are either classified as 'natural' or do not specify any anthropogenic impacts.
The most common types of soils data for marsh, mangrove, seagrass, and swamp ecosystems included dry bulk density, fraction organic matter, and fraction carbon measurements (Figure 4).Soil cores with a suite of basic soil carbon stock information included 4815 profiles (Figure 5).However, 90.4% were not confirmed to reach a contact point between wetland sediment and bedrock or non-wetland sediment.Only 19.5% of cores were greater than or equal to 1 m long, the target depth of soil maps used in some organic carbon stock assessments (Holmquist et al., 2018b).
Of the marsh, mangrove, seagrass, and swamp cores with sufficient data to calculate carbon stocks (4815), a subset of those had data sufficient to calculate carbon accumulation rates (533, Figure 5).Of these, the majority (77.8%) have fully transparent age-depth information traceable back to the original measurements (Figure 5).Of these dated cores, 326 also have associated elevation data, with 317 having both high-quality elevation and age-depth information (Figure 5)., attribute and variable definitions in the 'data dictionary' file, the bibliographic information in three file formats, the site, core, depth series, methods, impacts, and species table, as well as the derived normalized weighted averages depth series table in the 'standardized depth series' table.The interactive application is available online (https:// shiny.si.edu/ coast al_ carbon_ atlas ).Map lines do not necessarily depict accepted national boundaries.

TA B L E 9
We compared global area estimates for three major blue carbon ecosystems, tidal marshes, mangroves, and seagrasses, compared with the representativeness of data appropriate for carbon stock assessments and burial rates when only the subset of those habitats is considered.Multiple estimates were found, so we present ranges based on the maximal scenario for mangroves and minimal for marshes, as well as maximal for marshes and minimal for mangroves.Despite variation in estimates of areas based on source, the current iteration of the Coastal Carbon Library clearly overrepresents marshes.Indonesia, Canada, Brazil, and Papua New Guinea were undersampled despite having relatively large areas (Figure 6).
Median organic matter density was highest in swamps and marshes, lower in mangroves, and lowest in seagrasses (Table 10).
Dry bulk density, organic matter fraction, and their products, organic matter density, were non-normally distributed (Figure 7).All varied significantly based on habitat type according to a non-parametric Kruskal-Wallis rank-sum test (p < 2.2e-16).

| Strengths of current effort
The Coastal Carbon Library and Atlas's strengths are in provid- F I G U R E 5 Summary of data quality and completeness for three different purposes: carbon stock assessments, estimating carbon burial rates, and parameterizing models of future carbon sequestration.We define sediment profiles as complete if reaching the contact point between wetland sediment and bedrock or lower non-wetland sediment.
The ability to independently replicate derivative calculations and link measurements to original studies are both important to the validity of greenhouse gas inventories (Crooks et al., 2018) and carbon market verification (Needelman et al., 2018).
When collating data for use by a broad swathe of researchers and managers, the importance of grassroots approaches cannot be understated.The fact that this was a domain-specific effort allowed us to grow the database by leveraging professional networks, generate enthusiasm with a community that understood the immediate utility of the synthesis, and build trust based on relationships.Building trust took an understanding of both academic incentives around sharing data and the personal nature of many datasets.In shifting the burden of curation from data producers to dedicated staff this effort has helped data creators comply with open data standards (Tedersoo et al., 2021), and rescued data that otherwise would have never been made public (Todd-Brown et al., 2022).Our ambition is to continue providing some data curatorial services depending on the availability of funding through new projects.

| Future improvements
While there are myriad strengths to the current effort, there are also limitations to estimating deep carbon stocks and assessing carbon burial rates outside of marsh ecosystems.Most of the dated or deep Future efforts could also focus on improving the representativeness of data in the repository.The majority of the data were from the U.S. and future efforts could improve the representation of tropical and developing countries (Wylie et al., 2016).Poor representation in some cases is due to lack of data, and other times due to the fact that data has not yet been integrated.Efforts like this allow TA B L E 1 0 Summary statistics of organic matter fraction, dry bulk density (g cm −3 ), and organic matter density (g cm −3 ).Summary statistics include mean, median, lower and upper 95% quantile, standard deviation (SD) and data point count (n).F I G U R E 7 Summary of global blue carbon data shown as the frequency distribution of dry bulk density (g cm −3 ), fraction organic matter, and organic matter density (g cm −3 ).Note, the y-axes are different for each type of ecosystem in the rows.
the community to take stock of what data are already available and focus on new data collection and synthesis where it would be most impactful.
Further, future efforts could include increasing the representation of mangroves, and seagrasses, especially for carbon burial data.
Finally, future efforts could analyze whether the Coastal Carbon Library is biased towards pristine wetlands and underrepresents degraded or restored wetlands.

| Proposed best practices for blue carbon data dissemination and reuse
This data synthesis effort has resulted in proposed guidelines for both individual research groups as well as journal editors, reviewers, and funders.We suggest that best practices for reporting blue carbon soils data should include distributing disaggregated data associated with the summary statistics presented in journal articles and reports.We propose reporting at the level of original measurements, for example, loss-on-ignition and dry bulk density measurements reported for individual depth increments, and individual radioisotope measurements determining supported and unsupported 210Pb activity profiles in addition to derived age-depth models.We propose that positional information, elevation, wetland management history, salinity, and vegetation composition are all vital to data reinterpretation; we provide multiple ways for coding data resolution and methodology (Tables 1 and 2) in order to represent the original studies with fidelity.We propose that data producers should provide separate files for original measurements and derivative calculations, ideally with an open-source scripting workflow documenting how derivations are made.We propose utilizing dataset templates and consistent vocabulary whenever possible, such as those provided as Data S1 (Holmquist et al., 2023).The templates we provide can be modified and added to.We propose metadata accompanying data releases should define attributes, specify units, and detail study methods.
We also propose best practices for data producers include Having a synthetic dataset can allow researchers to develop additional best practices, and disentangle the effect different methodological choices have on carbon burial and accretion estimates (Holmquist et al., 2021).Fusing data with soil formation models (Morris et al., 2002;Schile et al., 2014) could allow for more standardization surrounding concepts such as accretion, burial, accumulation, and sequestration.
The adoption of open data policies has the potential to improve equity for researchers and communities in the global south (Serwadda et al., 2018).Although adopting open data is not free from ethical risks on its own, the protocol we propose does provide a way to avoid 'parachute science', in which data are collected from a middle-income or developing country, but not made available to the communities from which it came.Data and journal publications offer an avenue for credit towards researchers from the global south by recognizing a taxonomy of roles, including for local people who consulted and physically collected the data (Allen et al., 2014;Serwadda et al., 2018).
We finally propose some best practices for those reusing data for new purposes.Although the CC-BY 4.0 open data allows unrestricted use with attribution, we encourage data users to interact with the Coastal Carbon Library and Atlas products not only as a database but also as the community that built it.This could also include reaching out to original data producers when local knowledge is warranted for data interpretation.Data users can offer co-authorship when consultations result in more substantial contributions (Allen et al., 2014), or recommend data providers as reviewers, notify them of open-peer review periods, notify them of preprints, and provide copies of journal articles when a dataset they provided is vital to a new study.
Our hope is that by providing a centralized open database, building the practice of data publication, and implementing a data reuse policy that centers original data producers, we will contribute towards democratizing the development of coastal ecosystem service science.

| CON CLUS IONS
We present the Coastal Carbon Library, an open-source database for disaggregated global tidal wetland soil carbon stock, and accumulation rate data.It is made up of 6723 soil profiles, from 64 countries.
In addition to the data itself, the vocabulary, structures, and metadata are all discussed in depth.The strength of these data products lies in their commitment to F.A.I.R. data principles and their transparency.The addition of the Coastal Carbon Atlas, which allows for data visualization, subsetting, and limited post-processing, increases the accessibility of the data for non-specialists.Future work is needed especially to increase the amount of deep carbon stock data across wetlands and calculate carbon burial rates consistently.
While the database is global, new efforts are needed to increase the representation of countries outside the U.S. To conclude, we think that any scientific synthesis effort can learn important lessons from the grassroots nature of the Coastal Carbon Library.Data producers were incentivized to be involved by providing data templates, shifting the burden of curation from data producers to dedicated staff, and generating trust through a data use policy that rewards data producers through citation of primary material.

AUTH O R CO NTR I B UTI O N S
James R. Holmquist: Conceptualization; data curation; formal analysis; funding acquisition; investigation; methodology; project administration; software; supervision; visualization;

F
The Coastal Carbon Atlas: (a) Shows the map view with warmer colors indicating denser clusters of data.Clusters change and become more detailed as a user zooms in on a particular region or location.When zoomed all the way in users can identify individual coring locations.(b) An example of the Atlas's tabular view, which allows users to browse the raw data stored in the Coastal Carbon Library and accessible for download.(c) Shows the download screen including summary statistics of a query and options of post-processing depth series to output as depth weighted averages.(d) Shows a file tree of downloads including file summary in 'readMe' ing an example of how to build F.A.I.R. data in the earth and environmental sciences(Wilkinson et al., 2016).Providing support in issuing persistent digital object identifiers helps make data more findable.Accessibility was increased by preparing new data releases and distributing the Coastal Carbon Library all under creative commons open source licenses.Accessibility was further improved by creating the Coastal Carbon Atlas so that data could be queried by those without specialized data management and coding skills.The data were made interoperable by the creation and adoption of a controlled data structure and vocabulary.The data were made reusable by providing them in a highly disaggregated form.For example, for profiles available for calculating accretion rates and carbon burial rates, the majority of these data are fully reported, meaning that these metrics can be derived using existing publicly available age-depth modeling software (Aquino-López et al., 2018; Blaauw & Christen, 2011).Beyond enacting F.A.I.R. data principles, the strengths of the Coastal Carbon Library and Atlas are in enhancing transparency.F I G U R E 4 Summary of the count of cores with key measured attributes associated with them.Dry bulk density, fraction organic matter, and fraction carbon were most commonly measured.Data associated with core elevation and stratigraphic dating were less common.

F
I G U R E 6 (a) Points show country by country, the makeup of the Coastal Carbon Library as percentages, with respect to the global area of tidal ecosystems(Murray et al., 2022).Countries in the top 6 of database representation and/or tidal area are labeled.The black line represents an idealized one-to-one relationship with countries above the line being over-sampled and those below, under-sampled.Note the log-10 scale.Countries with no representation in the Coastal Carbon Library are plotted at the bottom of the y-axis.The United States is over-represented, while Indonesia, Brazil, Papua New Guinea, and Canada were underrepresented.(b) Maps show same information as (a), with colors visualizing the degree of over-or under-representativeness.Here we define over-or under-representation as actual minus ideal representation.Ideal representation is based on based on tidal habitat coverage (x-axis of a), actual on database representation (y-axis of a).Map lines do not necessarily depict accepted national boundaries.cores in the Coastal Carbon Library were from tidal marshes.For carbon stocks, we showed that the majority of datasets did not represent full profiles, reaching a contact point between wetland and deeper non-wetland layer, and in most cases, soils datasets were limited by the depth of the coring device.So, while the characterization of shallow carbon stock is the most widely available application of the data, future work could quantify the effect that profile depth has on total carbon stock assessments, how spatially predictable that contact point is from existing data, and how predictable deeper carbon stocks are from shallower ones.
making data freely available in public data repositories with opensource Creative Commons licenses.We encourage journal editors and funders to require data publication as part of publication and end-of-project reporting, and for reviewers to ensure data releases are analysis-ready and well-documented.Funders have a role in supporting both project-specific data curation, as well as communitywide aggregation efforts.This effort shows the value of dedicated staff in helping data producers meet their open data ambitions.

TA B L E 1 Attribute information for site table. Attribute name Definition Data type Unit study_id
in R. Geographic units include coun-Unique identifier for the study made up of the first author's family name, as well as the second author's family name or et al. if more than three, then publication year separated by underscores Attribute information for the core table.
Attribute information for depth-series table.Unique identifier for the study made up of the first author's family name, as well as the second author's family name or et al. if more than three, then publication year separated by underscores This is a sample identification unique to the core.This should be used in the case that there are relevant lab specific sample codes, or in the case that there are multiple replicate samples.This is the mass of organic matter relative to sample dry mass.Ash-free bulk density should not be used here but should be expressed as a loss on ignition fraction.
Fourqurean et al. (2012)reported here, rather than ra226_activity, either if the author also measures bismuth-214 and ra226_activity is a composite of the two measurements, or if they want to specify the proxy used for 226Ra.pb214_unitReportedunitforsampleinterval's214Pbactivity measurements Free text source entered the Coastal Carbon Library via another synthesis for example,Sanderman et al. (2018)andFourqurean et al. (2012), both the original source and synthesis should be cited.Second and third, the Coastal Carbon Library itself(Coastal Carbon Network, 2023) This is the radioactivity counts per unit dry weight for bismuth-214 (214Bi), a proxy for, also referred to as supported 210Pb.Radioactivity should be reported here, rather than ra226_activity, either if the author also measures lead-214 and ra226_activity is a composite of the two measurements, or if they want to specify the proxy used for 226Ra.The standard error of the age of any other dated depth horizon such as an artificial marker, pollen horizon, pollution horizon, etc. Attribute information for methods table.Unique identifier for the study made up of the first author's family name, as well as the second author's family name or et al. if more than three, then publication year separated by underscores This is the temperature at which samples were dried to measure dry bulk density.This can include either samples that were freeze dried or oven dried.
The Coastal Carbon Atlas allows for the sub-setting of datasets by habitat type, geography, data availability, fraction_carbon_method core depth, study, human impacts, plant species present, and salinity descriptor.Finally, the Coastal Carbon Atlas provides options for limited post-processing and for downloading the data (Figure Attribute information for species table.Unique identifier for the study made up of the first author's family name, as well as the second author's family name or et al. if more than three, then publication year separated by underscores Attribute information for impacts table.
TA B L E 7