Global Root Traits (GRooT) Database

Motivation Trait data are fundamental to quantitatively describe plant form and function. Although root traits capture key dimensions related to plant responses to changing environmental conditions and effects on ecosystem processes, they have rarely been included in large-scale comparative studies and global models. For instance, root traits remain absent from nearly all studies that define the global spectrum of plant form and function. Thus, to overcome conceptual and methodological roadblocks preventing a widespread integration of root trait data into large-scale analyses we created the Global Root Trait (GRooT) Database. GRooT provides ready-to-use data by combining the expertise of root ecologists with data mobilization and curation. Specifically, we (i) determined a set of core root traits relevant to the description of plant form and function based on an assessment by experts, (ii) maximized species coverage through data standardization within and among traits, and (iii) implemented data quality checks. Main types of variables contained GRooT contains 114,222 trait records on 38 continuous root traits. Spatial location and grain Global coverage with data from arid, continental, polar, temperate, and tropical biomes. Data on root traits derived from experimental studies and field studies. Time period and grain Data recorded between 1911 and 2019 Major taxa and level of measurement GRooT includes root trait data for which taxonomic information is available. Trait records vary in their taxonomic resolution, with sub-species or varieties being the highest and genera the lowest taxonomic resolution available. It contains information for 184 sub-species or varieties, 6,214 species, 1,967 genera and 254 families. Due to variation in data sources, trait records in the database include both individual observations and mean values. Software format GRooT includes two csv file. A GitHub repository contains the csv files and a script in R to query the database.

Conceptual and methodological challenges have deterred widespread data integration of root traits into global trait databases.
Conceptually, the functional importance of some root traits has yet to be established formally, which might preclude their use in large-scale analyses (Aubin et al., 2016).Methodologically, quantificatio of root traits is labour intensive, and there are technical difficulties in obtaining accurate measurements (e.g., Delory et al., 2017).Furthermore, large variation in methodologies precludes data standardization and integration within traits.Specifically, although traits are characteristics measurable at the level of the individual plant (Violle et al., 2007), root traits can be measured in different ways, increasing the number of trait variables.For example, data for root nitrogen uptake are separated into eight trait variables (Iversen et al., 2017).Although coordinated initiatives, such as the Fine-Root Ecology Database (FRED; Iversen et al., 2017) and the Plant Trait Database (TRY; Kattge et al., 2011Kattge et al., , 2020)), have compiled valuable root trait data, these databases still face many of these conceptual and methodological challenges associated with root traits.FRED has been essential in terms of mobilization of fine-root trait data and is the largest contributor of root trait data to TRY (Kattge et al., 2020).FRED contains ~300 root trait variables; this high resolution of root variables allows users to investigate a broad set of research questions.However, barriers remain when using these root trait data in the context of large-scale comparative studies.For example, the number of trait variables can be overwhelming, particularly for non-root specialists.Furthermore, a large number of trait variables have few data records, limiting data-quality checks.For example, TRY performs data standardization and intensive data-quality checks for traits with > 1,000 records (Kattge et al., 2020), but most root traits have fewer records than this threshold.In addition, some trait variables that are not directly comparable in terms of definitions and units have been aggregated by type on TRY, such as root type/ root architecture.Therefore, using these data requires that one first disaggregates these traits (e.g., by establishing links between trait names and definitions) and then standardizes trait values.Finally, accurate global assessments on root trait data availability, in terms of geographical or phylogenetic coverage, are essential to identify data gaps and to work towards increasing representativeness in largescale comparative studies and dynamic global vegetation models.
To overcome these roadblocks, we have created the Global Root Trait (GRooT) Database.The main objective of GRooT is to make root trait data ready to use, particularly in the context of large-scale analyses.To do so, we first provide a set of core root traits that are considered to be relevant for describing plant form and function.Trait Spatial location and grain: Global coverage with data from arid, continental, polar, temperate and tropical biomes.Data on root traits were derived from experimental studies and field studies.
Time period and grain: Data were recorded between 1911 and 2019.
Major taxa and level of measurement: GRooT includes root trait data for which taxonomic information is available.Trait records vary in their taxonomic resolution, with subspecies or varieties being the highest and genera the lowest taxonomic resolution available.It contains information for 184 subspecies or varieties, 6,214 species, 1,967 genera and 254 families.Owing to variation in data sources, trait records in the database include both individual observations and mean values.
Software format: GRooT includes two csv files.A GitHub repository contains the csv files and a script in R to query the database.

K E Y W O R D S
Belowground ecology, functional biogeography, macroecological studies, plant form and function, publicly-available database, root traits selection builds on the compilation of standardized trait measurements in a new handbook on root traits (Freschet et al., 2020) and an assessment by experts on root traits.In addition, we improve data coverage by compiling information from existing databases, mobilizing new data and standardizing data across methodologies within and among traits.Furthermore, we curate and perform data quality checks for each root trait in GRooT and make these data publicly available.Secondly, we provide within GRooT a unique overview of global root trait availability in terms of geographical and phylogenetic coverage.We envision that our advanced root trait database will be informative to global traitbased models and help to guide future measurement initiatives.
GRooT was assembled by initially determining which root traits are most relevant in terms of describing plant form and function (Table 1; Supporting Information Tables S1 and S2).To build towards an ontology of root traits, we standardized trait names across data sources (Supporting Information Table S3) and matched them with names from the new handbook of root traits (Freschet et al., 2020).Subsequently, we checked available trait variables (> 700) to establish: (a) which variable was associated with preselected root traits relevant to the description of plant form and function; (b) which variables would be the most pertinent for each root trait in terms of available methodologies, standardized definitions and units, which were based mostly on the handbook of root traits (Freschet et al., 2020); and (c) which variables could be standardized across methodologies within or among traits.
Within traits, we aggregated comparable trait variables into a single unique trait (e.g., specific root respiration was combined into a unique trait, independent of it being measured as O 2 consumption or CO 2 release; Supporting Information Table S3).Among traits, we recalculated values for traits that could be standardized, such as: (a) data on the root-to-shoot ratio for the calculation of root mass fraction (RMF); and (b) data on stele diameter for the calculation of the stele fraction (Supporting Information Table S3; Figure S1).After this process, we retained those relevant traits with data for > 50 plant species in the database (Table 1), because traits with lower species coverage seemed less helpful for large-scale analyses involving many species.Traits below this threshold, but still relevant, are currently excluded from GRooT (Supporting Information Table S2).
In GRooT, we included only trait records for which taxonomic information was available and excluded trait records where data was taken at the community level (i.e., from species mixtures).Trait records varied in their taxonomic resolution, with subspecies or varieties and genera being the highest and lowest taxonomic resolution available, respectively.We used the generic term of "root", which includes any type of root entity (e.g., established using diameter cut-offs, orders or functionality).Although the need to analyse root entities separately (e.g., separating between fine and coarse roots; root orders or diameter cutoffs; or absorptive and transport roots) is generally recommended by a range of recent syntheses (Freschet & Roumet, 2017;McCormack et al., 2015), which entity is most suitable can vary greatly depending on the research question (Freschet & Roumet, 2017).Therefore, we have included information in GRooT that allows one to select data based on root entities (Supporting Information Table S4).We urge future data contributors to provide information about root entities and data users to consider this issue carefully.
GRooT includes selected meta-data for each trait record, when available, such as taxonomic information, experimental conditions, sampling procedure, geographical location and date, in addition to climatic and soil variables (Supporting Information Table S5).
Moreover, we have included additional information for each trait record, such as species growth form, photosynthetic pathway and woodiness (Supporting Information Table S5).We extracted this information from TRY and the Global Inventory of Floras and Traits (GIFT; http://gift.uni-goettingen.de/home;Weigelt et al., 2019) or from general Web research [e.g., Flora of China (www.efloras.org), SEINet (swbiodiversity.org),United States Department of Agriculture (USDA; plants.usda.gov),and Southwest Desert Flora (southwestdesertflora.com)]when the information was not available in the aforementioned databases.We also included the present or absent ability to grow clonally and bud-bearing information at the species level on GRooT based on the CLO-PLA Database (CLO-PLA; http://clopla.butbn.cas.cz/;Klimešová & Bello, 2009;Klimešová et al., 2017Klimešová et al., , 2019)).For data collected in field conditions, biome classification according to Köppen-Geiger was included using the "kgc" R Package (Bryant et al., 2017).
We added information on qualitative root traits as mycorrhizal association type and nitrogen (N 2 )-fixing capacity by interconnecting existing databases.For mycorrhizal type, we extracted data from the "FungalRoot: Global online database of plant mycorrhizal associations" (Soudzilovskaia et al., 2020).Mycorrhizal assignments were made at the genus level for plant species for which the mycorrhizal status is, according to current knowledge, conserved at this level (Soudzilovskaia et al., 2020).We included both standardized mycorrhizal types (named: mycorrhizalAssociationTypeFungalRoot) and mycorrhizal type from the original source (named: mycorrhiza-lAssociationType) in the database.For N 2 -fixation capacity, we extracted data from the "Global database of plants with root-symbiotic nitrogen fixation: NodDB Database" (v.1.3a;Tedersoo et al., 2018) at the genus level.

| Data curation and quality control
We cross-checked references associated with each dataset to avoid data redundancy, which was mostly generated by: (a) a dataset being TA B L E 1 Root traits included in GRooT ).In addition, we obtained plant taxonomic order from the "taxize" R package v.0.9.4 (Chamberlain & Szöcs, 2013).
We checked trait records to ensure that data were in standardized units.Potential mistakes were checked in the original sources, corrected when possible or excluded when values were unreasonable (e.g., negative values for nutrient concentrations or percentages > 100).We calculated the error risk as the number of mean standard deviations (across all species within trait) from the respective species mean (named: errorRisk), following the TRY protocol (TRY; For each trait, standardized units, number of species and number of species-by-site mean values are presented.Traits are categorized based on McCormack et al. (2017) and Freschet et al. (2020).See the Supporting Information (Table S1) for trait definitions.
a This information can be used to calculate theoretical root specific hydraulic conductance (Valenzuela-Estrada et al., 2009). b This information needs to be interpreted with caution, because the included total root length can vary across studies. c This trait can be measured via minirhizotrons or ingrowth cores, and both measurements lead to contrasting results.d Lateral spread by clonal growth; although this trait is not categorized as a trait of the root system per se, it was included because of its influence on root growth (Klimešová & Bello, 2009).
e Qualitative microbial association traits, including mycorrhizal association type and nitrogen-fixing capacity, are included in GRooT (see Supporting Information Table S5).
f Mycorrhizal colonization intensity is based mostly on data for arbuscular mycorrhizal colonization.
TA B L E 1 (Continued) Kattge et al., 2011Kattge et al., , 2020)), but not implemented across root traits in TRY.We reported the number of data entries used to calculate the error risk per species (named: ErrorRiskEntries), with error risk robustness increasing when based on multiple replicates (preferably > 10 data entries).Normal distribution was checked for each trait, and logarithmic transformations were used before calculating error risk scores when required.Large error risk scores can indicate potential measurement errors, but they can also reflect intraspecific variation.Thus, we did not use error risk scores to remove trait records from the database but provide them to be used at the users' discretion.

| Data use guidelines and data availability
GRooT contains two csv files and an R script.S4) or relevant covariables, such as root vitality (McCormack et al., 2015).
GRooT is publicly available but should be referenced by citing this paper.We suggest citing the original data sources that contributed a substantial proportion to the analysis.GRooT is located and will be maintained and updated in a GitHub repository (https://groot -datab ase.github.io/GRooT/).We encourage users to report mistakes and suggestions to improve the database and to contribute data.  of species-by-site mean values).Root trait coverage from field studies varies geographically across the globe (Figure 1).Regions such as North America, Europe and Asia are well covered, whereas there are consistent gaps in other regions, such as Africa and South America.

| DE SCRIP TI ON OF THE DATA
These geographical patterns are observed in terms of the number of species and the number of traits measured per site.
Phylogenetically, data in GRooT cover all major clades of vascular plants (i.e., pteridophytes, gymnosperms, basal angiosperms, monocots, magnoliids, basal eudicots, superrosids and superasterids; Figure 2a), with data for 254 families.However, phylogenetic gaps are observed for traits related to key categories, such as anatomy, architecture, dynamics and physiology.When accounting for the number of vascular species included in GRooT (n = 6,214 species across 254 families), the average number of traits per species within family ranges between two and 14, with an overall average of four traits for species across the phylogeny.When accounting for the number of vascular species accepted globally (based on The Plant List; n = 316,110 species across n = 442 families), the average number of traits per species within family ranges from zero to eight traits, with an overall average of less than one trait for species (Figure 2b).

| D ISCUSS I ON
GRooT is a uniquely important step towards the inclusion of root GRooT also helps to highlight the remaining barriers to integration of root trait data on global analyses.In particular, data availability of certain relevant but hard-to-measure root traits related to physiology, mechanical properties and root dynamics generally remain scarce (Supporting Information Table S2).Moreover, although GRooT contains global data with a wide geographical range, the species coverage in South America and Africa remains limited irrespective of trait type, reflecting overall biases in global ecological observations (Cornwell et al., 2019;Martin et al., 2012).Thus, targeted initiatives in these regions, such as that by Addo-Danso et al.
(2019), are fundamental.Although GRooT includes ~6,500 species, initiatives to increase the representativeness of species for families with the highest species richness, such as Fabaceae, Fagaceae, Orchidaceae and Poaceae, are also required.
GRooT can be used for (but is not restricted to) studying macroecological and functional biogeography (Violle et al., 2014), assessing global belowground trait-environmental relationships as known from aboveground approaches (Bruelheide et al., 2018), and detecting fundamental ecological patterns, such as the root economic space (Bergmann et al., 2020) or trade-offs and coordination among organs in the plant economic spectrum (Freschet et al., 2010).Furthermore, GRooT facilitates the integration of root traits into studies of related scientific disciplines, such as soil science and agronomy (Martin & Isaac, 2018;Wood et al., 2015).The completion of this standardized, curated and publicly available database provides immediate benefit to the research community from ready-to-use data (Gallagher et al., 2020) and provides additional direction, helping experts to identify gaps that need to be filled to increase completeness of global root trait data.

Figures
Figures S2-S14).GRooT includes > 1,000 species with data on the following nine traits: root mass fraction, root carbon and nitrogen concentration, lateral spread, root mycorrhizal colonization intensity, mean root diameter, root tissue density, specific root length and maximum rooting depth.Data were collected in experimental microcosm studies (20.2 and 1.3% of species-by-site mean values from potted and hydroponic experiments, respectively) or field studies (71.4% of species-by-site mean values, including field observations and field or common garden experiments) or were unspecified (7.1% traits in large-scale comparative studies and global models by integrating expert knowledge, data mobilization, standardization, curation and open accessibility.In terms of geo-referenced data from field studies, GRooT has highest coverage in North America, Europe and Asia, especially for chemical and morphological traits, reflecting the capability for large-scale studies in these regions.In terms of phylogenetic coverage, data in GRooT include the major clades of vascular plants with, on average, four traits included per species.F I G U R E 1 Maps depicting all georeferenced data from field studies included in the Global Root Trait (GRooT) Database.Circles indicate the range of species per site (e.g., one or two species, two to four species, successively) or traits per site (e.g., one or two traits, two to four traits, successively) [Colour figure can be viewed at wileyonlinelibrary.com]Thereby, phylogenetic coverage in GRooT provides the possibility of using the data in large-scale phylogenetic studies, such as analyses of trait conservatism (Averill et al., 2019; Valverde-Barrantes et al., 2017) or assessments of trait relationships and trade-offs across the phylogeny.

F
Phylogenetic coverage of root traits in Global Root Trait (GRooT).Panel (a) shows the average distribution of root traits per species in GRooT across the phylogeny (n = 6,214 species across n = 254 families) and panel (b) shows GRooT phylogenetic coverage based on the accepted species by The Plant List (n = 316,110 species across n = 442 families).Tip and inner ring color depict mean number of traits per species in a family while dark blue colour indicates families with lower number of traits per species.The outer ring represents major clades of vascular plants and the bars in this ring represent the family size (proportional to the logarithm base 10) either based on the number of species per family included in GRooT or the number of accepted species per family globally (Panel a and b, respectively).[Colour figure can be viewed at wileyonlinelibrary.com]

Trait Units Number of species Number of Species by site Mean Quantile (.25) Median Quantile (.75)
The first csv file, named GRooTFullVersion.csv,providesroottraitdataat the highest resolution available (either trait values from individual replicates or mean values per study), information to filter data by entities (Supporting Information TableS4), meta-data (Supporting Information TableS5) and error risk scores.The second file, named GRooTAggregateSpeciesVersion.csv, provides the mean, median and quantiles (.25 and .75) of species values.The R script, named GRooTExtraction, includes code to calculate error risk and the steps to calculate the mean, median and quantiles (.25 and .75) of species values.The code of the R script is customizable, including options to calculate mean values by excluding trait records based on the error risk, and to select data based on root entities (Supporting Information Table GRooT includes 38 root traits, with 38,276 species-by-site mean values based on 114,222 trait records (Table1; Supporting Information