Scientific floras can be reliable sources for some trait data in a system with poor coverage in global trait databases

This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. © 2021 The Authors. Journal of Vegetation Science published by John Wiley & Sons Ltd on behalf of International Association for Vegetation Science. 1School of Geography, University of Nottingham, Nottingham, UK 2Biogeography & Biodiversity Lab, Institute of Physical Geography, Goethe University Frankfurt, Frankfurt, Germany 3Biodiversity, Macroecology & Biogeography, University of Goettingen, Göttingen, Germany 4Department of Biology, Lakehead University, Thunder Bay, ON, Canada 5Sport Ecology, Department of Sport Science & Bayreuth Center of Ecology and Environmental Research (BayCEER), Bayreuth, Germany 6Department of Biological Sciences, University of Bergen, Bergen, Norway 7Island Ecology and Biogeography Group, Instituto Universitario de Enfermedades Tropicales y Salud Pública de Canarias, Universidad de La Laguna, La Laguna, Spain


| INTRODUC TI ON
Functional trait-based approaches in ecological research have, in recent years, enhanced our understanding of biodiversity and how traits relate to ecosystem functioning. Functional traits are morphological, physiological or phenological features of organisms, measurable at the individual level, that impact individual performance and fitness (Violle et al., 2007). While the classification of species into functional groups has a long tradition (Raunkiaer, 1934;Weiher et al., 1999), the definition of a "trait" has shifted from a simple grouping towards a more quantitative categorisation, allowing more predictive science within ecology (McGill et al., 2006). Trait-based approaches are now abundantly used to answer research questions across a variety of topics including community ecology (Mouillot et al., 2013;Satdichanh et al., 2015), species diversity gradients (Lamanna et al., 2014;Whittaker et al., 2014;Si et al., 2017;Costa et al., 2018b), responses to environmental change (Bjorkman et al., 2018;Liu et al., 2018;Winchell et al., 2020), and niche dynamics (Reif et al., 2016;Costa et al., 2018b).

Functional traits have been particularly important in understand-
ing the role of plant diversity in ecosystem functioning, and efforts have been made to identify trait-trait correlations and trade-offs to develop an economic spectrum for plant traits (Wright et al., 2004;Chave et al., 2009;Reich, 2014;Díaz et al., 2016;Kong et al., 2019;Shen et al., 2019). This, in turn, has aided the quantification of traitenvironment relationships to understand how abiotic factors influence functional characteristics (Ordoñez et al., 2009;Bruelheide et al., 2018). Recognising the importance of plant functional traits in ecology has increased the demand for plant trait data (Kattge et al., 2020). However, acquiring such data is a challenge. The fundamental source of trait data is through the direct measurements of plant individuals, either in the field or under experimental conditions. A major disadvantage of these direct methods of data collection is their intensiveness -they require a significant amount of time, equipment and money. Even if resources are abundant, accessibility to field sites can be difficult and field work can be disrupted. This can lead to biased data collection, whereby field sites that are easier to access, such as those at low elevations or near roads, are preferentially chosen. As a result, the data may be limited in geographic or taxonomic coverage. Furthermore, measuring traits in the field can be destructive -collecting leaf and stem samples can be detrimental to an individual's survival. This is important to consider when studying rare or endangered species, for which non-destructive methods should be preferred (if acquiring a collection permit is even possible).
An alternative source for trait information is to rely on data that have been sampled in the past and made available via global databases (Kleyer et al., 2008;Kattge et al., 2020). This has benefited trait-based research by making plant trait data accessible to more researchers and it has allowed recent studies to examine plant trait variation across larger geographic and phylogenetic scales (e.g. Díaz et al., 2016;Bjorkman et al., 2018;Bruelheide et al., 2018). For plants, the TRY database is the largest collection of plant functional traits and holds an impressive amount of trait records for almost 280,000 species (Kattge et al., 2020). Despite efforts to update and improve trait databases, they are still incomplete Jetz et al., 2016) and large taxonomic and geographic gaps remain. These knowledge gaps are non-randomly distributed, such that some species and regions are underrepresented Jetz et al., 2016;Cornwell et al., 2019). There are also biases towards certain traits and trait values. Easily measured traits are more likely to be reported than those that are difficult, or require more resources, to measure. In addition, bias towards higher or lower trait values has been found for frequently measured traits in the TRY database (Sandel et al., 2015), and certain trait values may go unreported (but see Scheffer et al., 2015).
Outside of these databases, a wealth of information about plant form and function exists in the literature that is yet to be digitised.
Information on plant species has been assembled and published in thousands of scientific floras (Floras hereafter) and taxonomic monographs for centuries. In fact, attempts to assemble botanical knowledge were made in ancient times and date as far back as AD 77 (see Pliny & Healey, 2004). Floras catalogue all known plant species in a given geographic region and represent some of the oldest collections of plant information in the botanical literature. They contain detailed taxonomic descriptions, keys, illustrations and sometimes distribution maps, geographical and ecological information that can be used for locating and identifying species (Frodin, 2001). Such de- Trait values extracted from Floras have the potential to be used for ecological purposes (Whittaker et al., 2000;Hawkes, 2007;Kissling et al., 2008;Kissling et al., 2010), and there is a growing effort to mobilise and integrate them into global biodiversity databases (Weigelt et al., 2020). Data from Floras and checklists provide highly representative and complete data from large regions, which is beneficial to macroecological research, but this data type is currently underutilised compared to fine-scale, high-resolution data, such as site-specific trait measurements (König et al., 2019). Comparing data quality with systematically collected field data is necessary to understand how data from Floras can be successfully applied in traitbased research. Thus, the aim of our study is to compare trait data obtained via three different methods of collection: (a) Floras, where trait information is extracted from species descriptions and identification keys; (b) field work, where established quantitative plant traits are measured directly in the field, specific to the geographic location of interest; and (c) the TRY database, where a species list of the focal region is used to download data for the focal traits.
We use the islands of Tenerife and La Palma in the Canary Islands (Spain) as the study system, for which an up-to-date, comprehensive and modern Flora is available (Muer et al., 2016). Oceanic islands are an appropriate study system for trait-based research (Ottaviani et al., 2020)  Journal of Vegetation Science CUTTS eT al. et al., 2009). Island systems have the potential to answer fundamental questions in functional ecology (Patiño et al., 2017) but the use of trait-based research on islands remains underexploited (Ottaviani et al., 2020) and readily available trait data for island species are rare.
Leaves are at the core of plant functional ecology due to their role in carbon acquisition and transpiration, which influences biochemical cycling and ecosystem functioning (Press, 1999). Thus we specifically focus on two commonly used traits: leaf area and specific leaf area (SLA), for which precise measurements are not usually recorded in Floras. We estimate leaf area and SLA using simpler trait measurements recorded in Floras and evaluate how well these estimates reflect leaf area and SLA measured directly from specimens collected in the field. We expected that leaf area estimated using leaf length and leaf width would be strongly positively correlated with field-measured leaf area, and that SLA estimated using leaf thickness would be positively correlated with field-measured SLA. We also tested the ability of traits from Floras to predict field traits using independent data by using trait data from one island to predict trait values on another.

| Field data
We studied traits of native vascular plant species of the islands of Tenerife and La Palma, Canary Islands, Spain. The latest plant checklist of the Canary Islands classifies species into to six categories: definitely native (either endemic or not), probably native, possibly native, probably introduced, introduced non-invasive and introduced invasive (Arechavaleta et al., 2009). We focused on species within the definitely native category only. Leaf traits were measured using standardised protocols for measurement of plant functional traits (Pérez-Harguindeguy et al., 2013): leaf area is the one-sided area of a fresh adult leaf, and SLA is the leaf area divided by its dry mass. We aimed to measure these traits for five adult individuals per species but, due to logistical constraints and the rarity of certain species, this was not always possible. If sampling more than one individual per species, we took samples from different locations across the islands where possible, to account for environmental variation in trait values. Species were sampled where botanical experts or the Flora indicated they were located. We collected between 10 and 100 adult leaves per individual, depending on the species: for most species we collected 10-20 leaves but for species with small leaves we collected up to 100 to accurately measure their mass.
Where possible, we sampled leaves that were not in the shade.
Leaves were cut from the stem and the petiole was removed. Up to 10 leaves were scanned per individual using an A4 scanner and leaf area calculated for each leaf using WinFOLIA software (version: 2016b Pro; Regent Instruments Inc., Québec, Canada, 2016) for Tenerife specimens and ImageJ software (version 1.52a ;Schneider et al., 2012) for La Palma specimens. We used the mean value for leaf area per species. The two software packages produced nearidentical average values for leaf area per species (paired-t 44 = 1.32, p = 0.19; Pearson's r = 0.99). The leaf samples were weighed, then oven-dried and weighed again to calculate both fresh mass and dry mass per leaf. For compound leaves, we kept the entire leaf intact for scanning. SLA was calculated by dividing the leaf area by its oven-dried mass (Pérez-Harguindeguy et al., 2013). We calculated leaf dry matter content (LDMC) of a single leaf by dividing the ovendry mass by its fresh mass.

| Flora data
We sourced plant trait data from the most recent and comprehensive guide to the Canarian flora (Muer et al., 2016). The information in the Flora is based on expert knowledge and contains species from all islands in the archipelago. These data were supplemented using other Floras to increase data coverage (Bramwell & Bramwell, 1974;Hohenester & Welß, 1993;Schönfelder & Schönfelder, 2018).
In some instances, we recorded data for subspecies when the trait values were known to differ between subspecies found on different islands. This ensured the field and Flora data matched as precisely as possible, according to our aim throughout: that the data we obtained would be those typically used in trait-based research using the data source in question. We extracted the following leaf traits: leaf length, leaf width and leaf thickness (information on SLA was not provided). Maximum and minimum values were often reported for these traits but we calculated and used the mean values. We used leaf length and leaf width to estimate leaf area using the following formula: where LA = leaf area, LL = leaf length, LW = leaf width. This equation assumes elliptical-shaped leaves. SLA is normally calculated by dividing leaf area by its dry mass. Dry mass will depend on the volume and density of the leaf. In the absence of information on dry mass or leaf density, we cannot estimate SLA directly. However, it still may be possible to obtain a proxy for SLA in the absence of dry mass data if variation in volume has a greater influence. Given that leaf volume, LV = LA × Lth, where Lth is leaf thickness, then: where LD is leaf density (dry mass per unit volume (Poorter et al., 2009)). Thus, assuming invariant LD across species, SLA will vary as a function of Lth: Following this reasoning, we test whether SLA, measured in the field, can be estimated from the Lth values in the Flora. As a test-ofconcept, we also test whether SLA varies with 1/Lth using only our field data. Lastly, leaf thickness has also been shown to correlate reasonably well with SLA × LDMC (Vile et al., 2005). We tested this by regressing leaf thickness from the Flora with SLA × LDMC as calculated from field data.

| TRY data
Species names in TRY, our species list and the Flora were resolved using the Taxonomic Name Resolution Service (Boyle et al., 2013).
We used the resolved species list to download the following traits from the freely available data: leaf length, leaf width, leaf thickness, leaf area and SLA. To ensure consistency with field data, TRY data were filtered to include only measurements from living adult individuals in their natural environments.

| Statistical analyses
Simple linear regressions were carried out with field data as the dependent variable and Flora data as the independent variable. We removed Kunkeliella retamoides from the analysis -this species has tiny ephemeral leaves that are reduced to scales, making it difficult to define the functional equivalent of the leaf, which led to different definitions across data sources, and thus non-comparable values between field and Flora datasets. We regressed field-measured leaf area against Flora-estimated leaf area and field-measured SLA against Flora-estimated SLA. We also regressed field-measured leaf area and SLA against leaf length and leaf width obtained from the Flora to determine how well each measurement predicted leaf area and SLA by itself. Furthermore, to scrutinise our method of estimating SLA using Flora data, we regressed field-measured SLA with field-measured 1/Lth. We compared these models with a second set of models that included leaf type (simple vs compound) and leaf shape (broad-leaved vs needle-like) as interaction variables in order to determine if the regression slope differed between these groups (see Supporting Information). We also compared leaf thickness from field data and Flora data. All variables were log e -transformed to improve the residuals of the regressions. In addition, we compared trait values obtained from the Flora with those from TRY using

| Data coverage
We measured traits for 451 definitely native species in the field ( To maintain consistency among data sources, we focus primarily on definitely native species occurring on La Palma and Tenerife, as these were the species measured for the field data. However, for informative purposes, in Table 2 we also report Flora and TRY data for all species, including exotics, occurring across the entire Canary Island archipelago. We considered probably introduced, introduced non-invasive and introduced invasive as exotic species.
Field-measured SLA was not significantly correlated with estimated SLA for the overall dataset (r 2 = 0.11, p = 0.17, df = 16; Figure 2), neither was it when looking at Tenerife only (r 2 = 0.20, p = 0.08, df = 14). We did not analyse for La Palma only because not enough species from La Palma had trait values for leaf thickness and SLA. No significant relationship was found between SLA and either leaf length or leaf width for Tenerife or La Palma (Table 3).
When testing this using only field data, we found the r 2 values to be extremely low (df = 382, r 2 = 0.07, p < 0.001; Appendix S1). The addition of leaf type and shape as interactions terms did not improve the regression model (r 2 = 0.08; Appendix S3; Appendices S7 and S8  (100) 270 (12)

| Cross-island predictions
We used the linear regression models to predict leaf area outside the geographical range of input data (i.e. the other island), using Flora data. We then correlated these predicted values with the ob- from Tenerife models (r 2 = 0.85). Again, leaf width had a higher predictive power than leaf length (Table 4). For leaf area predictions on both La Palma and Tenerife, the slope and intercept were very close to, and not significantly different from, 1 and 0 respectively (i.e. the 1:1 line: Table 4; Figure 2). For leaf length, the slope differed significantly from 1 but the intercept did not differ from 0 for both islands.
For leaf width, the slope and intercept differed significantly from 1 and 0 for both islands.

| D ISCUSS I ON
We have demonstrated that a combination of easily obtained leaf parameters -leaf length and leaf width -can be used to estimate leaf area as a non-destructive alternative to field sampling. Furthermore, we were able to successfully predict independent field-measured data on leaf area across islands in the Canaries, indicating that the reliability of Floras as sources of trait data may be transferable to new regions.
Our estimates of leaf area correlated strongly with fieldmeasured leaf area on both La Palma and Tenerife despite assuming an elliptical shape. Other studies using leaf length and width to estimate leaf area have found similar results (Kraft et al., 2008;Pandey & Singh, 2011;Shi et al., 2019). Accounting for the differences in leaf type (simple vs compound) and leaf shape (broad-leaved vs needlelike) did not improve our models. In fact, we find that the species that diverge furthest from the 1:1 line are a mix of species with simple or compound leaves. Thus, the variation in leaf type and leaf shape does not necessarily correspond to variations of leaf area (leaf shape probably relates more closely to leaf perimeter). Therefore, the additional variance in leaf area due to leaf shape that is not accounted for in the model (e.g. from compound or severely lobed leaves) does not have a sufficient effect on leaf area to render a parsimonious model uninformative.
To evaluate the performance of the leaf area model, we used it to make predictions on a different island. The success of the predictions could be driven by the climatic overlap between islands as leaf area is linked to climate and microclimate (Byars et al., 2007;Peppe et al., 2011;Guerin et al., 2012;Sumida et al., 2018). Also, the phylogenetic relatedness within the Canary Island flora means that many species occurring on different islands belong to the same genera and are morphologically similar, such as Argyranthemum, which might contribute to the strong predictive ability. Nonetheless, despite considerable overlap, the climates of Tenerife and La Palma are different in some areas -La Palma receives the highest levels of precipitation in the archipelago due the northeasterly trade winds, and is cooler and wetter than Tenerife in some places, whereas Tenerife, being taller, reaches lower temperatures than La Palma at its summits.
Also, although many of the closely related species are morphologically similar, some genera have radiated into species that are morphologically quite different (Jorgensen & Olesen, 2001). Therefore, despite both environmental and trait differentiation, the model predicts well across islands. Whether or not this can be translated beyond the Canary Island archipelago is a subject for further study.
Intraspecific trait differences could be present in native species occurring on both the islands and the continent and could potentially have an island-continental gradient.
Despite our expectation, and considering that SLA is a function of leaf thickness (Witkowski & Lamont, 1991;Pérez-Harguindeguy et al., 2013), we only found a weak and non-significant relationship between field-measured SLA and Flora-estimated SLA. Accounting for differences between leaf groups only slightly improved these estimations. Perhaps a more complex model is required -assuming a constant volume to mass ratio for leaves is simplistic, because plants invest more or less in structural elements based on their ecological strategies (Westoby et al., 2002). Therefore, accounting for different leaf strategies might reveal different relationships.
However, Vendramini et al. (2002) found a clear association between SLA and leaf thickness, but when accounting for leaf strategies (succulent, sclerophyllous and tender-leaved) this relationship disappeared. SLA is also a function of LDMC (Vile et al., 2005), thus, future research could see how the relationship differs across different LDMC values. Our attempt to estimate SLA using leaf thickness from available Flora data was unsuccessful. Leaf thickness seems to be scarcely reported in Floras, perhaps due the difficulty of making precise measurements, resulting in little variation. Furthermore, it is possible that leaf thicknesses from Floras are obtained from dried herbarium specimens, which would not be comparable to measurements from fresh leaves. This might account for the unexplained variation in the relationship between TA B L E 3 Univariate linear regressions with field-measured traits as the response variables (LA field = field-measured leaf area, SLA field = field-measured specific leaf area, Lth field = fieldmeasured leaf thickness) and Flora-measured traits as the explanatory variables (LA flora = Flora-estimated leaf area, LL flora = leaf length from Flora, LW flora = leaf width from Flora, SLA flora = Flora-estimated specific leaf area, Lth flora = leaf thickness from Flora). SLA field-est = SLA estimated using 1/Lth from field data.  Weigelt et al., 2020).
This provides a standardised way of digitising and presenting the data in Floras and checklists worldwide.
A promising avenue for future research would be to evaluate digitalised herbarium specimens as a source of trait data. There are some clear advantages to using herbarium specimens to gather trait data, namely that the measurements are precise and the geographical/temporal origin of the specimens are known. However, there may be bias from using this type of data, whereby the most appealing specimens are collected. This may not accurately represent a species mean for a given trait.

| Concluding remarks
We have demonstrated that Floras can provide some valuable data for the Canary Islands, whereas the TRY database currently cannot, a situation that we expect will affect other insular systems with high numbers of endemic species. This points towards a need for more field work to fill in gaps and reduce bias. However, due to the high cost and typically destructive nature of field sampling, it may not be feasible to sample rare and endangered species if we are to protect them. Thus, Floras remain an important resource in the emerging field of functional island biogeography, for which a lot of new data are required.

ACK N OWLED G EM ENTS
We would like to thank Félix Medina and Rüdiger Otto for their botanical expertise; Jana Alonso, Manuel Nogales, Nora Straßburger and Mercedes Vidal-Rodríguez for their assistance with lab and fieldwork; and Franziska Schrodt for her help with online data extraction and commenting on the final manuscript. . SE = standard error. All regressions were significant at p < 0.001. "Slope p" and "Intercept p" are p-values from onesample t tests comparing slopes with 1 and intercepts with 0. All data were log e -transformed