• Open Access

TRY – a global database of plant traits


Jens Kattge, Max-Planck-Institute for Biogeochemistry, Hans-Knöll Straβe 10, 07745 Jena, Germany, tel. +49 3641 576226, e-mail: jkattge@bgc-jena.mpg.de


Plant traits – the morphological, anatomical, physiological, biochemical and phenological characteristics of plants and their organs – determine how primary producers respond to environmental factors, affect other trophic levels, influence ecosystem processes and services and provide a link from species richness to ecosystem functional diversity. Trait data thus represent the raw material for a wide range of research from evolutionary biology, community and functional ecology to biogeography. Here we present the global database initiative named TRY, which has united a wide range of the plant trait research community worldwide and gained an unprecedented buy-in of trait data: so far 93 trait databases have been contributed. The data repository currently contains almost three million trait entries for 69 000 out of the world's 300 000 plant species, with a focus on 52 groups of traits characterizing the vegetative and regeneration stages of the plant life cycle, including growth, dispersal, establishment and persistence. A first data analysis shows that most plant traits are approximately log-normally distributed, with widely differing ranges of variation across traits. Most trait variation is between species (interspecific), but significant intraspecific variation is also documented, up to 40% of the overall variation. Plant functional types (PFTs), as commonly used in vegetation models, capture a substantial fraction of the observed variation – but for several traits most variation occurs within PFTs, up to 75% of the overall variation. In the context of vegetation models these traits would better be represented by state variables rather than fixed parameter values. The improved availability of plant trait data in the unified global database is expected to support a paradigm shift from species to trait-based ecology, offer new opportunities for synthetic plant trait research and enable a more realistic and empirically grounded representation of terrestrial vegetation in Earth system models.


Plant traits – morphological, anatomical, biochemical, physiological or phenological features measurable at the individual level (Violle et al., 2007) – reflect the outcome of evolutionary and community assembly processes responding to abiotic and biotic environmental constraints (Valladares et al., 2007). Traits and trait syndromes (consistent associations of plant traits) determine how primary producers respond to environmental factors, affect other trophic levels and influence ecosystem processes and services (Aerts & Chapin, 2000; Grime, 2001, 2006; Lavorel & Garnier, 2002; Díaz et al., 2004; Garnier & Navas, 2011). In addition, they provide a link from species richness to functional diversity in ecosystems (Díaz et al., 2007). A focus on traits and trait syndromes therefore provides a promising basis for a more quantitative and predictive ecology and global change science (McGill et al., 2006; Westoby & Wright, 2006).

Plant trait data have been used in studies ranging from comparative plant ecology (Grime, 1974; Givnish, 1988; Peat & Fitter, 1994; Grime et al., 1997) and functional ecology (Grime, 1977; Reich et al., 1997; Wright et al., 2004) to community ecology (Shipley et al., 2006; Kraft et al., 2008), trait evolution (Moles et al., 2005a), phylogeny reconstruction (Lens et al., 2007), metabolic scaling theory (Enquist et al., 2007), palaeobiology (Royer et al., 2007), biogeochemistry (Garnier et al., 2004; Cornwell et al., 2008), disturbance ecology (Wirth, 2005; Paula & Pausas, 2008), plant migration and invasion ecology (Schurr et al., 2005), conservation biology (Ozinga et al., 2009; Römermann et al., 2009) and plant geography (Swenson & Weiser, 2010). Access to trait data for a large number of species allows testing levels of phylogenetic conservatism, a promising principle in ecology and evolutionary biology (Wiens et al., 2010). Plant trait data have been used for the estimation of parameter values in vegetation models, but only in a few cases based on systematic analyses of trait spectra (White et al., 2000; Kattge et al., 2009; Wirth & Lichstein, 2009; Ziehn et al., 2011). Recently, plant trait data have been used for the validation of a global vegetation model as well (Zaehle & Friend, 2010).

While there have been initiatives to compile datasets at regional scale for a range of traits [e.g. LEDA (Life History Traits of the Northwest European Flora: http://www.leda-traitbase.org), BiolFlor (Trait Database of the German Flora: http://www.ufz.de/biolflor), EcoFlora (The Ecological Flora of the British Isles: http://www.ecoflora.co.uk), BROT (Plant Trait Database for Mediterranean Basin Species: http://www.uv.es/jgpausas/brot.htm)] or at global scale focusing on a small number of traits [e.g. GlopNet (Global Plant Trait Network: http://www.bio.mq.edu.au/~iwright/glopian.htm), SID (Seed Information Database: http://data.kew.org/sid/)], a unified initiative to compile data for a large set of relevant plant traits at the global scale was lacking. As a consequence studies on trait variation so far have either been focussed on the local to regional scale including a range of different traits (e.g. Baraloto et al., 2010), while studies at the global scale were restricted to individual aspects of plant functioning, e.g. the leaf economic spectrum (Wright et al., 2004), the evolution of seed mass (Moles et al., 2005a, b) or the characterization of the wood economic spectrum (Chave et al., 2009). Only few analyses on global scale have combined traits from different functional aspects, but for a limited number of plant species (e.g. Díaz et al., 2004).

In 2007, the TRY initiative (TRY – not an acronym, rather an expression of sentiment: http://www.try-db.org) started compiling plant trait data from the different aspects of plant functioning on global scale to make the data available in a consistent format through one single portal. Based on a broad acceptance in the plant trait community (so far 93 trait databases have been contributed, Table 1), TRY has accomplished an unprecedented coverage of trait data and is now working towards a communal global repository for plant trait data. The new database initiative is expected to contribute to a more realistic and empirically based representation of plant functional diversity on global scale supporting the assessment and modelling of climate change impacts on biogeochemical fluxes and terrestrial biodiversity (McMahon et al., 2011).

Table 1.  Databases currently contributing to the TRY database
Name of the DatabaseContact person(s)Reference(s)
  1. Databases are separated whether they are at a final stage or still continuously developed, and whether they are or are not publicly available as an electronic resource in the Internet. Databases that are already integrated databases, pooling a range of original databases (e.g. LEDA, GLOPNET) are highlighted by asterisks (*). Contributions are sorted alphabetically by principal contact person. A database can consist of several datasets (268 individual files have currently been imported to the TRY database). Most of the nonpublic databases contain unpublished besides published data.

Databases public, maintained on the Internet
 1Seed Information Database (SID)*J. Dickie, K. LiuRoyal Botanic Gardens Kew Seed Information Database (SID), (2008)
 2Ecological Flora of the British Isles*A. Fitter, H. FordFitter & Peat (1994)
 3VegClass CBM Global DatabaseA. GillisonGillison & Carpenter (1997)
 4PLANTSdata*W. A. GreenGreen (2009)
 5The LEDA Traitbase*M. KleyerKleyer et al. (2008)
 6BiolFlor Database*I. Kühn, S. KlotzKlotz et al. (2002), Kühn et al. (2004)
 7BROT plant trait database*J. G. Pausas, S. PaulaPaula & Pausas (2009), Paula et al. (2009)
Databases public, fixed
 8Tropical Respiration DatabaseJ. Q. ChambersChambers et al. (2004, 2009)
 9ArtDeco Database*W. K. Cornwell, J. H. C. CornelissenCornwell et al. (2008)
 10The Americas N&P databaseB. J. Enquist, A. J. KerkhoffKerkhoff et al. (2006)
 11ECOCRAFTB. E. MedlynMedlyn and Javis (1999), Medlyn et al. (1999, 2001)
 12Tree Tolerance Database*Ü. NiinemetsNiinemets & Valladares (2006)
 13Leaf Biomechanics Database*Y. OnodaOnoda et al. (2011)
 14BIOPOP: Functional Traits for Nature Conservation*P. PoschlodPoschlod et al. (2003)
 15BIOME-BGC Parameterization Database*M. White, P. ThorntonWhite et al. (2000)
 16GLOPNET – Global Plant Trait Network Database*I. J. Wright, P. B. ReichWright et al. (2004, 2006)
 17Global Wood Density Database*A. E. Zanne, J. ChaveChave et al. (2009), Zanne et al. (2009)
Databases not-public, fixed in the majority of cases
 18Plant Traits in Pollution Gradients DatabaseM. AnandUnpublished data
 19Plant Physiology DatabaseO. AtkinAtkin et al. (1997, 1999), Loveys et al. (2003), Campbell et al. (2007)
 20European Mountain Meadows Plant Traits DatabaseM. BahnBahn et al. (1999), Wohlfahrt et al. (1999)
 21Photosynthesis Traits DatabaseD. BaldocchiWilson et al. (2000), Xu & Baldocchi (2003)
 22Photosynthesis and Leaf Characteristics DatabaseB. Blonder, B. EnquistUnpublished data
 23Wetland Dunes Plant Traits DatabaseP. M. van BodegomBakker et al. (2005, 2006), van Bodegom et al. (2005, 2008)
 24Ukraine Wetlands Plant Traits DatabaseP. M. van BodegomUnpublished data
 25Plants Categorical Traits DatabaseP. M. van BodegomUnpublished data
 26South African Woody Plants Trait Database (ZLTP)W. J. Bond, M. WaldramUnpublished data
 27Australian Fire Ecology Database*R. BradstockUnpublished data
 28Cedar Creek Plant Physiology DatabaseD. E. Bunker, S. NaeemUnpublished data
 29Floridian Leaf Traits DatabaseJ. Cavender-BaresCavender-Bares et al. (2006)
 30Tundra Plant Traits DatabasesF. S. Chapin IIIUnpublished data
 31Global Woody N&P Database*G. Esser, M. Clüsener-GodtClüsener-Godt (1989)
 32Abisko & Sheffield DatabaseJ. H. C. CornelissenCornelissen (1996), Cornelissen et al. (1996, 1997, 1999, 2001, 2003a, 2004), Castro-Diez et al. (1998, 2000), Quested et al. (2003)
 33Jasper Ridge Californian Woody Plants DatabaseW. K. Cornwell, D. D. AckerlyCornwell et al. (2006), Preston et al. (2006), Ackerly & Cornwell (2007), Cornwell & Ackerly (2009)
 34Roots Of the World (ROW) DatabaseJ. M. CraineCraine et al. (2005)
 35Global 15N DatabaseJ. M. CraineCraine et al. (2009)
 36CORDOBASES. DíazDíaz et al. (2004)
 37Sheffield-Iran-Spain Database*S. DíazDíaz et al. (2004)
 38Chinese Leaf Traits DatabaseJ. FangHan et al. (2005), He et al. (2006, 2008)
 39Costa Rica Rainforest Trees DatabaseB. Finegan, B. SalgadoUnpublished data
 40Plant Categorical Traits DatabaseO. FloresUnpublished data
 41Subarctic Plant Species Trait DatabaseG. T. Freschet, J. H. C. CornelissenFreschet et al. (2010a, b)
 42Climbing Plants Trait DatabaseR. V. GallagherGallagher et al. (2011)
 43The VISTA Plant Trait DatabaseE. Garnier, S. LavorelGarnier et al. (2007), Pakeman et al. (2008, 2009), Fortunel et al. (2009)
 44VirtualForests Trait DatabaseA. G. GutiérrezGutiérrez (2010)
 45Dispersal Traits DatabaseS. HigginsUnpublished data
 46Herbaceous Traits from the Öland Island DatabaseT. HicklerHickler (1999)
 47Global Wood Anatomy DatabaseS. Jansen, F. LensUnpublished data
 48Gobal Leaf Element Composition DatabaseS. JansenWatanabe et al. (2007)
 49Leaf Physiology Database*J. Kattge, C. WirthKattge et al. (2009)
 50KEW African Plant Traits DatabaseD. KirkupKirkup et al. (2005)
 51Photosynthesis Traits DatabaseK. KramerUnpublished data
 52Traits of Bornean Trees DatabaseH. KurokawaKurokawa & Nakashizuka (2008)
 53Ponderosa Pine Forest DatabaseD. LaughlinLaughlin et al. (2010)
 54New South Wales Plant Traits DatabaseM. LeishmanUnpublished data
 55The RAINFOR Plant Trait DatabaseJ. Lloyd, N. M. FyllasBaker et al. (2009), Fyllas et al. (2009), Patiño et al. (2009)
 56French Grassland Trait DatabaseF. Louault, J. -F. SoussanaLouault et al (2005)
 57The DIRECT Plant Trait DatabaseP. ManningUnpublished data
 58Leaf Chemical Defense DatabaseT. MassadUnpublished data
 59Panama Leaf Traits DatabaseJ. MessierMessier et al. (2010)
 60Global Seed Mass Database*A. T. MolesMoles et al. (2004, 2005a, b)
 61Global Plant Height Database*A. T. MolesMoles et al. (2004)
 62Global Leaf Robustness and Physiology DatabaseÜ. NiinemetsNiinemets (1999, 2001)
 63The Netherlands Plant Traits DatabaseJ. Ordoñez, P. M. van BodegomOrdonez et al. (2010a, b)
 64The Netherlands Plant Height DatabaseW. A. OzingaUnpublished data
 65Hawaiian Leaf Traits DatabaseJ. Peñuelas, Ü. NiinemetsPeñuelas et al. (2010a, b)
 66Catalonian Mediterranean Forest Trait DatabaseJ. Peñuelas, R. OgayaOgaya & Peñuelas (2003, 2006, 2007, 2008), Sardans et al. (2008a, b)
 67Catalonian Mediterranean Shrubland Trait DatabaseJ. Penuelas, M. EstiartePeñuelas et al. (2007), Prieto et al. (2009)
 68ECOQUA South American Plant Traits DatabaseV. Pillar, S. MüllerPillar & Sosinski (2003), Overbeck (2005), Blanco et al. (2007), Duarte et al. (2007), Müller et al. (2007), Overbeck & Pfadenhauer (2007)
 69The Tansley Review LMA Database*H. PoorterPoorter et al. (2009)
 70Categorical Plant Traits DatabaseH. PoorterUnpublished data
 71Tropical Rainforest Traits DatabaseL. PoorterPoorter & Bongers (2006), Poorter (2009)
 72Frost Hardiness Database*A. RammigUnpublished data
 73Reich-Oleksyn Global Leaf N, P DatabaseP. B. Reich, J. OleksynReich et al. (2009)
 74Global A, N, P, SLA DatabaseP. B. ReichReich et al. (2009)
 75Cedar Creek Savanna SLA, C, N DatabaseP. B. ReichWillis et al. (2010)
 76Global Respiration DatabaseP. B. ReichReich et al. (2008)
 77Leaf and Whole-Plant Traits Database: Hydraulic and Gas Exchange Physiology, Anatomy, Venation Structure, Nutrient Composition, Growth and Biomass AllocationL. SackSack et al. (2003, 2005, 2006), Sack (2004), Nakahashi et al. (2005), Sack & Frole (2006), Cavender-Bares et al. (2007), Choat et al. (2007), Cornwell et al. (2007), Martin et al. (2007), Coomes et al. (2008), Hoof et al. (2008), Quero et al. (2008), Scoffoni et al. (2008), Dunbar-Co et al. (2009), Hao et al. (2010), Waite & Sack (2010), Markesteijn et al. (2011)
 78Tropical Traits from West Java DatabaseS. ShioderaShiodera et al. (2008)
 79Leaf And Whole Plant Traits DatabaseB. ShipleyShipley (1989, 1995), Shipley and Meziane (2002), Shipley & Parent (1991), McKenna & Shipley (1999), Meziane & Shipley (1999a, b, 2001), Pyankov et al. (1999), Shipley & Lechowicz (2000), Shipley & Vu (2002), Vile (2005), Kazakou et al. (2006), Vile et al. (2006)
 80Herbaceous Leaf Traits Database Old Field New YorkA. SiefertUnpublished data
 81FAPESP Brazil Rain Forest DatabaseE. Sosinski, C. JolyUnpublished data
 82Causasus Plant Traits DatabaseN. A. Soudzilovskaia, V. G. Onipchenko, J. H. C. CornelissenUnpublished data
 83Tropical Plant Traits From Borneo DatabaseE. SwaineSwaine (2007)
 84Plant Habit Database*C. Violle, B. H. Dobrin, B. J. EnquistUnpublished data
 85Midwestern and Southern US Herbaceous Species Trait DatabaseE. WeiherUnpublished data
 86The Functional Ecology of Trees (FET) Database – Jena*C. Wirth, J. KattgeWirth & Lichstein (2009)
 87Fonseca/Wright New South Wales DatabaseI. J. WrightFonseca et al. (2000), McDonald et al. (2003)
 88Neotropic Plant Traits DatabaseI. J. WrightWright et al. (2007)
 89Overton/Wright New Zealand DatabaseI. J. WrightUnpublished data
 90Categorical Plant Traits DatabaseI. J. WrightUnpublished data
 91Panama Plant Traits DatabaseS. J. WrightWright et al. (2010)
 92Quercus Leaf C&N DatabaseB. YguelUnpublished data
 93Global Vessel Anatomy Database*A. E. Zanne, D. CoomesUnpublished data

For several traits the data coverage in the TRY database is sufficient to quantify the relative amount of intra- and interspecific variation, as well as variation within and between different functional groups. Thus, the dataset allows to examine two basic tenets of comparative ecology and vegetation modelling, which, due to lack of data, had not been quantified so far:

  • 1On the global scale, the aggregation of plant trait data at the species level captures the majority of trait variation. This central assumption of plant comparative ecology implies that, while there is variation within species, this variation is smaller than the differences between species (Garnier et al., 2001; Keddy et al., 2002; Westoby et al., 2002; Shipley, 2007). This is the basic assumption for using average trait values of species to calculate indices of functional diversity (Petchey & Gaston, 2006; de Bello et al., 2010; Schleuter et al., 2010), to identify ecologically important dimensions of trait variation (Westoby, 1998) or to determine the spatial variation of plant traits (Swenson & Enquist, 2007; Swenson & Weiser, 2010).
  • 2On the global scale, basic plant functional classifications capture a sufficiently important fraction of trait variation to represent functional diversity. This assumption is implicit in today's dynamic global vegetation models (DGVMs), used to assess the response of ecosystem processes and composition to CO2 and climate changes. Owing to computational constraints and lack of detailed information these models have been developed to represent the functional diversity of >300 000 documented plant species on Earth with a small number (5–20) of basic plant functional types (PFTs, e.g. Woodward & Cramer, 1996; Sitch et al., 2003). This approach has been successful so far, but limits are becoming obvious and challenge the use of such models in a prognostic mode, e.g. in the context of Earth system models (Lavorel et al., 2008; McMahon et al., 2011).

This article first introduces the TRY initiative and presents a summary of data coverage with respect to different traits and regions. For a range of traits, we characterize general statistical properties of the trait density distributions, a prerequisite for statistical analyses, and provide mean values and ranges of variation. For 10 traits that are central to leading dimensions of plant strategy, we then quantify trait variation with respect to species and PFT and thus examine the two tenets mentioned above. Finally, we demonstrate how trait variation within PFT is currently represented in the context of global vegetation models.

Material and methods

Types of data compiled

The TRY data compilation focuses on 52 groups of traits characterizing the vegetative and regeneration stages of plant life cycle, including growth, reproduction, dispersal, establishment and persistence (Table 2). These groups of traits were collectively agreed to be the most relevant for plant life-history strategies, vegetation modelling and global change responses on the basis of existing shortlists (Grime et al., 1997; Weiher et al., 1999; Lavorel & Garnier, 2002; Cornelissen et al., 2003b; Díaz et al., 2004; Kleyer et al., 2008) and wide consultation with vegetation modellers and plant ecologists. They include plant traits sensu stricto, but also ‘performances’ (sensu Violle et al., 2007), such as drought tolerance or phenology.

Table 2.  Summary of data coverage in the TRY data repository (March 31, 2011) for the 52 groups of focus traits and one group lumping all other traits (53)
Group of traitsTraits per groupDatasetsSpeciesEntriesGeo-referencedLocationSoil
  • *

    Qualitative traits assumed to have low variability within species.

  • Traits that address one plant characteristic but expressed differently are summarized in groups, e.g. the group ‘leaf nitrogen content’ consists of the three traits: leaf nitrogen content per dry mass, leaf nitrogen content per area and nitrogen content per leaf. In the case of respiration, the database contains 105 related traits: different organs, different reference values (e.g. dry mass, area, volume, nitrogen) or characterizing the temperature dependence of respiration (e.g. Q10). Specific information for each trait is available on the TRY website (http://www.try-db.org). Datasets: number of contributed datasets; Species: number of species characterised by at least one trait entry; Entries: number of trait entries; Georeferenced, Location, Soil: number of trait entries geo-referenced by coordinates, resp. with information about location or soil.

  • Bold: qualitative traits standardized and made publicly available on the TRY website.

1Plant growth form*76239 715130 52745 68348 35519 630
2Plant life form*19787064 94955 47658 57553 008
3Plant resprouting capacity*47324852194103192462
4Plant height156318 071105 42243 35150 15434 325
5Plant longevity423819818 844370923365109
6Plant age of reproductive maturity33150620240240
7Plant architectural relationships724310 227356 188340 540340 390332 608
8Plant crown size482764180145084633
9Plant surface roughness113131000
10Plant tolerance to stress4014827562 362877128633 799
11Plant phenology1016763026 765290088166868
12Leaf type*11533 51949 668626144902511
13Leaf compoundness*11534 52350 50213 49513 558230
14Leaf photosynthetic pathway*12931 64140 807630544425495
15Leaf phenology type*13515 51265 53636 57937 88824 900
16Leaf size176716 877205 165158 066138 10574 424
17Leaf longevity4181080195317051515551
18Leaf angle26469341 88241 84841 80539 820
19Leaf number per unit shoot length14413510 751134020071265
20Leaf anatomy4110107626 64924 01423 9500
21Leaf cell size14631011963394620
22Leaf mechanical resistance717420611 64556086295227
23Leaf absorbance141373630061
24Specific leaf area (SLA)1389875187 06463 73053 83018 149
25Leaf dry matter content535309833 77726 12519 7676919
26Leaf carbon content332302818 88715 29511 9387857
27Leaf nitrogen content462712258 06443 41741 84425 857
28Leaf phosphorus content235487026 06519 02221 0957390
29Tissue carbon content (other plant organs)19186594273272620401093
30Tissue nitrogen content (other plant organs)5540484832 43824 59822 31721 904
31Tissue phosphorus content (other plant organs)1618376317 05810 11512 5192445
32Tissue chemical composition (apart from C,N,P)13628503184 74326 27274 07625 152
33Photosynthesis4934204919 7939446998011 127
34Stomatal conductance762391811 811438664094729
35Respiration1051863314 898642312 5193621
36Litter decomposability28972217220131568968
37Pollination mode*110421116 571780853299
38Dispersal mode*619972843 50254106357341
39Seed germination stimulation*67340770741122064437
40Seed size173026 839158 88113 22567803755
41Seed longevity35186211 4663973
42Seed morphology592326381156712530
43Stem bark thickness13521831831830
44Wood porosity*1152217059000
45Woodiness*12344 38574 89124 95726 23719 609
46Wood anatomy77138506252 07212624965
47Wood density103411 90743 87119 42231 5223121
48Modifications for storage*47409010 410405240543747
49Mycorrhiza type*15245314 93510 48110 50010 481
50Nitrogen fixation capacity*32210 64236 02318 66316 82617 627
51Rooting depth15613629451453280
52Defence/allelopathy/palatability1512333313 3882489266310 936
 Additional traits25713235 286496 383123 068135 052179 577
 Sum1146268 (total)69 296 (total)2 884 8201 267 5131 318 5801 029 715

Quantitative traits vary within species as a consequence of genetic variation (among genotypes within a population/species) and phenotypic plasticity. Ancillary information is necessary to understand and quantify this variation. The TRY dataset contains information about the location (e.g. geographical coordinates, soil characteristics), environmental conditions during plant growth (e.g. climate of natural environment or experimental treatment), and information about measurement methods and conditions (e.g. temperature during respiration or photosynthesis measurements). Ancillary data also include primary references.

By preference individual measurements are compiled in the database, like single respiration measurements or the wood density of a specific individual tree. The dataset therefore includes multiple measurements for the same trait, species and site. For some traits, e.g. leaf longevity, such data are only rarely available on single individuals (e.g. Reich et al., 2004), and data are expressed per species per site instead. Different measurements on the same plant (resp. organ) are linked to form observations that are hierarchically nested. The database structure ensures that (1) the direct relationship between traits and ancillary data and between different traits that have been measured on the same plant (resp. organ) is maintained and (2) conditions (e.g. at the stand level) can be associated with the individual measurements (Kattge et al., 2010). The structure is consistent with the Extensible Observation Ontology (OBOE; Madin et al., 2008), which has been proposed as a general basis for the integration of different data streams in ecology.

The TRY dataset combines several preexisting databases based on a wide range of primary data sources, which include trait data from plants grown in natural environments and under experimental conditions, obtained by a range of scientists with different methods. Trait variation in the TRY dataset therefore reflects natural and potential variation on the basis of individual measurements at the level of single organs, and variation due to different measurement methods and measurement error (random and bias).

Data treatment in the context of the TRY database

The TRY database has been developed as a Data Warehouse (Fig. 1) to combine data from different sources and make them available for analyses in a consistent format (Kattge et al., 2010). The Data Warehouse provides routines for data extraction, import, cleaning and export. Original species names are complemented by taxonomically accepted names, based on a checklist developed by IPNI (The International Plant Names Index: http://www.ipni.org) and TROPICOS (Missouri Botanical Garden: http://www.tropicos.org), which had been made publicly available on the TaxonScrubber website by the SALVIAS (Synthesis and Analysis of Local Vegetation Inventories Across Sites: http://www.salvias.net) initiative (Boyle, 2006). Trait entries and ancillary data are standardized and errors are corrected after consent from data contributors. Finally, outliers and duplicate trait entries are identified and marked (for method of outlier detection, see Appendix S1). The cleaned and complemented data are moved to the data repository, whence they are released on request.

Figure 1.

Figure 1.

     The TRY process of data sharing. Researcher C contributes plant trait data to TRY (1) and becomes a member of the TRY consortium (2). The data are transferred to the Staging Area, where they are extracted and imported, dimensionally and taxonomically cleaned, checked for consistency against all other similar trait entries and complemented with covariates from external databases [3; Tax, taxonomic databases, IPNI/TROPICOS accessed via TaxonScrubber (Boyle, 2006); Clim, climate databases, e.g. CRU; Geo, geographic databases]. Cleaned and complemented data are transferred to the Data Repository (4). If researcher C wants to retain full ownership, the data are labelled accordingly. Otherwise they obtain the status ‘freely available within TRY’. Researcher C can request her/his own data – now cleaned and complemented – at any time (5). If she/he has contributed a minimum amount of data (currently >500 entries), she/he automatically is entitled to request data other than her/his own from TRY. In order to receive data she/he has to submit a short proposal explaining the project rationale and the data requirements to the TRY steering committee (6). Upon acceptance (7) the proposal is published on the Intranet of the TRY website (title on the public domain) and the data management automatically identifies the potential data contributors affected by the request. Researcher C then contacts the contributors who have to grant permission to use the data and to indicate whether they request coauthorship in turn (8). All this is handled via standard e-mails and forms. The permitted data are then provided to researcher C (9), who is entitled to carry out and publish the data analysis (10). To make trait data also available to vegetation modellers – one of the pioneering motivations of the TRY initiative – modellers (e.g. modeller E) are also allowed to directly submit proposals (11) without prior data submission provided the data are to be used for model parameter estimation and evaluation only. We encourage contributors to change the status of their data from ‘own’ to ‘free’ (12) as they have successfully contributed to publications. With consent of contributors this part of the database is being made publicly available without restriction. So far look-up tables for several qualitative traits (see Table 2) have been published on the website of the TRY initiative (http://www.try-db.org). Meta-data are also provided without restriction (13).

    Selection of data and statistical methods in the context of this analysis

    For the analyses in the context of this manuscript, we have chosen traits with sufficient coverage from different aspects of plant functioning. The data were standardized, checked for errors and duplicates excluded. Maximum photosynthetic rates and stomatal conductance were filtered for temperature (15–30 °C), light (PAR >500 μmol m2 s−1) and atmospheric CO2 concentration during measurements (300–400 ppm); data for respiration were filtered for temperature (15–30 °C). A temperature range for respiration from 15–30 °C will add variability to trait values. Nevertheless, an immediate response of respiration to temperature is balanced by an opposite adaptation of basal respiration rates to long-term temperature changes. More detailed analyses will have to take short- and long-term impact of temperature on both scales into account. With respect to photosynthetic rates the problem is similar, but less severe. Statistical properties of density distributions of trait data were characterized by skewness and kurtosis on the original scale and after log-transformation. The Jarque–Bera test was applied to assess departure from normality (Bera & Jarque, 1980). Finally outliers were identified (see supporting information, Appendix S1). The subsequent analyses are based on standardized trait values, excluding outliers and duplicates.

    PFTs were defined similar to those used in global vegetation models (e.g. Woodward & Cramer, 1996; Sitch et al., 2003; see Table 5), based on standardized tables for the qualitative traits ‘plant growth form’ (grass, herb, climber, shrub, tree), ‘leaf type’ (needle-leaved, broad-leaved), ‘leaf phenology type’ (deciduous, evergreen), ‘photosynthetic pathway’ (C3, C4, CAM) and ‘woodiness’ (woody, nonwoody).

    Table 5.  Variation within and between species and within and between plant functional types (PFT)
     Seed massPlant heightLLSLANmPmNainline imageinline imageinline image
    1. SD is based on log10-transformed trait data, after exclusion of duplicates and outliers, including data derived under experimental growth conditions. Numbers in brackets along with names of plant functional types characterize the numbers of species attributed to the respective PFT. Plant species were selected to provide examples from different functional types and with entries for each of the 10 traits.

    2. SD, standard deviation within group; SD between, standard deviation between groups; n, number of entries; nsp, n/sp and n/PFT, number of species vs. number of mean number of entries per species and PFT, mean values, calculated as arithmetic mean on log-scale and retransformed to original scale, Sign. P, significance level for difference between means for PFTs and species; Traits, seed mass (mg); plant height, maximum plant height (m); LL, leaf lifespan (month); SLA, specific leaf area (mm2 mg−1); Nm, leaf nitrogen content per dry mass (mg g−1); Pm, leaf phosphorus content per dry mass (mg g−1); Na, leaf nitrogen content per area (g m−2), inline image, light saturated photosynthetic rate per leaf area (μmol m−2 s−1); inline image, light saturated photosynthetic rate per dry mass (μmol g−1 s−1), inline image, light saturated photosynthetic rate per leaf nitrogen content (μmol g−1 s−1).

    All data49 8372.381.0826 6241.840.7815409.400.4145 73316.600.2633 88017.400.1817 0561.230.2412 8601.590.19314510.110.2529190.120.3330746.230.28
    PFT summary
     Mean 5.270.79 2.670.43 11.420.25 15.080.20 17.460.16 1.240.21 1.530.17 10.220.22 0.100.24 5.720.23
     SD between 0.90  0.69  0.40  0.18  0.10  0.14  0.11  0.16  0.24  0.27 
     n/PFT2623  1401  91  2407  1783  898  677  208  198  194  
     Sign. P***  ***  ***  ***  ***  ***  ***  ***  ***  ***  
    Species summary
     Mean 2.120.13 3.060.18 9.090.03 18.840.09 18.370.08 1.220.11 1.480.10 10.130.14 0.120.14 5.790.14
     SD between 1.03  0.81  0.40  0.22  0.16  0.23  0.16  0.22  0.33  0.25 
     nsp2707  882  363  2423  1250  649  519  168  120  121  
     n/sp11  10  3  16  18  16  15  13  11  13  
      Sign. P***  ***  ***  ***  ***  ***  ***  ***  ***  ***  
    Plant functional types
     Fern (218)30.080.833290.750.471328.480.2564718.860.2214314.770.19910.720.21501.140.2029.150.1820.090.1241.770.39
     Grass C3 (594)39350.610.7012420.440.31813.850.22503320.120.20266917.840.1614351.430.2310751.140.1734113.250.212320.200.242159.250.27
     Grass C4 (248)6350.580.603830.640.3361.680.1858319.230.22112814.140.151501.360.232320.930.169719.780.20700.250.178018.810.22
     Herb C3 (3129)15 5060.770.8234040.380.382153.490.2518 83022.830.19489323.310.1618702.020.2127981.290.18101512.810.256630.210.266948.490.20
     Herb C4 (63)1830.490.53360.250.55 1.000.0021220.200.258718.780.24471.860.251271.310.1410221.870.22330.150.298915.420.24
     Climber nonwoody (233)75115.250.572681.050.48178.990.3594923.400.2029525.340.171431.380.261541.330.192910.040.24300.120.39265.740.28
     Climber woody (73)10215.160.43763.740.51716.680.3544314.730.1915721.340.141011.620.23421.320.201311.210.2130.090.2034.100.19
     Shrub broadleaved deciduous (596)15736.670.9912213.590.491674.680.19383815.360.18222321.500.1412091.560.207481.450.182339.970.172420.150.232286.020.18
     Shrub broadleaved evergreen (1162)19114.020.9816941.610.5528415.880.2632168.990.21262313.730.1815040.840.2510331.900.193908.960.233450.080.293824.570.23
     Shrub needleleaved (83)2562.551.281213.530.581736.660.253037.430.1522310.110.151230.740.26891.830.17198.030.24190.040.16174.020.25
     Tree broadleaved deciduous (699)160633.801.09147120.820.282405.830.17396315.400.17434321.320.1322251.440.2017231.570.165399.340.185200.120.233606.280.17
     Tree broadleaved evergreen (2136)148727.641.07197316.560.3636016.830.2938599.460.19592116.890.1631770.860.2027231.870.156527.790.234840.070.275644.630.22
     Tree needleleaved deciduous (16)646.880.578832.980.20126.080.0112910.090.0924819.370.101551.830.15371.800.13116.900.20120.060.18134.170.17
     Tree needleleaved evergreen (134)88913.770.6388227.200.306339.710.2115175.000.13555812.090.1036221.230.169842.620.141969.450.241210.050.261243.140.25
    Plant species (exemplary)
     Carex bigelowii230.470.30460.230.13723.620.0031412.190.1244120.320.107161.940.18671.650.059315.160.10730.170.00338.970.059
     Dactylis glomerata880.810.154390.730.15332.750.12513924.580.1095024.670.128221.980.183111.320.098713.450.16070.310.19479.820.189
     Poa pratensis570.260.139220.500.14013.01 16923.960.1316317.360.172112.280.17861.190.184813.750.20060.170.187810.100.170
     Trifolium pratense611.530.117450.390.277   14122.850.0843438.650.086142.070.12371.650.090516.940.06140.430.116310.990.113
     Prunus spinosa22165.010.244142.920.21635.600.0248614.540.0911628.050.114132.150.099111.870.081311.170.04830.130.07436.320.101
     Acacia doratoxylon315.400.00076.090.268319.800.00334.570.000720.370.01260.830.00334.380.001214.510.00220.070.00323.340.001
     Phyllota phylicoides62.830.02660.670.345222.430.00167.440.059512.940.025   21.490.00228.350.00320.050.00324.870.001
     Pultenaea daphnoides53.980.14132.860.03629.360.002313.760.192619.400.00450.350.01331.830.00329.580.00220.100.00125.060.001
     Lepechinia calycina412.350.18622.790.17424.390.003511.230.075518.380.13931.200.00031.480.153212.560.00120.130.00126.910.001
     Leptospermum polygalifolium40.180.05634.000.00027.380.003210.930.002613.350.01450.490.04831.200.00138.560.00220.110.00037.620.024
     Banksia marginata78.510.07335.450.326336.360.001115.720.072118.300.05040.340.05181.410.032219.520.00120.100.001212.760.000
     Grevillea buxifolia746.390.11461.350.271215.070.00348.180.09467.160.00620.290.00030.780.00128.680.00220.060.00229.740.000
     Persoonia levis3206.270.06863.600.130245.590.00265.680.06865.870.00420.300.00031.080.00128.160.00220.050.00027.660.000
     Dodonaea viscosa286.890.189262.630.32069.290.054186.610.1071919.230.058161.200.09992.610.071611.640.05110.090.00064.530.046
     Pimelea linifolia52.850.11461.190.134212.640.002413.760.121614.390.02250.500.03430.850.00337.910.00220.110.00238.570.030
     Quercus ilex72241.030.0681417.410.285122.75 2836.240.10944914.000.0702970.880.129301.890.129207.240.181180.050.110112.680.209
     Quercus robur83219.440.1553326.480.23326.010.00110314.070.09022723.350.0971901.780.151481.670.15337.400.00120.080.01035.570.035
     Fagus sylvatica16194.920.1202330.960.18926.010.00127315.390.16126022.610.0781481.420.1082051.210.14965.180.160100.080.19036.770.010
     Simarouba amara5221.990.243334.280.020211.630.04068.400.183520.080.10940.730.09432.300.132113.840.00010.080.00014.520.000
     Synoum glandulosum6197.770.126103.800.307211.750.0011011.680.065616.220.01450.870.02231.460.00226.460.00020.070.00234.540.011
     Eucalyptus socialis40.810.03176.940.186228.780.00163.490.0121510.830.059140.540.09693.670.024216.230.00020.050.00024.450.001
     Brachychiton populneus6108.170.21787.760.221313.210.00188.700.0701116.990.045100.910.04062.130.04648.490.07040.060.10343.850.044
     Larix decidua96.420.0992037.650.18456.010.001909.730.0638919.810.072761.790.156122.100.11255.420.16150.060.21253.130.194
     Picea abies236.370.0782440.020.246388.850.1091464.450.13495412.400.0818121.420.1341093.070.11657.670.07150.030.01752.070.117
     Pinus sylvestris297.320.1333125.380.244527.710.0164304.920.103142213.060.08812451.300.1173592.800.121610.970.03160.040.02162.730.046
     Pseudotsuga menziesii2511.360.0542961.790.184264.680.001106.300.15310512.290.079821.690.13851.580.135359.120.15840.030.10442.990.091

    The evaluation of the two tenets of comparative ecology and vegetation modelling focuses on 10 traits that are central to leading dimensions of trait variation or that are physiologically relevant and closely related to parameters used in vegetation modelling (Westoby et al., 2002; Wright et al., 2004): plant height, seed mass, specific leaf area (one-sided leaf area per leaf dry mass, SLA), leaf longevity, leaf nitrogen content per leaf dry mass (Nm) and per leaf area (Na), leaf phosphorus content per leaf dry mass (Pm) and maximum photosynthetic rate per leaf area (inline image), per leaf dry mass (inline image) and per leaf nitrogen content (inline image). As for the relevance of the 10 selected traits: plant height was considered relevant for vegetation carbon storage capacity; seed mass was considered relevant for plant regeneration strategy; leaf longevity was considered relevant for trade-off between leaf carbon investment and gain; SLA for links of light capture (area based) and plant growth (mass based); leaf N and P content: link of carbon and respective nutrient cycle; photosynthetic rates expressed per leaf area, dry mass and N content for links of carbon gain to light capture, growth and nutrient cycle. Although we realize the relevance of traits related to plant–water relations, we did not feel comfortable to include traits such as maximum stomatal conductance or leaf water potential into the analyses for the lack of sufficient coverage for a substantial number of species. For each of the 10 traits, we quantified variation across species and PFTs in three ways: (1) Differences between mean values of species and PFTs were tested, based on one-way anova. (2) Variation within species, in terms of standard deviation (SD), was compared with variation between species (same for PFTs). (3) The fraction of variance explained by species and PFT R2 was calculated as one minus the residual sum of squares divided by the total sum of squares.

    We observed large variation in SD within species if the number of observations per species was small (see funnel plot in Appendix S1). With an increasing number of observations, SD within species approached an average, trait specific level. To avoid confounding effects due to cases with very few observations per species, only species with at least five trait entries were used in statistical analyses (with exception of leaf longevity, where two entries per species were taken as the minimum number because species with multiple entries were very rare). The number of measurements per PFT was sufficient in all cases. Statistical analyses were performed in r (R Development Core Team, 2009).


    Data coverage in the TRY database

    As of March 31, 2011 the TRY data repository contains 2.88 million trait entries for 69 000 plant species, accompanied by 3.0 million ancillary data entries [not all data from the databases listed in Table 1 and summarized in Table 2 could be used in the subsequent analyses, because some recently contributed datasets were still being checked and cleaned in the data staging area (see Fig. 1)]. About 2.8 million of the trait entries have been measured in natural environment, <100 000 in experimental conditions (e.g. glasshouse, climate or open-top chambers). About 2.3 million trait entries are for quantitative traits, while 0.6 million entries are for qualitative traits (Table 2). Qualitative traits, like plant growth form, are often treated as distinct and invariant within species (even though in some cases they are more variable than studies suggest, e.g. flower colour or dispersal mode), and they are often used as covariates in analyses, as when comparing evergreen vs. deciduous (Wright et al., 2005) or resprouting vs. nonresprouting plants (Pausas et al., 2004). The qualitative traits with the highest species coverage in the TRY dataset are the five traits used for PFT classification and leaf compoundness: woodiness (44 000 species), plant growth form (40 000), leaf compoundness (35 000), leaf type (34 000), photosynthetic pathway (32 000) and leaf phenology type (16 000); followed by N-fixation capacity (11 000) and dispersal syndrome (10 000). Resprouting capacity is noted for 3000 species (Description of qualitative traits: Plant dispersal syndrome: dispersed by wind, water, animal; N-fixation capacity: able/not able to fix atmospheric N2; leaf compoundness: simple versus compound, resprouting capacity: able/not able to resprout).

    The quantitative traits with the highest species coverage are seed size (27 000 species), plant height (18 000), leaf size (17 000), wood density (12 000), SLA (9000), plant longevity (8000), leaf nitrogen content (7000) and leaf phosphorus content (5000). Leaf photosynthetic capacity is characterized for more than 2000 species. Some of these traits are represented by a substantial number of entries per species, e.g. SLA has on average 10 entries per species, leaf N, P and photosynthetic capacity have about eight resp. five entries per species, with a maximum of 1470 entries for leaf nitrogen per dry mass (Nm) for Pinus sylvestris.

    About 40% of the trait entries (1.3 million) are georeferenced, allowing trait entries to be related to ancillary information from external databases such as climate, soil, or biome type. Although latitude and longitude are often recorded with high precision, the accuracy is unknown. The georeferenced entries are associated with 8502 individual measurement sites, with sites in 746 of the 4200 2 × 2° land grid cells of e.g. a typical climate model (Fig. 2). Europe has the highest density of measurements, and there is good coverage of some other regions, but there are obvious gaps in boreal regions, the tropics, northern and central Africa, parts of South America, southern and western Asia. In tropical South America, the sites fall in relatively few grid cells, but there are high numbers of entries per cell. This is an effect of systematic sampling efforts by long-term projects such as LBA (The Large Scale Biosphere-Atmosphere Experiment in Amazonia: http://www.lba.inpa.gov.br/lba) or RAINFOR (Amazon Forest Inventory Network: http://www.geog.leeds.ac.uk/projects/rainfor). For two individual traits, the spatial coverage is shown in Fig. 3. Here we additionally provide coverage in climate space, identifying biomes for which we lack data (e.g. temperate rainforests). More information about data coverage of individual traits is available on the website of the TRY initiative (http://www.try-db.org).

    Figure 2.

    Figure 2.

       Data density of georeferenced trait entries. Top, number of sites per 2 × 2° grid cell; bottom, number of trait entries per grid cell.

      Figure 3.

      Figure 3.

         Data density for (a) specific leaf area (SLA) (1862 sites) and (b) leaf nitrogen content per dry mass (3458 sites), and data density in climate space: (c) SLA and (d) leaf nitrogen content per dry mass (Nm). Red: geo-referenced measurement sites in the TRY database; dark grey: distribution of entries in the GBIF database (Global Biodiversity Information Facility, http://www.gbif.org) for species characterized by entries of SLA or leaf nitrogen content per dry mass in the TRY database; light grey: continental shape, respectively, all entries in the GBIF database in climate space. Mean annual temperature and mean annual precipitation are based on CRU gridded climate data (CRU: Climate Research Unit at the University of East Anglia, UK: http://www.cru.uea.ac.uk). Climate space overlaid by major biome types of the world following Whittaker et al. (1975): Tu, Tundra; BF, Boreal Forest; TeG, Temperate Grassland; TeDF, Temperate Deciduous Forest; TeRF, Temperate Rain Forest; TrDF, Tropical Deciduous Forest; TrRF, Tropical Rain Forest; Sa, Savanna; De, Desert. Biome boundaries are approximate.

        General pattern of trait variation: test for normality

        For 52 traits, the coverage of database entries was sufficient to quantify general pattern of density distributions in terms of skewness and kurtosis, and to apply the Jarque–Bera test for normality (Table 3). On the original scale all traits but one are positively skewed, indicating distributions tailed to high values. After log-transformation, the distributions of 20 traits are still positively skewed, while 32 traits show slightly negative skewness. For 49 of the 52 traits, the Jarque–Bera test indicates an improvement of normality by log-transformation of trait values – only for three traits normality was deteriorated (leaf phenolics, tannins and carbon content per dry mass; Table 3). The distribution of leaf phenolics and tannins content per dry mass is in between normal and log-normal: positively skewed on the original scale, negatively skewed on log-scale. Leaf carbon content per dry mass has a theoretical range from 0 to 1000 mg g−1. The mean value, about 476 mg g−1, is in the centre of the theoretical range, and the variation of trait values is small (Table 4).

        Table 3.  Statistical properties for the density distributions of 52 traits with substantial coverage and a test for deviation from normality, on the original scale and after log-transformation of trait values
        TraitNumber of entriesOriginal scaleLogarithmic scaleChange of normality
        SkewnessKurtosisJB testP-valueSkewnessKurtosisJB testP-value
        1. Results based on dataset after excluding obvious errors, but before detection of outliers. Skewness, measure of the asymmetry of the density distribution (0 in case of normal distribution; <0, left-tailed distribution; >0, right-tailed distribution); Kurtosis, measure of the ‘peakedness’ of the density distribution (here presented as excess kurtosis: 0, in case of normal distribution; <0, wider peak around the mean; >0, a more acute peak around the mean); JB test, result of Jarque–Bera test for departure from normality (0 for normal distribution; >0 for deviation from normal distribution); P-value, probability of obtaining a test statistic at least as extreme as the observed, assuming the null hypothesis, here the data are normal distributed, is true (on the original scale, resp. after log-transformation, >0.5 in case of normality accepted at 95% confidence); change of normality, difference between results of Jarque–Bera test on the original scale and after log-transformation of trait data (>0, improvement of normality by log-transformation; <0, deterioration of normality by log-transformation); RMSE, root mean squared error; bold: traits for which we quantified the fraction of variance explained by species and PFT.

        Seed dry mass53 744123.0219 457.168.E+11<2.20E−160.530.422915<2.20E−168.E+11
        Leaf dry mass26 220161.4826 118.887.E+11<2.20E−16−0.450.901748<2.20E−167.E+11
        Leaf area76 88365.476990.132.E+11<2.20E−16−0.540.023798<2.20E−162.E+11
        Conduit (vessel and tracheid) density545468.934968.046.E+09<2.20E−16−0.03−0.4343<2.20E−166.E+09
        Leaf Fe content per dry mass312831.841084.722.E+08<2.20E−161.518.7811 229<2.20E−162.E+08
        Releasing height19 66813.86292.857.E+07<2.20E−160.702.336068<2.20E−167.E+07
        Leaf Mn content per dry mass327312.04222.706 842 757<2.20E−16−0.02−0.51352.41E−086 842 722
        Seed length93367.4189.353 191 250<2.20E−160.310.47239<2.20E−163 191 011
        Whole leaf nitrogen content100612.84248.602 618 135<2.20E−16−0.530.08484.06E−112 618 087
        Leaf Na content per dry mass31809.55126.322 162 452<2.20E−160.190.79100<2.20E−162 162 352
        Specific leaf area (SLA)4 81422.8527.491 581 085<2.20E−16-0.541.064555<2.20E−161 576 530
        Leaf phosphorus content per dry mass (Pm)17 9203.5842.891 412 132<2.20E−16−0.380.981155<2.20E−161 410 977
        Leaf phosphorus content per area52905.3371.121 139 938<2.20E−16−0.040.75125<2.20E−161 139 813
        Leaf Zn content per dry mass32788.0484.861 018 873<2.20E−161.352.551880<2.20E−161 016 993
        Maximum plant longevity20067.3197.69815 546<2.20E−16−0.911.40442<2.20E−16815 104
        Leaf lifespan (longevity)16547.2691.59592 617<2.20E−160.31−0.35344.30E−08592 583
        Whole leaf phosphorus content44410.23141.53378 307<2.20E−16−0.27−0.3470.02529378 299
        Leaf K content per dry mass41444.0933.47204 954<2.20E−−160.090.33246.64E−06204 930
        Leaf Al content per dry mass34485.1435.08191 974<2.20E−161.131.01876<2.20E−16191 098
        Leaf nitrogen/phosphorus (N/P) ratio11 6123.0317.65168 595<2.20E−160.250.41199<2.20E−16168 396
        Seed terminal velocity11783.9150.26126 989<2.20E−16−0.45−0.77699.99E−16126 920
        Leaf mechanical resistance: tear resistance7586.5359.82118 402<2.20E−160.861.11132<2.20E−16118 270
        Leaf thickness29344.2429.88117 951<2.20E−160.770.71351<2.20E−16117 600
        Maximum Plant height28 2482.356.9983 464<2.20E−160.11−0.89983<2.20E−1682 481
        Leaf respiration per dry mass22344.2824.6563 393<2.20E−160.290.62664.77E−1563 327
        Wood phosphorus content per dry mass10564.9335.8760 888<2.20E−160.710.3194<2.20E−1660 794
        Leaf nitrogen content per area (Na)13 5281.738.2545 047<2.20E−16−0.270.34224<2.20E−1644 823
        Leaf Mg content per dry mass34852.5515.6839 460<2.20E−16−0.140.13140.00109839 446
        Conduit (vessel and tracheid) area30503.3115.8937 636<2.20E16−0.24−0.09312.15E−0737 605
        Leaf S content per dry mass10924.6024.7831 788<2.20E−161.454.211189<2.20E−1630 600
        Leaf Ca content per dry mass37552.1110.0918 721<2.20E−16−0.831.19656<2.20E−1618 065
        Leaf nitrogen content per dry mass (Nm)35 8621.212.3316 905<2.20E−16−0.22−0.38407<2.20E−1616 498
        Vessel diameter32092.619.6115 977<2.20E−160.27−0.35541.83E−1215 923
        Conduit lumen area per sapwood area22802.419.7511 243<2.20E−16−0.370.97140<2.20E−1611 102
        Canopy height observed40 5101.251.0412 416<2.20E−16−0.15−1.222654<2.20E−169762
        Leaf dry matter content (LDMC)17 3391.102.688693<2.20E−16−0.460.851141<2.20E−167551
        Leaf respiration per dry mass at 25°C14482.709.246907<2.20E−160.490.6382<2.20E−166825
        Stomatal conductance per leaf area10932.3910.696250<2.20E−16−0.731.27171<2.20E−166079
        Photosynthesis per leaf dry mass (inline image)25492.096.015699<2.20E−16−0.360.13582.85E−135642
        Leaf Si content per dry mass10572.359.825219<2.20E−16−0.540.8482<2.20E−165137
        Vessel element length30481.635.124668<2.20E−16−0.280.35559.89E−134613
        Wood nitrogen content per dry mass12592.228.244591<2.20E−160.330.15245.93E−064567
        Photosynthesis per leaf area (inline image)30621.493.202436<2.20E−16−0.631.32422<2.20E−162014
        Leaf K content per area2403.1212.281898<2.20E−160.370.5590.013931890
        Leaf carbon/nitrogen (C/N) ratio26150.951.99824<2.20E−16−0.12−0.18100.008102815
        Wood density26 4140.44−0.15887<2.20E−16−0.17−0.40298<2.20E−16589
        Leaf density14631.012.59655<2.20E−16−0.560.79115<2.20E−16540
        Root nitrogen content per dry mass12631.331.35466<2.20E−16−0.05−0.54160.0003217450
        Leaf respiration per area13031.222.00542<2.20E−16−0.791.80312<2.20E−16230
        Leaf phenolics content per dry mass4710.520.21221.90E−05−1.161.41144<2.20E−16−123
        Leaf carbon content per dry mass8140−0.070.0372.67E−02−0.320.08144<2.20E−16−137
        Leaf tannins content per dry mass4091.402.87274<2.20E−16−2.106.891109<2.20E−16−835
        Average 12.251165.87  −0.050.83   
        RMSE 2.4413.37  0.290.40   
        Table 4.  Mean values and ranges for 52 traits with substantial coverage, based on individual trait entries, after exclusion of outliers and duplicates
        TraitNumber of entriesUnitMean valueSDlg2.5% QuantileMedian97.5% Quantile
        • *

          Mean values for leaf phenolics, tannins and carbon content were calculated on the original scale, the SD is, provided on log-scale, for comparability.

        • Values for inline image were calculated based on database entries for Amax and leaf N content per area, resp. dry mass. Mean values have been calculated as arithmetic means on a logarithmic scale and retransformed to original scale. SD, standard deviation on log10-scale. Traits are sorted by decreasing SD. Bold: traits for which we quantified the fraction of variance explained by species and PFT (cf. Table 5, Fig. 5).

        Seed dry mass49 837mg2.381.080.021.95526
        Canopy height observed37 516m1.620.920.041.530
        Whole leaf phosphorus content426mg0.06850.830.00180.081.96
        Leaf area71 929mm21404.00.8125202536 400
        Maximum plant height26 625m1.840.780.11.2540
        Leaf dry mass24 663mg38.90.780.9643.51063.9
        Whole leaf nitrogen content961mg1.310.770.031.6927.6
        Conduit (vessel and tracheid) area2974mm20.003490.630.000240.00320.04
        Leaf Mn content per dry mass3159mg g−10.1890.580.010.192.13
        Maximum plant longevity1854year155.80.556.221751200
        Leaf Al content per dry mass3203mg g−10.1280.550.020.14.49
        Leaf Na content per dry mass3086mg g−10.2000.550.010.23.24
        Conduit (vessel and tracheid) density5301mm−237.60.54438380
        Seed terminal velocity1108m s−11.080.420.171.44.69
        Releasing height18 472m0.3470.420.050.352
        Leaf lifespan (longevity)1540month9.400.4128.560
        Leaf tannins content per dry mass*394%2.010.410.192.358.04
        Wood phosphorus content per dry mass1016mg g−10.07690.370.020.050.56
        Leaf respiration per dry mass2005μmol g−1 s−10.00970.360.00250.00970.04
        Seed length8770mm1.800.340.41.89
        Photosynthesis per leaf dry mass (inline image)2384μmol g−1 s−10.1150.340.020.120.49
        Leaf mechanical resistance: tear resistance722N mm−10.8140.340.190.765.11
        Leaf Ca content per dry mass3594mg g−19.050.341.579.8334.7
        Vessel diameter3102μm51.40.321550220
        Stomatal conductance per leaf area1032mmol m−1 s−1241.00.3152.4243.7895.7
        Root nitrogen content per dry mass1158mg g−19.670.312.69.336.1
        Leaf Si content per dry mass1027mg g−10.1630.
        Leaf Zn content per dry mass3080mg g−10.02260.280.00650.020.1
        Leaf respiration per dry mass at 25°C1305μmol g−1 s−10.00920.280.00350.00820.03
        Leaf K content per dry mass3993mg g−18.440.272.568.328.2
        Photosynthesis per leaf N content (inline image)3074μmol g−1 s−
        Leaf phenolics content per dry mass*454%
        Specific leaf area (SLA)45 733mm2 mg−
        Leaf K content per area231g m−20.7600.260.240.722.60
        Leaf Mg content per dry mass3360mg g−12.610.250.832.648.0
        Leaf Fe content per dry mass3040mg g−10.0770.
        Photosynthesis per leaf area (inline image)2883μmol m−2 s−
        Leaf respiration per area1201μmol m−2 s−
        Leaf phosphorus content per dry mass (Pm)17 057mg g−
        Leaf thickness2815mm0.2110.
        Conduit lumen area per sapwood area2210mm2 mm−20.1370.
        Leaf phosphorus content per area5083g m−20.1040.
        Vessel element length2964μm549.50.212005551350
        Leaf nitrogen/phosphorus (N/P) ratio11 200g g−
        Leaf nitrogen content per area (Na)12 860g m−21.590.190.641.633.6
        Wood nitrogen content per dry mass1210mg g−
        Leaf S content per dry mass1023mg g−11.660.180.781.594.75
        Leaf nitrogen content per dry mass (Nm)33 880mg g−
        Leaf dry matter content (LDMC)16 185g g−10.2130.
        Leaf density1372g cm−30.4260.150.20.430.77
        Leaf carbon/nitrogen (C/N) ratio2498g g−123.40.1412.3923.542.2
        Wood density26 391mg mm−30.5970.120.330.60.95
        Leaf carbon content per dry mass*7856mg g−1476.10.03404.5476.3540.8

        Nevertheless, according to the Jarque–Bera test, also on a logarithmic scale all traits show some degree of deviation from normal distributions (indicated by small P-values, Table 3). Seed mass, for example, is still positively skewed after log-transformation (Table 3). This is due to substantial differences in the number of database entries and seed masses between grasses/herbs, shrubs and trees (Fig. 4a). Maximum plant height in the TRY database has a strong negative kurtosis after log-transformation (Table 3). This is due to a bimodal distribution: one peak for herbs/grass and one for trees (Fig. 4b). The number of height entries for shrubs is comparatively small – which may be due to a small number or abundance of shrub species in situ (i.e. a real pattern) but is more likely due to a relative ‘undersampling’ of shrubs (i.e. an artefact of data collection). Within the growth forms herbs/grass and shrubs, height distribution is approximately log-normal. For trees the distribution is skewed to low values, because there are mechanical constrictions to grow taller than 100 m. The distribution of SLA after log-transformation is negatively skewed with positive kurtosis (Table 3) – an imprint of needle-leaved trees and shrubs besides the majority of broadleaved plants (Fig. 4c). The distribution of leaf nitrogen content per dry mass after log-transformation has small skewness, but negative kurtosis (Table 3) – the data are less concentrated around the mean than normal (Fig. 4d). In several cases, sample size is sufficient to characterize the distribution at different levels of aggregation, down to the species level. Again we find approximately log-normal distributions (e.g. SLA and Nm for Pinus sylvestris; Fig. 4c and d).

        Figure 4.

        Figure 4.

           Examples of trait frequency distributions for four ecologically relevant traits (Westoby, 1998; Wright et al., 2004). Upper panels: (a) seed mass and (b) plant height for all data and three major plant growth forms (white, all database entries; light grey, herbs/grasses; dark grey, trees; black, shrubs). Rug-plots provide data ranges hidden by overlapping histograms. Lower panels: (c) Specific leaf area (SLA) and (d) leaf nitrogen content per dry mass [Nm, white, all database entries excluding outliers (including experimental conditions); light grey, database entries from natural environment (excluding experimental conditions); medium grey, growth form trees; dark grey, PFT needle-leaved evergreen; black, Pinus sylvestris].

          Ranges of trait variation

          There are large differences in variation across traits (Table 4). The standard deviation (SD) expressed on a logarithmic scale ranges from 0.03 for leaf carbon content per dry mass (resp. about 8% on the original scale) to 1.08 for seed mass (resp. −95% and +1100% on the original scale). Note two characteristics of SD on the logarithmic scale: (1) it corresponds to an asymmetric distribution on the original scale: small range to low values, large range to high values; (2) it can be compared directly across traits. For more information, see supporting information Appendix S2. Leaf carbon content per dry mass, stem density and leaf density show the lowest variation, followed by the concentration of macronutrients (nitrogen, phosphorus), fluxes and conductance (photosynthesis, stomatal conductance, respiration), the concentration of micronutrients (e.g. aluminium, manganese, sodium), traits related to length (plant height, plant and leaf longevity), and traits related to leaf area. Mass-related traits show the highest variation (seed mass, leaf dry mass, N and P content of the whole leaf – in contrast to concentration per leaf dry mass or per leaf area). The observations reveal a general tendency towards higher variation with increasing trait dimensionality (length <area <mass; for more information, see Appendix S3).

          Tenet 1: Aggregation at the species level represents the major fraction of trait variation

          There is substantial intraspecific variation for each of the 10 selected traits (Table 5): for single species the standard deviation is above 0.3 on logarithmic scale, e.g. SD=0.34 for maximum plant height of Phyllota phyllicoides (−55% and +121% on the original scale), but based on only six observations and SD=0.32 in case of Dodonaea viscosa (n=26). The SD of Nm for Poa pratensis is 0.17 (n=63), which is almost equal to the range of all data reported for this trait, but this is an exceptional case. The trait and species with the most observations is nitrogen content per dry mass for Pinus sylvestris with 1470 entries (SD=0.088, −18% and +22%). The variation in this species spans almost half the overall variation observed for this trait (SD=0.18), covering the overall mean (Fig. 4d). For several trait-species combinations, the number of measurements is high enough for detailed analyses of the variation within species (e.g. on an environmental gradient).

          The mean SD at the species-level is highest for plant height (0.18) and lowest for leaf longevity (0.03, but few observations per species, Table 5). For all ten traits the mean SD within species is smaller than the SD between species mean values (Table 5). Based on anova, mean trait values are significantly different between species: at the global scale 60–98% of trait variance occurs interspecific (between species, Fig. 5). Nevertheless, for three traits (Pm, Na, inline image) almost 40% of the variance occurs intraspecific (within species, Fig. 5).

          Figure 5.

          Figure 5.

             Fraction of variance explained by plant functional type (PFT) or species for 10 relevant and well-covered traits. R2, fraction of explained variance; Traits: Seed mass, seed dry mass; Plant height, maximum plant height; LL, leaf longevity; SLA, specific leaf area; Nm, leaf nitrogen content per dry mass; Pm, leaf phosphorus content per dry mass; Na, leaf nitrogen content per area; inline image, maximum photosynthesis rate per leaf area; inline image, maximum photosynthesis rate per leaf dry mass; inline image, maximum photosynthesis rate per leaf nitrogen content.

            Tenet 2: Basic PFTs capture a sufficiently important fraction of trait variation to represent functional diversity

            For all 10 traits, the PFT mean values are significantly different between PFTs (Table 5). Four traits show larger variation between PFT mean values than within PFTs (plant height, seed mass, leaf longevity, inline image), two traits show similar variation between PFT means and within PFTs (SLA, inline image). As a consequence, more than 60% of the observed variance occurs between PFTs for plant height and leaf longevity, and about 40% of the variation occurs between PFTs for seed mass, SLA, inline image and inline image (Fig. 5). The high fraction of explained variance for these six traits reflects the definition of PFTs based on the closely related qualitative traits: plant growth form, leaf phenology (evergreen/deciduous), leaf type (needle-leaved/broadleaved) and photosynthetic pathway (C3/C4). For theses traits, PFTs such as those commonly used in vegetation models, capture a considerable fraction of observed variation with relevant internal consistency. However, for certain traits the majority of variation occurs within PFTs: four traits show smaller variation between than within PFTs, causing substantial overlap across PFTs (Nm, Na, Pm, inline image). In these cases only about 20–30% of the variance is explained by PFT, and about 70–80% of variation occurs within PFTs.

            Representation of trait variation in the context of global vegetation models

            To demonstrate how the observed trait variation is represented in global vegetation models, we first compare observed trait ranges of SLA to parameter values for SLA used in 12 global vegetation models; then we compare observed trait ranges of Nm with state variables of nitrogen concentration calculated within the dynamic global vegetation model O-CN (Zaehle & Friend, 2010).

            Some vegetation models separate PFTs along climatic gradients into biomes, for which they assign different parameter values. A rough analysis of SLA along the latitudinal gradient (as a proxy for climate) indicates no major impact on SLA within PFT (Fig. 6), and we further jointly analyse SLA data by PFT. However, the range of observed trait values for SLA per PFT is remarkably large, except for the PFT ‘needle-leaved deciduous trees’ (Figs 6 and 7). The parameter values from most of the 12 models match moderately high density of SLA observations, but most are clearly different from the mean, and some parameter values are at the low ends of probabilities, surprisingly far off the mean value of observations.

            Figure 6.

            Figure 6.

               Worldwide range in specific leaf area (SLA) along a latitudinal gradient for the main plant functional types. Grey, all data; black, data for the plant functional group (PFT) under scrutiny.

              Figure 7.

              Figure 7.

                 Frequency distributions of specific leaf area (SLA, mm2 mg−1) values (grey histograms) compiled in the TRY database and parameter values for SLA (red dashes) published in the context of the following global vegetation models: Frankfurt Biosphere Model (Ludeke et al., 1994; Kohlmaier et al., 1997), SCM (Friend & Cox, 1995), HRBM (Kaduk & Heimann, 1996), IBIS (Foley et al., 1996; Kucharik et al., 2000), Hybrid (Friend et al., 1997), BIOME-BGC (White et al., 2000), ED (Moorcroft et al., 2001), LPJ-GUESS (Smith et al., 2001), LPJ-DGVM (Sitch et al., 2003), LSM (Bonan et al., 2003), SEIB–DGVM (Sato et al., 2007). n, number of SLA data in the TRY database per PFT.

                The range of observed trait values for Nm per PFT is also high (Fig. 8), except for the PFT ‘needle-leaved evergreen trees’. Modelled state variables are in most cases within the range of frequently observed trait values – model values for the PFT ‘needle-leaved evergreen trees’ match the observed distribution almost perfectly. Nevertheless, there are considerable differences between modelled and observed distributions: the modelled state variables are approximately normally distributed on the original scale, while the observed trait values are log-normally distributed; the range of modelled values is substantially smaller than the range of observations; and the highest densities are shifted. Apart from possible deficiencies of the O-CN model, the deviation between observed and modelled distributions may be due to inconsistencies between compiled traits and modelled state variables: trait entries in the database are not abundance-weighted with respect to natural occurrence, and they represent the variation of single measurements, while the model produces ‘community’ measures. The distribution of observed data presented here is therefore likely wider than the abundance-weighted leaf nitrogen content of communities in a given model grid cell.

                Figure 8.

                Figure 8.

                   Frequency distributions of leaf nitrogen content per dry mass for major plant functional types as compiled in the TRY database compared with frequency distributions of the respective state variable calculated within the O-CN vegetation model (Zaehle & Friend, 2010). n, number of entries in the TRY database (left) and number of grid elements in O-CN with given PFT (right).


                  The TRY initiative and the current status of data coverage

                  The TRY initiative has been developed as a Data Warehouse to integrate different trait databases. Nevertheless, TRY does not aim to replace existing databases, but rather provides a complementary way to access these data consistently with other trait data – it facilitates synergistic use of different trait databases. Compared with a Meta Database approach, which would link a network of separate databases, the integrated database (Data Warehouse) provides the opportunity to standardize traits, add ancillary data, provide accepted species names and to identify outliers and duplicate entries. A disadvantage of the Data Warehouse approach is that some of the databases contributing to TRY are continuously being developed (see Table 2). However, these contributions to TRY are regularly updated.

                  The list of traits in the TRY database is not fixed, and it is anticipated that additional types of data will be added to the database in the future. Examples include sap-flow measurements, which are fluxes based on which trait values can be calculated, just as photosynthesis measurements can be used to determine parameter values of the Farquhar model (Farquhar et al., 1980), and leaf venation, which has recently been defined in a consistent way and appears to be correlated with other leaf functional traits (Sack & Frole, 2006; Brodribb et al., 2007; Blonder et al., 2011). Ancillary data, contributed with the trait data, may include images. There is also room for expansion of the phylogenetic range of the data incorporated in the database. There is currently little information on nonvascular autotrophic cryptogams in TRY (i.e. bryophytes and lichens), despite their diversity in species, functions and ecosystem effects, and the growing number of trait measurements being made on species within these groups.

                  The qualitative traits with greatest coverage (more than 30 000 species for woodiness, plant growth form, leaf compoundness, leaf type, photosynthetic pathway) represent about 10% of the estimated number of vascular plant species on land. The quantitative traits with most coverage (5000–20 000 species for e.g. seed mass, plant height, wood density, leaf size, leaf nitrogen content, SLA) approach 5% of named plant species. Although they represent a limited set of species (5–10%), most probably they include the most abundant (dominant) species. The high number of characterized species opens up the possibility of identifying the evolutionary branch points at which large divergences in trait values occurred. Such analyses will improve our understanding of trait evolution at both temporal and spatial scales. They highlight the importance of including trait data for autotrophs representing very different branches of the Tree of Life (Cornelissen et al., 2007; Lang et al., 2009) in the TRY database.

                  For some traits, we know that many more data exist, which could potentially be added to the database. Nevertheless, for some traits the lack of data reflects difficulties in data collection. Table 2 shows some traits where species coverage is thin, most probably because the measurements are difficult or laborious. Root measurements fall into this category. Rooting depth (or more exactly, maximum water extraction depth) is among the most influential plant traits in global vegetation models, yet we have estimates for only about 0.05% of the vascular plant species. Data for other root traits is even scarcer. However, many aboveground traits correlate with belowground traits (see Kerkhoff et al., 2006), so the data in TRY do give some indication about belowground traits. Apart from this, root traits are focus of current studies (Paula & Pausas, 2011). Anatomical traits also have weak coverage in general. Quantifying anatomy from microscopic cross-sections is a slow and painstaking work and there is currently no consensus on which are the most valuable variables to quantify in leaf sections, apart from standard variables such as tissue thicknesses and cell sizes, which show important correlations with physiological function, growth form and climate (Givnish, 1988; Sack & Frole, 2006; Markesteijn et al., 2007; Dunbar-Co et al., 2009; Hao et al., 2010). An exception is wood anatomy, where TRY contains conduit densities and sizes for many species (about 7000 and 3000 species, respectively). Finally, allometric or architectural relationships that describe relative biomass allocation to leaves, stems, and roots through the ontogeny of individual plants are presently scattered across 72 different traits, each with low coverage. These traits are essential for global vegetation models and this is an area where progress in streamlining data collection is needed.

                  Many trait data compiled in the database were not necessarily collected according to similar or standard protocols. Indeed many fields of plant physiology and ecology lack consensus definitions and protocols for key measurements. However, progress is being made as well towards a posteriori data consolidation (e.g. Onoda et al., 2011), as towards standardizing trait definitions and measurement protocols, e.g. via a common plant trait Thesaurus (Plant Trait Thesaurus: http://trait_ontology.cefe.cnrs.fr:8080/Thesauform/), and a handbook and website (PrometheusWiki: http://prometheuswiki.publish.csiro.au/tiki-custom_home.php) of standard definitions and protocols (Cornelissen et al., 2003b; Sack et al., 2010).

                  Information about the abiotic and biotic environment in combination with trait data is essential to allow an assessment of environmental constraints on the variation of plant traits (Fyllas et al., 2009; Meng et al., 2009; Ordoñez et al., 2009; Albert et al., 2010b; Poorter et al., 2010). Some of this information has been compiled in the TRY database. However, the information about soil, climate and vegetation structure at measurement sites is not well structured, because there is no general agreement on what kind of environmental information is most useful to report in addition to trait measurements. A consensus on these issues would greatly improve the usefulness of ancillary environmental information. Geographic references should be a priority for nonexperimental data.

                  The number of observations or species with data for all traits declines rapidly with an increasing number of traits: fewer species have data for each trait (see Appendix S3). In cases where multivariate analyses rely on completely sampled trait-species matrices, this issue poses a significant constraint on the number of traits and/or species that can be included. Gap filling techniques, e.g. hierarchical Bayesian approaches or filtering techniques (Shan & Banerjee, 2008; Su & Khoshgoftaar, 2009) offer a potential solution. On the other hand, simulation work in phylogenetics has shown that missing data are not by themselves problematic for phylogenetic reconstruction (Wiens, 2003, 2005). Similar work could be performed in trait-based ecology, and the emerging field of ecological informatics (Recknagel, 2006) may help to identify representative trait combinations while taking incomplete information into account (e.g. Mezard, 2007) .

                  General pattern and ranges of trait distribution

                  Based on the TRY dataset, we characterized two general patterns of trait density distributions: (1) plant traits are rather log-normal than normal distributed and (2) the range of variation tends to increase with trait-dimensionality. Here the analysis did benefit from compiling large numbers of trait entries for several traits from different aspects of plant strategy. Based on the rich sampling, we could quantify simple general rules for trait distributions and still identify deviations in the individual case. The approximately log-normal distributions confirm prior reports for individual traits (e.g. Wright et al., 2004) and are in agreement with general observations in biology (Kerkhoff & Enquist, 2009), although we also observe deviation from log-normal distribution, e.g. as an imprint of plant growth form or leaf type. Being approximately log-normal distributed is most probably due to the fact that plant traits often have a lower bound of zero but no upper bound relevant for the data distribution. This log-normal distribution has several implications: (1) On the original scale, relationships are to be expected multiplicative rather than additive (Kerkhoff & Enquist, 2009, see as well Appendix S2). (2) Log- or log–log scaled plots are not sophisticated techniques to hide huge variation, but the appropriate presentation of the observed distributions (e.g. Wright et al., 2004). On the original scale, bivariate plots of trait distributions are to be expected heteroscedastic (e.g. Kattge et al., 2009). (3) Trait related parameters and state variables in vegetation models can be assumed log-normal distributed as well, e.g. Figs 7 and 8 (Knorr & Kattge, 2005). For more details, see Appendix S2.

                  For several traits, we quantified ranges of variation: overall variation, intra- and interspecific variation, and variation with respect to different functional groups. Most of the trait data compiled within the TRY database have been measured within natural environments and only a small fraction comes from experiments. Therefore, the impact of experimental growth conditions on observed trait variation is probably small in most cases and the observed trait variation in the TRY database comprises primarily natural variation at the level of single organs, including variation due to different measurement methods and, of course, measurement errors. However, systematic sampling of trait variation at single locations is a relatively new approach (Albert et al., 2010a, b; Baraloto et al., 2010; Hulshof & Swenson, 2010; Jung et al., 2010b; Messier et al., 2010), and it may therefore be shown that trait variability under natural conditions is underestimated in the current dataset.

                  Tenets revisited

                  The results presented here are a first step to illuminate two basic tenets of plant comparative ecology and vegetation modelling at a global scale: (1) The aggregation of trait data at the species level represents the major fraction of variation in trait values. At the same time, we have shown surprisingly high intraspecific variation – for some traits responsible for up to 40% of the overall variation (Table 5, Figs 4 and 5). This variation reflects genetic variation (among genotypes within a population/species) and phenotypic plasticity. Through the TRY initiative, a relevant amount of data is available to quantify and understand trait variation beyond aggregation on species level. The analysis presented here is only a first step to disentangle within- and between-species variability. It is expected that in combination with more detailed analyses the TRY database will support a paradigm shift from species to trait-based ecology.

                  (2) Basic PFTs, such as those commonly used in vegetation models capture a considerable fraction of observed variation with relevant internal consistency. However, for certain traits the majority of variation occurs within PFTs –responsible for up to 75% of the overall variation (Table 5, Figs 4–8). This variation reflects the adaptive capacity of vegetation to environmental constraints (Fyllas et al., 2009; Meng et al., 2009; Ordoñez et al., 2009; Albert et al., 2010b; Poorter et al., 2010) and it highlights the need for refined plant functional classifications for Earth system modeling. The current approach to vegetation modelling, using few basic PFTs and one single fixed parameter value per PFT (even if this value equals the global or regional mean) does not account for the rather wide range of observed values for related traits and thus does not account for the adaptive capacity of vegetation. A more empirically based representation of functional diversity is expected to contribute to an improved prediction of biome boundary shifts in a changing environment.

                  There are new approaches in Earth system modelling to better account for the observed variability: suggesting more detailed PFTs, modelling variability within PFTs or replacing PFTs by continuous trait spectra. In the context of this analysis we focused on a basic set of PFTs. This schema is not immutable and there is not one given functional classification scheme. In fact, PFTs are very much chosen and defined along specific needs – and the availability of information. For example, the PFTs used in an individual based forest simulator (e.g. Chave, 1999), are by necessity very different from those used for DGVMs. The TRY dataset will be as important for allowing the definition of new, more detailed PFTs as for parameterizing the existing ones. Some recent models represent trait ranges as state variables along environmental gradients rather than as fixed parameter values. The O-CN model (Zaehle & Friend, 2010) is an example towards such a new generation of vegetation models, also the NCIM model (Esser et al., 2011), or in combination with an optimality approach the VOM model (Schymanski et al., 2009). Finally, functional diversity may be represented by model ensemble runs with continuous trait spectra and without PFT classification (Kleidon et al., 2009). However, compared with current vegetation models, these new approaches will be more flexible with respect to the adaptive capacity of vegetation. The TRY database is expected to contribute to these developments, which will provide a more realistic, empirically grounded representation of plants and ecosystems in Earth system models.

                  A unified database of plant traits in the context of global biogeography

                  The analyses presented here are only a first step to introduce the TRY dataset. To better understand, separate, and quantify the different contributions to trait variation observed in TRY, more comprehensive analyses could be carried out, e.g. variance partitioning accounting for phylogeny and disentangling functional and regional influences or analysis of (co-)variance of plant traits along environmental gradients. An integrative exploration of ecological and biogeographical information in TRY is expected to substantially benefit from progress in the science of machine learning and pattern recognition (Mjolsness & DeCoste, 2001). In principle, we are confronted with a similar challenge that genomics faced after large-scale DNA sequencing techniques had become available. Instead of thousands of sequences, our target is feature extraction and novelty detection in thousands of plant traits and ancillary information. Nonlinear relations among items and the treatment of redundancies in trait space have to be addressed. Nonlinear dimensionality reduction (Lee & Verleysen, 2007) may shed light on the inherent structures of data compiled in TRY. Empirical inference of this kind is expected to stimulate and strengthen hypothesis-driven research (Golub, 2010; Weinberg, 2010) towards a unified ecological assessment of plant traits and their role for the functioning of the terrestrial biosphere.

                  The representation of trait observations in a spatial or climate context in the TRY database is limited (Figs 2 and 3). This situation can be overcome using complementary data streams: trait information can be spatially expanded with comprehensive compilations of species occurrence data, e.g. from GBIF or herbarium sources. For SLA and leaf nitrogen content we provide an example for combining trait information with species occurrence data from the GBIF database and with climate reconstruction data derived from the CRU database (Fig. 3). Given that the major fraction of variation is between species, the variation of species mean trait values may be used – but with caution – as a proxy for trait variation, as has already been performed in recent studies at regional and continental scales (Swenson & Enquist, 2007; Swenson & Weiser, 2010). Ollinger et al. (2008) derived regional maps of leaf nitrogen content and maximum photosynthesis from trait information in combination with eddy covariance fluxes and remote sensing data. Based on these approaches and advanced spatial interpolation techniques (Shekhar et al., 2004), a unified global database of plant traits may permit spatial mapping of key plant traits at a global scale (Reich, 2005).

                  The relationship between plant traits (organism-level) and ecosystem or land surface functional properties is crucial. Recent studies have built upon the eddy covariance network globally organized as FLUXNET (a network of regional networks coordinating observations from micrometeorological tower sites: http://www.fluxnet.ornl.gov) and inferred site specific ecosystem-level properties from the covariation of meteorological drivers and ecosystem-atmosphere exchange of CO2 and water (Baldocchi, 2008). These include inherent water-use efficiency (Reichstein et al., 2007; Beer et al., 2009), maximum canopy photosynthetic capacity (Ollinger et al., 2008), radiation use efficiency and light response curve parameters (Lasslop et al., 2010). How species traits relate to these ecosystem-level characteristics has not been investigated, but should be possible via a combined analysis of FLUXNET and TRY data. For example, it is possible to test the hypothesized correlation between SLA, P, and N content of dominant species with radiation use efficiency and inherent water-use efficiency at the ecosystem level (as implicit in Ollinger et al., 2008). Similarly, patterns of spatially interpolated global fields of biosphere–atmosphere exchange (Beer et al., 2010; Jung et al., 2010a) may be related to spatialized plant traits in order to detect a biotic imprint on the global carbon and water cycles. Such increased synthetic understanding of variation in plant traits is expected to support the development of a new generation of vegetation models with a better representation of vegetation structure and functional variation (Lavorel et al., 2008; Violle & Jiang, 2009).

                  Conclusions and perspectives

                  The TRY database provides unprecedented coverage of information on plant traits and will be a permanent communal repository of plant trait data. The first analyses presented here confirm two basic tenets of plant comparative ecology and vegetation modelling at global scale: (1) the aggregation of trait data at the species level represents the major fraction of variation and (2) PFTs cover a relevant fraction of trait variation to represent functional diversity in the context of vegetation modelling. Nevertheless, at the same time these results reveal for several traits surprisingly high variation within species, as well as within PFTs – a finding which poses a challenge to large-scale biogeography and vegetation modelling. In combination with improved (geo)-statistical methods and complementary data streams, the TRY database is expected to support a paradigm shift in ecology from being based on species to a focus on traits and trait syndromes. It also offers new opportunities for research in evolutionary biology, biogeography, and ecology. Finally, it allows the detection of the biotic imprint on global carbon and water cycles, and fosters a more realistic, empirically grounded representation of plants and ecosystems in Earth system models.


                  We would like to thank the subject editor, the publisher for caution and patience, two anonymous reviewers for supportive comments. The TRY initiative and database is hosted, developed and maintained at the Max-Planck-Institute for Biogeochemistry (MPI-BGC) in Jena, Germany. TRY is or has been supported by DIVERSITAS, IGBP, the Global Land Project, the UK Natural Environment Research Council (NERC) through its programme QUEST (Quantifying and Understanding the Earth System), the French Foundation for Biodiversity Research (FRB), and GIS ‘Climat, Environnement et Société’ France. We wish to thank John Dickie and Kenwin Liu for making the data from the KEW Seed Information Database available in the context of the TRY initiative, Alastair Fitter, Henry Ford and Helen Peat for making the Ecological Flora of the British Isles available, and Andy Gillison for the VegClass database. We wish to thank Brad Boyle and the SALVIAS project for building and making available a global checklist of plant species names, and GBIF (Andrea Hahn) for making the species occurrence data available. The authors thank the NSF LTER program DEB 0620652 and the NSF LTREB program DEB 0716587 for making data on plant traits available.