Competitors and ruderals go to town: plant community composition and function along an urbanisation gradient

Urbanisation is expected to function as a filter for plant species by changing the phys-iochemical environment, causing species turnover along an urbanisation gradient. Analyses of the functional traits of species characteristic of different urbanisation levels allow for comparisons across studies, irrespective of exact species composition. This study aims to examine how plant species and functional traits vary with urbanisation. An independent dataset obtained through the Global Biodiversity Information Facility was used to validate the identified indicator species. The study was performed in Trondheim, Norway. Indicator species of two different urbanisation levels were identified from a structured plot vegetation survey, and their functional traits were contrasted. The functional trait patterns were compared to patterns identified from an independent GBIF dataset. Changes in species composition along the urbanisation gradient identified a shift in environmental- and anthropogenic variables, filtering species with different functional traits. Indicators of urban areas displayed higher SLA, nitrogen affinity and disturbance tolerance than indicators from non-urban areas. Not all functional trait differences observed between the indicator species from the vegetation survey were recognised in the independent dataset from GBIF. Nevertheless, the overall trends were consistent. Urbanisation favours species adhering to different trait syndromes than species outside of urban areas; disturbance-over stress-tolerance, and species adapted for rapid resource acquisition rather than species adapted for resource conservation.


Introduction
Land-use modifications are extensive on global-, national-and local scales, resulting in greatly altered environmental conditions. Urbanisation has been increasing over the last decades due to the increasing world population, and the fact that a greater proportion of the population currently live in cities than in rural areas (IPBES 2019). Both human population size and urbanisation are expected to further increase in the years to come (United Nations 2018).
2 Urbanisation heavily modifies the physical environment (Kaye et al. 2006), ranging from increased temperatures (the urban heat island effect (Oke 1988, Forman 2014a), dry and alkaline soils (Forman 2014b), and increased nutrient levels due to both nitrogen deposition through air pollution and fertilisation (Pellissier et al. 2008). Such dissimilarities favour plant species adapted to different environments; increasing urbanisation filters plants in correspondence with their functional traits, and an extensive species turnover is expected along an urbanisation gradient. Indeed, previous studies have shown such a species turnover between urban and non-urban areas (Lososová et al. 2011), including changes in functional trait states of the involved plant species (Knapp et al. 2009).
Multiple processes have been theorised to underpin community assembly; specifically the species-sorting paradigm dictate that community dynamics are mainly determined by environmental differences in habitat patches, and species adaptations to those environmental conditions (Leibold et al. 2004, Graae et al. 2018. Thus, abiotic conditions will filter which species are capable of persisting in a habitat patch based on its functional traits. Several functional traits are associated with plants responding positively or negatively to urbanisation, but a single 'urban trait syndrome' has not been identified (Aronson et al. 2016). Different studies have showed contrasting patterns, as reviewed by Williams et al. (2015); e.g. light affinity, life strategy (C: competitive, S: stress tolerant and R: ruderal, disturbance tolerant (Grime 1974)) and life form did not show unidirectional patterns in response to urbanisation. Plant functional traits can give insights to what environmental filters are at work in a given community. Thus, shifts in functional traits and -trait states along an urbanisation gradient can elucidate the changes in environmental conditions in more detail than changes in species composition on its own. Furthermore, analyses on functional traits allows for comparisons across studies, regardless of species composition per se. The expectations are that plant species will tend to display traits associated with resource acquisition: large, thin leaves (high SLA), short lifespan and high alkalinity-and nitrogen affinities, and great drought-and disturbance tolerance with increasing urbanisation (Lososová et al. 2006, Knapp et al. 2008, Kalusová et al. 2017, Palma et al. 2017. Such adaptations will 'pass through the urban filter' of increased temperatures and low moisture availability, alkaline and nutrient-polluted soils and general high disturbance levels (Forman 2014a, b). In contrast, plant species dominant in the low end of an urbanisation gradient are expected to be adapted for resource conservation and relatively nutrientpoor conditions.
When characterising biological communities, indicator species analysis is a frequently used tool, using a species' association with a group of sites to identify diagnostic species (De Cáceres and Legendre 2009). Using indicator species rather than full species lists to characterise communities might reveal patterns and filters otherwise obscured by widespread, generalist species. Adding a functional component to traditional indicator species analysis can be beneficial, as more mechanistic underpinnings of the site-species relationship can be inferred (Ricotta et al. 2015). Thus, assessing the functional trait variation in indicator species rather than entire communities might amplify the patterns in plant functional responses, revealing underlying environmental conditions and -filters. Functional traits of indicator species in comparison to other and more generalist species have been explored in other studies: for example, Hermy et al. (1999) compared traits of ancient forest specialists to other forest species; Ricotta et al. (2015) incorporated functional traits to improve the diagnostic values of indicator species; Conradi and Kollmann (2016) compared specialist species of old grasslands to species found in young, recovering grasslands; Auffret et al. (2017) studied spatial turnover in abandoned grasslands based functional traits and biogeographical variables, and Ladouceur et al. (2019) examined functional trait variation of European temperate grasslands. Whether the functional trait variation in indicator species reflects the trait variation displayed by entire communities is a question warranting attention.
An increasing amount of species occurrence data is becoming available through open databases, such as the Global Biodiversity Information Facility (GBIF) (Gaiji et al. 2013, GBIF.org 2019. These databases combine both data from professional scientists (both as opportunistic recordings and as structured surveys) and opportunistic amateur recordings ('citizen science'). Despite biases in both space, time and taxonomy inherent in such predominantly opportunistic data sets (Speed et al. 2018), they offer access to data quantities unfeasible through structured surveys alone (Theobald et al. 2015), and they are increasingly used in research. If openaccess, compiled datasets are to serve as proxies for field surveys, it is crucial to know if the two approaches reflect the same ecological patters and mechanisms.
The aims of this study are twofold. The first aim relates to assessing how species and functional traits vary along an urbanisation gradient by addressing the following questions.
1) What species are indicative of different urbanisation levels, and do these species adhere to different trait syndromes as response to differences in environmental conditions?
The second aim is to validate the identified differences in functional traits along an urbanisation gradient by comparing them to potential patterns observed in an independent, structurally different dataset, addressing the following question.
2) Are the patterns in functional traits observed in a structured field survey recognisable in data from openly available portals, including opportunistic citizen science recordings?

Material and methods
All analyses were performed in R ver. 3.6.1 (<www.r-project. org>). Statistical significance was assumed at p ≤ 0.05.
The study was carried out within Trondheim Municipality administrative borders (Norway), around 63°42'N, 10°38'E ( Fig. 1a-b). It is a southern-boreal (Moen 1999), coastal municipality with an area of ≈ 322 km 2 , a population of ≈ 196 000 people (Statistics Norway 2020a, Trondheim Kommune 2020), and annual mean temperature and precipitation of 3 approximately 5°C and 884 mm (Climate-Data.org 2020). Note that the population size is pr. 1 January 2019, prior to the merger with Klaebu Municipality pr. 1 January 2020.
Trondheim is an administrative centre, dominated by education and service businesses and with limited industrial activities (Statistics Norway 2020b).

Sampling design
The data used in the study was part of a previous unpublished investigation, conducted by (now retired) academic staff at the NTNU University Museum in 2001-2002 and reanalysed here. The description of sampling methodology is based on the original (Norwegian) reports, and interviews with retired staff members. Fifty 100 × 100 m plots (as defined by the UTM grid) were selected to represent a transect from the city centre to surrounding semi-natural areas. The plots were selected using stratified random sampling by the original investigators to represent the variation within the transect with regards to history, types of buildings and land-cover types (Fig. 1). The semi-random random sampling of UTM grid cells along the gradient can be justified, as completely random sampling potentially included grid cells not allowing for sufficient vegetation surveys (e.g. grid cells consisting primarily of private gardens and or buildings).
In this case, nearby UTM cells with adequate vegetation cover were selected.
During field seasons of 2001 and 2002, a trained botanist from the NTNU University Museum (E. Fremstad) inventoried the plots, registering vascular plants species in accessible areas. Only self-seeded species were counted; thus, planted species were not included, but potentially escaped garden plants were included. Each species was registered on a Braun-Blanquet rank-abundance scale (rare (1) -scattered (2) -common (3) -dominant (4)). 317 taxa were identified to species; 11 taxa were only identified to genus level, but are assumed to not be an already identified species.

Land-cover-and environmental variables
Land-cover-and environmental variables were retrieved from multiple sources: data on land-cover comparable to the 2001-2002 datasets were retrieved from official digitised land-cover maps from 2003 ('Digitalt Markslagskart', DMK, 1:5000) from NIBIO (Norwegian Inst. of Bioeconomy Research 2019). The relatively detailed land-cover classifications are largely based on productivity of the land. DMK maps are no longer updated, and have been replaced by the AR5-system (Land Resource map 1:5000). More recent land-cover maps were needed for the dataset used for evaluation of the analyses (see the following section: 'Evaluation with unrelated dataset'). Later land-cover was based on the Norwegian AR5 maps from NIBIO (Norwegian Inst. of Bioeconomy Research 2018). Shapefiles of the landcover maps were provided by the Trondheim Municipality in April 2018. The AR5 maps are both continually and periodically updated, and provides the most complete data on national land resources (Kartverket 2019). To reduce the number of variables and ensure comparability between the different data sets, similar land-cover categories were merged (Supporting information).
The digital land-cover maps were overlaid with the sampling plot polygons, and the area of each land-cover category within the plots were calculated, using the R-packages 'rgdal', 'sp', 'raster' and 'rgeos', functions 'intersect()' and 'gArea()' (Pebesma and Bivand 2005, Bivand et al. 2013, Bivand and Rundel 2020, Hijmans 2020. The original investigation included field-based registration of 'multi-layered forest' (m 2 ), which can be interpreted as a measure of (semi-)natural forest cover. This categorisation could be found within all forest categories described in the Supporting information. This area of seminatural forest was included as a variable in the analyses. The original calculations were done in ArcMAP ver. 10.6 (ESRI 2018). For each plot, the approximate age (years) of built structures had been assigned in the original unpublished data. These approximate age classes were based on Skogrand (1990), historical maps and local knowledge (Petersen et al. unpubl.). If the plot had no built structures, the age of the buildings was set as zero. Structures built before 1900 were assigned the age 100 years.
Mean aspect of the terrain was included as a land-cover variable. Data on aspect was taken from a digital terrain model raster with a resolution of 25 × 25 m, retrieved from the Dept of Natural History at the NTNU University Museum (pers. comm.). For each plot-polygon, the mean aspect of the included raster cells was calculated (degrees).

Cluster analysis
Community-and distance matrices (species-by-site) were constructed, using Gower's dissimilarity index (package 'cluster', function 'daisy()'). The plots were divided into two groups using hierarchical cluster analysis based on the dissimilarity matrix with 'ward.D' as the agglomeration method (function hclust()). Differences in distributions of the environmental variables (mean aspect, age of built structures, proportion of developed area and area of multi-layered forest) between the clusters were assessed using Mann-Whitney U tests (due to non-normality).
Overlaps in species composition between the clusters were assessed visually through a Venn diagram, and species richness were compared using generalised linear models (GLM) with a negative binomial distribution (due to overdispersion) (package 'MASS', function 'glm.nb()'): species richness was thus modelled as a function of cluster.

Indicator species analysis
To identify the associations of species with the different clusters, species indicator analysis were performed using the function 'multipatt()' from the package 'indicspecies', using 9999 permutations (De Caceres and Legendre 2009). This analyses the association between species' occurrences and the classification of sites. Plant community data was transformed to species presence/absence prior to analysis, due to the ordinal nature of the Braun-Blanquetlike inventory. Pearson's phi coefficient of association was used as the association statistic; this correlation index is used to determine the ecological preferences of species among sets of alternative site groups. Thus, identified indicator species were highly correlated with the indicated site group. p-values of the correlation coefficients were adjusted for multiple comparisons, using the Benjamini-Hochberg-method (function 'p.adjust()').
The following functional traits of the determined indicator species were requested through the TRY database (Kattge et al. 2011) on 24/09-19, and downloaded on 27/09-19 (request no. 7284): Ellenberg indicator values (EIV) for light (EIV L ), moisture (EIV F ), nitrogen (EIV N ), pH (EIV R ) and temperature (EIV T ), generative-and vegetative height, seed bank longevity, seed dry mass, specific leaf area, life span and life strategy (C-S-R, Grime 1974) (Supporting information). For numerical traits, the mean across all measurements were assigned as the trait value of the species. For categorical traits, the trait state reported most frequently was assigned as the trait state of the species. For example, the life strategy of Molinia caerulea was reported as both C (n = 1), CS (n = 2), R (n = 1) and S (n = 1) in the TRY database; in this analyses CS was thus used as the trait state.
Indicator values are not functional traits per se, but will be referred to as such for simplicity. Ellenberg indicator values are not strictly numerical, but rather ordinal in nature. However, it has been shown that the indices can be treated as numerical in the case of multiple measurements (Bartelheimer and Poschlod 2016). Differences in trait value distribution between the indicator species of the clusters were tested with a non-parametric Mann-Whitney U-test for numerical variables, except for SLA (unspecified petiole) which was tested with a t-test. The distribution of categorical variables were tested with Fisher's exact test, due to zeros in the observed counts in the contingency tables.

Evaluation with unrelated dataset
All occurrence records including spatial information from within the Trondheim municipality published on the Global Biodiversity Information Facility (GBIF) (GBIF.org 2019) 5 were downloaded. These were used to evaluate whether the patterns identified from the inventoried dataset are visible in openly available citizen science datasets, and to validate the indicator species. Data filtering and cleaning were performed based on the following criteria: 1) only records belonging to the kingdom 'Plantae', 2) only occurrence records with no spatial issues; 3) only records including information on species; 4) only records with a coordinate uncertainty of ≤ 354 m; 5) only records observed between 1 January 2001 and 31 December 2018. Potential duplicate records were removed (according to species name, date, basis of record, coordinates and coordinate uncertainty). The filtered dataset consisted of 19 974 records; the records thus reflect individual sampling events. Using the GBIF backbone taxonomy, a subset of the data set was constructed by only including the identified indicator species (4292 records).
Trondheim municipality was divided into a 500 × 500 m grid, overlaid on DMK land-cover latest updated in 2003, AR5 land-cover latest updated in 2012 and AR5 land-cover latest updated in 2018 (packages 'sp' and 'raster', functions 'SpatialGrid()' and 'intersect()'). For each grid cell, the mean area of each land-cover type across all time periods were calculated, thus using developed area as a proxy for level of urbanisation. Grid cells entirely covered by water (i.e. marine or entirely limnic cells) were removed, and potential records within these were assumed to be errors. Each grid cell was denoted as either 'urban' or 'non-urban', depending on the percentage of developed area within the cell. The cut-off was set at ≥ 20%, reflecting 'high' and 'moderate' levels of urbanisation as described by McKinney (2008).

Validation of indicator species
To validate the status of the indicator species, two logistic models were constructed predicting the probability of presence of an 'urban' and 'non-urban' indicator species, respectively, as the response variable. Both used percentage of developed area within a grid cell (a proxy of urbanisation) as the predictor variable. To account for the differences in sampling intensity, the total number of records within each grid cell was centred and scaled (subtracting the mean value and dividing by standard deviation, function 'scale()'), and included as a covariate (799 out of 1494 grid cells). Due to a high number of grid cells with no records of the indicator species, the models were fitted with a complementary log-log link ('cloglog'). As hump-shaped relationships were expected, all models were fitted with a quadratic term. Preliminary models showed spatial autocorrelation in the model residuals when testing the observed Moran's I against a Monte-Carlo simulation of randomly distributed values (999 permutations). To account for this, Matérn correlation functions were included as random effects (package 'spaMM', function 'fitme()'). The final models were thus of the form: Presence Indicator = % Developed Area + % Developed Area 2 + No. records + Matérn(1|longitude + latitude). Subsequent stepwise backwards model selection was based on ∆AIC.

Comparison of functional traits
For comparison of the functional traits of the indicator species and the species registered in GBIF, the distribution of functional traits of species occurring in urban-and non-urban grid cells were compared, similarly as was done for the indicator species. 1297 unique species names in the dataset had accepted synonyms and available trait data in the TRY database. The species were assigned occurrence status in either urban (800 species) or non-urban (1116 species) grid cells. Occurrence in both categories was thus possible. Six hundred and nineteen species occurred in both urban and nonurban cells, 181 only in urban cells and 497 only occurred in non-urban cells. The differences in distribution of (numerical) functional traits of the species found in urban-and nonurban grid cells were tested with a Mann-Whitney test. The categorical traits were tested with a χ 2 -test (life strategy) or Fisher's exact test (Raunkiaer life form and life span) (due to zero expectations and observed counts < 5). The results were compared with the results for indicator species.
Species composition overlapped between the two clusters (160 species), but both clusters had unique species (76 unique 'Urban' species and 81 unique 'non-urban species') (Supporting information). Species richness differed significantly between the clusters (Supporting information).

Indicator species analysis
The indicator species analysis for the plant communities performed on a presence-absence community matrix resulted in 57 (12 urban, 45 non-urban) indicator species, according to adjusted p-values (Table 1). The identified Urban indicators include locally-and regionally widespread and common species, generally viewed as indicative of cultivated/managed fields and meadows, and anthropogenic habitats. One species (Lepidotheca suaveolens) is alien to Norway (The Norwegian Biodiversity Information Centre 2018). The indicators of non-urban areas are common/widespread native species, generally associated with forests and/or wetlands (Norwegian Biodiversity Information Centre 2018).
Urban indicator species had higher EIV L (Mann-Whitney U = 418, p < 0.001), higher EIV N (Mann-Whitney U = 451, Table 1. Plant indicator species (Lid and Lid 2005). Species recognised as indicator species for the two clusters according to Pearson's Phi coefficient of correlation (Stat) and the Benjamini-Hochberg adjusted p-value. Nomenclature is based on GBIF's backbone taxonomy (package: = specific leaf area, not defined whether petiole was included or excluded in the measurements. The abbreviations used in the 7 p < 0.001), higher vegetative height (Mann-Whitney U = 371, p = 0.048), lower seed mass (Mann-Whitney U = 104, p = 0.007) and higher SLA (excl. petiole) (Mann-Whitney U = 404, p = 0.004), compared to non-urban indicator species (Fig. 2, 3). For the categorical traits, Fisher's exact test showed significant differences for life strategy (p = 0.005) and lifespan (p = 0.003). The differences are likely driven by the urban indicator species having relatively many R-and CR-strategists, and relatively few S-and CS-strategists, whereas the opposite was true for the non-urban indicator species. The urban indicator species likewise had relatively many annual species and few perennial species, whereas the opposite was the case for the non-urban indicator species.

Validation of indicator species
For the models on probability of presence/absence of an indicator species in 500 × 500 m 2 grid cells based on GBIF data predicted by the proportion of developed area, the best models did not contain the same predictors for urban-and non-urban indicators, respectively. For the model on urban indicator species, the quadratic term was retained during model selection, whereas a linear model provided the best fit for the non-urban indicators (Fig. 5, Table 2). Large confidence intervals around the model predictions for urban indicators however undermines this model.

Functional traits in urban versus non-urban grid cells
Species found in urban grid cells had lower affinity for moisture compared to non-urban grid cells (W = 276 318, p < 0.001). Species in urban grid cells had higher nitrogen-(W = 192 300, p < 0.001), and alkaline affinity (W = 220 799, p = 0.023) compared to species observed in non-urban grid cells. Species in urban grid cells were taller, both when comparing generative-(W = 135 166, p = 0.027) and vegetative height (W = 192 100, p = 0.038), and they had larger specific leaf area (petiole incl.) (W = 79 536, p = 0.024) compared to species from non-urban grid cells (Supporting information, Table 3). Strategy differed between urban-and non-urban grid cells (χ 2 = 12.002, p = 0.035), as did life-form (Fisher's exact test with simulated p-values, p < 0.001); however, only a surplus of CR-species among the species found in urban grid cells had a standardised residual > 2. Urban grid cells had more species with a transient seed bank than what was expected by chance (χ 2 = 20.012, p < 0.001) (Supporting information).

Discussion
Urbanisation is identified as a major driver of plant community composition, regarding both species composition and differences in functional traits. By examining vegetation data along an urbanisation gradient, we found plant species associated with urban areas had higher affinities for light and nitrogen, were taller and had smaller seeds and larger specific 8 leaf area than species associated with non-urban areas. More urban species were annuals and adhered to ruderal-and competitive-/ruderal strategies than expected by chance, whereas non-urban species were more often stress-tolerant or competitive-/stress-tolerant strategists than what were expected by chance. The identified patterns reflected a shift in environmental-and anthropogenic variables, indicating how urbanisation filters species displaying different trait syndromes. The probability of presence of non-urban indicator species records based on an unrelated dataset from the Global Biodiversity Information Facility (GBIF.org 2019) showed a linear negative response to increasing urbanisation, validating their status as indicator species. The probability of presence of the urban indicator species records from GBIF showed a hump-shaped relationship with increasing urbanisation as expected; large uncertainty around the predictions however undermines the validity of this group. The functional trait differences observed in the structured survey were partially validated by comparison to independent data from the publicly available datasets from GBIF registered in either urbanor non-urban areas.
The results showed that a species turnover happens along an urbanisation gradient, driven by a filtering of species with contrasting functional traits. The filtering show how species capable of surviving in urban areas adhere to a general trait syndrome with tradeoffs in favour of disturbance-over stresstolerance, and rapid resource acquisition advantageous in nutrient rich environments rather than resource conservation beneficial in nutrient poor sites.

Functional traits of indicator species
Abiotic conditions (such as climate, environment and disturbance regime) determines which plant species are capable of persisting in a habitat; the conditions filter species based on their adaptations and functional traits (Diaz et al. 1998). The differences in functional traits between the indicator species from respectively urban and non-urban areas reflect the filtering effects of urbanisation, favouring different trait syndromes along an urbanisation gradient (Fig. 4). Non-urban indicators displayed lower nitrogen affinity (EIV N values) and specific leaf area (SLA) on average compared to the urban indicators. Urban habitats present rather extreme environments, characterised by repeated disturbance and eutrophication. This is in part caused by nitrogen deposition, and relatively complex pollution. Urbanisation thus favours species tolerant ofor adapted to high soil nutrient levels (Pellissier et al. 2008), which is seen here in the high EIV N values of urban indicators. SLA is a trait generally responding to environmental gradients, with high SLA values generally indicating a 'disposable' strategy (Cornwell and Ackerly 2009). The higher SLA in urban indicators seen in this study reflect that nitrogen rich soils favour plants with large and thin leaves (Knapp et al. 2009, Ordoñez et al. 2009). An increase in nitrogen affinity and SLA pet. excl. * SLA pet. incl. SLA und.
EIV T Height generative Height vegetative * Seed dry mass * EIV L * EIV F EIV N * EIV R U r b a n N o n − u r b a n U r b a n N o n − u r b a n U r b a n N o n − u r b a n U r b a n N o n − u r b a n specific leaf area with increasing urbanisation was also seen by Knapp et al. (2008Knapp et al. ( , 2009Knapp et al. ( , 2010 and Vallet et al. (2010). The two groups also differed in strategy, with urban indicators having a high representation of R-and CR-strategists, similar to what was seen by Kalusová et al. (2017), compared to more Sand CS-strategists in the non-urban group than what could be expected by chance (Pellissier et al. 2008). The frequent disturbance of urban habitats favours species tolerant of such rather than stress-tolerant or competitive species. Thus, the indicator species of urban areas are overall short-lived, ruderal species, adapted for nutrient rich habitats. Regarding differences in seed mass, the results are not as easily interpretable. The lower seed mass of urban indicators is potentially an artefact from the overrepresentation of ruderal species, which generally have relatively small seeds (Grime 1988). The higher affinities for light of the urban indicator species can be related to the characteristics of the -non-urban indicator species: several of these are forest species, and are therefore assumed to be adapted to shady conditions. This can further be coupled to the differences seen in environmental conditions within the plots: the non-urban plots have only small areas of multi-layered forest, and thus favour species tolerant of full light exposure.
The differences in lifespan between urban and non-urban indicators are driven by a high number of annual species among the urban ones (similar to the results of Knapp et al. 2008). This relates to the general rapid life cycle of ruderal species, and the abundance of these within the urban indicators. This repeats previously observed patterns, favouring specific trade-offs in functional traits: urbanisation filters out long-lived, stress-tolerant species capable of growing in nutrient-sparse environments, instead favouring species tolerant of the disturbed, nutrient-rich conditions within urban areas. Interestingly, two of the identified urban indicators (Capsella bursa-pastoris and Stellaria media) were found to be cosmopolitan species occurring in ≥ 94% of the 110 investigated cities in the synthesis by Aronson et al. (2014), despite their study did not include Scandinavian cities.

Indicator species' response to urbanisation
The probability of a non-urban indicator being registered at any location in GBIF decreased linearly with increasing percentage of developed land cover (a proxy for urbanisation). This unidirectional response to increasing urbanisation validates their status as indicator species, with a probability decreasing from approximately 0.56 at 0% developed area to 0.14 at 97% developed area. Interestingly, the quadratic term was retained in the model of urban indicator species, peaking at intermediate levels of urbanisation. This shows that these species are species which tolerate the rather extreme urban conditions (frequent disturbance, drought, nutrient rich-and alkaline soils), rather than respond positively to high levels of urbanisation per se. This is also in concordance with previous studies showing plant species richness peaking at intermediate levels of urbanisation (likely due to intermediate levels of disturbance) (McKinney 2008), and values of certain functional traits (e.g. SLA) also peaking at intermediate urbanisation levels (Thompson and McCarthy 2008). However, the large uncertainty around the predictions (as illustrated by the large confidence intervals around the predictions in Fig. 5) undermines the validity as this group as reliable indicators of urbanisation, at least according to the GBIF data. Differences in absolute values of the probabilities between the groups are not directly comparable, as they include different numbers of species. Comparisons should thus be limited to the shape of the curves. The difference in number of species are likely to, at least in part, contribute to the difference in uncertainty in the model predictions, as fewer urban species in the data leads to fewer data points on which to base the models, the larger the uncertainty will inevitably get. Furthermore, the spatial scales of the original investigation and the validation dataset differ by a factor of 25. As species' responses to environmental conditions are scale-dependent (Pautasso 2007), the discrepancy between the response of the urban indicator species in the two datasets can relate to the difference in scale;  i.e. the direct response from the urban indicators to urbanisation are likely determined on a smaller scale than the one used for the GBIF data.

Functional traits in urban versus non-urban grid cells
An overall difference in trait syndrome for GBIF registrations in urban and non-urban areas was visible as well. On average, plant species registered in urban areas (not limited to the indicator species) had lower moisture affinities (EIV F ) values on average -this reflects how soils in urban areas are generally drier than soils outside of cities (Forman 2014b), requiring drought-tolerance for plants to persist in the urban environment. Species observed in urban grid cells had higher nitrogen-(EIV N ), and alkaline affinity (EIV R ), again reflecting the (nutrient-)pollution seen in urban areas. The relatively large SLA (incl. petiole) of species in urban grid cells reflects the patterns seen in other studies (Knapp et al. 2008, Vallet et al. 2010, reflecting the resource-acquisitive strategy of species thriving in urban areas. The higher average height of urban species has been observed in other studies as well (Palma et al. 2017, Cochard et al. 2019), but the underlying reason is not immediately evident. It can relate to taller, more conspicuous vegetation being preferred by humans, and thus more likely be preserved (Duncan et al. 2011).
A surplus of annual species in urban areas were also seen by Knapp et al. (2008), Williams et al. (2015) and Palma et al. (2017), and likely relates to the overall short-lived, resource acquisitive character of the urban species. This strategy is also reflected in the relatively many therophytes (versus fewer chamaephytes) within the GBIF records from urban grid cells (Knapp et al. 2010, Concepción et al. 2016. The reason for the many of chamaephytes in non-urban grid  cells observed here likely reflects that most species registered in Trondheim are classified as chamaephytes overall (561 out of 1297 species), and more species being registered in non-urban areas (1116 species) compared to urban ones (800 species). More species registered in urban areas had a transient seed bank, relative to species registered in non-urban areas (Vallet et al. 2010, Kalusová et al. 2017. Williams et al. (2005) found a persistent soil seed bank to increase the probability of local extinction in urban areas. A transient seed bank could sufficient for the persistence of plants in a disturbed environment. In addition, the presence or absence of a persistent seedbank is likely associated with other functional tradeoffs characteristic of different syndromes.
Thus, some patterns in functional traits observed in the indicator species were reflected in the GBIF data: increasing nitrogen affinity and SLA, more annual species and more species with a ruderal-and ruderal-competitive life strategy with increasing urbanisation. Likewise, the high frequency of CR-species in urban grid cells is somewhat reflected in the R-and CR-strategists among the urban indicator species. In contrast, some differences between urban-and non-urban species within the GBIF records were not observed among the indicator species (taller species, higher alkaline affinity, greater drought tolerance in urban areas). Despite lifespan not differing significantly among the two GBIF groups, and life form not differing among the different indicator species, the relatively high amount of therophytes in the urban GBIF records is mirrored in the abundance of annuals among the urban indicators.
Nevertheless, the identified traits fit within the same 'Disturbance-tolerant, resource-acquisitive' strategy displayed by the urban indicators, as opposed to the 'stresstolerant, conservative' strategy of the non-urban indicators. As similar, overall trait syndromes are thus seen in both, independent datasets, the generality of the pattern is emphasised.
In conclusion, plant species characteristic of urbanised areas tend to adhere to a disturbance-tolerant, resourceacquisitive life strategy, compared to a more stress-tolerant, conservative strategy seen among plant species indicating non-urban areas. The functional traits demonstrating these differences along an urbanisation gradient in particular are nitrogen affinity (EIV N ), specific leaf area (SLA) and CSR strategy. These differences were also detected in the GBIF dataset used for evaluation. In addition, the evaluation dataset showed urban species to have a lower moisture affinity (EIV F ), higher alkaline affinity (EIV R ), be taller, and have a more transient seed bank. It is thus evident that despite not all differences in functional trait states exhibited by the indicator species determined from a professionally collected dataset are reflected in data from open source datasets, overall tendencies are congruent. It can therefore be argued that while such unstructured data cannot be used in place of professional surveys, it can provide a quantity of data impossible to collect otherwise and can serve as additional data when looking for coarse-scale patterns.
All other relevant species occurrence data is available from public repository (GBIF Occurrence Download 10.15468/ dl.aarqnj accessed via GBIF.org on 2019-05-24).