The observed link between urbanization and invasion can depend on how invasion is measured

Cities are thought to promote biological invasions because invasive species are more often introduced in urban areas and because they are more successful in disturbed environments. However, the association is not as strongly supported by the literature as is generally assumed and might depend on how urbanization and invasion are measured. In this study, we test if the type of data used to assess the link between urbanization and invasion can affect a study's conclusions.


| INTRODUC TI ON
It is generally assumed that urbanization promotes biological invasions (Borden & Flory, 2021;Gaertner et al., 2017). This assumption is supported by multiple studies that show that invasive species richness increases along urbanization gradients (Kühn et al., 2017), that invasive populations reach higher densities in urban areas , and that invasive species have better physical conditions and greater reproductive output in urban than in nonurban areas (Marques et al., 2020). Two nonexclusive processes can lead to these positive associations between urbanization and invasion.
First, invasive species are more often introduced in urban areas because cities concentrate human activities and thus introduction events (Padayachee et al., 2017). Second, invasive species are more successful in cities because urban-modified environmental conditions increase their probability of surviving and reproducing as well as their secondary spread through landscapes (Kowarik & von der Lippe, 2011;Marques et al., 2020).
However, the positive relationship between urbanization and biological invasions might not be as systematic, nor as strongly supported by the literature as is generally assumed. For instance, the abundance (i.e. number of individuals) of invasive species increases with urbanization in invertebrates but not in plants and vertebrates . In addition, the proportion of urban invasions might be overestimated because the detection of invasive populations is likely biased towards urban areas as they contain higher densities of people and, thus, more opportunities for detection (Adams et al., 2020;Fithian et al., 2015;Hughes et al., 2021). More generally, wildlife is recorded more often in areas that are easily accessible to humans (Petersen et al., 2021). For example, among over 700 million animal occurrences on GBIF (www.gbif.org), Hughes et al. (2021) found that only 11% of Earth's land surface was sampled, 80% of records were within 2.5 km of roads, and that 22%-47% of records were within urban areas. Finally, urbanization gradients are often poorly defined, sometimes as the distance to a city centre, or grouped in broad urban versus nonurban classes . Therefore, the general link between urbanization and biological invasions is still unclear and the type of data used to assess invasive species distribution across urbanization gradients might affect our ability to characterize this link.
Assessing how the type of data used to measure invasion shapes our comprehension of the link between urbanization and invasion is important to better model invasive species distribution at local to global spatial scale. Presence-absence data are generally (but not always; Gormley et al., 2011) better than the presence-only data for modelling species distributions because presence-only data often suffer from a strong, and difficult to correct, spatial bias in sampling effort (Leroy et al., 2018;Phillips et al., 2009). Furthermore, presence-only and presence-absence data might misrepresent the spatial distribution of species by exaggerating the importance of small marginal sink populations, contrarily to not only more accurate but also more costly and geographically limited data such as abundance data (Ashcroft et al., 2017;Jarnevich et al., 2021). Thus, the type of data used to measure species occurrence might affect our ability to not only model and predict species distributions but also to understand species-environment relationships (Inman et al., 2021).
So far, it remains unknown whether different measures of invasion (e.g. presence-only, presence-absence or abundance) lead to different conclusions about the link between urbanization and invasion.
To address this issue, we studied how different measures of invasive species occurrence influence the association between invasion and urbanization. Using the ongoing invasion of the ant Lasius neglectus in Europe as a model system, we tested the link between invasion and urbanization using three measures of invasion (i.e. presence only, presence-absence and population area) and two measures of urbanization (i.e. urban/nonurban land cover classification and proportion of impervious surfaces (e.g. building, roads) per spatial unit).
First, we tested whether L. neglectus occurrences were more frequently recorded in urban environments throughout Europe using 180 presence-only data. Second, we tested whether L. neglectus populations were detected more often in urban environments while controlling for sampling effort using a 1870 presence-absence dataset across the Rhône Valley, France. Finally, by measuring 33 populations occurring along an urbanization gradient, we tested whether the area invaded by L. neglectus populations (hereafter "population area") was positively associated with urbanization.

| Study species
The ant (Hymenoptera, Formicidae), L. neglectus, is thought to originate from Asia Minor and is a widespread invader in Europe (Blatrix et al., 2018;Ugelvig et al., 2008). It is the least climatically limited invasive ant in central and northern Europe (Bertelsmeier & Courchamp, 2014). Unlike most other ant species invasive in Europe (e.g. Linepithema humile, Paratrechina longicornis, Wasmannia auropunctata), the genus Lasius occurs naturally throughout Europe, which might partially explain why L. neglectus is not climatically limited in this region (Charrier et al., 2020;Espadaler et al., 2018;Janicki et al., 2016;Rabitsch, 2011). Like most invasive ant species, L. neglectus has limited natural dispersal abilities because its winged reproductive females do not perform nuptial flights (Espadaler et al., 2007). Natural dispersal occurs through an incremental expansion of colonies (20-100 m per year; Espadaler et al., 2007) often resulting in large populations of interconnected nests with numerous queens and low intraspecific aggression. These colonies are often referred to as "supercolonies" (Espadaler et al., 2004). However, to simplify the terminology throughout the manuscript, we have chosen to use the term "population" to refer to a colony. Finally, in L. neglectus (as in most other invasive ants; Rabitsch, 2011), regional spread occurs via human-mediated dispersal when contaminated materials such as soil or potted plants are transported for landscaping, construction or horticultural trade (Schultz & Seifert, 2005;Van Loon et al., 1990).

| Indices of urbanization
We compared two commonly used measures of urbanization: urban/ nonurban land cover classification and proportion of impervious surfaces. The European land cover classification (i.e. categorical: urban/nonurban) was obtained from the Corine Land Cover (CLC) classification 2018 (100-m resolution cells; from land.copernicus. eu). Corine Land Cover was simplified into two categories: urban (CLC class 1) and nonurban (CLC classes 2 and 3; nonurban areas are composed of agricultural (CLC class 2) and forests and seminatural areas (CLC class 3)). Land cover classifications are frequently used to characterize urbanization but, as a coarse discrete variable, it might oversimplify the urbanization gradient. To have a more precise and continuous representation of the urbanization gradient, we used the proportion of impervious surfaces (calculated as the percentage of soil sealing with impervious materials such as asphalt and cement per spatial unit) from the European Settlement Map 2015 (20-m resolution cells; from land.copernicus.eu).

| Presence-only data
We obtained 230 occurrence locations of L. neglectus across its current introduced range in Europe from the CREAF database (Espadaler & Bernal, 2020). Lasius neglectus occurrences located outside of the area covered by Corine Land Cover (50 out of 230) were removed from the dataset before the analysis (180 remaining occurrences; Figure 1a). In addition, to characterize urbanization in the area where L. neglectus can occur, we constrained our analyses to the smallest rectangular area including all 180 L. neglectus occurrences and ignored areas with obvious unsuitable climatic conditions (such as extremely cold, high elevation areas). We, therefore, excluded areas with minimal temperature of coldest month (bio06) and maximum temperature of warmest month (bio05) that were more than 1°C outside of the range of values of L. neglectus occurrences for these two climatic variables (from worldclim.org).
We extracted the land cover information for Europe (N ~ 350 million of pixels at 100 m resolution) and for locations where L. neglectus occurs (N = 180 locations; Figure 1b). We also calculated the proportion of impervious surfaces (averaged at 100-m resolution, because L. neglectus populations often spread over several hundred square meters; Espadaler et al., 2007;Gippet et al., 2021) for the 180 locations where L. neglectus occurs and for 9999 sets of 180 randomly selected locations (Data S1-S3). Finally, we tested if L. neglectus occurrences were more often located in urban areas than expected from a random distribution across Europe using a two-sample test for equality of proportions (function "prop.test" from the stats package in R v.3.6.2, R Core Team, 2019). We tested if L. neglectus occurred in more urbanized locations than randomly sampled locations across Europe by comparing the median proportion of impervious surfaces between the 180 L. neglectus occurrences and each of the 9999 sets of 180 random locations with Wilcoxon rank sum tests (function "wilcox.test" from the stats package in R; Figure S1).

| Presence-absence data
We performed a detection survey at a regional scale (~5000 that are easy to detect on trees, shrubs and on the ground, even by nonspecialist observers (Espadaler & Bernal, 2020;Gippet et al., 2018). In addition, to make sure that small L. neglectus populations were not confused with native Lasius species (typically L. alienus), all ants from the genus Lasius were sampled (one sample corresponds to one nest or one foraging trail) and stored in 96% ethanol at −20°C.
Ants were then identified to the species level using morphological criteria (Seifert, 2007) and, for morphologically ambiguous Lasius individuals, molecular identifications were performed by sequencing the Cytochrome Oxydase I (COI) mitochondrial gene (one individual per population; see Gippet et al., 2017). Thus, we were confident in our ability to determine whether L. neglectus was present or absent at sampled sites. Land cover information and proportion of impervious surfaces were extracted for each sampling location (N = 1870), and for the study landscape using a minimum bounding polygon (N ~ 400,000 pixels at 100-m resolution; Figure 2a, Data S4-S6).
Using two-sample tests for equality of proportions (function "prop. test" from the stats package in R), we tested ( using a Wilcoxon rank sum test (function "wilcox.test" from the stats package in R).

| Population area
We measured the surface occupied by 33 L. neglectus populations found by the regional survey (i.e. middle Rhône valley, France;

| Presence-absence data
Urban areas covered 16% of our sampling area. Among the 1870 locations sampled in this area, 63% were in urban areas, indicating that our sampling survey was biased towards urban areas (3.9 times more frequent than random; Proportion test: Χ 2 = 2871.3, p < .0001; agricultural and urban areas) were invaded. Finally, the proportion of impervious surfaces did not differ among invaded and noninvaded locations (W = 75,650, p = .12; Figure 2c).

| DISCUSS ION
We have shown that the type of data used to measure invasive species distribution can easily bias our understanding of the link between invasion and urbanization. In L. neglectus, this link was positive according to the presence-only data, nonexistent according to the presence-absence data and negative according to the population area data.
Our findings support the idea that presence-only data overes-  (Hertzog et al., 2014). The consequences of not accounting for sampling bias in species distribution models is generally assessed in terms of model performance (Chauvier et al., 2021;Leroy et al., 2018;Milanesi et al., 2020;Phillips et al., 2009) but the effect of sampling bias on species-environment relationships is rarely investigated (Inman et al., 2021). Therefore, this study provides valuable empirical evidence that the type of occurrence data used to study species-environment relationships can have profound impact on the conclusions of a study.
Presence-absence data could also be used to better assess the link between urbanization and invasion because they can account for unbalanced sampling effort. However, compared to presenceonly data, these types of data are much more expensive to collect and generally limited geographically. In this study, we only sampled one region (i.e. the middle Rhône valley, France) and found that L. neglectus was equally present in urban and nonurban areas, but it is possible that the link between urbanization and the invasion of L. neglectus differs in other regions. The climatic context could, for instance, greatly affect the suitability of urban areas relative to adjacent nonurban areas (Pyšek et al., 2020). For example, the invasive ant Tetramorium immigrans is found only in urban areas at high latitudes while it is distributed across the urban-rural gradient at lower latitudes, most likely because the species benefits from urban heat island effects in colder environments (Cordonnier et al., 2020;Gippet et al., 2017) Interestingly, the negative association between urbanization and invasion found in the population area data was not detected using the presence-absence data. This is consistent with the idea that presence-absence data might overestimate the importance of small marginal populations (Ashcroft et al., 2017). Thus, this result is particularly important because it suggests that even studies using high-quality presence-absence data might miss important speciesenvironment relationships. This is particularly problematic when studying invasive species because abundance is strongly linked to the survival (e.g. Allee effect; Taylor & Hastings, 2005) and impacts of invasive species (Bradley et al., 2019).
Overall, our findings do not contradict the idea that cities and their surroundings are hotspots for the introduction of invasive species, but they challenge the assumption that urbanized environments are systematically more suitable for invasive species. Furthermore, in our study landscape, L. neglectus was as likely to occur in seminatural areas as in anthropized areas (i.e. agricultural and urban areas), suggesting that its distribution (in the middle Rhône valley) is completely independent of land use and that most suitable habitats can be colonized, most likely by local human-mediated dispersal events (Espadaler et al., 2007;Gippet et al., 2019). Lasius neglectus might, therefore, be a serious threat to local biodiversity hotspots (Liu et al., 2020) and its impacts in these habitats should be further assessed as existing evidence is limited to disturbed habitats in an urbanized area of Budapest, Hungary (Nagy et al., 2009).
It is possible that most invasions start inside or near cities as the direct result of increased human density and activity, but the subsequent spread of invasive species might often be independent of, or even negatively affected by urban environments. The link between urbanization and invasion differs among invasive species and cities (Gippet et al., 2017;Perez & Diamond, 2019) and future research is needed to quantify the proportion of invasive species that are favoured by urban conditions and the ecological traits associated with it.

ACK N OWLED G EM ENTS
We thank Stéphanie Mermet and all the great interns who participated in sampling the ants and J-P Lena for his helpful advice on the early stages of this work. We also thank Olivia K. Bates, Alan Andersen and two anonymous reviewers for helpful comments and suggestions on previous versions of the manuscript.

CO N FLI C T O F I NTE R E S T
The authors declare that they have no conflict of interest. Surface occupied by the population (m²)

DATA AVA I L A B I L I T Y S TAT E M E N T
Proportion of impervious surfaces