### Introduction

- Top of page
- Summary
- Introduction
- Methods
- Results
- Discussion
- Acknowledgements
- References

Human activities continue to increase the number of species introduced to areas beyond their native geographical distributions, giving the potential for these species to become biological invaders. To be a successful biological invader, a species must clear several hurdles (Williamson 1996; Davis & Thompson 2000; Richardson *et al*. 2000; Daehler 2001; Kolar & Lodge 2001; Duncan, Blackburn & Sol 2003). First, the species must be transported to the non-native region. Second, it must be released into the environment there. Third, the species must successfully establish in the new location. Finally, it must spread from the point of introduction. Only if a species successfully transits all four stages (termed transport, release, establishment and spread) can it become invasive. Most non-indigenous species fail to establish self-sustaining populations, and of those that do establish, only a small fraction cause some harm (Williamson 1996). However, the few non-indigenous species that do cause harm can have devastating effects on recipient environments and economies (Vitousek *et al*. 1996; Mack *et al*. 2000). Ecologists are thus confronted with the task of finding predictive tools that are powerful enough to distinguish benign or non-invasive introductions from invasive pests. One investigative approach to this formidable task is to analyse data from historical introductions, an example of an ‘experiment in nature’ (*sensu*Diamond 1986).

Historical data are particularly good for studies of establishment success in birds, as is reflected in global catalogues of avian introduction events (Long 1981; Lever 1987; Cassey 2002a) that have contributed to more than 50 such studies (see recent reviews by Kolar & Lodge 2001; Cassey 2002a; Duncan *et al*. 2003). These data comprise at least 1920 introduction attempts involving 416 species from 44 families (Cassey 2002b). They include the deliberate introduction of species to 125 mainland states and 218 oceanic islands over four centuries (1600–1980). Studies of the determinants of establishment success and failure have considered the relationship of more than 80 ecological, environmental, and life-history traits to these bird data at a variety of regional and phylogenetic scales. Much of our knowledge of causes of establishment success derives from these studies (Kolar & Lodge 2001). They have identified a variety of factors associated with success in different geographical locations and taxa, including introduction effort, climate, environmental suitability, behavioural flexibility and body size. For example, Duncan *et al*. (2001) showed that birds introduced to Australia were more likely to establish if introduced in higher numbers and at more sites, if they had a greater area of climatically suitable habitat in Australia, and if they had a larger native range size. Brooke, Lockwood & Moulton (1995) showed morphological overdispersion amongst finch species successfully introduced to St Helena, consistent with the effects of competition in determining establishment success or failure. Sol & Lefebvre (2000) found that birds introduced to New Zealand were more likely to establish if they were introduced in larger numbers, had large brains relative to their body size, were partial migrants and were nidifugous.

The diversity of studies, and the range of variables analysed, raise the question of the relative importance of different factors for establishment success in all non-indigenous species. Unfortunately, the very diversity of studies makes this question difficult to answer. Variables studied may describe characteristics of the species introduced (‘species-level’ effects, e.g. relative biological traits), the location where species are introduced (‘location-level’ effects, e.g. latitude), the individual introduction event (‘event-level’ effects, e.g. number of individuals introduced), or some combination of the three (Duncan *et al*. 2003). Further, even within taxonomic groups, different authors define what constitutes an introduction in different ways, and then apply different analytical techniques to different sets of variables for different subsets of species. Each of these variables may also be defined or measured in a variety of different ways (e.g. for morphological overdispersion, see Moulton & Pimm 1983, 1986a, 1987; Moulton, Sanderson & Labisky 2001). The importance of any single variable is thus difficult to judge. Hence, the question remains: what do these data and analyses actually tell us about non-indigenous species establishment?

Recent reviews have attempted to assess the influence of different factors in determining establishment success in birds (Kolar & Lodge 2001; Cassey 2002a; Duncan *et al*. 2003; more generally see Williamson 1996). However, all have relied on simple narrative reviews or ‘vote counting’ approaches, whereby the influence of a variable is assessed by weighing the number of studies that show significant results for its effect against those that do not (Gurevitch & Hedges 1999; Gates 2002). Such methods are qualitative and subjective. They do not take account of the fact that different studies may have different statistical power, that the strength of relationships will differ amongst variables, or that results may be non-significant while still showing consistent effects of a variable (Osenberg *et al*. 1999). Here, we use a meta-analytical approach to synthesize quantitatively the results of multiple studies and assess the influence of predictor variables invoked to explain the variation in establishment success in birds. In particular, we explicitly assess the role of different regions as well as species-level, location-level, and event-level effects in the successful establishment of non-native bird species. Meta-analysis allows the influence of these different classes of variables to be assessed in a rigorous and quantitative manner.

### Results

- Top of page
- Summary
- Introduction
- Methods
- Results
- Discussion
- Acknowledgements
- References

The weighted average effect size (*Z*_{+}) for the population of transformed Pearson correlation coefficients is 0·04. The 95% confidence interval around this effect (0·01, 0·08) does not include zero, which thus indicates a significant positive common correlation. The frequency distribution of correlation coefficients was not significantly different from normal and the most common coefficient was close to the mean value (Fig. 1; Shapiro–Wilk *W* = 0·98, *P* = 0·25). The skewness and kurtosis of the distribution of coefficients were −0·32 and −0·29, respectively, neither of which differed significantly from zero (Fig. 1; *P* > 0·05). These results suggest that the large sample normal approximation to the distribution of *Z*_{+} is appropriate, and hence that our test of the null hypothesis of a common correlation of zero is robust.

The relationship between transformed effect size (*Z*_{i}) and sample size (*n*_{i}) across all comparisons is shown in Fig. 2. If associations between success and independent variables were random, most points would fall within the limits defined by the significance lines at α = 0·05. Clearly, a reasonable proportion of the points in Fig. 2 fall outside these limits. The rank correlation test between standardized effect size and sample size was not significant (*r =* −0·04, *n* = 96). This indicates that the number of expected negative and positive effects at low sample sizes is not significantly biased, and hence it can be inferred that the pattern of effect sizes is not due to publication bias.

Although we find a significant positive common correlation on introduction success across all the variables in our data, it is of relatively low magnitude (0·04). Moreover, the range of different locations chosen for introduction, and the range of traits that researchers have subsequently compared, suggests that it is inappropriate to expect a common fixed effect size for the population of correlation coefficients across all variables, locations, and studies. However, the test for homogeneity shows that the variance in effect sizes is not significant across all studies (α = 0·05, Table 2), contrary to this expectation. Nevertheless, we strongly believe that it is sensible not to interpret this result over-conservatively and thus potentially commit a Type II statistical error. One reason why this might happen is that the test statistic in equation 5 is strongly dependent on *n*_{i} whereas Fig. 2 reveals that all but two of the sample sizes available for the study of avian introduction success are relatively small (< 100). Moreover, the parameter variance component of the population of correlation coefficients in the random effects model is relatively large (0·38), while the significance value of the heterogeneity test is relatively low (approximately 0·10). Finally, a test for outliers reveals that seven parameters are not consistent with the homogenous effect size model. These include effect sizes for species-level (*K* = 1), location-level (*K* = 2), and event-level (*K* = 4) traits (Fig. 2). Therefore, we believe that investigating possible causes of heterogeneity in the distribution of correlations is clearly worthwhile.

Table 2. Effect size statistics for the treatment groups within the 24 studies and 96 moderator traits. The overall mean effect is calculated using equation 2. Population variance is the variance component for the population of correlation coefficients (effect sizes) calculated using a random model. The homogeneity test statistic for the population variance is defined in equation 4, and its significance is indicated by the *P*-value Comparison | Overall mean effect | Confidence interval | Population variance | Homogeneity test statistic | *P*-value |
---|

All introduction studies | 0·04 | 0·01, 0·08 | 0·38 | 108·11 | 0·10 |

Islands of New Zealand | 0·05 | 0·00, 0·09 | 0·34 | 48·04 | 0·15 |

Islands of Hawaii | 0·01 | −0·06, 0·08 | 0·37 | 13·13 | 0·87 |

Oceanic islands | −0·02 | −0·13, 0·09 | 0·48 | 18·21 | 0·15 |

Australia | 0·10 | 0·03, 0·17 | 0·29 | 13·93 | 0·38 |

Species-level traits | 0·03 | −0·02, 0·07 | 0·32 | 31·72 | 0·87 |

Location-level traits | −0·07 | −0·13, 0·00 | 0·35 | 26·18 | 0·75 |

Event-level traits | 0·21 | 0·14, 0·28 | 0·12 | 6·31 | 0·99 |

One potential source of variance in the effect sizes is location. Over two-thirds of the collected variables come from studies on the islands of New Zealand and Hawaii (68%). However, these two locations contribute only 38% of all island introduction attempts (Cassey *et al*. 2004). We examined the variability in effect sizes for four categories of study locations: the islands of New Zealand (*K* = 42), the islands of Hawaii (*K* = 22), the remaining oceanic study islands (*K* = 15), and the land mass of Australia (*K* = 15). For both New Zealand and Australia, the mean effects were significantly different from zero (Table 2). In all four cases, the 95% confidence interval for the mean effect overlapped with the overall mean. In two sets of studies, those for the islands of New Zealand and the remaining oceanic islands, the test for heterogeneity was approximately *P* = 0·15. Although clearly not statistically significant, these are two cases with a large detectable degree of heterogeneity among their effect size estimates (Table 2).

Another obvious source of potential variation in effect sizes is the difference in moderator variables used across studies. We examined the variance in effect sizes for species-, location- and event-level traits. The mean effect for species-level traits was not significantly different from zero, and its 95% confidence interval overlapped with the overall mean effect (Table 2). However, the 95% confidence intervals for location- and event-level traits did not overlap with the overall mean effect and the mean effect for event-level traits was highly significantly different from zero (*Z* = 6·14). This strongly suggests that the overall positive common correlation is driven largely by event-level traits. The test for homogeneity shows that effect sizes within the species-, location- and event-level categories are all highly homogenous, indicating that the effects in each case are not drawn from multiple distributions. In addition, event-level is the only set of traits for which the fail-safe number (315) is large enough that it is unlikely that missing results could nullify the significance of the effect. Notably, the fail-safe number of ‘missing’ publications is more than three times the value suggested by (Rosenthal 1991) as being evidence of a robust average effect size (5*K*+10).

### Discussion

- Top of page
- Summary
- Introduction
- Methods
- Results
- Discussion
- Acknowledgements
- References

Our analyses quantitatively summarize the findings of many such comparative studies for bird introductions to islands, and reveal the following patterns. First, the range of predictor variables analysed shows a small but significant positive common correlation with introduction success. This pattern in the effect sizes is not due to publication bias. Second, although a test for homogeneity shows that the effect sizes are not significantly heterogenous in their distribution, they do show relatively high variance. This suggests that the magnitude of the common correlation may be influenced by the combination of effects from heterogenous groups. We thus felt justified in searching for the source of this variance. Third, grouping the predictor variables by location of study did not sufficiently explain the heterogeneity in effect sizes. The mean effect size in each group did not differ significantly from the overall mean, while the variance component for the population of effect sizes was only markedly reduced for one of the four locations (Australia; Table 2). In contrast, fourth, grouping the predictor variables invoked to explain avian establishment on islands as species-, location- or event-level factors did identify sources of heterogeneity. Of the three groups, event-level variables were the most consistent predictors of establishment success for introduced birds (Table 2; see also below).

Although the event-level class includes several different predictor variables, the most frequently considered such factors are measures of introduction effort or ‘propagule pressure’. Theory suggests that introduction effort should be a key determinant of establishment success (Williamson 1996). In a recent review of avian introductions, Duncan *et al*. (2003) noted that all studies that have addressed the question ‘have shown that species introduced with greater effort … have a higher probability of establishment’, and that ‘introduction effort remains by far the strongest independent explanation for establishment success in birds’. These were qualitative observations, but are supported by the quantitative meta-analyses presented here. Notably, the only event-level trait that produced detectable outliers (i.e. deviated from the overall mean effect size) was introduction effort (the four event-level outliers in Fig. 2). Moreover, the mean effect size for the event-level variables (0·212) is relatively high by biological standards (Møller & Jennions 2002).

The overall mean effect for species-level factors did not differ from zero, which therefore suggests that there is no consistent set of traits that distinguish species that are ‘good’ at establishing from those that are ‘poor’. Note that while a non-significant effect could arise because consistent positive and negative relationships within this category cancelled each other out, this would give rise to heterogeneity in the estimate of the overall mean effect if the population of correlation coefficients tended to be statistically meaningful. There is no evidence for such heterogeneity in the population of species-level factors, as grouping variables in this way resulted in very low heterogeneity in effect sizes within this group (Table 2). Although it has frequently been argued that some characteristics will predispose species to invade, such as generalist habits or high intrinsic population growth rates (Duncan *et al*. 2003), it seems more likely that features of the environment and species will interact to determine establishment success. The general weakness of species-level effects (Table 2) fits with studies that show most variation in establishment success to be clustered at the species level (Blackburn & Duncan 2001a; Sol, Timmermans & Lefebvre 2002): species tend either to have high or low establishment success, but closely related species often differ substantially in their probability of establishment. Since closely related species tend to share many life-history, behavioural and ecological characteristics, it is unlikely that any such traits will be general drivers of the differences between such species in establishment success.

There is a trend in our analyses for a negative effect of location-level traits. This implies that some locations (e.g. smaller islands and islands with fewer exotic species; Fig. 2) are easier to invade than others. However, the influence of location-level traits on establishment success is clearly not as strong as the effect of event-level factors (Table 2). This is perhaps not surprising, as the influence of the location seems likely to depend on the characteristics of the species exposed to it: an environment conducive to the establishment of some species may be inimical to others. That said, most of the introduction locations in the studies included in our analyses are islands, traditionally considered to be relatively benign environments for invaders. It is possible that by including mostly island systems in our analyses we may be excluding much of the variation in invasibility associated with location, and so influencing our study against finding location-level effects. However, Sol (2000), Blackburn & Duncan (2001a) and Cassey (2003) have all shown that islands are no more susceptible to colonization by exotic birds than continental regions. Recently, Cassey *et al*. (2004) showed that significant effects of location-level variables on establishment success in a global analysis of birds were artefacts of their relationship with introduction effort. We suspect that the interaction between location- and event-level traits will prove to be a rewarding avenue for future studies of invasion biology.

One criticism of the importance for establishment success attributed to introduction effort is that a large number of studiesd use data from just a single location, the New Zealand archipelago (Duncan *et al*. 2003). Thus, they may say much about the influence of introduction effort in one specific system, but little about its influence in general. However, we obtained data for introduction effort from studies of the Hawaiian archipelago, Australia, St Helena and New Zealand. We found strong evidence for homogeneity in the population of effect sizes for event-level factors, as the significance level for the homogeneity statistic is greater than 0·99 and the highest of our analyses (Table 2). In addition, there is no statistical evidence to suggest that the magnitude of the effect sizes in New Zealand is any different from the other locations (two-sample Wilcoxon test; *Z* = 0·69, *n* = 19, *P* = 0·48). Thus, there is nothing to suggest that the results from analyses of New Zealand introductions differ in kind or degree from those of other locations in our data. The strong implication is that the clear importance of introduction effort in New Zealand is likely to generalize beyond that archipelago. Indeed, the primary influence of introduction effort and other event-level factors on establishment success is the principal message of the analyses we present. Importantly, we show that this is a strong consistent effect, recovered from studies across a range of locations. Finally, we note that despite the differences in the number and type of variables included *a priori* in the three hypothesized categories, the test for homogeneity in each case suggests that the range of variables is not producing heterogenous effect sizes in any category.

Our finding, that event-level factors are a critical determinant of establishment success in island bird introductions, has important implications for understanding establishment success in other taxa. In particular, the most important likely predictor of establishment success will not be available for most introduction studies. Few taxa or locations have documented introductions in sufficient detail that information on the number of individuals released is known, even on a broad comparative basis. This has two further consequences. First, it means that the effect of any other predictor variable on establishment success must be considered provisional if there is any reason to believe that it may be correlated with introduction effort (cf. Moulton, Sanderson & Labisky 2001; Duncan & Blackburn 2002; Cassey *et al*. 2004). Of course, if there are grounds to believe that this correlation is likely to be strong, this does suggest a way to account for the influence of effort in a comparative study when that variable is not directly available (see, for example, Blackburn & Duncan 2001a). Second, the lack of information on effort means that many, perhaps most, comparative studies of introduction data will be unable to explain the majority of variance in establishment success (cf. Williamson 1996, 1999), if establishment is largely driven by event-level factors, and data for these factors are usually unavailable. Whereas we are undoubtedly attaining an extremely good understanding of the general likelihood of introduction success (Blackburn & Duncan 2001a; Cassey *et al*. 2004), the outcome of any given event is likely to remain inherently difficult to predict (Williamson 1999).