Lessons from the establishment of exotic species: a meta-analytical case study using birds


Phillip Cassey, School of Biosciences, Birmingham University, Edgbaston, Birmingham, B15 2TT, UK. Tel: + 44 121 4145893. Fax: + 44 1214 145925. E-mail: p.cassey@bham.ac.uk


  • 1The establishment of species outside their natural geographical ranges is an important driver of changes in global biodiversity. This creates an imperative to understand why some species are more successful than others at establishing viable populations following introduction.
  • 2Historical data are particularly useful in this regard, and those for birds especially comprehensive. This has resulted in the publication of regional-scale studies that have used these data to attempt to quantify relationships between establishment success and characteristics of bird introductions.
  • 3We use a meta-analytical approach to summarize quantitatively the results of these studies, and to assess the influence of variables invoked to explain the variation in establishment success in birds.
  • 4We find that variables describing characteristics specific to the individual introduction event (i.e. event-level variables), such as introduction effort (or ‘propagule pressure’), are the most consistent predictors of establishment success.


Human activities continue to increase the number of species introduced to areas beyond their native geographical distributions, giving the potential for these species to become biological invaders. To be a successful biological invader, a species must clear several hurdles (Williamson 1996; Davis & Thompson 2000; Richardson et al. 2000; Daehler 2001; Kolar & Lodge 2001; Duncan, Blackburn & Sol 2003). First, the species must be transported to the non-native region. Second, it must be released into the environment there. Third, the species must successfully establish in the new location. Finally, it must spread from the point of introduction. Only if a species successfully transits all four stages (termed transport, release, establishment and spread) can it become invasive. Most non-indigenous species fail to establish self-sustaining populations, and of those that do establish, only a small fraction cause some harm (Williamson 1996). However, the few non-indigenous species that do cause harm can have devastating effects on recipient environments and economies (Vitousek et al. 1996; Mack et al. 2000). Ecologists are thus confronted with the task of finding predictive tools that are powerful enough to distinguish benign or non-invasive introductions from invasive pests. One investigative approach to this formidable task is to analyse data from historical introductions, an example of an ‘experiment in nature’ (sensuDiamond 1986).

Historical data are particularly good for studies of establishment success in birds, as is reflected in global catalogues of avian introduction events (Long 1981; Lever 1987; Cassey 2002a) that have contributed to more than 50 such studies (see recent reviews by Kolar & Lodge 2001; Cassey 2002a; Duncan et al. 2003). These data comprise at least 1920 introduction attempts involving 416 species from 44 families (Cassey 2002b). They include the deliberate introduction of species to 125 mainland states and 218 oceanic islands over four centuries (1600–1980). Studies of the determinants of establishment success and failure have considered the relationship of more than 80 ecological, environmental, and life-history traits to these bird data at a variety of regional and phylogenetic scales. Much of our knowledge of causes of establishment success derives from these studies (Kolar & Lodge 2001). They have identified a variety of factors associated with success in different geographical locations and taxa, including introduction effort, climate, environmental suitability, behavioural flexibility and body size. For example, Duncan et al. (2001) showed that birds introduced to Australia were more likely to establish if introduced in higher numbers and at more sites, if they had a greater area of climatically suitable habitat in Australia, and if they had a larger native range size. Brooke, Lockwood & Moulton (1995) showed morphological overdispersion amongst finch species successfully introduced to St Helena, consistent with the effects of competition in determining establishment success or failure. Sol & Lefebvre (2000) found that birds introduced to New Zealand were more likely to establish if they were introduced in larger numbers, had large brains relative to their body size, were partial migrants and were nidifugous.

The diversity of studies, and the range of variables analysed, raise the question of the relative importance of different factors for establishment success in all non-indigenous species. Unfortunately, the very diversity of studies makes this question difficult to answer. Variables studied may describe characteristics of the species introduced (‘species-level’ effects, e.g. relative biological traits), the location where species are introduced (‘location-level’ effects, e.g. latitude), the individual introduction event (‘event-level’ effects, e.g. number of individuals introduced), or some combination of the three (Duncan et al. 2003). Further, even within taxonomic groups, different authors define what constitutes an introduction in different ways, and then apply different analytical techniques to different sets of variables for different subsets of species. Each of these variables may also be defined or measured in a variety of different ways (e.g. for morphological overdispersion, see Moulton & Pimm 1983, 1986a, 1987; Moulton, Sanderson & Labisky 2001). The importance of any single variable is thus difficult to judge. Hence, the question remains: what do these data and analyses actually tell us about non-indigenous species establishment?

Recent reviews have attempted to assess the influence of different factors in determining establishment success in birds (Kolar & Lodge 2001; Cassey 2002a; Duncan et al. 2003; more generally see Williamson 1996). However, all have relied on simple narrative reviews or ‘vote counting’ approaches, whereby the influence of a variable is assessed by weighing the number of studies that show significant results for its effect against those that do not (Gurevitch & Hedges 1999; Gates 2002). Such methods are qualitative and subjective. They do not take account of the fact that different studies may have different statistical power, that the strength of relationships will differ amongst variables, or that results may be non-significant while still showing consistent effects of a variable (Osenberg et al. 1999). Here, we use a meta-analytical approach to synthesize quantitatively the results of multiple studies and assess the influence of predictor variables invoked to explain the variation in establishment success in birds. In particular, we explicitly assess the role of different regions as well as species-level, location-level, and event-level effects in the successful establishment of non-native bird species. Meta-analysis allows the influence of these different classes of variables to be assessed in a rigorous and quantitative manner.



We conducted a meta-analysis of all independent variables from 24 studies of regional establishment success (dependent variable) among introduced bird species (Table 1). The data were collated from exhaustive searches and bibliographic reviews of the primary scientific literature published since 1980 (Duncan et al. 2003). Each data point is a relationship between establishment success and an independent variable, as reported in the papers cited (Table 1). Thus, the exact definition of success varies among studies, although a successful introduction is usually taken to be one for which there is evidence of a viable population established in a non-native location (all other introduction outcomes are regarded as failures). No data that had previously been collected and analysed by one study and subsequently re-analysed by another study were included twice. However, if the same variable was measured using ‘independent’ data collection methods then the analyses were included. The statistical approaches adopted in our analyses are described by Hedges & Olkin (1985; see also Gurevitch, Curtis & Jones (2001) for an accessible overview, and Møller & Jennions (2001) for a discussion of publication bias). Our analyses and test statistics were calculated in SAS v 8·02.

Table 1.  References and transformed effect size estimates for the studies included in our analyses
ReferenceRegionAverage sample sizeNumber of variables| Average effect size |
Moulton & Pimm (1983)Hawaii61 20·08
Moulton (1985)Hawaii15 10·24
Moulton & Pimm (1986a)Hawaii54 20·02
Moulton & Pimm (1986b)Hawaii72 10·07
Simberloff & Boecklen (1991)Hawaii28 30·22
Moulton & Lockwood (1992)Hawaii20 30·15
Lockwood et al. (1993)Tahiti41 10·19
Lockwood & Moulton (1994)Bermuda17 10·17
Brooke et al. (1995)Saint Helena70 50·27
McLain, Moulton & Redfearn (1995)Oceanic Islands21 20·18
Case (1996)Global islands21 40·50
Veltman, Nee & Crawley (1996)New Zealand78 90·08
Duncan (1997)New Zealand78 70·15
Green (1997)New Zealand40100·12
Eguchi & Amano (1999)Japan42 20·14
Sorci, Møller & Clobert (1998)New Zealand79 30·05
Duncan & Young (1999)Oceanic Islands82 10·11
Sol & Lefebvre (2000)New Zealand32 40·34
Cassey (2001)New Zealand11 80·18
Duncan et al. (2001)Australia55150·14
Moulton, Miller, & Tillman (2001)Hawaii81 30·14
Moulton, Sanderson, & Labisky (2001)Hawaii21 60·08
Moulton, Sanderson, & Labisky (2001)New Zealand20 10·23
Duncan & Blackburn (2002)New Zealand19 20·26

We classified all moderator variables into both their regional location and one of the three categories suggested by Blackburn & Duncan (2001a, 2001b) for predictors of introduction success: location-level, species-level or event-level factors. Location-level variables describe characteristics of the location where the species is introduced, species-level variables describe characteristics of the species introduced, and event-level variables describe characteristics that are specific to the individual introduction event (i.e. characteristics that are entirely independent of either species or location, such as introduction effort (number of individuals released, or ‘propagule pressure’)). The categorization of variables was as follows (many of these variables appeared in more than one study):

Location-level variables: area (km2), climate matching, community at introduction location, congeneric pairs (presence of a congener), extinct species, great circle distance (distance from native to introduced range), hemisphere, human habitat use, introduced assemblage, island size, mean latitude, number of exotic species, number of predator species, overdispersion (pattern of morphological similarity amongst species introduced to a site), species density.

Species-level variables: all-or-none pattern of success, body size, brain size, clutch size, generation time, geographical range size, geographical latitude diversity, incubation period, mating system, migration, mode of development, number of broods, number of diet types, number of foraging innovations, number of habitat types, parental care, plumage dichromatism, population size, presence of flocking behaviour.

Event-level variables: introduction effort, introduction history, mean introduction date, number of releases at a location, year of release.

We classified variables into these three classes, rather than considering each variable separately, for the following reasons. First, these categories have been previously hypothesized to characterize the variability in establishment success. Second, the number of studies reporting each variable separately is small. Thus, statistically it would be impossible to draw any meaningful conclusions about the effect of each variable in isolation. Third, even when multiple studies nominally examine the same variable, that variable may be defined in a range of very different ways. Fourth, our aim in this analysis is synthesis. Thus, we are interested in trying to identify generalities in the causes of introduction success, rather than become mired in the details of each individual analysis. Broad comparative studies have scored notable recent successes in understanding introduction success in birds (see Duncan et al. 2003). Our aim here is to extend that success by using rigorous meta-analytical techniques to explore whether any of three general classes of predictor variable identified by previous authors (Blackburn & Duncan 2001a; Cassey 2002b; Duncan et al. 2003) have consistent effects on success.


The main objective of a meta-analysis is to summarize estimates of the standardized magnitude of an ecological response (i.e. the ‘effect size’) relative to a given correlation or manipulation variable. The effect size, ei from each of the i = 1, … , k study variables included, is simply the magnitude of the change in the response re-expressed to remove the dependence on the arbitrary scale factor σ (the sample variance). There are no preconditions on the definition of ei, and our measure of effect size was Pearson's correlation coefficient. If the original sources did not provide a correlation coefficient, we transformed the published test statistics into a correlation coefficient. Rosenthal (1994) provides examples of basic formulae for the transformation of common test statistics. Because the individual observations (i.e. establishment outcomes to introduction attempts) are assumed to be distributed binomially, multiple logistic regression type approaches (univariate chi-square and F-test test statistics) are the most common method of analysis. In a number of cases, the published studies did not provide coefficient estimates or test statistics for non-significant results. In these cases the published data were either re-analysed or the authors were contacted for their missing statistics. The mean absolute effect size from each paper is shown in Table 1.

A plot of estimated effect size, ei against sample size should produce a ‘funnel’ shape symmetric around the ‘true’ average effect size when the effect sizes are derived from a random sample of similar research studies. Purely due to sampling error (i.e. the larger the sample the more accurate the estimate) the variance in estimates of the ‘true’ effect size is higher for studies with smaller samples. Consider now what the form of the relationship between effect size and N might be if there was a genuine association between variables across the set of studies. First, the mean estimate of the overall effect size might not be distributed around zero, and hence indicate a consistently positive (or negative) effect. Consistent positive or negative effects of independent variables on establishment success can be identified by testing whether the ‘common correlation’ among effect sizes differs from zero. Second, some genuine positive effects and some genuine negative effects might cause the mean effect size to equal zero. For example, positive effects of event-level variables on success and negative effects of location-level variables might cancel out to give a common correlation that does not differ from zero. However, these effect sizes would be drawn from multiple distributions, and hence the variance of the population of effects would show significant heterogeneity. This would also be true if the genuine effects were mixed with a set of random associations (e.g. if just some event-level variables were significantly positive, and just some location-level variables significantly negative). Evidence for such situations can be provided by a test for homogeneity of variance, which would demonstrate significant heterogeneity across studies. Third, the effect size for a genuine association between two variables should not be dependent on sample size (although at very small sample sizes the effect may not be strong enough to achieve statistical significance). Hence, plots of effect size vs. N would reveal parameters that were not characteristic of the expected funnel shape. These can be identified as outliers from the funnel plot.

To test whether the common correlation among effect sizes in our data was zero, Pearson correlation variates were first transformed for analysis by Fisher's Z-transformation:

image( eqn 1 )

The measure of effect size was weighted by sample size, based on the assumption that a larger sample size should provide a more reliable estimate of a true relationship. We estimated the common weighted average of the Z-transformed effect size variates as:

image( eqn 2 )

where K is the number of study samples in the analysis, ni is the size of sample i, and N = Σ ni. The large sample normal approximation to the distribution of Z+ can then be used to test the hypothesis that the common correlation is zero (Kraemer 1983) using the test statistic:

image( eqn 3 )

which is compared to the α = 0·05 two-tailed critical value of the standard normal distribution. In the same manner we constructed 95% confidence intervals around the mean effect size:

image( eqn 4 )

where 1·96 is the 95% two-tailed value of the standard normal distribution.

Because the studies used in our analyses obviously differ in many characteristics, such as data collection, species and environmental traits, and researcher effects, we used a random model to estimate the variance of the population of correlations. The expected values of the mean squares are thus expressed as variance components of the mean effect sizes. We then tested whether the variance of the population of correlations differs from zero, using the large sample test for homogeneity of correlations given by Hedges & Olkin (1985):

image( eqn 5 )

which is subsequently compared with the critical value from the chi-square distribution with K – 1 degrees of freedom. Clearly, Q depends on both effect size and sample size, and is a conservative test when sample sizes are small, as is the case for all the studies considered here.

The homogeneity statistic determines whether the effect sizes from a series of studies exhibit any variability beyond that which could be expected due to sampling error. In the case of significant heterogeneity, then post hoc multiple comparison methods can be used to partition the heterogenous populations into more homogenous groups in which the effect sizes within clusters are close, but the effect sizes between clusters are separated. Methods for post hoc disjoint cluster analysis follow Hedges & Olkin (1985). If the homogenous model is consistent with the data, it implies that each study in essence conforms to or replicates the findings of the other studies. Nevertheless, individual outliers, or aberrant results, are likely to remain and these can be identified by calculating standardized weighted residuals:

image( eqn 6 )

where ei is the difference between the effect size of the ith study and the weighted mean with the ith study omitted, and σ(ei) is the square root of the approximate estimated variance. The residuals that have large absolute values (> 2) indicate a set of parameters with different effect sizes from the overall mean (Hedges & Olkin 1985).

Meta-analysis is based on the assumption that the literature reviewed is unbiased (Rosenthal 1991). Alternatively, publication bias occurs whenever the strength or direction of the results of published and unpublished studies differ. To examine publication bias we plotted effect size (zi) against sample size (ni) and conducted the rank correlation test of Begg & Mazumdar (1994) to investigate the relationship between the two. Publication bias is inferred if this relationship is significant such that there are fewer than expected studies with either negative or positive effects at low sample sizes (Møller & Jennions 2001). Where we found effect size to be different from zero, we estimated the number of unpublished null results that would be required to nullify the observed significant effect. This ‘fail-safe’ number was estimated following Rosenthal (1991) as:

image( eqn 7 )

where the value 2·706 is the one-tailed z-value for α = 0·05.


The weighted average effect size (Z+) for the population of transformed Pearson correlation coefficients is 0·04. The 95% confidence interval around this effect (0·01, 0·08) does not include zero, which thus indicates a significant positive common correlation. The frequency distribution of correlation coefficients was not significantly different from normal and the most common coefficient was close to the mean value (Fig. 1; Shapiro–Wilk W = 0·98, P = 0·25). The skewness and kurtosis of the distribution of coefficients were −0·32 and −0·29, respectively, neither of which differed significantly from zero (Fig. 1; P > 0·05). These results suggest that the large sample normal approximation to the distribution of Z+ is appropriate, and hence that our test of the null hypothesis of a common correlation of zero is robust.

Figure 1.

Frequency distribution of Pearson's product-moment correlation coefficients for the relationship between the outcome of introduction success and predicted ecological moderator variables.

The relationship between transformed effect size (Zi) and sample size (ni) across all comparisons is shown in Fig. 2. If associations between success and independent variables were random, most points would fall within the limits defined by the significance lines at α = 0·05. Clearly, a reasonable proportion of the points in Fig. 2 fall outside these limits. The rank correlation test between standardized effect size and sample size was not significant (r = −0·04, n = 96). This indicates that the number of expected negative and positive effects at low sample sizes is not significantly biased, and hence it can be inferred that the pattern of effect sizes is not due to publication bias.

Figure 2.

The relationship between transformed effect sizes (Zi) and sample sizes (ni) for the 24 studies and 96 moderator variables. Overlain significance lines are calculated for the levels of α = 0·05 from the relationship between R2 and n in (Sutton 1990). Highlighted points (hollow) indicate study parameters with large standardized residuals (outliers), and are as follows: (a) introduction area (location-level); (b) nestling type (species-level); (c) number of introduced species (location-level); and (d–g) number of individuals released (event-level).

Although we find a significant positive common correlation on introduction success across all the variables in our data, it is of relatively low magnitude (0·04). Moreover, the range of different locations chosen for introduction, and the range of traits that researchers have subsequently compared, suggests that it is inappropriate to expect a common fixed effect size for the population of correlation coefficients across all variables, locations, and studies. However, the test for homogeneity shows that the variance in effect sizes is not significant across all studies (α = 0·05, Table 2), contrary to this expectation. Nevertheless, we strongly believe that it is sensible not to interpret this result over-conservatively and thus potentially commit a Type II statistical error. One reason why this might happen is that the test statistic in equation 5 is strongly dependent on ni whereas Fig. 2 reveals that all but two of the sample sizes available for the study of avian introduction success are relatively small (< 100). Moreover, the parameter variance component of the population of correlation coefficients in the random effects model is relatively large (0·38), while the significance value of the heterogeneity test is relatively low (approximately 0·10). Finally, a test for outliers reveals that seven parameters are not consistent with the homogenous effect size model. These include effect sizes for species-level (K = 1), location-level (K = 2), and event-level (K = 4) traits (Fig. 2). Therefore, we believe that investigating possible causes of heterogeneity in the distribution of correlations is clearly worthwhile.

Table 2.  Effect size statistics for the treatment groups within the 24 studies and 96 moderator traits. The overall mean effect is calculated using equation 2. Population variance is the variance component for the population of correlation coefficients (effect sizes) calculated using a random model. The homogeneity test statistic for the population variance is defined in equation 4, and its significance is indicated by the P-value
ComparisonOverall mean effectConfidence intervalPopulation varianceHomogeneity test statisticP-value
All introduction studies   0·04   0·01, 0·080·38108·110·10
Islands of New Zealand   0·05   0·00, 0·090·34 48·040·15
Islands of Hawaii   0·01−0·06, 0·080·37 13·130·87
Oceanic islands−0·02−0·13, 0·090·48 18·210·15
Australia   0·10   0·03, 0·170·29 13·930·38
Species-level traits   0·03−0·02, 0·070·32 31·720·87
Location-level traits−0·07−0·13, 0·000·35 26·180·75
Event-level traits   0·21   0·14, 0·280·12  6·310·99

One potential source of variance in the effect sizes is location. Over two-thirds of the collected variables come from studies on the islands of New Zealand and Hawaii (68%). However, these two locations contribute only 38% of all island introduction attempts (Cassey et al. 2004). We examined the variability in effect sizes for four categories of study locations: the islands of New Zealand (K = 42), the islands of Hawaii (K = 22), the remaining oceanic study islands (K = 15), and the land mass of Australia (K = 15). For both New Zealand and Australia, the mean effects were significantly different from zero (Table 2). In all four cases, the 95% confidence interval for the mean effect overlapped with the overall mean. In two sets of studies, those for the islands of New Zealand and the remaining oceanic islands, the test for heterogeneity was approximately P = 0·15. Although clearly not statistically significant, these are two cases with a large detectable degree of heterogeneity among their effect size estimates (Table 2).

Another obvious source of potential variation in effect sizes is the difference in moderator variables used across studies. We examined the variance in effect sizes for species-, location- and event-level traits. The mean effect for species-level traits was not significantly different from zero, and its 95% confidence interval overlapped with the overall mean effect (Table 2). However, the 95% confidence intervals for location- and event-level traits did not overlap with the overall mean effect and the mean effect for event-level traits was highly significantly different from zero (Z = 6·14). This strongly suggests that the overall positive common correlation is driven largely by event-level traits. The test for homogeneity shows that effect sizes within the species-, location- and event-level categories are all highly homogenous, indicating that the effects in each case are not drawn from multiple distributions. In addition, event-level is the only set of traits for which the fail-safe number (315) is large enough that it is unlikely that missing results could nullify the significance of the effect. Notably, the fail-safe number of ‘missing’ publications is more than three times the value suggested by (Rosenthal 1991) as being evidence of a robust average effect size (5K+10).


The publication of catalogues of avian introductions by Long (1981) and Lever (1987) has provided the raw material for a plethora of comparative studies of establishment success. While the resulting studies have undoubtedly improved our understanding of the introduction process, they have also spawned disagreement regarding the relative importance to establishment, or lack thereof, of specific factors or processes (e.g. compare Moulton & Pimm (1983, 1986a, 1987) with Simberloff & Boecklen (1991); Green (1997) with Sol & Lefebvre (2000); Moulton & Sanderson (1997, 1999) with Duncan & Young (1999); and Moulton, Sanderson & Labisky (2001) with Duncan & Blackburn (2002)).

Our analyses quantitatively summarize the findings of many such comparative studies for bird introductions to islands, and reveal the following patterns. First, the range of predictor variables analysed shows a small but significant positive common correlation with introduction success. This pattern in the effect sizes is not due to publication bias. Second, although a test for homogeneity shows that the effect sizes are not significantly heterogenous in their distribution, they do show relatively high variance. This suggests that the magnitude of the common correlation may be influenced by the combination of effects from heterogenous groups. We thus felt justified in searching for the source of this variance. Third, grouping the predictor variables by location of study did not sufficiently explain the heterogeneity in effect sizes. The mean effect size in each group did not differ significantly from the overall mean, while the variance component for the population of effect sizes was only markedly reduced for one of the four locations (Australia; Table 2). In contrast, fourth, grouping the predictor variables invoked to explain avian establishment on islands as species-, location- or event-level factors did identify sources of heterogeneity. Of the three groups, event-level variables were the most consistent predictors of establishment success for introduced birds (Table 2; see also below).

Although the event-level class includes several different predictor variables, the most frequently considered such factors are measures of introduction effort or ‘propagule pressure’. Theory suggests that introduction effort should be a key determinant of establishment success (Williamson 1996). In a recent review of avian introductions, Duncan et al. (2003) noted that all studies that have addressed the question ‘have shown that species introduced with greater effort … have a higher probability of establishment’, and that ‘introduction effort remains by far the strongest independent explanation for establishment success in birds’. These were qualitative observations, but are supported by the quantitative meta-analyses presented here. Notably, the only event-level trait that produced detectable outliers (i.e. deviated from the overall mean effect size) was introduction effort (the four event-level outliers in Fig. 2). Moreover, the mean effect size for the event-level variables (0·212) is relatively high by biological standards (Møller & Jennions 2002).

The overall mean effect for species-level factors did not differ from zero, which therefore suggests that there is no consistent set of traits that distinguish species that are ‘good’ at establishing from those that are ‘poor’. Note that while a non-significant effect could arise because consistent positive and negative relationships within this category cancelled each other out, this would give rise to heterogeneity in the estimate of the overall mean effect if the population of correlation coefficients tended to be statistically meaningful. There is no evidence for such heterogeneity in the population of species-level factors, as grouping variables in this way resulted in very low heterogeneity in effect sizes within this group (Table 2). Although it has frequently been argued that some characteristics will predispose species to invade, such as generalist habits or high intrinsic population growth rates (Duncan et al. 2003), it seems more likely that features of the environment and species will interact to determine establishment success. The general weakness of species-level effects (Table 2) fits with studies that show most variation in establishment success to be clustered at the species level (Blackburn & Duncan 2001a; Sol, Timmermans & Lefebvre 2002): species tend either to have high or low establishment success, but closely related species often differ substantially in their probability of establishment. Since closely related species tend to share many life-history, behavioural and ecological characteristics, it is unlikely that any such traits will be general drivers of the differences between such species in establishment success.

There is a trend in our analyses for a negative effect of location-level traits. This implies that some locations (e.g. smaller islands and islands with fewer exotic species; Fig. 2) are easier to invade than others. However, the influence of location-level traits on establishment success is clearly not as strong as the effect of event-level factors (Table 2). This is perhaps not surprising, as the influence of the location seems likely to depend on the characteristics of the species exposed to it: an environment conducive to the establishment of some species may be inimical to others. That said, most of the introduction locations in the studies included in our analyses are islands, traditionally considered to be relatively benign environments for invaders. It is possible that by including mostly island systems in our analyses we may be excluding much of the variation in invasibility associated with location, and so influencing our study against finding location-level effects. However, Sol (2000), Blackburn & Duncan (2001a) and Cassey (2003) have all shown that islands are no more susceptible to colonization by exotic birds than continental regions. Recently, Cassey et al. (2004) showed that significant effects of location-level variables on establishment success in a global analysis of birds were artefacts of their relationship with introduction effort. We suspect that the interaction between location- and event-level traits will prove to be a rewarding avenue for future studies of invasion biology.

One criticism of the importance for establishment success attributed to introduction effort is that a large number of studiesd use data from just a single location, the New Zealand archipelago (Duncan et al. 2003). Thus, they may say much about the influence of introduction effort in one specific system, but little about its influence in general. However, we obtained data for introduction effort from studies of the Hawaiian archipelago, Australia, St Helena and New Zealand. We found strong evidence for homogeneity in the population of effect sizes for event-level factors, as the significance level for the homogeneity statistic is greater than 0·99 and the highest of our analyses (Table 2). In addition, there is no statistical evidence to suggest that the magnitude of the effect sizes in New Zealand is any different from the other locations (two-sample Wilcoxon test; Z = 0·69, n = 19, P = 0·48). Thus, there is nothing to suggest that the results from analyses of New Zealand introductions differ in kind or degree from those of other locations in our data. The strong implication is that the clear importance of introduction effort in New Zealand is likely to generalize beyond that archipelago. Indeed, the primary influence of introduction effort and other event-level factors on establishment success is the principal message of the analyses we present. Importantly, we show that this is a strong consistent effect, recovered from studies across a range of locations. Finally, we note that despite the differences in the number and type of variables included a priori in the three hypothesized categories, the test for homogeneity in each case suggests that the range of variables is not producing heterogenous effect sizes in any category.

Our finding, that event-level factors are a critical determinant of establishment success in island bird introductions, has important implications for understanding establishment success in other taxa. In particular, the most important likely predictor of establishment success will not be available for most introduction studies. Few taxa or locations have documented introductions in sufficient detail that information on the number of individuals released is known, even on a broad comparative basis. This has two further consequences. First, it means that the effect of any other predictor variable on establishment success must be considered provisional if there is any reason to believe that it may be correlated with introduction effort (cf. Moulton, Sanderson & Labisky 2001; Duncan & Blackburn 2002; Cassey et al. 2004). Of course, if there are grounds to believe that this correlation is likely to be strong, this does suggest a way to account for the influence of effort in a comparative study when that variable is not directly available (see, for example, Blackburn & Duncan 2001a). Second, the lack of information on effort means that many, perhaps most, comparative studies of introduction data will be unable to explain the majority of variance in establishment success (cf. Williamson 1996, 1999), if establishment is largely driven by event-level factors, and data for these factors are usually unavailable. Whereas we are undoubtedly attaining an extremely good understanding of the general likelihood of introduction success (Blackburn & Duncan 2001a; Cassey et al. 2004), the outcome of any given event is likely to remain inherently difficult to predict (Williamson 1999).


This work was conducted as part of the ‘Phylogeny and Conservation’ Working Group supported by the National Center for Ecological Analysis and Synthesis, a Center funded by NSF (Grant no. DEB-0072909), the University of California, and the Santa Barbara campus. P.C. acknowledges the assistance of the French Ministery of Education and Research, Action Concertee Incitative ‘Jeunes Chercheurs 2000’, awarded to the group ‘Eco-Evolution Mathematique’, the French Ministery for Environment, Action Concertee Incitative ‘Invasions Biologiques’, and the Leverhulme Trust (grant F/00094/AA). T.M.B. thanks R. Ferriere (Ecole Normale Supérieure, Paris) and M. Bergman for their kind hospitality.