## Introduction

Dispersal, the movement from a natal or breeding site to a new breeding site, is probably *the* most important and least understood life history trait (Clobert *et al*. 2001). It is of core importance to the ecological understanding of landscapes, populations and organisms, and it connects ecology to evolution through life history theory, biogeography and population genetics. Population geneticists have long understood that dispersal can act both as a source of genetic variation for evolutionary change (e.g. Wright 1982; Bohonak 1999) as well as a limit to local adaptation (e.g. Dhondt *et al*. 1990; Hendry *et al*. 2001; Lenormand 2002). Population ecologists have also recently rediscovered the importance of dispersal in metapopulation and range dynamics (Andrewartha & Birch 1954, cf. Durrett & Levin 1994; Tilman & Kareiva 1997; King & With 2002): dispersal is the glue that binds together the components of a metapopulation, and it effects source–sink dynamics and the demographic interconnection that is essential to metapopulation dynamics. Without dispersal, the dramatic range extensions that we see today (Shaw 1995; Veit & Lewis 1996), and that we infer in the recent past (Mila *et al*. 2000), would not have been possible. Finally, demographers know dispersal as the principal confounding factor in estimating survival rates (e.g. Lebreton *et al*. 1992; Lindberg *et al*. 2001; Blums *et al*. 2002), as a bird that disappears from a marked open population can only be known with certainty to have dispersed or died if it is found again after leaving.

Because birds are so vagile, avian dispersal has been studied most effectively to date where potential dispersal destinations are constrained to discrete ‘islands’: in the ocean (Pärt 1994, 1995, 1996; Spear *et al*. 1998; Wheelwright & Mauck 1998; Young 1998), in seas of unfavourable habitat (e.g. Matthysen *et al*. 1995; Stith *et al*. 1996) or where social behaviour dictates either a very clumped distribution of breeders (e.g. colonial breeders; Brown & Brown 1992; Pradel 1996; Negro *et al*. 1997; Hafner *et al*. 1998; Lindberg *et al*. 1998; Schjørring 2001) or limited dispersal options (e.g. cooperative breeders; Walters *et al*. 1988; Zack 1990; Young 1998; Koenig *et al*. 2000). For all the tractability that these choices of systems provide, the majority of avian species occupy more continuous habitat, where the distribution of dispersal distances is expected to be continuous; or the movements of dispersers, while impacted by interactions with competitors, are not constrained by habitat availability *per se*. By ‘continuous habitat’ we do not imply that there are not patches of unsuitable habitat, simply that these patches are not arrayed so as to bound dispersal distributions.

There have been few studies of passerines in continuous habitat (Plissner & Gowaty 1996; Verhulst *et al*. 1997), and especially of obligate migrants (Payne 1990, 1991; Shutler & Clark 2003). The biggest problem in studying dispersal empirically in these habitats is that distributions of dispersal distances are confounded by the unequal probabilities of detecting dispersal movements of differing length, i.e. that the dispersal distances actually observed are dictated largely by the dispersal distances that *could* be observed (e.g. Porter & Dooley 1993; van Noordwijk 1995; Koenig *et al*. 1996). There is also the fundamental problem of distinguishing between mortality and dispersal to a breeding site outside the study area. These and other biases inherent in most estimates of dispersal are now widely acknowledged, and there has been considerable recent interest in developing computational methods that quantify dispersal distances and survival more accurately (e.g. Barrowclough 1978; Manly & Chatterjee 1993; Baker *et al*. 1995; Pradel 1996; Thomson *et al*. 2003). These methods still involve extremely simplifying assumptions that may not apply in most systems. Regardless, no matter how sophisticated our corrections for bias may be, we cannot measure dispersal accurately until almost all the potential dispersal distances are sampled (Baker *et al*. 1995; Koenig *et al*. 1996).

Despite the availability of these and other methods, in thinking about dispersal most ornithologists are still guided by generalizations that arise from early spatially constrained studies of dispersal (e.g. Murray 1967; Greenwood 1980; Greenwood & Harvey 1982; Clarke *et al.* 1997): that female birds disperse further than males, that dispersal frequency declines geometrically with distance from the natal site, etc. We examine here the validity of these generalizations in a large-scale study of dispersal in a continuous mainland environment. We explore the natal dispersal distance distributions (DDDs) of a widely distributed Neotropical migrant bird, the tree swallow (*Tachycineta bicolor*, Vieillot 1808). This paper focuses on natal dispersal, the movement from the natal site to the first breeding site. Such movements are of larger scale than ‘breeding dispersal’ movements among successive breeding sites, both in general (Greenwood & Harvey 1982) and in tree swallows (Winkler *et al*. 2004), and they encompass the largest component of the spatial ecology of these birds. We compare the observed DDDs to those that could have been observed in our study area on the basis of the distribution of recapture effort. These comparisons help to weigh the ‘true’ DDD free of constraining study area boundaries. We explore further the potential costs and benefits of dispersal decisions by investigating the relationship between dispersal distances and the density of breeding opportunities, sex of the disperser and the relative timing of the disperser's fledging and its first breeding attempt.

### study system and methods

Like most other North American passerines, tree swallows are Neotropical migrants. They fly every year between breeding grounds throughout North America to wintering areas in the Gulf Coast of North America, the Caribbean and Central America (Robertson *et al*. 1992). Their dispersal distances are thus not constrained by any limitation of movement. Tree swallows are secondary cavity nesters that rely on woodpeckers (or humans) to create the tree holes (or nest-boxes) that they require for nesting.

Our studies of tree swallows around Ithaca were begun in ‘UNIT’ study areas with the erection of 105 nestboxes in 1985 at Cornell University's Experimental Ponds Unit 1. Boxes were established at Experimental Ponds Unit 2 (128 boxes) in 1989, and on Cornell farm land at the top of Mt Pleasant (Unit 4: 60 boxes) in 1991 and along Hanshaw Road. (Unit 5: 22 boxes) in 1993 (see map in Winkler *et al*. 2004). Boxes at each of these UNIT sites are 20 m from the nearest neighbouring box. In the late 1980s we began monitoring variable numbers of boxes erected by others on private property surrounding our intensive study areas on the UNITs. Searching the roads of Tompkins County and creating a database for the locations, conditions, occupants, owners and permissions to visit each of these boxes, by 1993 we built a network (dubbed TOCO) for exploring the dispersal of swallows all around Tompkins County.

We extended the reach of our recapture efforts further by recruiting participants to a dispersal study that was part of the Cornell Nest Box Network (CNBN). Through this network we recruited and trained a subset of CNBN participants in New York and surrounding states to band bluebirds and swallows. In addition to these subpermitees on our master banding permits, several independent banders were also recruited to participate in the study. Although CNBN transformed into the Birdhouse Network (birds.cornell.edu/birdhouse/) in the late 1990s, the Swallow/Bluebird Dispersal Study (SDS) continued to function from Winkler's laboratory at Cornell through the 2003 breeding season. CNBN began reaching cooperators outside Tompkins County in 1994, with eight banders trained, and increased through the next 2 years to between 66 and 73 active banders state-wide from 1997 to 1999. The inclusion of banders throughout New York and surrounding states allowed us to conceive of our dispersal study area as a circle of 400 km radius around our original study site at Unit 1 (Fig. 1).

The central preoccupation of studying dispersal is gathering a collection of line segments, each of which represents the connection between a bird's natal site from which it fledged and its first breeding site. To ensure the accuracy of these line segments, we took great pains to assure that both capture locations, for chicks when they were banded in the nest and for adults when they were recaptured as breeders elsewhere, were recorded as accurately as possible. Box locations were mapped to an accuracy of less than 100 m using USGS 7·5 minute topographic quads. The other critical information available from our database is the distribution of nestboxes in which an adult swallow was captured. This distribution offers an integrated estimate of the ‘eyes’ of our project, as it incorporates not only the distribution of boxes, but also the distribution of adult-capturing effort.

It is the nature of birds that nest in continuous habitat that a check of all possible nesting sites is impossible. Thus, raw recapture rates cannot be taken as estimates of survival rates, and our study aimed not to capture every bird dispersing but rather to sample the dispersal distances across as wide a range of distances as possible. We conducted randomization tests with s-plus (2002) to evaluate the deviations of the observed DDD from those expected under various null distributions. Taking the natal box of each dispersal event as a starting-point, we calculated the distance from the natal box to every other box (henceforth ‘capture-boxes’) in the study area at which we captured an adult the following year. (The same qualitative results were obtained if we evaluated capture-boxes in the year of fledging.) Then, to judge the extent to which the observed DDD was dictated by the distribution of dispersal events that could have been observed, we conducted randomization tests on the distribution of all capture-boxes to see whether the observed DDD represented a significantly different distribution.

The first randomization test was based on a uniform null distribution, with an equal probability of a fledgling settling to breed in any capture-box. One draw was taken from the distribution of capture-boxes for each of the natal nests that was the origin of a dispersal line segment. This process was repeated 1000 times to produce an estimate of the median and range of the expected DDD for all dispersal events. The uniform null model assumed that returning birds were equally aware of all the nesting opportunities in our entire 400 km-radius study circle. One alternative to this null model is that the birds search for available nesting sites starting at their natal site and working outwards from there until they find an unoccupied site. Random-walk local searches produce a geometric decline in frequency with distance (e.g. Murray 1967; Waser 1985), and we created a similar exponential null distribution by regressing the overall observed log probabilities of capture on distance and using the slope and intercept of this regression to parameterize the null distribution. Note that in this paper, in the interest of comparability, we use a one-parameter exponential model, with a steeper drop-off in probability of settlement with distance than in the two-parameter exponential null used in Winkler *et al*. (2004). Finally, we used a very similar procedure to generate a half-Cauchy distribution with its shape parameter derived from the observed data by non-linear regression. The Cauchy distribution is the distribution resulting from the ratio of two independent normal distributions, and it has the heavy tails that characterize what we know of other empirical DDDs (Sutherland *et al*. 2000; Paradis *et al*. 2002).

In analyses relating dispersal distance to breeding phenology, we standardized for annual variations in laydates by subtracting the mean laydate for all nests (not just those that produced dispersal recaptures) in each year from each laydate. We then added either this standardized natal lay date for each disperser or a three-state (early, mid, late) laydate code for each to the mixed-model analyses. With the same methods, we also standardized the laydates of dispersing females in their first breeding year.

We tested for nest density effects in the sample of known first breeders from natal years 1993 onwards, as it is only in these latter years that large numbers of birds were being recaptured from all three networks (Table 1). Within this sample, we divided the area around each natal site into a series of concentric bands of increasing radius. We then related the observed dispersal distances to the numbers of capture-boxes in these bands, tallied for the year of breeding. In a mixed-model analysis with year as a random effect we also included the effect of sex and destination network, along with interactions of all these with each other and the distance rings, as fixed effects.

Natal Year | CNBN | TOCO | UNITs | Total |
---|---|---|---|---|

1985 | 0 | 0 | 1 | 1 |

1986 | 0 | 0 | 9 | 9 |

1987 | 0 | 0 | 15 | 15 |

1988 | 1 | 0 | 3 | 4 |

1989 | 0 | 0 | 16 | 16 |

1990 | 1 | 1 | 26 | 28 |

1991 | 2 | 1 | 46 | 49 |

1992 | 0 | 0 | 26 | 26 |

1993 | 7 | 0 | 64 | 71 |

1994 | 9 | 10 | 90 | 109 |

1995 | 20 | 27 | 99 | 146 |

1996 | 33 | 31 | 61 | 125 |

1997 | 30 | 33 | 74 | 137 |

1998 | 22 | 9 | 24 | 55 |

Total | 125 | 112 | 554 | 791 |

Mixed-model analyses were conducted using the mixed procedure in SAS statistical software version 8·2 (Littell *et al*. 1996). Model selection proceeded from a fully parameterized model, with interaction terms eliminated, weakest first, that had *P* > 0·25. Natal year was included as a random effect in all mixed models. To moderate the effect of rare very long-distance dispersal events, we conducted the analyses with log_{e}-transformed distance. In interpreting the fixed effect coefficients, one cannot merely take the antilog of the coefficient to estimate the mean effect of a change in the predictor on the distance dispersed. It is more direct to think about the median, because the log(median distance) = median of log(distance), which is not true of the mean. To focus, for example, on only the effect of sex on dispersal distance, the median of log(distance) = intercept + beta × sex and, taking antilogs: median distance = exp(intercept + beta × sex) = exp(intercept)exp(beta × sex). Because we coded sex as 1 for females and 0 for males, the expressions for the sex-effects are exp(intercept)exp(beta) and exp(intercept), respectively. The ratio of female to male distance is thus: median(females)/median(males) = exp(beta).