The parameters (*θ*) of each model can be fit using maximum likelihood. Our survey data provide us with observations of the number of trip outcomes *S*_{nj,} for each boater *n*, to a given lake *j*. From these observations, we can write the log-likelihood for model *M* as:

- (eqn 10)

We fit the parameters of each model, including reduced models using maximum likelihood implemented in the R statistical programming environment (R Development Core Team 2008). Reduced models were those in which we removed the boater level parameters pertaining to boat type. Each model was then compared in terms of its relative performance using two separate metrics. The first metric of model selection we used was the Akaike Information Criterion (AIC) (Burnham & Anderson 2002). The second metric we used was the simple coefficient of determination (*R*^{2}) between the predicted and observed total number of trips taken to each lake in our study system. From this metric, we could compare the relative proportions of total variation in the number of visits across all lakes explained by each model.

##### Spread Simulations: Examining theoretical model behaviour and interactions with population demographics

Ultimately, we are constructing our models of human movement patterns between discrete patches to use in making predictions about the spread of species that are being dispersed across this network of patches. While spread is a stochastic process, where introductions lead to viable population establishments at a given site in a probabilistic manner, we can use repeated simulations to characterize the expected trajectory of a given invasion process (Peck 2004). By simulating the spread process under each of our competing models, we can compare the predicted trajectories to make inferences about the consequences of model specification on spread prediction. Differences in predicted spread rates, as well as predicted invasion risk at the individual site level may have an effect on management decisions regarding mitigation and control.

To conduct these simulations, we followed the procedure outlined in Leung & Delaney (2006). We model the stochastic spread process as

- (eqn 11)

Where the probability of invasion is given as a function of the number of propagules *Q*, arriving at time *t* to site *j*. The function is described by two shape parameters. The first, *α,* is a per propagule multiplier proportional to *–ln*(*1 − p*), where *p* is the per propagule probability of establishment. The additional parameter *c* allows us to describe an Allee effect, where the per propagule establishment probability is disproportionately lower at low propagule pressures (Dennis 2002). The strength of the Allee effect increases as *c > 1*. Non-negligible Allee effects have been observed in some aquatic invasives. This parameter has been estimated as 1·86 (*P *<* *0·0001; H_{o}: *c *=* *1) for zebra mussels using an invasion time series (Leung, Drake & Lodge 2004). Wittmann *et al*. (2011) also detected an Allee effect using a stage-structured model of the invasive zooplankton *Bythotrephes*.

To calculate the number of propagules *Q* arriving at site *j*, we sum across the probability distribution of each boater having visited an invaded lake before arriving at site *j*. To do this, we first calculate the proportion of boaters at each source location that have visited an invaded lake as:

- (eqn 12)

Where *O*_{i} is the number of boaters at source location *i,* and *P*^{M}(*T*_{ih}) is the probability of a boater at source location *i* visiting lake *h* as given by the model *M* under which simulations are being carried out. *X*_{i,t} is the number of boaters in source location *i* having visited an invaded lake in time step *t*. We derived *O*_{i} from data obtained from the Ontario Ministry Natural Resources on the number of registered boaters in Ontario in each of 526 postal regions identified by the first three postal code digits. The next step is to calculate the propagule pressure *Q* arriving at lake *j* in time *t* as:

- (eqn 13)

Which is the total boater traffic from all invaded sources to lake *j* in time step *t*. For more details, see Gertzen & Leung (2011). While each human vector model predicts a unique trip distribution matrix, the total number of boater trips taken, or the overall magnitude of traffic flow in the system as a whole, is constant across both models. Any difference in the observed rates of spread in our simulations therefore is a result of the dispersal network *structure,* and not the absolute magnitude of between-lake movement.

While there are roughly 250 000 lakes and rivers in Ontario, to render our simulations computationally feasible, we simulate spread across only those lakes with a surface area larger than 10 hectares. Additionally, we removed lakes above 52˚latitude, as these lakes are not accessible by any roadways connecting them to the southern lakes. This left us with 781 lakes in our simulation set. Each independent simulation began with a seed invasion in Lake Ontario and was run forward 30 years. By seeding the invasion in Lake Ontario, we recreate the most likely invasion scenario for Ontario inland lakes. As of 2006, the great lakes are known to have been invaded by at least 182 species (Ricciardi 2006), making it the most likely source location of a novel species spread to inland lakes.

To analyse potential interactions between population dynamics and the human vector model, we examined the effect of population establishment parameters and we ran repeated simulations across a range of parameter values of both *α* (*7·5-e05,1·0e−04,1·25e−04,1·5e−04*) and *c* (1,1·5,2,2·5). For each simulation, we used either the best fitting GM or RUM of boater behaviour. As our metrics of invasion progress, for each run, we retained the cumulative number of lakes invaded. An example realization of our simulated spread procedure can be seen in Fig. 2. Additionally, we compared the relative invasion risk at each of three specific selected sites. Lakes Simcoe, Nipissing and Nipigon were selected because of their large size, making them more at risk to invasion, as well as because of their relative distances from the source location of invasion. While these lakes by no means represent a random sample, they provide a convenient gradient of baseline risk along which to observe the rate at which deviations between models occur. For these lakes, we retained the time to invasion across every simulation for every parameter combination. We calculated the risk to a given lake as the proportion of simulation realizations in which the site became invaded before the end of the 30-year time horizon.