## Introduction

Statisticians have long debated the relative merits of systematic vs. random sampling (Barbacki & Fisher 1936; Gossett 1938; Yates 1939; Greenberg 1951). In the early 20th century, two statistical pioneers debated this issue as R.A. Fisher was a staunch proponent of strict randomization designs (Fisher 1926; Barbacki & Fisher 1936), whereas W.S. ‘Student’ Gossett criticized Fisher's views and argued that systematic approaches were superior (Gossett 1936, 1938). Hurlbert (1984) reviewed this debate and concluded that Gossett's (1938) basic arguments regarding the strengths of systematic designs seemed irrefutable. However, Hurlbert (1984) also noted that Fisher's views (e.g. Barbacki & Fisher 1936; Fisher 1971), despite being somewhat irrational in terms of his intolerance for any departure from strict randomization, had a more profound influence on the statistical literature of the subject, partly because he outlived Gossett and published much more extensively. A central argument against systematic approaches is that they may lead to a sampling bias if there is a systematic pattern (periodicity) in the variation being studied (Yates 1939; Greenberg 1951). Currently, environmental variation is often sampled using Geographic Information Systems (GIS) where fine-scale sampling is more feasible than it was with traditional field-based sampling efforts, making the potential for confounding periodicity less of a concern. Indeed, with GIS it is often possible to systematically assess environmental variation at a fine-scale such that it can hardly be considered sampling at all, as complete knowledge of the variation of interest is attained at the appropriate resolution.

Quantifying animal habitat selection by comparing habitats used with those available is a fundamental analytical framework for studies of wildlife-habitat relationships (e.g. Neu, Byers & Peek 1974; Johnson 1980; Aebischer, Robertson & Kenward 1993; Manly *et al*. 2002; Beyer *et al*. 2010). Conner & Plowman (2001) and Conner, Smith & Burger (2003) proposed a Euclidean distance-based approach for the comparison of animal habitat use with habitat availability to determine specific habitats that are selected and/or avoided. Use of Euclidean Distance Analysis (EDA) has since become widespread in practical and theoretical studies of animal habitat selection (Appendix S1), likely because of several analytical advantages of distance-based analyses relative to classification-based techniques (Conner & Plowman 2001; Conner, Smith & Burger 2003, 2005; but see Dussault, Oeullet, & Courtois 2005; Bingham, Brennan & Ballard 2010). As originally described, EDA relies on random sampling in a GIS environment to determine mean distances to each habitat, which are then compared with distances from animal locations to these habitats to assess whether animal locations are closer (implying selection) and/or farther (implying avoidance) than random locations (Conner & Plowman 2001; Conner, Smith & Burger 2003). The mean distances from random locations to each habitat are used as the estimate of habitat availability (in a distance-based context) and are referred to as expected distances (i.e. distances expected under a null hypothesis of no selection; Conner & Plowman 2001; Conner, Smith & Burger 2003). Conner & Plowman (2001) stated that numerous random locations should be generated throughout the appropriate scale of selection and that numerous should be defined by stability in the mean distances to each habitat type. Clearly, these mean distances will vary when smaller numbers of random points are used depending on their random placement in relation to habitat features, but should stabilize at some unknown threshold number of points above which the distances will not change appreciably if greater numbers of points are used.

Any sampling regime will require some minimum level of effort to adequately capture the variation of interest; however, the importance of determining the sufficient number of random points needed to estimate expected distances for EDA has apparently not been recognized by most researchers. I conducted a literature search yielding 41 studies published from 2003 to 2011 (see Appendix S1) that used EDA to study animal habitat use. Only 5 (12%) reported testing to determine whether a sufficient number of random points were used (Van Etten, Wilson & Crabtree 2007; Moyer, McCown & Oli 2008; Obbard *et al*. 2010; Onorato *et al*. 2011) or otherwise accounted for variability due to random sampling (Parra 2006), whereas 9 (22%) did not even report the number of random locations used (Appendix S1). Most researchers seem to have arbitrarily selected a number of random points to use for the analysis and assumed this number was numerous *sensu* Conner & Plowman (2001). Thus, the potential sensitivity of EDA to sampling error associated with low and arbitrary numbers of random locations has not been adequately addressed by the majority of researchers employing the technique. Exhaustive testing (i.e. for each animal and all habitat types) is necessary to ensure that robust (i.e. repeatable) expected distances are produced, but this is computationally intensive and extremely time consuming, which probably explains why most studies have not done so.

Herein, I propose a systematic approach for calculating habitat availability for habitat selection studies that eliminates the uncertainty associated with random sampling and the need for time-consuming testing. To demonstrate the sensitivity of use-availability analyses to the number of random points used, investigate the potential for spurious results with insufficient random sampling, and evaluate the systematic approach, I re-analysed data from a previous study employing EDA to study habitat selection of the highly endangered Florida panther (*Puma concolor coryi*) by Kautz *et al*. (2006). Specifically, I conducted EDA with the same panther location data, but with increasing numbers of random points and the systematic approach to calculate expected distances. I hypothesized that: (i) expected distances based on different sets and sample sizes of random points would vary, and this variation would lead to different statistical results in EDA; (ii) mean expected distances from random sampling would stabilize (*sensu* Conner & Plowman 2001) at greater numbers of points and would approach expected distances calculated with the proposed systematic approach; (iii) statistical results with greater numbers of random points (i.e. once random sampling was sufficient) would be similar to those obtained using the systematic approach. My results will provide guidance to future studies to ensure that rigorous, repeatable results can be efficiently obtained when employing use-availability habitat selection analyses such as EDA.