#### Study material

*Calochortus albus* is herbaceous, with one ribbonlike basal leaf developing early each spring from a perennial bulb. Later in the spring, a single flowering stalk is produced, from which hang nodding, globe-shaped flowers (Fig. 1) that vary regionally in color from white to pale pink to red (Ownbey 1940). *C. albus* does not spread vegetatively. It is protandrous, with anthers of a given flower dehiscing before the stigma is receptive. Individuals are visited and pollinated exclusively by bees (mostly *Bombus*) and, although they are self-compatible, protandry, and pollinator observations by Jokerst (1981) suggest that selfing is rare.

*Calochortus albus* is a member of the Bay Area clade, a group of 10 species centered around San Francisco Bay, but ranging into more distant portions of the Coast Ranges, the Sierra Nevada, and the Cascades (Patterson and Givnish 2004). *C. albus* is the most widespread member of the Bay Area clade in California, and occurs in oak woodlands in the Coast Ranges south of San Francisco Bay, the foothills of the northern and central Sierras, the western Transverse Ranges, and the northern Peninsular Ranges.

#### Sampling

During spring 2006, 20 populations of *C. albus* were sampled at Henry Coe State Park (37°11′N, 122°33′W), a 350-km^{2} California state park located southeast of San Jose in the interior South Coast Ranges (Fig. 2). Sixteen sample sites were located in oak woodlands on north-facing slopes throughout the western section of the park along Flat Frog Trail, Forest Trail, and Poverty Flat Road; four other sites were located in the southern section, along Grizzly Gulch Trail and Hunting Hollow Road. Although the 16 northern sites form a nearly one-dimensional array (Fig. 2), oak-woodland habitat for *C. albus* is strongly two-dimensional there, covering large areas of the north-facing slopes. Distances between populations ranged from 0.07 to 14.4 km, and distances between plants within populations varied from 0.2 to ca. 50 m. Within this area, *C. albus* was extensively distributed and especially common on north-facing slopes. In each population, leaf material was collected from 10 to 20 individuals and preserved in silica gel, sampling a total of 254 plants. Universal Transverse Mercator coordinates of one individual per population were determined using a high-precision GPS (Leica SR530, St. Gallen, Switzerland; 1 cm horizontal precision). Coordinates of all other individuals were determined based on their bearing and distance from the focal plant using a compass and a sonic range-finder (Sonic Multi-Measure™ ComboPro #10300, Charlotte, NC; 10 cm horizontal precision). In each study population, and at 10 sites randomly located between populations, reproductive and nonreproductive individuals were counted along three 1 m × 10 m transects, to produce estimates of the density of reproductives and nonreproductives.

#### AFLP data generation

High-quality genomic DNA was extracted using a DNeasy^{®} 96-well plant extraction kit (Qiagen, Valencia, CA). AFLP data were generated following the protocols of Myburg et al. (2001), with minor modifications for optimization with an ABI 3100 capillary sequencer (Applied Biosystems, Foster City, CA). Genomic DNA samples were digested with the restriction endonucleases *Eco*RI and *Mse*I in a 10 μL reaction containing 83 ng DNA, 0.05 μL 100 ng/100 μL BSA, 5 U *Eco*RI, and 5 U *Mse*I and incubated at 37°C for 2 h. Double-stranded adapters were then ligated to the resulting digestion fragments. Double-stranded adapters were produced by mixing equal volumes of equimolar solutions of two oligos for both the M (5′ GAC GAT GAG TCC TGA G 3′ and 5′ TAC TCA GGA CTC AT 3′) and the E (5′ CTC GTA GAC TGC GTA CC 3′ and 5′ ATT TGG TAC GCA GTC TAC 3′) adapters. These solutions were then incubated at 95°C for 5 min and allowed to cool 1°C per minute at room temperature. A volume of 0.19 μL of both the M and the E adapters, along with 3.52 μL ddH_{2}O, 1 μL 10 × ligase buffer, and 0.4 U T4 DNA ligase were added to the 5 μL digestion reaction and incubated at 16°C overnight. Ligation products were then diluted at a ratio of 17 μL ligation product to 70 μL ddH_{2}0.

Pre-selective amplification reactions were carried out using the dilute ligation products and primers complementary to the adapters but extending one additional, specified base in the 3′ direction. These 25 μL reactions contained 2.5 μL 10 × buffer, 1.5 μL 25 mmol/L MgCl_{2}, 2 μL 2.5 mmol/L (each) dNTPs, 0.38 μL M+C primer, 0.38 μL E+A primer, 1.5 U *Taq*, 5 μL dilute ligation product, and 13 μL ddH_{2}0. Reactions were then cycled under the following conditions: 72°C for 60 sec; 20 cycles of 94°C for 50 sec, 56°C for 60 sec, 72°C for 120 sec; 72°C for 120 sec. Pre-selective amplification products were then diluted with ddH_{2}O at a ratio of 40 μL: 720 μL.

Three rounds of selective amplifications were performed using dilute pre-selective amplification products and three different primer pair combinations. All of the primers used in this final selective amplification were complementary to those used in the pre-selective amplification but extended an additional two or three bases in the 3′ direction. The primer pair combinations used were as follows: M+CCAG and E+ATT, M+CTT and E+ACT, and M+CCCG and E+AGC. In each primer pair the E+A– primer was fluorescently labeled. These 25 μL reactions contained 2.5 μL 10 × buffer, 1.5 μL 25 mmol/L MgCl_{2}, 3 μL 2.5 mmol/L (each) dNTPs, 0.5 μL deionized formamide, 1.25 μL M+C— primer, and 0.25 μL labeled E+A– primer, 1.25 U *Taq*, and 10.75 μL ddH_{2}O. The reactions were then exposed to the following cycling conditions: 10 cycles of 94°C for 50 sec, 65°C for 60 sec (decreasing by 1°C each cycle), 72°C for 120 sec; then 20 cycles of 94°C for 50 sec, 56°C for 60 sec, 72°C for 120 sec; then 72°C for 10 min.

Selective amplification products were cleaned using magnetic beads (CleanSeq™, Agencourt, Beverly, MA) and run on an ABI 3100 capillary sequencer using a fluorescent internal standard in each lane (Geneflow™ 625, Chimerx, Milwaukee, WI). Chromatograms were analyzed using GeneMarker (SoftGenetics LLC, State College, PA) to generate 0/1 matrices of fragments 100–300 bp in length for the M+CCAG/E+ATT and M+CCCG/E+AGC primer pair combinations and 100–400 bp in length for the M+CTT/E+ACT primer pair combination.

#### Analysis of spatial genetic structure

Individual AFLP bands were each assumed to represent one locus with two alleles. The presence of a band thus indicated either a heterozygote or dominant homozygote at that locus, while the absence of a band indicated a recessive homozygote. Spatial genetic structure was assessed by calculating the slope of pairwise kinship coefficients (Hardy 2003) against the logarithm of distance between individuals, using the software program *SPAGeDi* 1.3 (Hardy and Vekemans 2009). The kinship coefficient was developed for dominant genetic markers and thus requires an estimate of the inbreeding coefficient, but is robust to moderate errors in that coefficient (Hardy 2003). Given the strong protandry seen in *C. albus,* we conducted calculations assuming Hardy–Weinberg conditions and an inbreeding coefficient of zero. Pairs of samples were binned into nine, logarithmically spaced distance classes: 0–3 m, 3–9 m, 9–27 m, 27–81 m, 81–243 m, 243–729 m, 729–2187 m, 2187–6561 m, and 6561–19683 m. For each of these classes, average pairwise kinship values were plotted against ln distance to create a kinship-distance plot (Hardy 2003). Least mean squares regressions were used to determine the slope of the regression in the kinship-distance plot using average values for distance classes across all 20 sites, and for the 16 northern sites only. For all pairwise comparisons of individual plants, Mantel tests based on 999 permuations of the data were used to determine whether regression slopes differed significantly from zero for plants from all 20 sites, and for those from the 16 northern sites only (Hardy 2003).

To permit comparisons with results from other studies and to estimate neighborhood size (*N*_{b}), we calculated the *Sp* statistic (Vekemans and Hardy 2004). *Sp* is a measure of the strength of SGS, with high values indicating strong fine-scale structure. *Sp* is defined as

- (1)

where b_{F} = the slope of the regression of kinship on ln geographic distance and F_{(1)} = the average kinship between adjacent plants. The average kinship between plants falling into the first distance category (0–3 m) was used to estimate F_{(1)}. The *Sp* statistic was then used to estimate the root-mean-square distance of gene dispersal (σ) as

- (2)

where *D*_{e} is effective population density and σ is the root-mean-square distance of gene dispersal (Vekemans and Hardy 2004). Neighborhood size (*N*_{b}) (Wright 1943, 1969) was calculated as the inverse of *Sp* (Vekemans and Hardy 2004). Estimates of σ and *N*_{b} were made using an iterative procedure to estimate σ based on the genetic structure over a restricted distance range. Equations (1) and (2) hold best over distances between σ and 20σ (Vekemans and Hardy 2004). Therefore, *SPAGeDi* applies an iterative regression procedure within this range, first calculating an *Sp* value from the slope of the regression of the kinship coefficient on ln distance over an arbitrarily chosen initial range of distances, and then using this *Sp* value to calculate σ according to equation (2). A restricted regression is then calculated over distances between σ and 20σ, and a new *Sp* value obtained based on the slope over this range. This procedure is repeated 100 times or until estimates of σ converge on a stable value (Hardy and Vekemans 2009), thus providing an estimate of the scale of gene dispersal at a given effective density as well as *N*_{b}. We confirmed, in each case, that the same estimates of σ, *Sp*, and *N*_{b} resulted when the iterative procedure was started using interplant distances of 10–200 m as when starting at interplant distances of 100–2000 m.

We calculated the mean ± SD of the densities of reproductive and nonreproductive individuals across all 20 sites surveyed. We estimated effective population density *D*_{e} as the density *D* of reproductive individuals times 0.5, 0.3, and 0.1, given that effective densities of natural plant populations often fall within this range (Husband and Barrett 1992; De-Lucas et al. 2009). We estimated σ, *Sp*, and *N*_{b} for a total of nine estimates of *D*_{e}, based on the mean density *D* of reproductives observed, plus or minus one standard deviation, multiplied by the factors 0.5, 0.3, or 0.1.

We compared the values of the *Sp* statistic for *C. albus* with those of other herbaceous plants in the meta-analysis of Vekemans and Hardy (2004), to determine whether *C. albus* showed exceptionally short dispersal distances. Comparisons included the placement of taxa into one of four categories based on pollination mechanism and mode of seed dispersal: (1) animal pollination/gravity dispersal; (2) animal pollination/animal dispersal; (3) wind pollination/gravity dispersal; and (d) animal pollination/mixed animal and gravity dispersal. The latter permitted us to assess whether *C. albus* had exceptionally short dispersal distances given its ecology of pollen and seed dispersal.

#### Cluster analyses

We employed the Bayesian clustering algorithms in STRUCTURE v. 2.3.4 (Pritchard et al. 2000; Falush et al. 2003; Hubisz et al. 2009) to infer population structure and to assign individuals to clusters, based on multi-locus genetic data and minimization of Hardy–Weinberg disequilibrium within clusters. The estimation analyses assume different numbers of clusters *K*, and then compare the estimated log probability of data under each *K*, ln Pr(X|*K*). We conducted 20 replicate runs for all proposed values of *K* between 1 and 10, assuming dominant AFLP markers, admixture among clusters and individuals (α = 1), default allele frequency distribution (λ = 1), and correlated allele frequencies. Each run used 10^{4} iterations following a burnin period of 5 × 10^{4} iterations. We estimated the number of clusters as the value of K with the greatest Pr(X|*K*), and then tested that using the Δ*K* procedure of Evanno et al. (2005). We compiled color-coded STRUCTURE plots of plant membership in individual cluster(s) to assess spatial population structure, plotting sample sites in order from west to east. We conducted a parallel set of analyses restricting attention solely to the northern 16 sample sites, excluding the large distances to the two pairs of southern sample sites (Fig. 2).