METHODOLOGICAL INSIGHTS: Rapid genetic delineation of provenance for plant community restoration



    Corresponding author
    1. Kings Park and Botanic Garden, Botanic Gardens and Parks Authority, Fraser Ave, West Perth, Western Australia, 6005, Australia, and School of Plant Biology, Faculty of Natural and Agricultural Sciences, The University of Western Australia, Nedlands, Western Australia, 6907, Australia
    Search for more papers by this author

    1. Environmental Department, Alcoa World Alumina Australia, PO Box 252, Applecross, Western Australia, 6953, Australia
    Search for more papers by this author

S. Krauss, Kings Park and Botanic Garden, Botanic Gardens and Parks Authority, Fraser Ave, West Perth, Western Australia, 6005, Australia (tel. +61 89480 3673; fax +61 89480 3641; e-mail


  • 1Best practice in native plant community restoration and/or revegetation recognizes the importance of using material of local provenance. At the practical level, various guidelines exist but these have limitations. The challenge is to deliver accurate provenance information rapidly to the restoration industry.
  • 2We demonstrate a novel approach to the rapid delineation of genetic provenance by utilizing minimal sampling, the power and efficiency of the AFLP DNA fingerprinting technique and a multivariate spatial autocorrelation analysis for four species with high priority for minesite revegetation in south-west Western Australia.
  • 3A significantly positive genetic correlation was found between individuals in the smallest distance class for three species. Correlograms then stabilized, with no significant genetic correlation between individuals at any other distance class. A local genetic provenance distance was here defined as the distance where the correlogram goes from significant to non-significant, which was approximately 26 km, 20 km and 20 km for these three species.
  • 4In contrast, for the fourth species, no significant genetic correlation was seen at any distance class, suggesting a very broad genetic provenance (up to 100 km), although the possibility of significant structure below the smallest distance class used here (20 km) cannot be dismissed.
  • 5Whilst spatial autocorrelation has identified significant spatial genetic structure for 3 of 4 species assessed, a robust delineation of provenance distance, or patch size, is problematic because it is dependent on properties of sampling and analysis such as the scale of sampling and the choice of distance class size, especially when sample sizes are small.
  • 6Synthesis and applications. The determination of seed collection zones for revegetation projects is a complex problem. We demonstrate a new approach for the rapid delivery of genetic provenance delineation for native plant community restoration, and provide recommendations for seed collection zones for each of four species. This approach can be applied to other species and other areas. We discuss the limitations of the approach, and conclude that further research is required to assess an appropriate minimal sampling strategy that leads to a more robust delineation of provenance distance. We also note that revegetation programmes provide an opportunity to experimentally assess the biological significance of the local provenance as defined, through an assessment of the relative performance of plants sourced from within and beyond defined provenances.


Best practise in native plant restoration and/or revegetation identifies the importance of using seed collected from the local provenance (Coates & van Leeuwen 1997; Mortlock 1999, 2000; Krauss et al. 2000; Sackville Hamilton 2001; although see Wilkinson 2001 for an argument against provenance). An introduction of non-local material can have significant consequences for the short and/or long-term success of restoration activities. First, non-local genotypes may be at a severe disadvantage if genetic differences between provenances are the result of local adaptation (Knapp & Dyer 1998; van Andel 1998; Jones, Hayes & Sackville Hamilton 2001). For example, mortality of non-local plants of Nassella pulchra, a grass widely used in restoration projects in California, was as much as four times that of local plants after 3 years (Knapp & Dyer 1998). Secondly, this maladaptation can be transferred from introduced non-local plants to local populations and expressed through outbreeding depression in subsequent generations (Fenster & Dudash 1994; Keller, Kollman & Edwards 2000). These potential outcomes lead to restoration inefficiencies and possibly failure. A third issue is the potential for genetic swamping of a local gene pool by the introduction of a non-local gene pool, leading to a loss of biodiversity (Rieseberg 1991; Sackville Hamilton 2001). These provenance issues are particularly important in areas of exceptionally high plant diversity, such as south-west Western Australia (Hopper 1998), an international plant biodiversity hotspot (Myers et al. 2000). Here, a high turnover of species and marked variation within species is a feature of the landscape, due in part to the persistence of the flora, probably well into the Cretaceous, without any large scale extinction episodes associated with glaciation (Hopper et al. 1996; Coates & Atkins 2001). While glaciation and permafrost associated with Quaternary ice ages have produced significant changes in species distributions in Europe and North America (Williams et al. 1998; Hewitt 2000), the impacts were much less severe in Australia, albeit with substantially drier and cooler conditions at glacial maxima (Kershaw, Moss & van der Kaars 2003).

Consequently, restorationists often ask ‘How far from my restoration site can I collect seed or other plant material before it compromises the success (as defined above) of revegetation effort at my site?’. How do restoration/revegetation practitioners identify the local provenance for seed collection? Various guidelines exist to direct practitioners in seed collection strategies restricted to local provenance (Mortlock 1999, 2000). These guidelines stress the importance of, for example, habitat matching, life-history variables, known taxonomic boundaries and geographical proximity, which can all be critical considerations. However, these guidelines still only provide a vague ‘best guess’, and strict application can lead to error in provenance delineation. For example, habitat matching may be difficult in highly, yet subtly, variable landscapes. Rather than a simple correlation between geographical and genetic distance, many species show complex relationships, particularly in ancient landscapes such as Western Australia (Hopper et al. 1996). Ultimately, the history of a population, and the landscape within which it exists, are critical factors influencing the genetic relationships of populations that are not necessarily predictable from these guidelines.

The use of genetic markers, in combination with the above, is currently the most accurate approach available for the rapid delineation of provenance (Krauss et al. 2000). Numerous genetic marker techniques are available to achieve this objective (Karp, Isaac & Ingram 1998; Glaubitz & Moran 2000). Of these markers, one with particular utility for the practical delineation of provenance is Amplified Fragment Length Polymorphism (AFLP) (Vos et al. 1995). AFLP is based on the highly stringent selective PCR amplification of restriction fragments from a total digest of genomic DNA. In principle, AFLP has the features of rapidly generating a DNA fingerprint for any plant specimen without information on the genome. In most cases, the DNA fingerprints generated are unique to individual genotypes. AFLP achieves a relatively wide sampling of the genome with minimal marker development time and is sensitive to the detection of population genetic differentiation (Mueller & Wolfenbarger 1999). Consequently, AFLP has now been widely used in the assessment of population genetic variation in many plant species (Mueller & Wolfenbarger 1999). However, AFLP is technically demanding and excellent quality of DNA is critical, which can create significant problems when attempting a mass screening effort of plants from the wild, especially for many Australian natives (Byrne 2001). Large genomes can also cause difficulties for AFLP, although this difficulty can often be overcome through the use of larger primers (Cato, Corbett & Richardson 1999).

For the spatial analysis of intraspecific genetic variation, a common approach is to sample a sufficiently large number of individuals from each of many populations from the range of a species to identify levels and partitioning of genetic variation and the genetic distance among populations (Wright 1978). The phenetic relationships and extent of differentiation among populations can then be visualized graphically by cluster analysis or, preferably, by ordination techniques such as multidimensional scaling (Lessa 1990). However, these detailed studies can be relatively labour intensive, time consuming and costly. To fulfil the demands of a restoration industry operating in a highly diverse landscape for practical yet powerful provenance information, an alternative approach was required. In this study, we combine the power of AFLP with a minimal sampling strategy and analysis of the spatial genetic variation by way of spatial autocorrelation to identify a provenance distance for key species targeted for revegetation.

Briefly, spatial autocorrelation analysis assesses the similarity between samples for a given variable as a function of spatial distance (Sokal & Oden 1978; Heywood 1991; Epperson 1993; Diniz-Filho & Telles 2002, and references therein). Results are then presented graphically in a correlogram, which shows the genetic correlation between individuals as a function of distance. Because the correlogram is a summary of spatial genetic variation, it can be used to delineate a geographical distance below which the similarity of samples is high relative to the scale of sampling. For example, the first x-axis intercept can be considered to approximate a ‘patch’ size diameter and, especially in the case of a stabilizing profile, can be considered a single genetic unit for conservation or management (Diniz-Filho & Telles 2002). The patch size diameter is the distance at which samples become genetically independent, which is here interpreted as the delineation of provenance distance. Spatial autocorrelation analysis can be applied to any scale and from continuously or discretely distributed populations (Escudero, Iriondo & Torres 2003). Whilst many studies have employed spatial autocorrelation analysis at a scale of metres to assess genetic variation within populations, it has been validly applied at scales of tens to thousands of kilometres to assess conservation units across a wide species range (Diniz-Filho & Telles 2002).

The spatial autocorrelation analysis performed here follows the new procedures of Smouse & Peakall (1999), which allows an intrinsically multivariate analysis of spatial genetic structure specifically for multilocus data sets. The procedure treats the genetic data set as a whole, strengthening the spatial signal and reducing the stochastic (allele to allele, locus to locus) noise (Smouse & Peakall 1999). As such, this approach overcomes a lack of sensitivity in the more familiar Moran I-statistic (Cliff & Ord 1981).

The species analysed here are the first of many targeted for use in minesite revegetation programmes that ultimately aspire to maximize the re-creation of pre-mining species composition. These species have been prioritized in part due to the difficulty of collecting sufficient numbers of seed locally. The restoration industry urgently requires accurate data on provenance for many species. In response to this need, we apply here a powerful molecular technique with an efficient sampling methodology and a multivariate approach to spatial autocorrelation analysis to expeditiously address the question of how far can seed collections be made from the restoration site whilst remaining within the local provenance.


the restoration site landscape

This study was commissioned by Alcoa World Alumina Australia and Worsley Alumina Pty Ltd. Alcoa operates two bauxite mines at Willowdale and Huntly in the Darling Range of south-western Western Australia, 80–140 km south of Perth. Worsley's mine at Mt Saddleback is 130 km south-east of Perth. The mines are all in State forest which is managed by State Government authorities. The current rehabilitation objective of both companies is to re-establish a stable forest ecosystem planned to maintain agreed forest values, which include conservation, timber production, hydrology and recreation. To reinstate the conservation value and recreation value it is necessary to re-establish the biodiversity of the jarrah forest.

The region has a Mediterranean climate with cool, wet winters and hot, dry summers. Average annual rainfall is 1200 mm at both of Alcoa's mines and 740 mm at Mt Saddleback. Average summer monthly maximum temperatures are approximately 28 °C at Alcoa's sites and 31 °C at Mt Saddleback. Average winter minima are 5 °C at all sites. Typically the mining areas and the surrounding forest soils are ferruginous gravels with sandy clay subsoils (Churchward & McArthur 1980).

The sclerophyll vegetation of the areas within the mining envelopes at each mine is generally dominated by jarrah Eucalyptus marginata Sm. with varying amounts of marri Corymbia calophylla (Lindl.) K.D. Hill & L.A.S. Johnson. In addition there is a small-tree component, with bull banksia Banksia grandis Willd., sheoak Allocasuarina fraseriana (Miq.) L.A.S. Johnson, snottygobble Persoonia longifolia R. Br. and woody pear Xylomelum occidentale R. Br., the most common species. The undergrowth consists of sclerophyllous shrubs to 3 m high, predominantly from the families Liliaceae sensu lat., Leguminosae, Orchidaceae, Apiaceae, Ericaceae, Asteraceae, Proteaceae, Restionaceae and Cyperaceae. A strong feature of the jarrah forest is the relative homogeneity of both the overstorey and the understorey species. Many of the common jarrah forest plant species have a contiguous distribution over the entire extent of the forest (c. 300 km north to south and 100 km west to east).

The bauxite ore is relatively shallow, averaging 4–5 m deep at Willowdale and Huntly and 6–8 m deep at Mt Saddleback. The mine pits range in size from one to tens of hectares and are surrounded by relatively intact forest. Integration of the restored sites with the surrounding populations of native flora is therefore a critical issue. After timber harvest, the mining sequence involves: clearing the remaining vegetation, removing the soil, blasting the cemented bauxite layer or ripping it with a bulldozer, and removing and crushing the bauxite before transporting it to the refineries.

The rehabilitation process starts with shaping the mine pit to produce a landscape that blends with the surrounding forest. Soil is returned to the mine pit and the pit is ripped using a bulldozer – ripping breaks up compacted ground which reduces the risk of erosion and improves tree growth. Seeds of local plants are spread throughout the rehabilitated mine pit. Planting of nursery grown plants is also carried out for species where seed is not a viable method of establishment. These are grown from seed in pots, from cuttings or for the most difficult or ‘recalcitrant’ species, by tissue culture. A fertiliser mix is then applied in late winter or early spring using a helicopter.

The four species targeted in this study were Dryandra lindleyana Meisn. In Lehm., Lomandra hermaphrodita (C.R.P. Andrews) C.A. Gardner, Bossiaea ornata (Lindl.) Benth. and Lechenaultia biloba Lindl. Dryandra lindleyana (Proteaceae) is a common insect-, bird- and mammal-pollinated shrub that grows to 3 m high, and is widespread throughout the jarrah forest on a range of soils. Lomandra hermaphrodita (Dasypogonaceae) is a dioecious, rhizomatous, caespitose, wind-pollinated perennial herb, 0·06–0·2 m high growing over sand or laterite, with a distribution extending throughout the jarrah forest. Bossiaea ornata (Papilionaceae) is a common and widespread upright, weakly stemmed, insect-pollinated shrub growing to 1 m high. Lechenaultia biloba (Goodeniaceae) is a diffuse, ascending shrub, 0·15–1 m high with distinctive large blue flowers that are insect-pollinated. It is common on lateritic or granitic soils, and has a wide distribution throughout the south-west of Western Australia.


For each species, one plant from each of up to 36 locations was randomly sampled in November 2001. Sampling involved the collection of plant material for DNA extraction and recording of geographical co-ordinates. Locations were approximately evenly spaced throughout a 110 km (north–south) by 50 km (east–west) area, bounded approximately by Boddington, Dwellingup, Willowdale and Jarrahdale. Linear geographical distances between pairs of locations ranged from 4 km to 106 km, and averaged 42 km. Plant material was collected and kept fresh in wet newspaper over ice, or stored in vials with silica gel. All locations occurred up to 60 km from the minesites, and represented a potential source population of seed for revegetation.


Upon returning to the laboratory, fresh material was stored at 4 °C until extraction of DNA. Within 1 week of delivery, DNA was extracted from fresh leaf samples or from silica dried samples using the Qiagen DNeasy extraction kit (Qiagen, Valencia, USA), according to the manufacturer's instructions, or a modified CTAB procedure (Bossiaea ornata). The purity and quantity of the extracted DNA was assessed visually following agarose gel electrophoresis and ethidium bromide staining, as well as the 260 nm/280 nm wavelength absorbance ratio using a GeneQuant spectrophotometer (Gene Quant, Foster City, USA). Multi-locus DNA fingerprints were generated for each sample using AFLP with fluorescently labelled primers and an ABI (Foster City, USA) Prism 377 automated sequencer. AFLP procedures were as set out previously (Krauss 1999), except that the primer pairs used were m-cac/e-agg for Dryandra lindleyana and Lechenaultia biloba, m-ctg/e-agg for Bossiaea ornata and m-cac/e-act for Lomandra hermaphrodita. These final primer pairs were chosen as generating the best fingerprints from a larger trial set of primer pairs. AFLP profiles were scored for the presence/absence of strong and consistent markers between 70 and 500 base pairs using GeneScan® software (GeneScan, Foster City, USA). Replicate DNA fingerprints were generated for a subset of samples from each species to confirm the reliability of markers scored. For each species, the genetic distance between each pair of samples was calculated as a simple count of the number of presence/absence polymorphisms (Manhattan distance).

multi-locus spatial autocorrelation analysis

Spatial genetic structure was assessed using spatial autocorrelation analysis in the GenAlEx program (Peakall & Smouse 2003). Geographic distance between each pair of locations was assessed from x/y co-ordinates expressed as kilometres from the south-west corner of the total area sampled (Table 1). The spatial autocorrelation analysis follows the procedures of Smouse & Peakall (1999), which allow the multivariate analysis of individual spatial genetic structure for multilocus data sets. Distance classes were chosen so that there was a minimum of 30 pairs of individuals in each distance class. For all pairs of individuals within a distance class, a correlation coefficient r was calculated according to formula 15 in Smouse & Peakall (1999). The coefficient r is a proper correlation coefficient, with a mean of 0 when there is no autocorrelation and bounded by −1 and 1 (Smouse & Peakall 1999), and is closely related to Moran's I. By treating the genetic data set as a whole, the spatial signal is strengthened by reducing the stochastic (allele to allele, locus to locus) noise (Smouse & Peakall 1999). Results are presented graphically in a correlogram, which shows the genetic correlation as a function of distance.

Table 1.  Sampled locations and their north/south co-ordinates (from the south-west corner of the total area sampled) for each of four species assessed for provenance in the jarrah forest south-east of Perth, Australia
LocationNorth (km)East (km)DryandralindleyanaLechenaultia bilobaLomandra hermaphroditaBossiaea ornata
Trees  029·68 
Godfrey  0·6339·37 
Stockyard  6·5334·74
Stene  9·6842·53
Bednall 12·4230·11
Kent 15·58 1·89   
Hoffman 22·74 3·58  
Quindanning 23·5845·47
Tumlo 26·95 9·47  
Driver 27·37 3·16 
Saddleback 29·6840·63
George 32·8426·74
Federal 33·05 3·79 
Taree 38·3220·63
Nanga 41·47 5·89 
Marradong 41·6835·79
Plavins 45·2612·84
Amphion 47·1617·47 
Hedges 48·0025·89 
Holmes 54·95 0   
Wells 56·4224·63
Holyoake 57·44 5·30   
Turner 64·84 1·68
Gyngoorda 65·6842·53
Urbrae 68·42 3·79  
O’Neill 71·5814·95
Boonerring 74·3228·00
Wilson 74·74 4·63 
Lang 80·4218·11
Clinton 81·68 6·11 
Karnet 86·11 6·11
Cooke 86·7425·68
Gibbs 89·6840·00
Serpentine 92·63 6·74
Cobiac 93·4715·16
Mundlimup 97·05 6·74
Gordon104·21 4·21
Churchman109·89 6·95

Upper and lower confidence limits, as generated by 1000 random permutations of the data, bound the 95% confidence interval about the null hypothesis of no spatial structure. These permutations are equivalent to shuffling the individual genotypes among locations and recomputing r. This generates an estimate of r about the null hypothesis of no spatial structure. After 1000 permutations, the permuted values are sorted and the 25th and 975th values taken to define the upper and lower values of the 95% confidence interval. Correlation values outside the confidence interval are considered to be statistically significant at P < 0·05. In addition, 95% confidence intervals about r were estimated by bootstrapping, obtained by drawing with replacement from pairwise comparisons within a given distance class. For each of 1000 bootstraps, the r-value for the bootstrap was calculated for each distance class. These were sorted from smallest to largest and the 25th and 975th ranked values were taken to define the bounds of the 95% confidence interval. When the 95% confidence interval contains zero, the null hypothesis of no spatial structure is accepted. This bootstrap test is less powerful than permutational tests, because the number of samples per distance class is much smaller than the n(n – 1)/2 comparisons used during permutation. Therefore, for small sample sizes, bootstrap errors tend to be larger than the permutational errors, and consequently bootstrap tests are more conservative than permutational tests (Peakall et al. 2003).

A single correlogram may not necessarily reveal the true extent of non-random genetic structure, because somewhat arbitrary decisions are required in defining the number and size of distance classes (Escudero et al. 2003; Peakall et al. 2003). A general methodological criterion includes a minimum of 30 pairwise comparisons within each distance class (Legendre & Fortin 1989), and a first distance class size less than approximately 1·5 times the square root of the inverse of the sampling density (Epperson & Chung 2001). Sampling at intervals greater than the true (but unknown) extent of genetic structure will fail to detect genetic structure at all, while sampling at intervals well below the scale of genetic structure may be associated with unnecessarily small sample sets and limited statistical power. Consequently, we performed numerous spatial autocorrelation analyses on each data set with various permutations of number and size of distance classes, while still consistent with the minimum criteria, and show here those correlograms that best detected significant genetic structure. We also show the variation in correlograms when distance size classes are changed for Lomandra hermaphrodita, as an example.

The first x-axis intercept of the correlogram has been widely considered to approximate the diameter of a ‘patch’ size (Sokal & Wartenberg 1983; Diniz-Filho & Telles 2002; Escudero et al. 2003). This can be defined, relative to the total collection of genotypes spread over the spatial limits of the study, as the distance where individuals are no more correlated than an average pair of genotypes would be, if drawn at random and without regard to their distance apart. This distance can be considered to represent a genetic unit for conservation or management (Diniz-Filho & Telles 2002). In practical terms, the intercept can be defined in different ways. The most conservative approach is to define the patch size as the largest distance class with a significant correlation. Traditionally, the first x-axis intercept itself has been considered to approximate the patch size. Alternatively, when estimating the error about r, it is possible to define the point where the correlogram goes from significantly positive to non-significant as the patch size diameter. We use the latter to define a provenance distance, although the largest significant distance class and the x-intercept provide more and less conservative estimates, respectively.

Additionally, we follow Peakall et al. (2003) and calculate r (along with associated errors about r and the null hypothesis) for increasing distance class sizes for each of the four species. Graphs provide a cumulative estimate of r with increasing distance classes. That is, r is calculated for all pairs of individuals in the smallest distance class, for all pairs of individuals in the two smallest distance classes, for all pairs of individuals in the three smallest distance classes, etc., up to the largest distance class. When significant positive structure is present, the estimated value of r will decrease with increasing size of the distance class. The distance class size at which the estimate of r is no longer significant is an estimate of the true extent of detectable positive spatial genetic structure (Peakall et al. 2003).


Between 24 and 47 consistently strong and reliable markers were scored for each species, of which between 71% and 87% were polymorphic (Table 2). Genetic distance, measured as the number of polymorphic markers, between each pair of samples varied from 4 to 24 (out of 46 markers scored) for Dryandra lindleyana, 1–19 (out of 47 markers scored) for Lomandra hermaphrodita, 2–19 (out of 47 markers scored) for Bossiaea ornata and 0–10 (out of 24 markers scored) for Lechenaultia biloba.

Table 2.  Summary of AFLP marker statistics for each of four jarrah forest species
SpeciesAFLP primer pairNumber of samplesNumber of markers scoredPercentage of markers polymorphic
Dryandra lindleyanam-cac/e-agg334687
Lomandra hermaphroditam-cac/e-act324771
Bossiaea ornatam-ctg/e-agg284787
Lechenaultia bilobam-cac/e-agg362475

A significantly positive genetic correlation between individuals, when compared to permuted r, was found in the smallest distance class for Lomandra hermaphrodita (P (r < permuted r) = 0·01), Dryandra lindleyana (P (r < permuted r) = 0·01), and Bossiaea ornata (P (r < permuted r) = 0·01) (Fig. 1). However, bootstrap estimates of the 95% error about r showed that r was significantly greater than 0 (P < 0·05) for the smallest distance class for Lomandra hermaphrodita only (Fig. 1). The largest distance class with a significantly positive r, the distance where the correlogram goes from significant to non-significant, and the first x-axis intercept was 13 km, 20 km and 42 km for Lomandra hermaphrodita, 20 km, 26 km and 38 km for Dryandra lindleyana, and 15 km, 20 km and 45 km for Bossiaea ornata, respectively. For these species, there was no significant genetic correlation at any other distance class, indicating a stabilizing profile (Diniz-Filho & Telles 2002). In contrast, Lechenaultia biloba showed no significant genetic correlation between individuals at any distance class, indicating no detectable genetic structure at this scale and sampling intensity (Fig. 1).

Figure 1.

Correlograms showing the genetic correlation coefficient r as a function of distance, 95% CI about the null hypothesis of a random distribution of genotypes, and 95% confidence error bars about r as determined by bootstrapping, for each of four species.

Correlograms generated by defining different distance class sizes, or distance classes defined by equal sample pairs within each distance class, generated different results for patch size. For example, correlograms generated for Lomandra hermaphrodita for four different distance class sizes all identified significant genetic structure at the smallest distance class and a stabilizing profile not significantly different from r = 0 across all other distance classes (Fig. 2). However, the first x-axis intercept and the distance where the correlogram goes from significant to non-significant varied between correlograms, substantially so for the correlogram with a distance class size of 14 km (Fig. 2). The distance where the correlogram goes from significant to non-significant was generally more similar across correlograms than was the first x-intercept, supporting its use for delineating a patch size.

Figure 2.

Correlograms showing the genetic correlation coefficient r as a function of distance, 95% CI about the null hypothesis of a random distribution of genotypes, and 95% confidence error bars about r as determined by bootstrapping, for Lomandra hermaphrodita for equal distance class sizes of 14 km, 15 km, 17 km and 19 km.

Graphs showing r calculated for increasing distance classes show at least three important results (Fig. 3). First, they confirm the contrasting results between Lechenaultia biloba (no detectable genetic structure at this scale) and all other species. Secondly, for these other species, they confirm a positive significant genetic correlation between pairs of individuals at the smallest distance class, and no significant genetic correlation (or only weakly so) at larger distances. This result supports the conclusions of patch sizes at the distances determined. This result differs to that of Peakall et al. (2003) who found significant positive genetic structure at distances much larger than the first intercept in their correlogram. Thirdly, r tended to decrease with increasing distance, suggesting isolation by distance, in contrast to the stabilizing pattern of the correlograms. However, the greatest decrease in each case was from the smallest to second smallest distance class, again supporting the significance of the smallest distance class (Fig. 3).

Figure 3.

Graphs showing the genetic correlation coefficient r for increasing distance class sizes, 95% CI about the null hypothesis of a random distribution of genotypes, and 95% confidence error bars about r as determined by bootstrapping, for each of four species.


The native plant restoration industry, as a matter of urgency, requires accurate, powerful and practical provenance information to guide seed collection strategies. Current published guidelines (e.g. Mortlock 1999, 2000) are an important contribution, but can be in error for individual species. Ultimately, a rapid yet powerful genetic approach can best generate the required guidelines within realistic timeframes. We suggest that the combination of the efficiency of AFLP with a minimal sampling approach and data analysis by multivariate spatial autocorrelation is, in principle, capable of filling this need.

delineation of provenance

For a total sample of genotypes, spatial autocorrelation analysis addresses the question ‘Are genotypes in closer physical proximity any more similar than those with greater physical separation?’. For Dryandra lindleyana, Lomandra hermaphrodita and Bossiaea ornata, a spatial autocorrelation analysis of multilocus AFLP data has identified a positive genetic correlation between individuals, significantly greater than r permuted over all samples, at the smallest distance classes of 20 km, 13 km and 15 km, respectively, within the sampled jarrah forest distribution. Additionally, ‘patch’ size diameter, interpreted here as the distance where the correlogram goes from positively significant to non-significant, was 26 km, 20 km and 20 km, respectively. It follows that the radii of the patch size was 13 km, 10 km and 10 km, respectively. The radius of the patch size is here interpreted as the maximum provenance distance to delineate the seed collection locality from the revegetation site. Beyond these distances, no significant genetic structure was observed. These correlograms are examples of a stabilizing profile (Diniz-Filho & Telles 2002). Stabilizing profiles indicate that there is a combination of high and low genetic divergence between samples at the larger distance classes, due to the stochasticity of the evolutionary processes involved (Epperson 1993).

Somewhat surprisingly, no evidence for genetic structuring was found for Lechenaultia biloba at any distance class, suggesting a broad provenance encompassing the entire sampled area. Alternatively, significant genetic structuring may be found at a spatial scale below the resolution of the current study, suggesting that a provenance distance lies below the resolution of the current sampling. Indeed, this might be expected given the life-history characteristics of L. biloba, which is known to generate very little seed, is characteristically long-lived, highly successful at re-sprouting after a fire and exhibits a marked ability to form clones. Poor sexual reproduction may in part contribute to low genetic variation and an absence of detectable population structuring at the scales assessed in this study. The apparent absence of genetic structure over this spatial scale may be a consequence of fewer polymorphic markers than detected for the other species, and therefore a reduction in power to detect a positive relationship where present. The number of markers generated is due to a number of variables, and includes the amount of variation present within the species, the size of the genome, and the primer pairs used. However, the weakness of any relationship suggests that even with more markers, a positive relationship between genetic distance and geographical distance at this scale appears unlikely. We are currently conducting further studies to address these issues, through an assessment of clonality and genetic divergence among populations.

practical implications

These results have important practical implications. The estimated provenance distances for Dryandra lindleyana, Lomandra hermaphrodita and Bossiaea ornata are approximately half those currently applied by the revegetation practitioners who commissioned this work. Past experience has shown that it will be extremely difficult for the mining companies to collect sufficient seed in these relatively small areas to successfully revegetate at the necessary scale (c. 500 ha year−1). In the early 1990s, both Alcoa and Worsley recognized the importance of conserving genetic diversity in its restored mines and drew up seed collecting zones for each mine. No genetic information was available at that time so the boundaries of these seed provenances were selected, conservatively, and based on practical boundaries. For Alcoa, the boundaries were about 20 km radius from each mine. The Huntly seed collection zone for example is approximately 900 km2. For Worsley, the areas agreed with government from which seed is collected are up to 30 km from the mine. The results of this study, if applied strictly, would reduce the collecting zone for Lomandra hermaphrodita, for example, to 314 km2. Availability of seed is affected by many factors, especially burning history and it is already difficult to collect sufficient seed of many species in the current seed collecting zones. Also the possible impact of more intense seed collecting in smaller areas needs to be considered. The geographical extent of each mine is also larger than the smaller of these zones. All of these factors need to be addressed when reviewing seed collection provenance.

methodological caveats

We have defined a local provenance distance on the basis of an assumed linear connection between the significantly positive distance class and the subsequent non-significant distance class, which may be in error. A more conservative approach is to interpret the upper limit of the largest significantly positive distance class as the provenance distance (Diniz-Filho & Telles 2002). These diameters were 20 km, 13 km and 15 km (radii of 10 km, 6·5 km and 7·5 km) for Dryandra lindleyana, Lomandra hermaphrodita and Bossiaea ornata, respectively. This was the smallest distance class in each case, which was determined by the intensity of sampling and the minimum criterion of at least 30 pairs of individuals in each distance class. This smallest distance class is clearly arbitrary and suggests that at least two smallest significant distance classes be a minimum criterion for the application of this definition of provenance.

Alternatively, patch size is determined from the first x-intercept (Diniz-Filho & Telles 2002; Escudero et al. 2003). However, in a profile that stabilizes close to r = 0, the exact intercept can be ambiguous. For example, the correlogram for Lomandra hermaphrodita with equal distance classes at 13 km, shows a near intercept at 26 km, significantly smaller than the actual intercept at 42 km. Re-analysing with equal distance classes of 14 km shows a first x-intercept of 21 km. Re-analysing with equal distance class sizes of 15 km, 17 km and 19 km revealed first x-intercepts and distances where the correlograms go from significant to not significant to be similar (but not identical) to those of the original correlogram.

Consequently, a robust delineation of patch size distance by spatial autocorrelation analysis, as opposed to merely identifying the existence of genetic spatial structure, is tenuous because it is dependent on the properties of sampling and analysis. Our results showed no significant genetic structuring beyond the smallest distance classes, suggesting that sampling of populations at larger distances is inefficient. Additionally, a more precise provenance distance, or the suggestion of a very narrow provenance distance, could not be adequately resolved due to limited sampling at these smaller distances. Increased sampling should lead to a finer division of the continuous space into more discrete distance classes, more than one significant distance class in the presence of structure, greater stability in the shape of the correlogram and the delineation of patch size, as well as reducing the error around r. We are currently conducting further research to more carefully assess an appropriate minimal sampling strategy, given that an ultimate objective is to expeditiously generate provenance information for many species. However, the results of this study suggest that future sampling will be more efficient by doubling sampling intensity at the smaller geographical scales and reducing the maximum distance between sample locations to approximately 60 km. The variation in correlograms generated from genetic data from different AFLP primer pairs is another way of assessing the robustness of a single correlogram. Further discussion of the limitations and benefits of spatial autocorrelation in the context of identifying operational units across continuous populations can be found in Diniz-Filho & Telles (2002).

Spatial genetic variation is highly scale dependent, and conclusions drawn from spatial autocorrelation (and indeed any spatial analysis) are in the context of the scale of sampling and the spatial limits of the study, rather than in an absolute sense (Heywood 1991; Escudero et al. 2003). For example, spatial autocorrelation has been typically used to identify relationships between genetic and geographical distance over much smaller (metres rather than kilometres) scales (Smouse & Peakall 1999; Gonzalez-Martinez et al. 2002; Miyamoto, Kuramoto & Yamada 2002). Patch size on this scale is often interpreted as an isolation by distance effect due to the restricted dispersal of seed and pollen (Hardy & Vekemans 1999). It is therefore likely that a patch size in metres rather than kilometres may be found following a spatial autocorrelation analysis over a scale of tens to hundreds of metres for the four species studied here. Similarly, reducing the spatial limits of this study will probably reduce the patch size. Consequently, there are intriguing issues regarding the identification, and biological significance, of patch size over different scales, and the consequences for provenance, that require further research and a balance between practical outcomes and conservation in achieving best practise in native plant community restoration. New tools for the spatial analysis of genetic diversity for plant conservation and restoration (Escudero et al. 2003) offer novel opportunities for addressing these issues.

While we have interpreted our results in the context of delineating a local genetic provenance, it is important to qualify the interpretation of these results. Genetic similarity between any two populations within the local provenance distance cannot necessarily be assumed. This may especially be so if a species is distributed over an environmental mosaic (e.g. variation in substrate, aspect and altitude). In this case, habitat matching may be critical, and should be applied in conjunction with these genetic results. To identify whether any two specific populations are genetically differentiated, then a suitably large sample of plants per population is required. This has been our principal approach to provenance delineation to date (Krauss et al. 2000). However, the approach adopted here, where individual samples from many sites over a wide geographical range were analysed, is a compromise that provides a rapid and cost effective assessment of the general relationship between genetic distance and geographical distance for a first assessment of genetic provenance delineation. This approach is open to the criticisms levelled at a simple geographical distance approach. However, in a relatively homogeneous environment, such as the jarrah forest within which this study was conducted, such an approach is justifiable, especially when used with careful habitat matching within the delineated provenance distance. While we have not specifically addressed the causes of the significant spatial structure detected for three of four species, there were no obvious edaphic differences between sampled population sites. The importance of a significant east–west rainfall gradient (1200–740 mm) warrants further attention.

Beyond the practical issues associated with delineating provenance, there may be occasions where strict adherence to the local genetic provenance is undesirable (Lesica & Allendorf 1999; Wilkinson 2001). For example, Wilkinson (2001) questions the importance of using material of the local provenance in areas such as Europe and North America, which have been severely impacted by Quaternary climate change resulting in a dynamic recent vegetation history. Although the Quaternary history of the jarrah forest has not been studied in detail, the impacts of Quaternary climate change have been much less severe in south-west Australia, with no glaciation, generally cooler and drier conditions, but still significant winter rainfall during glacial maxima (Williams et al. 1998; Kershaw et al. 2003), suggesting a persistence of the sclerophyllous jarrah forest within much of its current range. Practical restoration provides a novel and powerful opportunity to assess these and other issues associated with the use of non-local provenance material (Holl et al. 2003). For example, we are now in a position to assess the consequences of using local vs. non-local seed, as defined in this study, as part of mine-site revegetation that will in turn provide the opportunity to assess the utility of the approach and results from this study.

While further research is required in a provenance context to address these issues, the continuation of the rapid delivery of provenance information to the native plant restoration industry, to achieve best practice in restoration and conservation objectives, is critical. Despite the problems discussed, the approach outlined here is an important and novel contribution to this objective.


This work was commissioned by Alcoa World Alumina Australia and Worsley Alumina Pty Ltd. Thanks go to Stephen Vlahos (Worsley) for his support and contributions to the study. Robyn Taylor, Liliane Gerhardt, Erika Alacs and Janet Anthony provided technical assistance on this project, and Bill Freeman collected the samples. Thanks to Rod Peakall for making available the GenAlEx program and to Kingsley Dixon, Andrew Dyer, Steve Hopper, Rod Peakall, Peter Smouse, Stephen Vlahos, David Wilkinson and an anonymous referee for helpful discussion and comments on an earlier draft of this manuscript.