Present address: Ifremer, Centre de Brest BP70, Department DEEP, 29280 Plouzané, France
Standardizing methods to address clonality in population studies
Version of Record online: 17 OCT 2007
DOI: 10.1111/j.1365-294X.2007.03535.x
Additional Information
How to Cite
ARNAUD-HAOND, S., DUARTE, C. M., ALBERTO, F. and SERRÃO, E. A. (2007), Standardizing methods to address clonality in population studies. Molecular Ecology, 16: 5115–5139. doi: 10.1111/j.1365-294X.2007.03535.x
Box 1 Genotypic vs. clonal membership
a) Assessing whether all replicates of the same MLG are part of the same clone
The probability of a given genotype i under the assumption of Hardy–Weinberg equilibrium can be estimated as:
- ((eqn 1))
where l is the number of loci, f_{i} the frequency of each allele at the i^{th} locus (estimated using the round-robin method, see text), and h the number of heterozygous loci in the sample.
When taking into account departures from Hardy–Weinberg equilibrium (using F_{IS}), this equation becomes:
- ((eqn 2))
where l is the number of loci, h is the number of heterozygote loci, and f and g are the allelic frequencies of the alleles f and g at the i^{th} locus (with f and g identical for homozygotes), F_{IS(i)} is the F_{IS} estimated for the i^{th} locus (using allelic frequencies estimated with the round-robin method), and z_{i} = 1 if the i^{th} locus is homozygous (for f_{i}=g_{i}) and z_{i} = –1 if the i^{th} locus is heterozygous.
When the same genotype is detected n times in a sample of N sampling units, the probability that the repeated genotypes originate from distinct sexual reproductive events (i.e. from different zygotes, thus being different genets), derived from the binomial expression, is:
- ((eqn 3))
In this calculation, the probability of the genotype p_{gen }can be replaced by p_{gen}(F_{IS}) to consider possible departures to Hardy–Weinberg equilibrium, in order to obtain a more conservative estimate of p_{sex}.
A Monte Carlo procedure can be applied to ensure that the set of loci used provides enough power to discriminate all MLGs present in the sample:
Fig. B1.1: Box plot describing the genotypic resolution of microsatellites in a data set of the seagrass Cymodocea nodosa containing 220 sampling units genotyped using nine microsatellites, analysed for of all possible combinations of K loci (K = 1, ... , l; l is the number of loci available). the edges of the boxes show the minimum and maximum number of genotypes and the central line shows the average number of genoptypes identified in the sample using X microsatellites (Alberto et al. 2005). The example illustrated here shows that a set of seven loci allows an accurate determination of the number of genotypes in the sample.
b) Ascertaining that each distinct MLG belongs to a distinct clone, or genet (Halkett et al. 2005a); defining clonal lineages (MLL)
This procedure can be used if the distribution of genetic distances among sampling units does not follow a strict unimodal distribution but shows high peaks toward low distances, susceptible to reveal the existence of somatic mutations or scoring errors in the data set resulting in low distances among slightly distinct MLG actually deriving from a single reproductive event. The use of the frequency distribution of distances to detect such events has been proposed four times so far, to our knowledge (Douhovnikoff & Dodd 2003; Meirmans & Van Tienderen 2004; Arnaud-Haond et al. 2005; Rozenfeld et al. 2007). In a recent work on Posidonia (Arnaud-Haond et al. 2007) we introduced the concept of MLL to design genets represented by slightly distinct MLG, due to mutation or scoring errors. We propose a two step approach, consisting in (i) screening each MLG pair presenting extremely low distance, and originating a primary small peak in the frequency distribution of distances, making it bimodal rather than unimodal (see the dashed line in Fig. B1.2). Then we propose (ii) using p_{sex} on the set of identical loci in order to estimate the likelihood that those slightly distinct MLG would actually be derived from distinct reproductive events. When such likelihood was lower than a chosen threshold (in that case 0.01), then the slightly distinct MLG may be considered as being derived from the same genet and being slightly distinct representatives of the same MLL. Numerous distance metrics can be chosen, such as the number of distinct alleles, Jaccard similarity in particular for multibanding patterns (Douhovnikoff & Dodd 2003) or the number of microsatellite motifs (Arnaud-Haond et al. 2007) under the hypothesis of a stepwise mutation model for somatic mutations.
Fig. B1.2: (A) Frequency distribution of the pairwise number of alleles differences between MLGs for the same sample of C. nodosa (Alberto et al. 2005), compared with (B) the frequency distribution of the pairwise distances in a set of seeds from the same location (Cadiz, Spain) in which neither identical MLG nor somatic mutation are expected. The x-axis represents the number of allele differences and the y-axis is the frequency distribution for each x rank. The dashed line in the adult distribution represents the threshold below which identical MLG have a p_{sex}, estimated after excluding the slightly different loci, that supports the slightly distinct MLG as having originated from the same MLL (i.e. from the same zygote).
Box 2 Clonal richness estimates
The index of clonal diversity proposed by Ellstrand & Roose (1987) for a sample of size N in which G genotypes are discriminated is estimated as:
- ((eqn 4))
This modification was proposed by Dorken & Eckert (2001):
- ((eqn 5))
such that the smallest possible value in a monoclonal stand is always 0, independently of sample size, and the maximum value is still 1, when all the different samples analysed correspond to distinct clonal lineages.
These indices provide an estimate of the clonal (vs. sexual) input, once the set of loci allowed assessing the clonal membership, as previously detailed. Else, this index may overestimate clonal input, as it will ignore the reproduction of the same multilocus genotype through sexual reproduction (Stoddart 1983; Uthike et al. 1998). To estimate the extent of this possible bias in estimating sexual input, one method was developed (Stoddart 1983; Stoddart & Taylor 1988) involving two of those components. The first is the estimate of genotypic diversity in the sample:
- ((eqn 6))
where p_{i} is the observed frequency of the i^{th} of G genotypes, as described in Stoddart (1983). This first component happens to be also the inverse of the Simpson index of genotypic heterogeneity commonly used to describe clonal diversity (equation 20). It is used in a ratio with the second component, the expected genotypic diversity under Hardy–Weinberg and random assortment between all pairs of loci:
- ((eqn 7))
where D is the sum of all for all p_{i} where (p_{i}×N) > 1 , and P the sum of p_{i} for all (p_{i}×N) < 1. The clonal input is then estimated as:
- ((eqn 8))
When the data set used is made of markers exhibiting high polymorphism and allowing an optimal discriminating power, a very high number of genotypes may be expected and P will be negligible. The estimator (equation 19) will approximate estimator (15) as the number of multilocus lineages is more accurately estimated, and when reaching full resolution of MLLs P_{d} (or R) provides then a reliable estimate of the clonal input.
Box 3 Clonal heterogeneity and evenness estimates
Clonal heterogeneity
Simpson index:
- ((eqn 9))
where p_{i} is the frequency of the MLLi in the population, and G_{pop} the number of distinct MLLs in the population. An unbiased estimator of λ for a sample of size N is:
- ((eqn 10))
where G is the number of MLLs detected in the sample, and n_{i} is the number of sampled units with the MLLi.
The Simpson index can be modified to vary positively with heterogeneity (Pielou 1969), as an index first proposed in economical sciences (Gini 1912; Peet 1974), and the resulting complement of Simpson index then describes the probability of encountering distinct MLLs when randomly taking two units in the sample:
Simpson's complement:
- ((eqn 11))
for which the unbiased estimator from a sample of size N is D* = 1 – L that ranges from 0 to almost 1 − (1/G).
As proposed for species heterogeneity indices, the reciprocal of Simpson index is:
Simpson's reciprocal:
- ((eqn 12))
for which the unbiased estimator for a sample of size N is 1/L.
Simpson's reciprocal ranges from 1 to G, and it can be interpreted as the number of equally represented MLLs required to obtain the same heterogeneity as observed in the sample (Hurlbert 1971; Hill 1973), or as the ‘apparent number of clonal lineages in the sample’.
The Shannon-Wiener's index describes clonal diversity as:
- ((eqn 13))
using the estimator:
- ((eqn 14))
This index quantifies the level of uncertainty regarding the MLL of a sample unit taken at random (Pielou 1966). This index of clonal diversity increases with the number of MLLs and the evenness in the assignment of individuals (ramets) to the MLLs, since this leads to a greater uncertainty in predicting the MLL of a randomly drawn sample unit.
Clonal evenness
A way of describing clonal equitability, which is independent of clonal richness but not explicitly described by any diversity index (see above), is to use an evenness index. So far the most widely used evenness index in clonal plant studies is the Simpson's complement index (Hurlbert 1971; Fager 1972):
- ((eqn 15))
with D_{min} and D_{max} being the approximate minimum and maximum values of Simpson's complement index given the sample size N and the sample clonal richness G, estimated as:
This evenness formulation can also be used with the Shannon-Wiener index (e.g. Hurlbert 1971), or alternatively evenness can also be estimated as V′, the ratio of observed to maximal diversity (using either heterogeneity index). In this case, when using the Shannon-Wiener index, the corresponding evenness index, sometimes called Pielou's evenness (J′, Pielou 1975) and hereafter referred to as such, can be estimated as:
- ((eqn 16))
where
Box 4 Power law (Pareto) distribution of clonal membership
The distribution of elements into size classes has been shown to follow a power law for a very broad diversity of systems and phenomena, all of which (from distributions in social sciences to astrophysics and the commonality of gene expression) conform to a particular probability density distribution referred to as the Pareto distribution (e.g. Pareto 1897 in Vidondo et al. 1997; Ueda et al. 2004). A power law distribution applies to systems where the distribution of elements into classes is highly skewed, with much fewer large classes than small ones. The use of a power distribution allows the efficient and parsimonious description of the distribution of the studied elements into classes. We therefore propose here the use of the Pareto distribution as a continuous approximation to describe the discrete distribution of sample units, or ramets (elements) into groups of clonal sizes (classes), where clonal sizes are defined by the number of sampling units belonging to that clone (MLL). This relationship is described by the equation:
- ((eqn 17))
where N_{≥X} is the number of sampled ramets belonging to lineages (MLLs) containing X, or more, ramets in the sample of the population studied, and the parameters a and β are fitted by regression analysis. In practice, the power slope (–β) is derived as the slope of the fitted log-log regression equation describing the rate of decline in the relative frequency of ramets that belong to MLLs of size equal to or larger than a given number of ramets X (when both are in log scale; Fig. B4.1). The parameter β (–slope) therefore indicates the scaling of the partitioning of the ramets among MLL size classes (Fig. B4.1).
- (B4.1)
[ ]
Fig. B4.1: (a) Distribution of replicates among MLLs in Cymodocea nodosa from Alfacs Bay (Alberto et al. 2005), showing the steep decline in number of MLLs with increasing clonal membership typical of power law distributions; (b) transformed into a log-log reverse cumulative distribution.
Box 5 Spatial components of clonality
Edge effect
In order to test whether for the sampling design used, apparent unique or rare MLLs are more distributed towards the edges of the sampling area, thereby inducing a possible overestimation of clonal diversity, the following index can be estimated:
with D_{u} the average geographic distance between unique MLLs and the centre of the sampling area, and D_{a} the average geographic distance between all sampling units and the centre of the sampling area. The significance of such index is tested against the null hypothesis of random distribution of unique and multiply represented MLLs. In practice, the likelihood of the observed difference D_{u} – D_{a} being only due to chance and not to edge effect can be tested for by permuting x times the positions of the samples (i.e. randomly reassigning the sample unit to the sampling coordinates), and calculating the index for each permutation to obtain an empirical distribution of E_{e}. If the observed E_{e} value lies beyond the critical value (function of the chosen alpha) in the distribution of E_{e} in the permuted data, then a significant edge effect is present that may cause indices of clonal diversity to overestimate the population diversity.
Aggregation index
In order to test for the existence of spatial aggregation of clonemates, or MLGs belonging to identical MLLs, the aggregation index A_{c} can be estimated as follows:
with P_{sg} being the average probability of clonal identity of all sample unit pairs and P_{sp} the average probability of clonal identity among pairwise nearest neighbours; these are estimated from the respective observed proportions in the sample. This index will typically range from 0, when the probability between nearest neighbours does not differ on average from the global one, to l when all nearest neighbours preferentially share the same MLL, in a situation of spatially distant distinct clonal lineages. The statistical significance of the calculated aggregation index can be tested against the null hypothesis of spatially random distribution of samples using a resampling approach, whereby the individuals sampled are randomly assigned to the existing sampling coordinates.
Publication History
- Issue online: 13 DEC 2007
- Version of Record online: 17 OCT 2007
- Received 15 May 2007; revision 27 July 2007
References
- 2003a) New microsatellites markers for the endemic Mediterranean seagrass, Posidonia oceanica. Molecular Ecology Notes, 3, 253–255. , , , , , (
- 2003b) Isolation and characterization of microsatellite markers for the seagrass, Cymodocea nodosa. Molecular Ecology Notes, 3, 397–399. , , , , (
- 2005) Spatial genetic structure, neighbourhood size and clonal subrange in seagrass (Cymodocea nodosa) populations. Molecular Ecology, 14, 2669–2681. , , , , , (
- 1995) Clonality in soilborne, plant-pathogenic fungi. Annual Review of Phytopathology, 33, 369–391. , (
- 2005) Assessing genetic diversity in clonal organisms: low diversity or low resolution? Combining power and cost-efficiency in selecting markers. Journal of Heredity, 96, 434–440. , , , , , (
- 2007) genclone 1.0: a new program to analyse genetics data on clonal organisms. Molecular Ecology Notes, 7, 15–17. , (
- 2007) Vicariance patterns in the Mediterranean sea: East-West cleavage and low dispersal in the endemic seagrass Posidonia oceanica. Journal of Biogeography, 34, 963–976. , , , , , (
- 2000) Genotypic diversity and gene flow in brooding and spawning corals along the Great Barrier Reef, Australia. Evolution, 54, 1590–1605. , (
- 2001) Genetic differentiation among populations of a broadcast spawning soft coral, Sinularia flexibilis, on the Great Barrier Reef. Marine Biology, 138, 517–525. , , , (
- 2001) Consequences of clonal growth for plant mating. Evolutionary Ecology, 15, 521–530. (
- 1997) Responses to severe competitive stress in a clonal plant: differences between genotypes. Oikos, 79, 581–591. (
- 2004) Implications of clonal structure for effective population size and genetic drift in a rare terrestrial orchid, Cremastra appendiculata. Conservation Biology, 18, 1515–1524. , , (
- 2002) Origins of clonal diversity in the hypervariable asexual ostracode Cypridopsis vidua. Journal of Evolutionary Biology, 15, 134–145. , (
- Feed-backs between genetic structure and perturbation-driven decline in seagrass (Posidonia oceanica) Meadows Conservation Genetics, doi: 10.1007/s10592-007-9288-0. , , et al .
- 2001) Severely reduced sexual reproduction in northern populations of a clonal plant, Decodon verticillatus (Lythraceae). Journal of Ecology, 89, 339–350. , (
- 2003) Intra-clonal variation and a similarity threshold for identification of clones: application to Salix exigua using AFLP molecular markers. Theoretical and Applied Genetics, 106, 1307–1315. , (
- 1987) Patterns of genotypic diversity in clonal plant-species. American Journal of Botany, 74, 123–131. , (
- 1997) Gene dispersal and spatial genetic structure. Evolution, 51, 672–681. , (
- 1972) Diversity: a sampling study. American Naturalist 106, 293–310. (
- 2000) Genetic diversity of North American populations of Cristatella mucedo, inferred from microsatellite and mitochondrial DNA. Molecular Ecology, 9, 1375–1389. , , (
- 1912) Variabilità e mutabilità. In: Studi Economico-Giuridici Facolta de Giurisprundenza dell’ Universita di Cagliari, A., Vol III, parte II. (
- 2005) Testing for clonal propagation. Heredity, 94, 173–179. (
- 1990) Clonal Growth in Plants: Regulation and Function. SPB Academic Publishers, The Hague, The Netherlands. , (
- 2005a) Admixed sexual and facultatively asexual aphid lineages at mating sites. Molecular Ecology, 14, 325–336. , , , , , (
- 2005b) Tackling the population genetics of clonal and partially clonal organisms. Trends in Ecology & Evolution, 20, 194–201. , , (
- 2003) Genetic neighbourhood of clone structures in eelgrass meadows quantified by spatial autocorrelation of microsatellite markers. Heredity, 91, 448–455. , (
- 2002) Clonal diversity and structure within a population of the pondweed Potamogeton pectinatus foraged by Bewick's swans. Molecular Ecology, 11, 2137–2150. , , , (
- 1997) Probability of clonal identity: inferring the relative success of sexual versus clonal reproduction from spatial genetic patterns. Journal of Ecology, 85, 591–600. , , (
- 1977) Population Biology of Plants. Academic Press, London. (
- 1973) Diversity and evenness: a unifying notation and its consequences. Ecology, 54, 427–432. (
- 1971) The nonconcept of species diversity: a critique and alternative parameters. Ecology, 52, 577–586. (
- 2004) Genotypic variation in populations of the clonal plant Saxifraga cernua in the central and peripheral regions of the species range. Russian Journal of Ecology, 35, 413–416. (
- 2003) Plant clonality, mutation, diplontic selection and mutational meltdown. Biological Journal of the Linnean Society, 79, 61–67. (
- 2004) Genetic structure of the deep-sea coral Lophelia pertusa in the northeast Atlantic revealed by microsatellites and internal transcribed spacer sequences. Molecular Ecology, 13, 537–549. , , (
- 2002) Estimating allelic richness: effects of sample size and bottlenecks. Molecular Ecology, 11, 2445–2449. (
- 1995) Spatial genetic structure of a tropical understory shrub, Psychotria officinalis (Rubiaceae). American Journal of Botany, 82, 1420–1425. , , , (
- 1981) Population dynamics and local specialization in a clonal perrenial (Ranunculus repens). I. The dynamics of ramets in contrasting habitats. Journal of Ecology, 69, 743–755. (
- 2003) Rapid changes in clonal lines: the death of a ‘sacred cow’. Biological Journal of the Linnean Society, 79, 3–16. , (
- 2004) genotype and genodive: two programs for the analysis of genetic diversity of asexual organisms. Molecular Ecology Notes, 4, 792–794. , (
- 1990) Local genetic and clonal structure in the tropical terrestrial bromeliad, Aechmea magdalenae. American Journal of Botany, 77, 1201–1208. , (
- 2004) Clonal diversity, genetic structure, and mode of recruitment in a Prunus ssiori population established after volcanic eruptions. Plant Ecology, 174, 1–10. , , , (
- 2004) North Atlantic phylogeography and large-scale population differentiation of the seagrass Zostera marina L. Molecular Ecology, 13, 1923–1941. , , et al . (
- 1979) Ecological implications of clonal diversity in parthenogenetic morphospecies. American Zoologist, 19, 753–762. (
- 1993) A study of spatial features of clones in a population of Bracken fern, Pteridium aquilinum (Dennstaedtiaceae). American Journal of Botany, 80, 537–544. , (
- 2006) genalex 6: genetic analysis in Excel. Population genetic software for teaching and research. Molecular Ecology Notes, 6, 288–295. , (
- 1974) The measurement of species diversity. Annual Review of Ecology and Systematics, 5, 285–307. (
- 1998) Identifying populations for conservation on the basis of genetic markers. Conservation Biology, 12, 844–855. , , (
- 1966) Shannon's formulae as a measure of species diversity: its use and misuse. American Naturalist, 100, 463–465. (
- 1969) An Introduction to Mathematical Ecology. Wiley-Interscience, New-York. (
- 1975) Ecological diversity. New York, USA, 165pp. (
- 1996) Genotypic diversity revealed by allozymes and oligonucleotide DNA fingerprinting in French populations of the aquatic macrophyte, Sparganium erectum. Molecular Ecology, 5, 251–258. , , , , (
- 2001) New markers — old questions: population genetics of seagrasses. Marine Ecology — Progress Series, 211, 261–274. (
- 1999) Differentiating between clonal growth and limited gene flow using spatial autocorrelation of microsatellites. Heredity, 83, 120–126. , , , (
- 1996) Estimators for pairwise relatedness and individual inbreeding coefficients. Genetical Research, 67, 175–185. (
- 2000) Genetic differentiation between individuals. Journal of Evolutionary Biology, 13, 58–62. (
- 2007) Spectrum of genetic diversity and networks of clonal populations Journal of the Royal Society Interface. , , et al . (
- 1991) Fractals, Chaos, Power Laws: Minutes from an Infinite Paradise. W.H. Freeman, New York. (
- 1949) Measurements of diversity. Nature, 163, 688. (
- 2005) Nonlinear processes in seagrass colonisation explained by simple clonal growth rules. Oikos, 108, 165–175. , , , (
- 1996) A consumer's guide to evenness measures Oikos, 76, 70–82. , (
- 1995) Biometry. W.H. Freeman, New York. , (
- 2001) statistica (Data Analysis Software System), Version 6. http://www.statsoft.com . (
- 2003) mlgsim: a program for detecting clones using a simulation approach. Molecular Ecology Notes, 3, 329–331. , , (
- 1983) A genotypic diversity measure. Journal of Heredity, 74, 489–490. (
- 1988) Genotypic diversity — estimation and prediction in samples. Genetics, 118, 705–711. , (
- 2004) Invasion dynamics of two alien Carpobrotus (Aizoaceae) taxa on a Mediterranean island: I. Genetic diversity and introgression. Heredity, 92, 31–40. , , (
- 2000) Transglobal comparisons of nuclear and mitochondrial genetic structure in a marine polyploid clam (Lasaea, Lasaeidae). Heredity, 84, 321–330. , (
- 2002) The clonal theory of parasitic protozoa: 12 years on. Trends in Parasitology, 18, 405–410. , (
- 1990) A clonal theory of oarasitic protozoa: the population structures of Entamoeba, Giardia, Leishmania, Naegleria, Plasmodium, Trichomonas, and Trypanosoma and their medical and taxonomical consequences. Proceedings of the National Academy of Sciences, USA, 87, 2414–2418. , , (
- 2004) Universality and flexibility in gene expression from bacteria to human. Proceedings of the National Academy of Sciences, USA, 101, 3765–3769. , , et al . (
- 1998) Genetic structure of fissiparous populations of Holothuria (Halodeima) atra on the Great Barrier Reef. Marine Biology, 132, 141–151. , , (
- 2003) Genetic structure of a population sample of apomictic dandelions. Heredity, 90, 326–335. , , , , , (
- 1997) Some aspects of the analysis of size spectra in aquatic ecology. Limnology and Oceanography, 42, 184–192. , , , (
- 1984) Diversity, biotic and similarity indices. A review with special relevance to aquatic ecosystems. Water Research, 18, 653–694. (
- 2002) Breeding system, genetic diversity and clonal structure in the sub-alpine forb Rutidosis leiolepis F. Muell. (Asteraceae). Biological Conservation, 106, 71–78. , , , (