Geographic distance and ecosystem size determine the distribution of smallest protists in lacustrine ecosystems


  • Cécile Lepère,

    1. Laboratoire “Microorganismes: Génome et Environnement”, Clermont Université, Université Blaise Pascal, Clermont-Ferrand, France
    2. CNRS, UMR 6023, LMGE, Aubière, France
    3. INRA, UMR 42 CARRTEL, Thonon les bains, France
    Search for more papers by this author
  • Isabelle Domaizon,

    1. INRA, UMR 42 CARRTEL, Thonon les bains, France
    Search for more papers by this author
  • Najwa Taïb,

    1. Laboratoire “Microorganismes: Génome et Environnement”, Clermont Université, Université Blaise Pascal, Clermont-Ferrand, France
    2. CNRS, UMR 6023, LMGE, Aubière, France
    Search for more papers by this author
  • Jean-François Mangot,

    1. Laboratoire “Microorganismes: Génome et Environnement”, Clermont Université, Université Blaise Pascal, Clermont-Ferrand, France
    2. CNRS, UMR 6023, LMGE, Aubière, France
    3. INRA, UMR 42 CARRTEL, Thonon les bains, France
    Search for more papers by this author
  • Gisèle Bronner,

    1. Laboratoire “Microorganismes: Génome et Environnement”, Clermont Université, Université Blaise Pascal, Clermont-Ferrand, France
    2. CNRS, UMR 6023, LMGE, Aubière, France
    Search for more papers by this author
  • Delphine Boucher,

    1. Laboratoire “Microorganismes: Génome et Environnement”, Clermont Université, Université Blaise Pascal, Clermont-Ferrand, France
    2. CNRS, UMR 6023, LMGE, Aubière, France
    Current affiliation:
    1. EA 4678 Conception, Ingénierie et Développement de l'Aliment et du Médicament, Clermont Université, Clermont-Ferrand, France
    Search for more papers by this author
  • Didier Debroas

    Corresponding author
    1. CNRS, UMR 6023, LMGE, Aubière, France
    • Laboratoire “Microorganismes: Génome et Environnement”, Clermont Université, Université Blaise Pascal, Clermont-Ferrand, France
    Search for more papers by this author

Correspondence: Didier Debroas, Clermont Université, Université Blaise Pascal, Laboratoire “Microorganismes : Génome et Environnement”, BP 10448, F-63000 Clermont-Ferrand, France. Tel.: +33 473407837; fax: +33 473407670; e-mail:


Understanding the spatial distribution of aquatic microbial diversity and the underlying mechanisms causing differences in community composition is a challenging and central goal for ecologists. Recent insights into protistan diversity and ecology are increasing the debate over their spatial distribution. In this study, we investigate the importance of spatial and environmental factors in shaping the small protists community structure in lakes. We analyzed small protists community composition (beta-diversity) and richness (alpha-diversity) at regional scale by different molecular methods targeting the gene coding for 18S rRNA gene (T-RFLP and 454 pyrosequencing). Our results show a distance–decay pattern for rare and dominant taxa and the spatial distribution of the latter followed the prediction of the island biogeography theory. Furthermore, geographic distances between lakes seem to be the main force shaping the protists community composition in the lakes studied here. Finally, the spatial distribution of protists was discussed at the global scale (11 worldwide distributed lakes) by comparing these results with those present in the public database. UniFrac analysis showed 18S rRNA gene OTUs compositions significantly different among most of lakes, and this difference does not seem to be related to the trophic status.


A main objective in ecology is to understand the dynamic of biodiversity and, more particularly, the spatial repartition of species. The existence of biogeographic patterns has been highlighted in microbial ecology mainly by distance–decay or taxa–area relationships (Green & Bohannan, 2006). Spatial distribution of microbial eukaryotes (i.e. protists) has received much less attention than bacteria and the studies have concentrated mainly on a single taxonomic group. However, their total diversity and distribution in nature are currently the focus of active debates (Finlay & Fenchel, 2004; Foissner, 2006; Pinel-Alloul & Ghadouani, 2007; Caron, 2009; Nolte et al., 2010), and no consensus has been reached yet. Opposing views have asserted that this highly diverse group of organisms presents a cosmopolitan distribution on the one hand (Finlay & Fenchel, 2004; Finlay et al., 2006; Pither, 2007), especially microorganisms smaller than 20 μm according to Yang et al. (2010), or is composed primarily of species that have limited geographic distributions on the other (Papke & Ward, 2004; Telford et al., 2006). The biogeography studies about microorganisms seem therefore varied according to the group targeted but also to the methods used for delineating ‘microbial species’ or/and spatial scale (Lindstrom & Langenheder, 2012). The importance of the methods used was, for instance, underlined by Taylor et al. (2006) in a study on fungal strains that showed a global distribution by morphological inspection and endemicity by phylogenetic recognition. In addition, most of the studies might have undersampled microbial communities, and novel high-throughput sequencing technologies might provide the tools necessary to explore microbial diversity to a greater depth (Pedros-Alio, 2006; Sogin et al., 2006). Microbial community surveys likely underestimated the slope of the taxa–area relationship, and the evolution of the abundance of the rarest taxa with increasing area was not detectable. Finally, some works omitted to take into account environmental parameters (see review Martiny et al., 2006), and there was no clear distinction between the spatial variation due to present-day environmental factors or historical contingencies.

As model organism to understand the spatial repartition of microbial species in nature, we studied small protists (0.2–5 μm; i.e. unicellular eukaryotic organisms from algae to heterotrophic flagellates, including unicellular fungi (Caron, 2009)) and their distribution in lakes. As well as for bacteria, these small protists are characterized by a tiny size and are likely to disperse easily (ex: capacity of survival during transport for dormant cells) and could have a cosmopolitan distribution, leading to the classical dictum ‘everything is everywhere, but the environment selects’ (Baas-Becking, 1934). However, protists constitute complex assemblages, and one can wonder if the concept of biogeographic diversity is applicable to small-sized heterogeneous group, which is diverse in terms of physiologies, life cycles, phylogenetic positions, with the ability to reproduce sexually and with capacities of dispersal–colonization, which are likely not the same (i.e. cyst, endospores easily transported in atmosphere/aerosols).

The recent application of molecular approaches to assess the diversity of natural microbial assemblages, mainly in marine environments, has revealed an unexpected diversity, undescribed taxa, and new lineages among these small protists (e.g. Lopez-Garcia et al., 2001) emphasizing they could be as diversified as bacteria. A few recent studies conducted in lakes have also highlighted the wide diversity of 18S rRNA gene sequences affiliated to numerous phylogenetic groups involved in photosynthetic and heterotrophic processes but also in parasitism (e.g. Lefranc et al., 2005; Richards et al., 2005; Lepère et al., 2007, 2008). The application of novel high-throughput sequencing technologies (e.g. 454 pyrosequencing) has revealed a greater diversity within microbial communities (e.g. Sogin et al., 2006) and allows the access to the rarest phylotypes. However, these methods have not commonly been used for microbial eukaryotes especially in lacustrine ecosystems (e.g. Monchy et al., 2011).

In the present study, small protists composition and richness were assessed at a regional scale (six French lakes), by different molecular methods targeting the gene coding for 18S rRNA gene: T-RFLP and 454 pyrosequencing. We tested how change (1) protists richness (alpha-diversity) according to lakes areas and (2) similarity in protists community composition (beta-diversity) according to the geographic distance between lakes. According to Martiny et al. (2006), we expected that environmental factors (physical, chemical, and biological local parameters) tested have significant effects in shaping microbial composition at regional scale, whereas distance (dispersal limitation) should be the main structuring factor across continents.

Materials and methods

Studied lakes

The study was conducted in six lakes located in two regions of France (Massif Central and Alps) described in Table 1 (Lakes Godivelle, Aydat, Pavin, and Bourget; Sep and Villerest reservoirs). These lakes were on average 133 km separated from each other, and the most distant lakes are separated by 400 km (Aydat-Bourget). Samples from the six lakes were taken monthly during the thermal stratification period from April to August (from 2002 and 2005). Water samples from the epilimnion (1–5 m) were collected with a Van Dorn bottle at a permanent station situated at the deepest zone of the water column. Water samples [from 100 to 120 mL (maximum volume that can be filtered without clogging the filters)] were successively filtered through 5-μm (as prefiltration step) and 0.2-μm-pore-size polycarbonate filters (Millipore) and stored at −80 °C until nucleic acid extraction. The T-RFLP, a fingerprinting method, was used to examine shifts in the structure of protists community at each site during 4 months (thermal stratification period), whereas pyrosequencing was used to analyze in depth one sample from each of these ecosystems (six lakes). Samples for determining microorganisms' abundances were collected and fixed immediately as described in Lepère et al. (2007). Environmental parameters measured in the six lakes, listed in Table S1, were measured as described in Lepère et al. (2006).

Table 1. Main characteristics of the lakes sampled in this study
LakesTrophic statusCoordinatesArea (km2)Vol (km3)Max depth (m)Mean depth (m)Altitude (m)
  1. a


GodivelleUltra-oligotrophic45°23′04″N, 2°55′25″E0.1380.00344 a 1239
PavinOligomesotrophic45°29′45″N, 2°53′18″E0.440.0029854.91197
SepOligomesotrophic46°02′51″N, 3°02′47″E0.330.00537.414.2414
BourgetMesotrophic45°43′55″N, 5°52′06″E44.53.614581231.5
AydatEutrophic45°39′50″N, 2°59′04″ E0.650.00415.57.4825
VillerestHypereutrophic45°59′36″N, 4°2′12″E70.0624518257

Molecular methods


Nucleic acid extraction has been carried out as described in Lefranc et al. (2005), and extracts were stored at −20 °C until analysis.

T-RFLP analysis

PCR, enzymatic digestions (MspI and RsaI), and terminal restriction fragments (T-RFs) analyses were performed as described in Lepère et al. (2006). Briefly, samples were analyzed in triplicate, and a T-RF (size between 48 and 560 bp with a peak area > 50 fluorescence) was included in the analysis if it occurred in at least two profiles. To account for small differences in the running time among samples, we considered fragments from different profiles with < 1-base pair difference to be the same length. A program in Visual Basic for Excel was developed to automate these procedures and validated previously (e.g. Lepère et al., 2006). Regardless of lakes, MspI was to be more discriminative enzyme in terms of richness; we have therefore only presented data obtained with MspI. The results were then expressed either in terms of presence or absence, or as a relative percentage area compared with the total area.

454 Pyrosequencing

The V4-V5 hypervariable regions of eukaryotic 18S rRNA gene were amplified with Ek-NSF573 (5′-CGCGGTAATTCCAGCTCCA-3′) and EK-NSR1147 (5′-CCGTCAATTYYTTTRAGTTT-3′) (Wuyts et al., 2004). To discriminate each sample, a 5-bp multiplex tag was coupled with adaptor A. The amplification mix contained 30 ng of genomic DNA, 200 μM of deoxynucleoside triphosphates (Bioline, London, UK), 2 mM MgCl2 (Bioline), 10 pmol of each primer, 1.5 U of Taq DNA polymerase (Bioline), and the PCR buffer. The cycling conditions were an initial denaturation at 94 °C for 10 min followed by 30 cycles of 94 °C for 1 min, 57 °C for 1 min, 72 °C for 1 min and 30 s, and a final 10-min extension at 72 °C. The pyrosequencing data representing 131 869 raw sequence reads were cleaned and analyzed by the method described in (Data S1). Finally, after stringent quality filtering and the subtraction of nonsmall protists affiliated with metazoan and Streptophyta taxa OTUs (operational taxonomic units), a total of 89 337 sequences averaging ≥ 200 bp in length were selected for studying small protists (raw data have been deposited in Dryad: The pyrosequencing reads were clustered with a threshold of 95%. The cut-off used for defining an OTU is always debatable; however, the threshold of 95% has been proven to be appropriate to approximate species-level distinction (Caron, 2009; Mangot et al., 2013). The OTUs were compared against our reference database (details in Data S1) with USEARCH (Edgar, 2010), and following the taxonomy of its best hit, each sequence is appended to a phyletic group, together with its 5 best hits. Homologous reads have been then assigned to phyletic groups; they were aligned with the referenced sequences of the corresponding profile using HMMalign (Eddy, 1998). FASTTREE (Price et al., 2009) was used to build phylogenetic trees for each phyletic profile with the Jukes-Cantor + Cat model and a bootstrap threshold of 100 (the trees have been deposited in Dryad:

Data analyses

Lakes can be considered as islands within a ‘sea’ of land (Dodson, 1992). Therefore, to describe the inter-lake diversity distribution of protists, we tested the ‘island biogeography’ theory (MacArthur & Wilson, 1967) that proposes that size and distance of an island (lake) determine its richness/diversity (Reche et al., 2005; Smith et al., 2005; Logue et al., 2012). In this study, Margalef index (Hill et al., 2003) that allows computing richness for rare and dominant taxa was used for quantifying the richness (alpha-diversity). The taxa–area relationship (TAR) (Green & Bohannan, 2006) was calculated using: S ∝ Az where S is the richness and A is the area (Table 1). The slope, z, was calculated after log transformation of data. For assessing the independent effect of the factors tested, partial Spearman's rank correlation analyses were performed. This test allows to measure the degree of association between two variables (i.e. richness vs area), while a third variable is controlled (i.e. environmental factors). Environmental factors were used as a single variable by introducing in the partial correlation, the first axis data obtained from a principal component analysis (PCA) computed with NH4-N, NO3-N, PO4-P temperature, water clarity (Secchi disk), chlorophyll a, prokaryotes, heterotrophic nanoflagellates (HNF), and zooplankton (cladocerans, copepods, and rotifers) abundances (Table S1).

To test the effects of geographic distances vs. environment on assemblage composition (beta-diversity), Mantel (Mantel, 1967) and partial Mantel tests have been completed (Martiny et al., 2006) on the T-RFLP data from the time series and from pyrosequencing data. Although the definition of dominant and rare taxa can be different according to the method used, we defined rare taxa as < 1% (of total area or reads) according to Pedros-Alio (2006). Correlations were carried out between Bray–Curtis community dissimilarity matrix (24 × 24 for T-RFLP and 6 × 6 for pyrosequencing data), a matrix of pairwise lake environmental Euclidian distances (included in Table S1), and a matrix of spatial proximity (distance in kilometers between pairs of lakes). The beta-diversity of protists was also assessed from the UniFrac distance (Lozupone & Knight, 2005) calculated from phylogenies obtained from the pyrosequencing approach (Data S1). We have computed the coverage index (CGood's (Good, 1953) for the pyrosequencing data for each sample, and this index varied between 96.7% (Lake Bourget) and 99.2% (Lake Pavin). These high values demonstrate good diversity coverage.


Composition of the small protist community

On the six studied lakes, on average, eight T-RFs represent more than 60% of the total area, and rare T-RFs account on average for 90% of the total T-RFs number (Fig. S1). The pyrosequencing data showed that the dominant and rare OTUs ranged from 11 (Lake Aydat) to 26 (Lake Godivelle) and 281 (Lake Aydat) to 405 (Lake Godivelle), respectively (Fig. S2). Considering T-RFLP data, dominant T-RFs ranged from 7.8 (Lake Pavin) to 13 (Lake Bourget) and rare T-RFs ranged from 71 to 162 in the same lakes. These data suggest that rare taxa account for most of the small protists diversity.

Protists alpha-diversity

The mean number of total T-RFs varied in the euphotic zone from 83 in Lake Godivelle (oligotrophic) to 175 in Lake Bourget (mesotrophic). Lakes Pavin, Sep, Aydat, and Villerest revealed mean of total T-RFs of 134, 136, 163, and 155, respectively. By pyrosequencing, the results were different because the highest richness was detected for the most eutrophic system (Godivelle = 28.94, Pavin = 21.78, Sep = 25.44, Bourget = 27.72, Aydat = 18.9, and Villerest = 32.19) with a mean range of fluctuation of 17.2%. However, a similar pattern was obtained when considering only the richness index computed from pyrosequencing data restricted to dominant OTUs (Godivelle = 3.05, Pavin = 2.41, Sep = 2.16, Bourget = 3.12, Aydat = 1.59, and Villerest = 2.52).

To test the prediction of the island biogeography theory, richness data (estimated by two molecular methods) have been analyzed according to the area of the six lakes (Table 2). The inter-lake variations of the total, dominant and rare T-RFs were explained significantly (P < 0.001) by area. The area was also related to OTUs determined from pyrosequencing data but only for the dominants (Table 2). To test whether these relationships were not due to environmental factors, we processed a partial correlation allowing to study this relation when environmental factors do not vary. The environmental parameters (Table S1) were then represented by the first axis of a principal component analysis (PCA), synthesizing their variation with the richness. The first PCA axis associated with the T-RFLP experiments (24 samples) represented 38.0% of total variance and 46.1% for pyrosequencing data (six samples). The partial correlations were therefore also significant supporting that the significant effect measured can be attributed mainly to the main effect tested area. When the taxa–area relationships were significant, the slope ‘z’, a measure of the rate species turnover across space, varied between 0.06 and 0.10 for fingerprinting method and was equal to 0.14 for high-throughput sequencing (Table 2).

Table 2. Relation between small protists richness (from T-RFLP (T-RFs) and pyrosequencing (OTUs) data) and lake areas (calculated from the six French lakes). Data are log-transformed; freedom degrees associated with T-RF (six lakes × four dates) and OTUs are 22 and 4, respectively. Probability was computed from a one-sided test
RichnessSlope (z)rsa P Partial rsb P
  1. a

    rs: Spearman correlation.

  2. b

    Partial Spearman correlation was computed with environmental parameters (first axis of a PCA) as constant.

Total0.090.79< 0.0010.79< 0.001
Dominant0.060.26< 0.050.33< 0.05
Rare0.100.79< 0.0010.80< 0.001
Dominant0.140.83< 0.050.83< 0.01

Protists beta-diversity

In terms of beta-diversity, the Table 3 shows the protists community composition (PCC) similarity between each lake by taking into account dominant and rare populations obtained from T-RFLP and pyrosequencing. Apparently, there is not congruence between both methods because, for example, lakes Aydat and Pavin shared 21.7% of dominant T-RFs, whereas a value of 3.3% was obtained by high-throughput sequencing. Similarly, the common rare OTUs between each pairs of lakes were lower than the common rare T-RFs. However, a same pattern can be found with both methods: lakes shared more rare populations than dominants. The beta-diversity studied from the UniFrac distance (fraction of the total branch length in the phylogeny that is unique to any particular environments) shows that these distances between lakes obtained with pyrosequencing were significantly different (< 0.001; Table S2).

Table 3. (a) % of common dominant (left panel) and rare (right panel) T-RFs for each pair of lakes (b) % of common dominant (left panel) and rare (right panel) OTUs for each pair of lakes
(a) T-RFs
(b) OTUs (95%) from pyrosequencing

The relative importance of local environmental factors and spatial distance on the protists repartition in the six French lakes has been analyzed with a Mantel test based on T-RFLP (time series) and pyrosequencing (single sampling) data (Table 4). The analysis showed a highly significant effect of geographic distance (spatial distribution at the regional scale) between sites for both rare and dominant T-RFs/OTUs, while the examined environmental variables do not seem to be significantly involved in the protists distribution.

Table 4. Effects of distance and environmental variables on protist community composition at regional scale
ScaleMantelPartial mantela
r P r P
  1. a

    The partial Mantel test holds spatial distribution constant.

  2. Values in bold are significant at < 0.005 (probability based on 999 permutations).

  3. The environmental variables used at regional scale are reported in Table S1.

Spatial distribution0.53 < 0.01   
Environmental variables−0.150.87−0.170.91
Spatial distribution0.42 < 0.001   
Environmental variables0.
Spatial distribution0.49 < 0.01   
Environmental variables−0.110.76−0.120.81
OTUs from pyrosequencing
Spatial distribution0.83 0.05   
Environmental variables−0.240.730.050.42
Spatial distribution0.35 0.03 0.350.13
Environmental variables0.190.23  
Spatial distribution0.84 0.03 0.180.33
Environmental variables−0.180.66  


Here, we analyzed, for the first time to our knowledge, alpha- and beta-diversity of the protist distribution patterns in view of geographic distances, lake areas, and habitat variables using two molecular methods including high-throughput sequencing.

Artifacts of taxonomic lumping, undersampling and unequal sampling could result in the incorrect conclusion about the spatial scaling of microbial biodiversity (Martiny et al., 2006). Therefore, in our study, temperate lakes characterized by stable summer stratification, sampled in the epilimnetic zone, were selected across a regional scale. Moreover, our results are based on an equal sampling effort and/or a time series, which allowed to take into account temporal variations in the community to analyze both spatial distribution and environment effects. Nolte et al. (2010) showed that seasonal abundance patterns of protists closely match their biogeographic distribution; temporal sampling is therefore basic for adequate diversity and species richness estimates. Finally, to avoid biases as much as possible, all lakes were sampled in the same way, and two molecular methods have been used for studying the spatial pattern of these microorganisms because the molecular methods used (fingerprinting, sequencing, etc.) can give different view on spatial patterns (Cho & Tiedje, 2000). In this study, the differences between T-RFLP results and OTUs determination can be due in part to the fact that a T-RF can correspond to diverse phylogenetic levels. It is clear that most of studies on protists diversity have undersampled their diversity greatly to this point. The application of novel high-throughput sequencing technologies, such as 454 pyrosequencing, has revealed a great diversity within bacterial and archaeal communities and allowed the access to the rarest OTUs (e.g. Sogin et al., 2006; Galand et al., 2009). However, these methods have been used for microbial eukaryotes in a really few studies so far in lacustrine ecosystems (e.g. Monchy et al., 2011). High-throughput pyrosequencing technology address indeed methodological shortcomings by recovering uncommon and rare species, but the short read lengths of 454 sequences made it necessary to rely on the existing long rRNA gene sequences to establish taxonomic identities (Edgcomb et al., 2011). Also, concerns remain about the role that sequencing errors may play in producing a distorted picture of the true complexity/richness of microbial communities (Kunin et al., 2010). Analyses of pyrosequenced SSU rDNA fragments amplified from an artificial bacterial community suggested that sequencing errors result in richness estimates that are at least one order of magnitude too high (Quince et al., 2009). However, this problem has been largely alleviated using computational tools to distinguish and filter out erroneous sequences (Quince et al., 2009).

Spatial patterns of small protists were not linked to environmental factors

According to the cosmopolitan view of the microbial world, one might expect to find similar microbial community structure (richness, diversity, and composition) in similar habitats and differentiated microbial communities along an environmental gradient (Green & Bohannan, 2006). The highest richness found in the mesotrophic lake (lake Bourget) among the dominant populations could be therefore explained by its large area as well as by its intermediate trophic status, although most of the studies regarding protists (e.g. nanoflagellates) did not show a strong evidence of a link between diversity and trophic status (Arndt et al., 2000; Auer & Arndt, 2001). However, phytoplankton and bacterial studies have reported that oligotrophic and eutrophic lakes presented lower diversity than mesotrophic lakes (Dodson et al., 2000; Horner-Devine et al., 2003). Experimental results also reported that phytoplankton diversity followed a hump-shape progression along a gradient of eutrophication. This diversity variation would allow a compromise between competition, predation, and accessibility of resources as well as other many ecological processes (e.g. Leibold, 1999). In this study, the use of partial correlations and the Mantel test allow to estimate the impact of geographic distance vs. environmental conditions on assemblage composition (Martiny et al., 2006), factors rarely taken into account simultaneously. The results clearly showed that variations of the alpha- and beta-diversity were significantly influenced by the geographic distance between lakes or areas rather than the environmental factors analyzed, such as bottom-up factors (i.e. nutrients) or potential predators (i.e. HNF and metazooplankton). However, even though the parameters explored are those commonly considered when attempting to explain the spatial partitioning of aquatic microorganisms, we did not analyze all potential controlling factors. For example, Schiaffino et al. (2011) showed that light penetration and DOC had a structuring effect on microorganism populations in 45 lakes. Overall, our results contradict the hypothesis of a general microbial cosmopolitanism. These patterns have already been observed for bacterial communities (Reche et al., 2005; Martiny et al., 2011). In addition, this analysis performed with all T-RFLP data (monthly sampling) showed therefore that temporal variations in the composition of the small protists community do not affect the importance of geographic distances. We can therefore hypothesize that the use of a single sampling per lake (pyrosequencing) could be enough to analyze the spatial repartition of lacustrine protists at different spatial scales.

Distribution of dominant small protists was linked to lake area and distance between lakes

MacArthur and Wilson's theory of island biogeography is among the most well-known process–based explanations for the distribution of species richness (alpha-diversity) and has been applied for microorganisms' biogeography studies (e.g. Reche et al., 2005; Logue et al., 2012). It helps understanding the taxa–area relationship, a fundamental pattern in ecology and an essential tool for conservation. Most of what we know of taxa–area curves is derived from analyses of terrestrial systems, but lakes and ponds can be considered as discrete habitats with definable borders that are comparable in some ways to oceanic islands (Dodson, 1992). This theory has been indeed already tested for zooplankton, phytoplankton, and bacteria communities in lakes (Dodson, 1992; Smith et al., 2005; Reche et al., 2005; Logue et al., 2012). In our study, the linear relationships observed between lake area and protists richness (determined by the fingerprinting method) were significant for total, dominant, and rare T-RFs, whereas the same relationships determined by pyrosequencing involved mainly the dominant species (i.e. OTUs). The TAR determined in this study varied with the molecular method used, as already highlighted (e.g. Zhou et al., 2008), and therefore with the taxonomic resolution. However, all methods suggested that the spatial distribution of dominant protists follows the TAR of the island biogeography theory.

Significant TAR was recently found for the richness of phytoplankton (Smith et al., 2005) as well as bacteria (Reche et al., 2005). However, as emphasized above, this theory seems restricted to the dominant taxa (i.e. OTUs) and to a specific range of lake area. The slopes of TAR varied also with the method used as already showed (Zhou et al., 2008). To compare to bibliographic data, we focused on the ‘z’ values determined at the ‘OTUs’ level. The z value determined with pyrosequencing (0.14) is quite far from the TAR determined for ectomycorrhizal fungi (Peay et al., 2007) but is close to those found for other planktonic organisms such as zooplankton [z = 0.094 (Dodson, 1992)] and phytoplankton (z =  0.114, Smith et al., 2005). For microorganisms, the lowest ‘z’ values were recorded for benthic ciliates (z = 0.043, Finlay et al., 1998) or bacteria (z = 0.040, Horner-Devine et al., 2004). Finally, our results contradict for dominant populations the advocates of microorganism cosmopolitan distribution, which suggests that microorganisms should be characterized by a flat TAR. Moreover, if the richness (alpha-diversity) seems to be constant for total OTUs (dominants and rares), PCC shows important changes. Therefore, even if the pyrosequencing data highlighted that the alpha-diversity did not vary with the habitat size, beta-diversity was strongly associated with distance for total, dominant, and rare OTUs. Hillebrand et al. (2001) showed also a distance decay relationship for diatoms and ciliates, but these authors did not consider the putative effects of environmental parameters.

Biogeographic patterns of the rarest protists taxa

The other component of the community, named ‘the rare biosphere’, comprises a very high number of rare species that contains most of the diversity (Sogin et al., 2006). Our results show that, while the TAR cannot be applied to rare OTUs (pyrosequencing data), the composition (beta-diversity) was linked to the geographic distances for the rarest T-RFs and OTUs. Thus, rare community composition varies between ecosystems but not the richness. Similarly, Galand et al. (2009) reported that the rare biosphere of the archaeal community followed patterns similar to those of the most abundant members of the community and has a biogeographic pattern. Based on the assumption that rare OTUs are taxa with low abundance, we assume that rare taxa could be more impacted by dispersal limitation because the probability of immigrating and growing in a new ecosystem is limited compared to abundant taxa (Green et al., 2004; Weisse, 2008). The ‘rare biosphere’ has traditionally been thought to indicate the presence of a seed bank of potential new colonizers, according to the ‘everything is everywhere’ hypothesis. However, the Mantel test showed no correlation between environmental variables and the rare T-RFs/OTUs. Galand et al. (2009) showed that regardless of spatial or temporal scales, most of the rare phylotypes are always rare within an ecosystem and the few rare phylotypes that are sometimes detected as abundant represent traces of phylotypes that are highly abundant in some habitats.

At a larger geographic scale: what do we learn from public database?

To extend our discussion to a more global scale (across continent), we used sequences available on the same planktonic size fraction from public database. These sequences were obtained by the traditional cloning–sequencing method (11 worldwide lakes). Although with this approach the sampling is far from exhaustive and many more taxa (especially rare) might be present at the sampling sites, statistical analysis (UniFrac analysis) showed 18S rRNA gene OTUs compositions significantly different among most of lakes, and this difference does not seem to be related to the trophic status at any spatial scales (data not shown). Such results were expected because the effect of dispersal limitation would occur at the largest geographic scales (across continents) rather than at regional scales where environment selection could theoretically be predominant (Martiny et al., 2006).

At the global scale, a significant linear relationship with lake areas (z = 0.24, < 0.001, n = 9) is obtained but only when using the smaller lake areas (lower than 114 km² corresponding the lakes Bourget and George; Fig. 1). The Fig. 1 shows a type IV curve typically calculated as a linear regression on a log–log scale from samples of island-like habitats (e.g. lakes) with possible decreasing properties (Scheiner, 2003). In a recent study conducted on Bacteria richness in 14 Swedish lakes, Logue et al. (2012) showed by analyzing 454 pyrosequencing data that taxa–area relationships were negative. These data suggest that the island model is therefore nonrestricted to a positive TAR, and the area range of ecosystems chosen must be large enough for assessing the model type.

Figure 1.

Relationship between lake areas and richness expressed by Margalef index (Dmg) calculated from cloning–sequencing method (six French lakes + five worldwide lakes = Baikal (JN547261-JN547327), George (AY919677-AY919829), Tanganyika (GU290066-GU290116), Zixia (FJ939033-FJ939124), Xuanwu (FJ939033-FJ939124)).


The use of several approaches (including 454-pyrosequencing) allows to reduce bias due to inadequate sampling of rare taxa (Woodcock et al., 2006) and the difficulty of delineating microbial species (Horner-Devine et al., 2004). Our study clearly shows distance–decay pattern in terms of beta-diversity for rare and dominant small protists. The spatial distribution of dominant taxa followed both predictions of the island biogeography theory following the hypothesis that lakes act as ecological islands, not only for macroorganisms but also for microorganisms (Rengefors et al., 2012). Unlike to our hypothesis, geographic (i.e. distances) effect seems to be the only one shaping the PCC at global and regional scales suggesting that one cause of the PCC differentiation is due to physical barriers. However, population differentiation may also be due to biological barriers. It is, however, difficult to assess speciation and extinction rates of microorganisms in situ especially when considering heterogeneous group of organisms, as protists, which comprise a high diversity of functional roles. There is likely no general rule for the biogeography of microorganisms, and results seem to change according to the group studied. A scenario proposed by Rengefors et al. (2012) to explain dinoflagellates differentiation in lakes is the ‘Monopolization Hypothesis’, which states that genetic differentiation can be explained by rapid population growth after historical founder events, enhanced by the presence of a large resting propagule bank providing a powerful buffer against newly invading genotypes (De Meester et al., 2002). Another hypothesis is that the richness/diversity of the smallest protist studied here is not truly independent of the diversity and structure of other (larger) planktonic compartments due to competition or parasitism. In this case, the high importance of parasitic groups recently highlighted among these protists in lacustrine ecosystems (e.g. Lepère et al., 2008) might explain an evolutionary drift due to 1) a host–parasite co-evolution in a lake, 2) the absence or extinction of the host and parasites have to change their primary hosts (common process in parasitic interactions) and such events often lead to speciation (Zietaria & Lumme, 2002). For example, the highly differentiated population of freshwater diatoms described by Evans et al. (2009) could be also explained by a host–parasite relation with chytrids (Jobard et al., 2010; Rasconi et al., 2011) or Cryptomycota (Jones et al., 2011). Therefore, we need to extend our knowledge of eukaryote diversity and species association in lakes (e. g. host–parasite relationship, syntrophy, etc.) to highlight a biogeography of co-occurrence.


This study was supported by financial aids from INSU EC2CO. We would like to thank Pr. N. Melnik and Dr. O. Belykh for the hosting and cruise organization during the lake Baikal sampling.