The use of long-term monitoring data for studies of planktonic diversity: a cautionary tale from two Swiss lakes


Correspondence: Dietmar Straile, Limnological Institute, University of Konstanz, 78464 Konstanz, Germany.



  1. Long-term data have been suggested as resources for investigating environmental influences on biodiversity and, in turn, the role of biodiversity for ecosystem dynamics. However, scientists analysing biodiversity patterns in long-term data need to recognise that multidecadal time series are likely to suffer from inconsistencies in methodology, which might strongly complicate the interpretation of diversity patterns. Unfortunately, such inconsistencies are usually difficult to detect, and consequently, it is not known how strongly they affect the conclusions drawn.
  2. Here, we highlight two long-term data sets sampled by one laboratory to analyse patterns in phytoplankton richness in two Swiss lakes, Lake Zurich and Lake Walen. Apparent patterns in the long-term species richness in the two lakes arise from: (i) inconsistencies in species identification (changes in taxonomic literature and/or taxonomic expertise of the counting personnel) and (ii) changes in the detection limits of taxa. Hence, bias in these two case studies was strong enough to obscure any possible effects on species richness of environmental change (oligotrophication).
  3. We show that in the case of these two data sets, inconsistency confounds estimates of phytoplankton richness not only at the species level but even at the generic and familial levels. This suggests that a solution often proposed for inconsistency, that is, reanalysis of the data after aggregation to genus or family, may be insufficient.
  4. We suggest the use of two diagnostic plots, which may be used in other studies examining richness patterns in either long-term time series or comparative studies in which several scientists/laboratories contributed to data acquisition. These plots illustrate temporal or spatial patterns in (i) the percentage of taxa identified only to genus and (ii) in the 5% percentile of the concentrations of individual algal taxa. They will help to identify inconsistency problems due to changes or differences in (i) taxonomic expertise and (ii) detection limits.


The current biodiversity crisis (Thomas et al., 2004) and the awareness of the functional role of biodiversity (Loreau, 2010; Cardinale et al., 2011) have strongly increased research on various aspects of diversity. For example, spatial and temporal patterns in diversity (Soininen, 2010), the regulation of diversity (Mittelbach et al., 2001) and biodiversity changes due to environmental change (Magurran et al., 2010) are of key interest in current ecological science. Long-term data have been suggested as potentially important and underexploited resources for studying diversity dynamics (Magurran et al., 2010). Indeed, there are many long-term data on plankton in lakes (e.g. Jeppesen et al., 2005), which span several decades and usually document the dynamics of dozens of species. Furthermore, due to the short generation times of plankton, long-term data spanning several decades cover several thousand generations. This suggests that such long-term data might indeed be particularly valuable for detailed studies of the responses of diversity to environmental changes and vice versa on resulting changes in ecosystem processes. Consequently, monitoring data have been increasingly used for biodiversity-related studies in recent years (e.g. Jeppesen et al., 2000; Bürgi & Stadelmann, 2002; Ptacnik et al., 2008, 2010; Spatharis et al., 2011; Stomp et al., 2011; Bürgi, Bührer & Keller, 2003; Pomati et al., 2012; Weyhenmeyer, Peter & Willén, 2012).

Causes for data set inconsistencies

The use of long-term records to address diversity-related questions, however, relies on consistency in the data. The latter is unfortunately compromised by difficulties in identifying many species in the phytoplankton. The most severe problems arise when there is a change in personnel analysing samples during the course of a long-term programme. This has been acknowledged in the literature; for instance, major changes in phytoplankton community structure in the Helgoland Roads phytoplankton time series (Wiltshire & Durselen, 2004), as well as in the Dutch North Sea monitoring programme (Peperzak, 2010), arose from changes in the staff or laboratory responsible.

In addition to changes in the scientists responsible, the effects of more subtle modifications in protocol that are likely to occur during a monitoring programme are less clear. First, the taxonomy of many groups has been under continuous change over the past few decades. For phytoplankton, this is evident from the publication of new identification keys (Salmaso, pers. comm.) that have appeared since the 1980s; for instance, for diatoms (e.g. Krammer & Lange-Bertalot, 1986, 1988) and cyanobacteria (e.g. Komárek & Anagnostidis, 1998). Second, technological advances (i.e. better microscopes with higher resolution) might have improved the scrutiny of samples during a programme spanning several decades. Third, the design of the sampling programme (i.e. seasonal or depth resolution of samples) might have changed, and fourth, species identifications will have improved with an increase in the experience and knowledge of those concerned.

While it is clear that these are important problems, which possibly confound interpretation of richness estimates from long-term data, it is not clear how significant these problems actually are. In particular, some of the possible changes affecting richness estimate might not occur in a stepwise way, but rather happen steadily, such as an increase in expertise. In consequence, it is very difficult for scientists using phytoplankton long-term data to avoid such bias.

To illustrate these problems, we have analysed taxon richness patterns in the long-term data sets of Lake Zurich and Lake Walen, Switzerland. Within the spectrum of biodiversity measures, richness is probably the most sensitive to consistency problems, as it ignores information about the relative abundance of taxa. However, other measures of α- and β-diversity are also strongly influenced by species richness (Patil & Taillie, 1982; Anderson et al., 2011), and consistency problems will also be relevant for those measures.

Study sites, methods and long-term data sets

Lakes Zurich (Thalwil station) and Walen have been monitored with a highly controlled and regular sampling programme by the ‘Wasserversorgung Zürich’ (WVZ). The phytoplankton community in the two lakes has been monitored with an identical sampling programme, and the phytoplankton samples have been counted and processed in the same laboratory by the same people using the same methods in specific sampling periods. In both lakes, phytoplankton from nine discrete depths between the surface and 20 m was sampled. Phytoplankton identification and counting was performed separately for each depth in sedimentation chambers under an inverted microscope (Utermöhl, 1958). Only one scientist supervised the phytoplankton monitoring throughout the study, during which only four staff were responsible for counting the samples. Furthermore, the periods of employment of theses few people overlapped considerably (Köster, pers. comm.), which allowed their species identification expertise to be passed on and kept consistent. This suggests a priori confidence in the consistency of phytoplankton richness estimates using these two data sets (Anneville, Gammeter & Straile, 2005; Pomati et al., 2012).

Both lakes have gone through a history of eutrophication and re-oligotrophication (Fig. 1a) in the last century. However, the absolute phosphorus concentration in the two lakes differs strongly (Anneville et al., 2005). In the early 1980s, phosphorus concentration during winter mixing (TPMix) in Lake Walen was less than 20 μg L−1, whereas the concentration in Lake Zurich was about four times higher. With oligotrophication, TPMix declined in Lake Zurich to c. <30 μg L−1 in 2000, whereas TPMix in Lake Walen declined to around 5 μg L−1 by the end of the 1980s, with no further reduction until the end of the study period. Hence, phosphorus concentration in Lake Walen in the early 1980s corresponded approximately to that recorded in Lake Zurich after two decades of re-oligotrophication. If oligotrophication was the main driver of observed changes in algal taxon richness, we would expect the Lake Zurich phytoplankton richness in 2000 to have approached the richness in Lake Walen present in the early 1980s. On the other hand, if the main driver of observed phytoplankton richness in the two lakes was methodological change, we would expect their richness dynamics to be synchronised.

Figure 1.

(a) Long-term development of total phosphorus concentrations during winter mixing in Lakes Zurich and Walen. (b,c) Taxon richness and 5-year running means of taxon richness in the data sets of (b) Lake Zurich and (c) Lake Walen.

We have analysed taxon richness of the two lakes from a data set spanning from the early 1980s until 2000, using the data previously examined by Anneville et al. (2004, 2005). In both lakes, we calculated richness dynamics from the ‘raw’ data, that is, we did not attempt to increase data set consistency by lumping taxa, for instance. We present richness dynamics in both lakes for the individual sampling dates, but also as a 5-year running average to facilitate recognition of long-term dynamics (Pomati et al., 2012).

Species richness dynamics and evidence for data set inconsistencies

Phytoplankton richness in the data from both lakes increased strongly, that is, approximately doubled over the study period (Fig. 1b,c). Taxon richness in Lake Zurich in 2000 did not approach the richness in Lake Walen at the start of the monitoring period, as might be expected if phytoplankton richness was a function of lake trophic status. In contrast, changes in the observed richness in the two lakes were strikingly synchronous (R2 for monthly data = 0.78, R2 for 5-year moving average richness = 0.98). Both lakes showed an apparently steady and similar increase in richness from the 1980s to the early 1990s and a further strong, stepwise increase in 1997, despite their distinctly different phosphorus concentrations. This synchrony provides circumstantial support for the proposition that methodological change is a driver of richness dynamics in the two lakes.

The richness increase evident in the two lakes might be caused by at least three factors. First, improved taxonomic knowledge might have allowed species to be detected at a finer taxonomic resolution. This could be checked by plotting the percentage of taxa classified to genus on each sampling date against all taxa (i.e. additionally those taxa identified to species level or into different size classes). This percentage dropped in parallel in both data sets from c. 55 to 45% during the late 1980s, suggesting increasing taxonomic expertise in that period (Fig. 2a). However, the c. 10% increase in species level identification in both lakes can only partially explain the twofold increase in taxon richness. Second, a large number of new taxa have been identified in both lakes since the 1980s, again in striking synchrony (Fig. 2b). Note that the number of new taxa includes those resulting from finer taxonomic resolution and that the rate of identifications of new taxa did not increase strongly in the mid-1990s, when richness increased strongly in both lakes. Third, the number of occurrences per year in monthly samples of individual taxa increased. A plot of the number of taxa versus their number of occurrences reveals a frequency distribution with two maxima, one at the lower end of annual occurrences, that is, rare taxa that appeared only once per year, and the other of taxa that were present year-round (i.e. 12 times a year). The major difference between the number of occurrences per year in the period 1997–2000 and two other periods (1980–1990, 1991–1996) in both lakes is that the number of taxa occurring year-round (12 occurrences per year) has increased strongly (Fig. 2c,d). Hence, an increasing number of taxa occurring year-round seem to be an important factor explaining the richness increase observed after the mid-1990s.

Figure 2.

(a) Five-year running means of the percentage of taxa identified to genus in the Lake Zurich (black line) and Lake Walen (grey line) data sets, (b) cumulative number of newly identified taxa in the Lakes Zurich (black line) and Walen (grey line) data sets after 1981, (c) Lake Zurich data and (d) Lake Walen data taxa frequency distribution of the number of occurrences per year in monthly samples.

The taxa newly identified in both lakes were largely identical, despite the large differences in phosphorus concentration between the lakes. For example, the cyanobacterium Phormidium arcuatum was first identified in Lake Zurich on 8 June 1994 and in Lake Walen on 18 September 1995. The benthic diatom Fragilaria nitschioides was first identified in Lake Zurich and Lake Walen on 3 February 1999 and 15 February 1999, respectively. As a last example, the green algae Thorakochloris sp. was first identified in Lake Zurich on 6 October 1999 and on 20 September 1999 in Lake Walen. Because this synchrony of first identification was the rule rather the exception, the time a specific taxon was first identified in Lake Zurich is closely related to its first identification in Lake Walen (Fig. 3; R2 = 0.76, = 48, < 0.0001). From a total of 184 taxa occurring in the Lake Zurich data set in the period 1977–2000, 81 (44%) were first identified after 1980, 48 (60%) of which were also identified in Lake Walen after 1980, with a timing of first identification differing by less than 2 years for 36 of 48 taxa in Lake Walen (75%) and 36 of 81 taxa in Lake Zurich (44%).

Figure 3.

Relationship between the timing of first identification of phytoplankton taxa in the Lake Zurich data set relative to the timing of first identification in the Lake Walen data set. The line represents the fit of a standard major axis regression. The grey area covers ±2 years of simultaneous first identification.

Because of the large differences in trophic status between the two lakes, this coherence in the first identification of algal taxa in the two lakes is very unlikely to be caused by genuine synchrony in the colonisation of the two lakes, but is more likely to be a result of an increased ability to recognise species in the personnel responsible. Nevertheless, it might be argued that the observed synchrony results from phytoplankton dispersal. Lake Walen is located in the headwaters of Lake Zurich, and both lakes are connected via the Linth channel. However, local environmental factors usually strongly outweigh dispersal-related factors in phytoplankton communities (Vanormelingen et al., 2008; Verleyen et al., 2009), suggesting that dispersal is highly unlikely to result in almost perfect synchrony. The dispersal hypothesis further assumes a rather limited geographical range for the taxa. In contrast, microbial taxa are often thought to be widely distributed (e.g. the ‘everything is everywhere hypothesis’; Fenchel & Finlay, 2004), and whether they occur in samples depends on local environmental conditions. In any case, the dispersal hypothesis would predict that taxa should first occur in the headwater lake, that is, Lake Walen, and only with a time lag sufficient to allow population growth and subsequent detection in Lake Zurich This is equivalent to the prediction that the intercept of a major axis regression line (Warton et al., 2011) should be significantly larger than zero. As this was not the case (Fig. 3; slope = 0.96 (95% confidence interval: 0.83–1.1), t = −0.08, = 0.57, intercept = 347 (95% confidence interval: −191–885), t = 0.97, = 0.34), the dispersal hypothesis was not supported by our analysis.

The increase in taxa occurring year-round since the mid-1990s could possibly result from a homogenisation of environmental conditions consequent upon oligotrophication, because in oligotrophic lakes, a seasonal spring period with abundant phosphorus supply disappears. This ‘seasonal homogenisation hypothesis’ would predict that the increase in taxa occurring year-round should have occurred much earlier in Lake Walen than in Lake Zurich, as phosphorus concentration in the former was much lower. This was not observed, however, and taxa occurring year-round increased in both lakes rather abruptly in 1997. Alternatively, this increase might be caused by a reduction in the detection limit of algal taxa. To test this hypothesis, we computed the distribution of concentrations of individual algal taxa at each sampling date, and Fig. 4 shows the development of the median and the 5% and 95% percentiles of algal concentrations. In April 1997, a drop of the 5% percentile occurred abruptly in both data sets, which suggests that taxa occurring at lower concentrations were detected after that date. This indicates in turn that the strong increase in taxon richness in 1997 is probably the result of a methodological change lowering the detection limit. In principle, such changes in detection limit may arise when there is a change: (i) in the number of samples counted on a specific sampling date (e.g. differences in spatial or depth resolution), (ii) in the volume of the algal sample allowed to settle in the Utermöhl chamber and/or (iii) in the area of the settling chamber that is analysed. Unfortunately, the WVZ Zurich was not able to determine which methodological change(s) resulted in the simultaneous drop in the detection limits, and increase in species richness, in the data sets of both Lakes Zurich and Walen.

Figure 4.

Median, 5% and 95% percentiles of concentrations of individual algal taxa in (a) the Lake Zurich data set and (b) the Lake Walen data set. Arrows indicate a drop in detection limit occurring in April 1997 in both monitoring programmes.

Effects of preliminary inconsistency adjustments on species richness dynamics

To test the effect of synchronous ‘discovery’ of new species and the reduction in the detection limit on taxon richness, we computed richness dynamics, first by removing those 36 taxa which appeared in the Zurich and Walen data sets within a time span of 2 years. As a result, the absolute increase in smoothed taxon richness dropped from c. 30 to 13 in Lake Zurich (Fig. 5a) and from 23 to 9 in Lake Walen for the time period 1982–2000 (Fig. 5b). The strong increase in richness after the mid-1990s is still apparent in both data sets, however, even after deleting these 36 taxa. In a second step, we additionally removed all taxa in the data sets with a concentration of less than 0.1 cells mL−1 (0–20 m average). This resulted in an overall reduction in richness in Lake Zurich but, as expected, also in the disappearance of the strong richness increase after the mid-1990s (Fig. 5a). This provides additional strong evidence that the increase in richness in the late 1990s was indeed due to a greater ability to detect algae. For Lake Walen, the a posterior adjustment of the detection limit did not reduce overall richness, but did remove the richness increase after the mid-1990s (Fig. 5b). Consequently, taxon richness in Lake Walen reached a maximum during the early 1990s. This pattern is more consistent with current theory, which predicts a peak in richness at intermediate productivity (Mittelbach et al., 2001), that is, a reduction in richness at very low phosphorus concentrations, which occurred in Lake Walen after the mid-1990s.

Figure 5.

Five-year running means of taxon richness in the Lake Zurich (a,c) and Lake Walen (b,d) data sets before and after consistency adjustments (i.e. after deletion of 36 co-identified taxa and additionally after a posterior adjustment of the detection limit) (a,b) and at different taxonomic levels (c,d).

We want to emphasise that the removal of 36 co-identified taxa and the adjustment of the detection limit, although providing a more consistent picture of long-term dynamics than using the unmodified data set does not eliminate all problems of consistency. Since it cannot be assumed that newly identified taxa are confined to those synchronously appearing in both Lakes Zurich and Walen (= 36), the number of newly identified taxa (but not newly co-occurring) in both lakes, because of increased identification ability, is probably greater than 36. Consequently, it is unclear whether phytoplankton richness in Lake Zurich really increased during the study period. Such an increase might be expected from theory (Mittelbach et al., 2001); however, we are very doubtful that such a pattern could be demonstrated with the data set available from Lake Zurich.

Inconsistencies at courser taxonomic levels

The uncertainty of algal species identification is often acknowledged in the literature (Anneville et al., 2004, 2005; Ptacnik et al., 2008, 2010; Pomati et al., 2012). Hence, a reanalysis of the data with taxa aggregated to a courser taxonomic level (e.g. species to genus) is often proposed for testing the robustness of the results or to provide a more reliable result. To test the effects of taxonomic aggregation on richness dynamics, we calculated the richness patterns in both lakes based on the number of genera and families (Fig. 5c,d). This analyses show that the major patterns in richness dynamics are still apparent. For instance, 17 families were newly identified in the Lake Zurich data set after 1980, 13 of which also appeared in Lake Walen after 1980. In 10 cases, the time of first detection in both lakes was within less than 2 years. That is, more than 50% of the families newly described in Lake Zurich also appeared at roughly the same time in Lake Walen.

The new occurrence of some of these families in the data set might be related to oligotrophication. For example, members of the diatom family Rhizosoleniaceae were first recorded in 1998 (Lake Zurich) and 1990 (Lake Walen) and are typical of clear oligotrophic lakes (Padisak, Crossetti & Naselli-Flores, 2009). In contrast, typical occurrence patterns of members of such newly recorded families as the Pseudanabaenaceae (Lake Walen: 1987, Lake Zurich: 1988), Phormidiaceae (Lake Walen: 1995, Lake Zürich: 1991), Ulotrichaceae (both lakes: 1994) and Katablepharidaceae (both lakes: 1996) suggest that their appearance is unlikely to be associated with oligotrophication of lakes (Padisak et al., 2009). Their new recordings might hence be due to either learning effects and/or taxonomic reorganisation. In addition, the change in the detection limit in 1997 also increased the number of genera and families occurring year-round during the 1997–2000 period (relative to earlier periods in both Lake Zurich and Lake Walen). This indicates that working at courser taxonomic levels (genus or even family) does not necessarily solve problems of inconsistency when analysing taxon richness patterns in long-term data. We stress, however, that aggregation might definitely be advantageous for studying the dynamics of plankton abundance or biovolume in contrast to using presence–absence data.

To conclude, our analyses suggest strongly that methodological changes can have large effects on richness estimates of phytoplankton, making reliable quantification of environmental effects on richness difficult if not impossible. In these two lakes, the dynamics in algal richness are very unlikely to be due to changes in the trophic status of the lake, but seem to be driven by inconsistent methodology in both plankton identification and counting. In fact, the reported pattern of taxon increase is most likely to be composed of newly identified taxa, which probably resulted in a steady increase in richness, overlaid by a methodological change associated with a lower detection limit of algae, causing the sudden increase in taxa during the late 1990s. We believe that many, if not all, decade-long phytoplankton time series probably suffer from such subtle inconsistencies, due to otherwise welcome technical improvements and increased taxonomic expertise of the counting teams. Clearly, if this is the case, it will be difficult if not impossible to distinguish between newly identified species and truly newly occurring species in a specific data set. The circumstance that two long-term time series (i.e. for Lake Zurich and Lake Walen) for lakes differing in trophic status were established by one laboratory provided a unique possibility to highlight the problem of, for instance, changes in species identification expertise for plankton richness studies.

Recently, Pomati et al. (2012) studied the increase in richness of phytoplankton in Lake Zurich over the past few decades, based on the same data set but extended to more recent years. They showed a threefold increase in phytoplankton richness from the 1980s to 2007. During the time period, which overlaps with our study, the long-term increase in richness reported by Pomati et al. (2012) is strikingly similar to the one we show, both in respect of the magnitude and pattern of increase (i.e. a continuous accumulation of new taxa overlaid by a strong richness increase in 1997; see their Fig. 3a and their Online Appendix 4 for a 5-year running mean). They attributed this increase to a combination of oligotrophication and warming, suggesting that both drivers result in increased water column heterogeneity and consequently an increased number of niches for phytoplankton. Unfortunately, Pomati et al. (2012) do not consider any effects of methodological change on their richness patterns. Our results suggest that the conclusions of Pomati et al. (2012) should be reconsidered.

One important lesson to be learned from our study is that future research analysing the plankton richness of long-term series needs to show that problems of consistency have been thoroughly considered. For example, it should be demonstrated that there are no long-term patterns present in the ratio of taxa identified to species level nor in the detection limits of individual taxa. To demonstrate this, the ‘diagnostic plots’ suggested in Figs 2a & 4 might also prove helpful for other studies. In addition, editors and reviewers should request an extensive online appendix that documents all methods and changes in species identification literature in detail. This appendix should also provide either evidence for consistency, for example, by showing a table with all taxa identified, for example, in the first and last year of the time period, and/or by explaining in detail how authors have dealt with the problem of inconsistencies. It should be noted that problems of consistency do not only arise for long-term plankton series, but also when spatial patterns in plankton richness are analysed and several laboratories identifying plankton samples are involved.

Because of the problems highlighted in this Opinion paper, we advocate more studies analysing plankton richness responses to environmental change using palaeolimnological approaches (e.g. Gregory-Eaves & Beisner, 2011). For diatoms, such studies (in concert with contemporary monitoring) could provide an assessment of the reliability of richness dynamics from monitoring data in general. Recent advances in ecological metagenomics of sediment samples might finally allow an assessment of phytoplankton richness dynamics to environmental change for all phytoplankton taxa (i.e. Chariton et al., 2010; Anderson-Carpenter et al., 2011).

Long-term time series, including those of Lakes Zurich and Walen, represent an important resource for ecological science (Jackson & Fureder, 2006), and future analyses of these time series will continue to contribute insights to, for example, community ecology (Vasseur & Gaedke, 2007; Ernest et al., 2008; Jochimsen, Kümmerlin & Straile, 2013), life history adaptations to environmental variability (Seebens, Einsle & Straile, 2009) and will allow for a better understanding of the effects of anthropogenetic change on ecosystems (Jackson et al., 2001; Jeppesen et al.,2005; Adrian et al., 2009; Posch et al., 2012). However, these data sets obviously do have limitations, especially with regard to the study of richness and diversity. Our study on the phytoplankton richness in Lakes Zurich and Walen may help to increase the awareness of these limitations.


We thank the Wasserversorgung Zürich (WVZ) for the permission to work with the long-term data from lakes Zurich and Walen and Oliver Köster (WVZ) for providing us with details about the sampling programme at the WVZ. We greatly acknowledge the suggestions of Alan Hildrew for improving the English. Many thanks also to Orlane Anneville, Wayne Dawson, Oliver Köster, Karl-Otto Rothhaupt, Nico Salmaso and two anonymous reviewers for discussions and comments on our manuscript.