Opening a next‐generation black box: Ecological trends for hundreds of species‐like taxa uncovered within a single bacterial >99% 16S rRNA operational taxonomic unit

Current knowledge on environmental distribution and taxon richness of free‐living bacteria is mainly based on cultivation‐independent investigations employing 16S rRNA gene sequencing methods. Yet, 16S rRNA genes are evolutionarily rather conserved, resulting in limited taxonomic and ecological resolutions provided by this marker. The faster evolving protein‐encoding gene priB was used to reveal ecological patterns hidden within a single operational taxonomic unit (OTU) defined by >99% 16S rRNA sequence similarity. The studied subcluster PnecC of the genus Polynucleobacter represents a ubiquitous group of abundant freshwater bacteria with cosmopolitan distribution, which is very frequently detected by diversity surveys of freshwater systems. Based on genome taxonomy and a large set of genome sequences, a sequence similarity threshold for delineation of species‐like taxa could be established. In total, 600 species‐like taxa were detected in 99 freshwater habitats scattered across three regions representing a latitudinal range of 3,400 km (42°N to 71°N) and a pH gradient of 4.2 to 8.6. In addition to the unexpectedly high richness, the increased taxonomic resolution revealed structuring of Polynucleobacter communities by a couple of macroecological trends, which was previously only demonstrated for phylogenetically much broader groups of bacteria. An unexpected pattern was the almost complete compositional separation of Polynucleobacter communities of Ca2+‐rich and Ca2+‐poor habitats. This compositional pattern strongly resembled the vicariance of plant species on silicate and limestone soils. The new cultivation‐independent approach presented opened a window to an incredible, previously unseen diversity, and enables investigations aiming on deeper understanding of how environmental conditions shape bacterial communities and drive evolution of free‐living bacteria.


| INTRODUC TI ON
Prokaryotes are the most numerous organisms on earth. Our current knowledge on diversity of prokaryotes and composition of prokaryotic communities is mainly based on cultivation-independent investigations employing 16S rRNA gene sequencing methods. The insights obtained by such investigations are, however, limited by the taxonomic resolution of the 16S rRNA gene sequences (Stackebrandt & Ebers, 2006). It has repeatedly been shown that prokaryotes affiliated with distinct but related species cannot be discriminated by using their full-length 16S rRNA gene sequences (Hahn et al., 2016;Jaspers & Overmann, 2004;Stackebrandt & Ebers, 2006). In addition to the limited taxonomic resolution of the gene, the majority of the current studies using high-throughput amplicon sequencing of 16S rRNA genes, do not use the entire gene length and thus not the full information content of 16S rRNA gene sequences. Some studies have even been based on <150 bp sequence fragments (García-García et al., 2019) equivalent to only about 10% of the entire gene.
Furthermore, the taxonomic ranks of the operational taxonomic units (OTUs) typically established in 16S rRNA-sequence-based investigations are unknown. Despite being frequently stated in publications, and independent of the applied sequence similarity threshold used for OTU demarcation, the established OTUs do not represent species-like taxa. Due to all these limitations, prokaryotic communities remained at the species level black boxes. A few alternative markers and primers have been developed for profiling the composition and structure of bacterial communities with higher resolution than that provided by the 16S rRNA gene (Hill et al., 2002;Ogier et al., 2019;Sánchez et al., 2014). Such approaches have to deal with the trade-off of either relying on strongly degenerated primers ensuring a taxonomically broad match of the primers (Hill et al., 2002;Ogier et al., 2019), or using less degenerated primers, which only match to genes of much more narrow taxonomic groups (Sánchez et al., 2014).
Here, we present a method targeting only a phylogenetically narrow (>99% 16S rRNA sequence similarity, OTU 99% ) but important group of freshwater bacteria. The method presented here is not intended to be a replacement of standard 16S rRNA amplicon sequencing but to be a supplemental method enabling a species level taxonomic resolution for an important fraction of freshwater bacterioplankton. For the targeted protein-encoding single-copy housekeeping gene (primosomal replication protein N, priB), involved in DNA replication, a sequence similarity threshold could be established, which largely resolves communities of the investigated OTU 99% at the species level. The establishment of this threshold is based on large sets of priB gene sequences and corresponding genome sequences obtained from a large culture collection. These two corresponding data sets enabled the search for a priB gene sequence similarity threshold equal to the 95% genome-wide average nucleotide identity (ANI) threshold used in genome taxonomy and in standard taxonomy of prokaryotes for genome-based species demarcation (Jain et al., 2018;Konstantinidis et al., 2006). The genus Polynucleobacter can be phylogenetically subdivided into four subclusters: PnecA, PnecB, PnecC and PnecD (Hahn, 2003). In contrast to the ecological diversification at the subcluster level (Jezbera et al., 2012;Newton & McLellan, 2015;Nuy et al., 2020), little is known about ecological diversification within these subclusters (Hahn et al., 2016;Jezbera et al., 2011). The method presented here targeted only subcluster PnecC (Hahn, 2003), which consists of many described and undescribed species all sharing 16S rRNA sequence similarity values ≥99%, and thus the whole subcluster PnecC represents a single OTU 99% . This subcluster harbours strains of two distinct lifestyles. The evolutionary primary lifestyle is a free-living and planktonic one, while a derived lifestyle is found in strains dwelling as obligate endosymbionts of benthic ciliates (Boscaro et al., 2013;Heckmann & Schmidt, 1987). Strains of the planktonic lifestyle have been frequently detected in diversity surveys, which together cover an ecologically broad range of freshwater systems (Allen & Cavicchioli, 2017;Bahr et al., 1996;Percent et al., 2008;Zwart et al., 2002). They represent abundant freshwater bacteria (Jezbera et al., 2012) ubiquitously appearing in the water column of freshwater habitats worldwide (Comte et al., 2016;Hahn et al., 2015;Jezberova et al., 2010;Peixoto et al., 2011). Conversely to the planktonic strains, endosymbiotic strains have not previously been reported to appear with significant abundance in the water column of freshwater systems.
Comparative genomic analyses suggested that species of the genus Polynucleobacter are biologically maintained by intensive intraspecific recombination, which is opposed by interspecific recombination barriers that separate core gene pools of related species . The coherence of the intraspecific gene pool of at least one species is even maintained across populations separated by geographic distances of up to 3,000 km, indicative of a high dispersal potential of the species .
On the other hand, microdiversification of Polynucleobacter species is influenced by horizontal acquisition of accessory genomic islands that can be transferred among different species .
Here, we present a largely exploratory study designed to reveal the breadth of diversity within the targeted OTU 99% . We investigated and compared the PnecC diversity of 99 freshwater habitats representing a broad limnological range and scattered across a latitudinal range of 3,400 km. This resulted in the detection of an astonishingly high total species richness across the investigated habitats.
We tested if the revealed diversity is structured by environmental factors and known macroevolutionary trends. Since the structuring of bacterial communities and evolutionary processes takes place at the species level and not primarily at the level of 16S rRNA OTUs, the developed method provides a supplementary tool which partially increases the resolution of diversity surveys. In addition, the method is suitable to reveal drivers structuring the community composition and the selective forces of evolutionary adaptation of a model group of planktonic freshwater bacteria.

| Investigated habitats and sampling
In total, 117 water samples from 102 freshwater habitats were obtained (Table S1 and Figure S1); however, only the results of 114 samples from 99 habitats could be analysed (see below). This included 11 habitats sampled 2-3 times. Surface water (0.1-0.5 m depths) samples were taken from the shoreline or from piers, if available. Biomasses were collected by filtration onto 0.2 µm nuclepore membrane filters, preserved by storage in absolute ethanol, and transported in a mobile refrigerator. Water temperature, pH, conductivity and oxygen were measured on location. Water samples were filtered through GF/F filters (Whatman) for determination of concentrations of major ions and measured by ion chromatography (Thermo Scientific DIONEX ICS-1100).

| Reference priB sequences of cultured strains
In total, a set of 377 priB (primosomal replication protein N) sequences of cultured Polynucleobacter strains were established, representing 254 unique sequences (Table S2). Of those priB sequences, 102 originated from genome sequences. The remaining 275 sequences were obtained by Sanger sequencing of PCR products. The sequenced amplicons were generated by using primers priBausF 5′-CGTCARATGGCTTACATGATC-3′ and priBausR 5′-CAATAACGYTTACGCTTGAAC-3′. The sequenced fragments obtained with this primer pair included the binding sites of the priBin-nFd/priBinnRd primer pair (see below), which enabled checking of the binding sites in strains without an available genome sequence.

| Amplicon sequencing and processing of reads
Genomic DNA was extracted from environmental samples as previously described (Jezbera et al., 2011), and purified using the Wizard DNA clean-up kit (Promega

| Data analyses
The OTU table exported from Qiime2 and the environmental data were analysed using R version 3.6.1 (R Core Team, 2019). The vegan package (Oksanen et al., 2019) was used for most of the analyses performed. Geographic distances between sampled habitats were determined by calculation of great-circle-distance using the haversine method, which assumes a spherical earth, from the R package "geosphere" (Hijmans, 2019). Site-specific climate data were obtained from the WorldClim data set (Fick & Hijmans, 2017) using the DIVA-GIS software (Hijmans et al., 2001). Furthermore, the R packages ggplot2 (Wickham, 2016) and maps (Minka & Deckmyn, 2018) were used. average pH of detections of P. paneuropaeus was 5.9, therefore occupancy refers to the samples of the pH range 4.9-6.9. Forty-one of the investigated samples belonged to this pH range and P. paneuropaeus was detected in 31 (with >25 reads, i.e., >0.1% of reads per sample), and thus showed a pH-specific occupancy of 73%.

| Development of PCR primers targeting a protein-encoding gene
We aimed to develop a primer pair suitable for specific amplification of a protein-encoding gene present in all Polynucleobacter bacteria. The strategy employed for primer development and the faced limitations and results are described in Supporting Information S1.
In brief, a primer pair for amplification of the primosomal replication protein N (priB) gene of Polynucleobacter bacteria affiliated with subcluster PnecC was developed. Detailed analyses suggested a sequence similarity threshold of 98% for discrimination of species-like OTUs ( Figure 1; Supporting Information S1). Discrimination based on this threshold agreed with average nucleotide identity (ANI) based species discrimination (95% identity threshold) for 99.2% of the pairwise comparisons among the 102 strains with available genome sequences ( Figure 1). Species that could not be discriminated by priB similarities of <98% were combined together to species complexes.

| Amplicon sequencing of environmental samples
In total, results from 99 freshwater habitats (114 samples) including small ponds, lakes, streams and rivers (Table S1) located in three regions (Lapland, Central Europe, Corsica; Figure S1) along a European South-North cross-section (42°N to 71°N) were obtained. The selection of habitats aimed for maximizing the covered habitat diversity in order to maximize the insights into diversity of Polynucleobacter taxa. Details on the results of the amplicon sequencing are given in Supporting Information S1. Rarefaction analyses suggested that sequencing depth after rarefication (25,230 reads) was large enough to completely cover the amplicon sequence variant (ASV) numbers in the respective samples ( Figure S2A). In total, 600 OTUs 98% were detected in the samples. Rarefaction analyses of the OTU 98% data suggested that the number of investigated samples was not high enough to completely cover the total OTU 98% richness in the investigated area ( Figure S2B). The established OTU 98% were taxonomically classified by using reference sequences and a threshold of 98% sequence similarity (Figure 1). The reference sequences represented 108 species-like taxa, seven species complexes, and two taxa representing the same species but separated by priB sequence similarities <98% (Supporting Information S1 and  The percentage of eOTUs, that is, environmental OTUs sharing <98% sequence similarity with all reference strains, increased with decreasing read numbers recruited by the respective OTUs ( Figure 2). While only 30% of the top-ten-ranked OTUs were eOTUs, the first quarter of the ranking contained 70% and the last quarter 95% eOTUs.

| Structuring of Polynucleobacter communities by environmental factors
The investigated samples and habitats represent broad ranges of environmental parameters (Table S1) The largest explained fraction of variance was explained by habitat characteristics (habitat size and type, 31%). The second largest fraction was explained by physicochemical variables (including pH) together with habitat characteristics (21% of variability in OTU richness). The rest of the variance was mainly explained by various combinations of parameters ( Figure S4).
Communities of the most acidic samples tended to be dominated by only a single OTU, that is, P. sphagniphilus. In the four samples with lowest pH (<4.6) this species represented 99.7%-99.8% of reads ( Figure 6b). When excluding rare OTUs defined by relative abundances in the samples of <1% (i.e., <1% of reads), these four samples showed an OTU richness of 1.0. By contrast, samples from streams and rivers were characterized by 66-89 OTUs when rare OTUs were excluded. In general, both the OTU richness ( Figure 6b) and the size of habitats ( Figure S5) tended to increase with pH values of the habitats. Interestingly, similar z values describing the relationship of habitat size and species (OTU) richness were observed for Polynucleobacter OTU ( Figure S5) and whole bacterioplankton communities of freshwater lakes characterized by using 16S rRNA sequence data (Reche et al., 2005).

| Occupancy of particular OTUs along the pH gradient
Only eight (including two species complexes) of the 600 detected OTUs represented common taxa with pH-specific occupancy >50%. These 1.5% of all detected OTUs frequently occurred with high average relative abundances and tended to appear with more even relative abundance across the occupied habitats. In contrast to these common taxa, 15.5% of the detected OTUs showed occupancies between 10% and 50% and appeared by average with maximum relative abundances of 15.9% of the reads (range 0.3% to 87%).
Interestingly, locally abundant taxa showed occupancy values <10% but rather high relative abundances in a few habitats or samples.
Examples for locally abundant taxa are P. wuianus and P. meluiroseus ( Figure S1). The former species was discovered in October 2002 in a small slightly acidic pond designated Pond-1 .
The priB amplicon sequencing included three samples of that pond, which were taken in August and October 2009, and in June 2010.
P. wuianus was detected in all three samples of its type locality with relative abundances ranging from 15% to 86% of the reads. Across the other 111 investigated samples, this species was only detected in six samples from two ponds located 200 m and 40 km away from Pond-1. Both ponds were sampled three times but P. wuianus was detected only with low relative abundances of 0.004% to 0.135% of the reads. The occupancy of this species was only 3.7% and the average relative abundance was despite the local abundance peak in Pond-1 only 0.56%, which was twenty-times smaller than the relative abundance of the common species P. paneuropaeus. Similarly, P. meluiroseus showed an occupancy of 6.1% with detection in nine samples representing seven habitats but this species only appeared in three samples with relative abundances of >1%. Interestingly, the two highest values of 41% and 8% relative abundance were observed for the lake from which the type strain of the species was isolated (Pitt et al., 2018). In contrast to P. wuianus, the detections of P. meluiroseus were geographically broader scattered ( Figure S1). Obviously, both species are characterized by high local relative abundances, low pH-specific occupancy and high local persistence.

F I G U R E 2
Frequencies of the detected OTUs 98% . The bar plot shows the rank-abundance distribution of the 600 operational taxonomic units (OTUs) detected in the 99 investigated habitats. Detections of repeatedly sampled habitats were downweighted in order to give detections from all habitats the same weight. Individual rank-abundance curves of each investigated sample are shown in Figure S2C. The pie charts depict shares of refOTUs (representing cultured strains) and eOTUs (sharing <98% sequence similarity with priB sequences of cultured strains). Top pie chart, shares of refOTUs and eOTUs of the total number of detected OTUs 98% . Middle, share of reads assigned to refOTUs and eOTUs. Bottom, cumulative contribution of detected OTUs sorted by increasing rank to the total number of reads. For instance, the top-ranked OTU 98% (P. paneuropaeus) recruited 11.3% of the total number of reads (habitats weighted equally) and the top-seven ranked OTUs 98% recruited in total 48.  The pH-specific occupancy of OTUs tended to decrease with increasing pH prefered by the respective taxa ( Figure S6A). This trend is linked to a trend of increasing community dissimilarities among communities of the same pH class with increasing pH (Figure 5a).
This means that differences in composition among communities present in habitats with similar pH are increasing with pH. This is also obvious in the NMDS ordination where the communities from habitats with similar pH are spread over larger ordination space if pH values of their habitats are higher ( Figure 3b). Interestingly, in the case of high-Ca 2+ communities (mainly represented by the pH class 8-9), the link between occupancy of OTUs and community dissimilarities seems to be less strict than in the low Ca 2+ communities ( Figure 5a and Fig. S6A).

| Biogeography of Polynucleobacter communities
Of the set of 123 reference taxa (Table S2)

ASVs of Polynucleobacter bacteria
We evaluated if 16S rRNA based ASVs of Polynucleobacter bacteria possess a predictive power regarding environmental preferences of ASVs. A set of 226 strains (Table S2 plus

| DISCUSS ION
The limited taxonomic and ecological resolution of the 16S rRNA marker is well known ( Stackebrandt & Ebers, 2006). An alternative "universal" marker for diversity investigations on bacterial communities is available (Hill et al., 2002) but requires the use of highly degenerated primers strongly substituted with inosine bases. This is potentially biasing comparative compositional analyses of bacterial communities. High degeneration of primers can be avoided if the taxonomic focus of diversity studies is narrowed to genus-like taxa (Pereira et al., 2018;Sánchez et al., 2014).
We The priB gene could also be used for biodiversity survey or experimental studies on other genus-like bacterial taxa. However, this would need sufficient knowledge on sequence diversity at the potential primer binding sites (Supporting Information S1) and the quality of the taxonomic resolution at the species level would largely depend on a suitable collection of reference genomes enabling a sound search for a species discrimination threshold (Figure 1).

| Enormous but still incompletely covered OTU richness
Our priB-based investigation of 99 freshwater habitats revealed an astonishing total number of 600 species-like Polynucleobacter OTUs.
We cannot be sure that the performed read processing and filtering removed all erroneous sequences; however, the strict sequence length filtering and the removal of sequences containing additional stop codons should have helped to exclude many sequences representing PCR artefacts. Especially the search for additional stop codons increased the confidence in the established sequence data, because their appearance in the single-copy, essential house-keeping gene priB clearly indicates erroneous sequences. Importantly, rather few such sequences were found and all were present in very low copy numbers (Supporting Information S1). We did not perform chimera filtering due to the lack of a suitable priB reference database.
However, we used a phylogenetic tree calculated with all reference and eOTU priB sequences to search for eOTUs displaying unusually F I G U R E 6 (a) Relative abundance of PnecC bacteria determined by FISH (data from Jezbera et al., 2012). Some of the shown data represents samples included in the priB amplicon sequencing (red dots), other samples were obtained from habitats included in the priB sequencing but taken at other dates (blue dots). (b) Polynomial regressions on relationship between pH and operational taxonomic unit (OTU) richness. OTU numbers represent only detections >1% of reads per sample. Samples from running waters were completely excluded from the analyses. Regressions were performed on all remaining samples, all remaining samples with low Ca 2+ communities, and remaining samples from habitats with mediumsized surface area (0.018-0.64 ha). The surface size of all standing water habitats is indicated by grey bubbles (log transformed data) [Colour figure can be viewed at wileyonlinelibrary.com] long branch length. Long branch lengths are expected if two sequences with low similarity contribute larger fractions to a chimeric sequence; however, suspicious long branches were not observed for any eOTU. The increasing percentage of eOTUs towards the rare species end of the rank-abundance distribution (Figure 2) could indicate erroneous sequences; however, an alternative explanation for the decreasing percentage of refOTUs could be that rare species tend to be underrepresented in collections of cultured strains.
Even if we assume that 10%, 20%, or even 30% of the detected 600 OTUs were based on erroneous sequences, an impressive number of detected OTUs would remain. In addition, rarefaction analyses suggested that not even all of the abundant taxa (>10% of priB reads of a particular sample) could be detected by the variety of samples we investigated ( Figure S2B). This indicated that further sampling would increase the detected number of both abundant and rare OTUs. This is not surprising given the observation of taxa with overall low occupancy but locally abundant populations such as P. wuianus and P. meluiroseus. In addition, our study could certainly not cover the entire diversity of freshwater systems in the investigated regions. Running water systems, for instance, were only marginally covered and anoxic hypolimnia of lakes known to be rich in Polynucleobacter bacteria (Diao et al., 2017;Jezbera et al., 2011) were completely omitted. The indicated incomplete coverage of OTUs present in the investigated area is further supported by the lack of detection of 20% of the reference taxa originating from this area ( Figure 7a). Consequently, ecologically broader sampling and inclusion of seasonal aspects are both expected to increase the total number of OTUs detected in the investigated regions.

| Biogeography of taxa mainly reflected regional differences in ecological conditions
Environmental filtering resulted in a strong biogeographic structur-

ing. For instance, OTUs abundant in limestone areas in Central Europe
were not detected in other sampled regions, which lack habitats with high Ca 2+ concentrations ( Figure S1), and low pH preferring OTUs were almost absent from the investigated habitats in Scandinavia with mainly circum-neutral pH. On the other hand, hints on biogeographic structuring caused by an isolation by distance mechanism were scarce. Partial Mantel tests controlling for environmental influences did not suggest that dissimilarity of Polynucleobacter communities increased with geographic distance (  . The detection of OTUs exclusively found in only one of the three investigated regions (Figure 7b) is well explained by the abundance-occupancy relationship documented in macroecology (Gaston et al., 2000), which predicts a positive relationship between the abundance of a taxon and its range occupancy ( Figure 7c). However, in the case of Polynucleobacter bacteria, it is not known if this relationship simply resulted from undersampling of rare OTUs, or if it really reflects restricted biogeographic distributions. Nevertheless, the former explanation seems to be more likely. The rather small fraction of reference taxa originating from lower latitudes and the southern hemisphere detected in the investigated habitats, as well as the very low relative abundance of the detected southern taxa (Figure 7a) confirmed a previously revealed biogeographic pattern (Hahn et al., 2015). Currently, it is still unknown if these patterns result from a distance mechanism (dispersal limitation), or from environmental filtering of Polynucleobacter taxa differing in thermal adaptation (Hahn et al., 2015). In any case, a pro-

nounced further increase in numbers of species-like Polynucleobacter
OTUs has to be expected if future cultivation-independent studies investigate habitats located south of 40°N.

| Complex diversity trends along environmental gradients
An uneven distribution of Polynucleobacter subclusters along pH gradients has been previously reported (Jezbera et al., 2012;Nuy et al., 2020), and based on previous investigations, even within subcluster PnecC, differences in distribution of species along pH gradients are expected (Hahn et al., 2016;Jezbera et al., 2011). Due to the pronounced increased taxonomic resolution provided by the priB marker, much deeper insights into the structuring of PnecC communities by environmental factors are possible. This has revealed a couple of diversity trends.
Importantly, the composition of the PnecC communities did not change continuously along all environmental gradients. A previous study suggested that the composition of PnecC communities is mainly controlled by pH (Jezbera et al., 2011); however, in our study we observed that the majority of OTUs preferred either low or high Ca 2+ concentrations (Figure 4). This trend seemed to be at least partially independent of pH, since alkaline habitats with low and high Ca 2+ concentrations rarely share their inhabitants. Ca 2+ concentrations were tightly correlated with conductivity (R 2 = 0.93, p < .0001), therefore, it is not known if Ca 2+ concentrations or salinity was specifically controlling the composition of the communities. However, coastal habitats with increased NaCl concentrations shared community compositions with low but not high Ca 2+ communities, suggesting that salinity is not the major driver of this distribution. The Ca 2+ concentrations of aquatic systems are mainly controlled by their geological background. Therefore, high Ca 2+ Polynucleobacter communities were restricted to habitats located in limestone areas and characterized by higher Ca 2+ concentrations ( Figure S1). However, even within limestone areas, smaller habitats with low Ca 2+ concentrations inhabited by low Ca 2+ communities were found. Such habitats are limited to systems influenced by peat bogs or, at least, influenced by peat moss (Sphagnum spp.) vegetation. In addition to Ca 2+ concentration, pH had, as expected, a strong influence on the PnecC community composition ( Figure 3); however, OTU composition changed more continuously along the pH gradient.
In botany, it is well known that silicate and limestone soils basically differ in their plant community compositions, at least regarding the nontree species (Bothe, 2015). These two soil types differ in many variables including pH and CaCO 3 content. Due to the manifold factors distinguishing these two soil types, the major drivers of the distinct differences in vegetation composition are unknown (Bothe, 2015). On the other hand, it is well known that in many plant genera pairs of vicarious species evolved, which either dwell on silicate or on limestone soils. Similar vicariances seemed to be given among species of Polynucleobacter subcluster PnecC. Another case of vicariance has been previously reported (Schauer et al., 2005) for planktonic freshwater bacteria affiliated with the two related taxa Candidatus Aquirestis calciphila (aka subcluster LD2) and Candidatus Haliscomenobacter calcifugiens (aka subcluster GKS2-217).
In soils, bacterial OTU richness shows a unimodal distribution along pH gradients with a richness peak at about neutral pH and a 3-to 4-fold change of richness across the pH gradient (Bickel et al., 2019;Fierer & Jackson, 2006). In comparison, the increase in PnecC Along with OTU richness, Shannon diversity increased with pH ( Figure 3b). This increase in diversity was accompanied by an increase of community dissimilarity among communities dwelling in habitats of similar pH (Figure 3b). This trend was obvious even if habitats with high Ca 2+ concentrations were excluded (Figure 5a). We found no suggestion of a general increase of environmental diversity between different pH classes along the gradient towards higher pH values ( Figure 5b); however, the measured environmental variables seemed to be only poor predictors of variance of composition of PnecC communities (Figure 5c). The increase of dissimilarity among communities of the same pH category is associated with a decreasing trend in occupancy of taxa ( Figure S6A). This could be explained by a more stochastic community assembly in habitats with higher pH (Nemergut et al., 2013), combined with a higher number of OTUs able to dwell in systems with higher pH (Figure 6b). Persistence of taxa with rather low occupancy in particular habitats over periods of more than one year (e.g., P. wuianus and P. meluiroseus) may be suggestive of historical contingencies (Langenheder & Lindström, 2019) combined with potential local adaptation (Kraemer & Boynton, 2017). However, the observed phenomenon could also be linked to unmeasured abiotic environmental variables or to only locally occurring specific biotic interactions (Zhou & Ning, 2017). Time series and broader sets of measured variables are necessary to obtain insights into the mechanisms responsible for this phenomenon.
Obviously, Polynucleobacter species strongly differ in ecophysiological adaptations (e.g., pH, Ca 2+ -related adaptation) but also in other ecological characteristics such as occupancy. The resolution of the priB marker was high enough to reveal these speciesspecific differences in adaptation and ecological success among Polynucleobacter bacteria, which are undetectable with 16S rRNA sequence-based methods (Figure 8).

| Conclusions
Amplicon sequencing of the priB gene provided an unprecedented insight into the diversity of Polynucleobacter bacteria and structuring of their local communities by environmental factors. The used marker gene revealed patterns and trends invisible to 16S rRNA sequence-based methods. An astonishingly high, yet incompletely covered, species richness was found in the studied area. The observed high richness could indicate a general huge underestimation of bacterial species richness by 16S rRNA-based methods, if the observed high degree of diversification is also present in other bacterial OTUs 99% . Importantly, Polynucleobacter communities showed several patterns well known from macroecological theory, which were previously only observed in phylogenetically much broader microbial taxa and communities (Horner-Devine et al., 2004;Reche et al., 2005;Sogin et al., 2006). This includes species-area and geographic abundance-occupancy relationships, as well as the organization of communities in a few abundant and many rare taxa (rank-abundance curves, (Sogin et al., 2006)). In contrast, the observed opposing trends of abundance and diversity of Polynucleobacter communities along the pH gradient, as well as differences in pH-specific occupancy of taxa along this gradient were unexpected. Obviously, priB amplicon sequencing provides a possibility to study the mechanisms of community assembly in great detail. Furthermore, this method may provide an opportunity to measure the response of some important freshwater bacteria to environmental changes caused by anthropogenic impact (Kraemer et al., 2020) with higher sensitivity than synecological methods based on ribosomal markers. However, detailed studies on the influence of various environmental factors and time series are needed to better understand the mechanisms structuring Polynucleobacter communities and influencing the occupancy of particular taxa.

ACK N OWLED G EM ENTS
We thank Ulrike Koll and Johanna Schmidt for isolation of strains, DNA isolation, and processing of half of the samples for priB amplicon sequencing, and Johanna Schmidt for determination of major ion concentrations. We thank "Le Syndicat Mixte du Parc Naturel Régional de Corse et le gestionnaire des lacs d'altitude sur son territoire" for permission to take samples in the Corsica Natural Park (France), and we thank the National Park Hohe Tauern (Austria) and the owners of lakes for sampling permissions.

DATA AVA I L A B I L I T Y S TAT E M E N T
Details on the sampled habitats and the measured environmental variable are provided in Supporting Information (Table S1 and