Locality or habitat? Exploring predictors of biodiversity in Amazonia

databases for the organisms sequenced.


Background
The Amazon drainage basin (Amazonia) has the highest biodiversity of all tropical rainforests and is a global biodiversity hotspot (Hansen et al. 2013). However, large areas of Amazonia are severely understudied, and a major proportion of its biodiversity remains 322 poorly characterized. Despite the gaps in our understanding of Amazonian biodiversity, some broad-scale patterns have been identified.
On a large scale, a west-to-east diversity gradient from highly diverse areas on the Andean slopes in western Amazonia towards relatively less diverse areas on the Guiana shield and the eastern Amazonian lowlands has been observed in many animal and plant groups (ter Steege et al. 2003, Bass et al. 2010, Hoorn et al. 2010, Zizka et al. 2018. The drivers for this gradient remain elusive, but have been attributed to bedrock geology (Tuomisto et al. 2017), historical processes including mountain and basin formation (Hoorn et al. 2010), marine incursions (Bates 2001, Lovejoy et al. 2006, Antonelli et al. 2009), and soil fertility (Moran et al. 2000).
On a finer scale, distinct vegetation types play a major role in structuring the distribution of plant and animal species throughout Amazonia. These vegetation (habitat) types are closely linked to soil characteristics, flooding regime, and nutrient availability. The four most widespread and important habitat types are: 1) unflooded terra-firme forests, considered the most biodiverse habitat type, are generally characterized by latosols (Falesi 1984, Fig. 1D); 2) várzeas, fertile forests that are seasonally flooded by nutrient-rich white-water rivers up to 240 d yr -1 (Junk et al. 1989;Fig. 1C); 3) igapós, less fertile forests that are seasonally flooded by nutrientpoor black-water rivers (Junk et al. 2011;Fig. 1F); and 4) the naturally open areas, which are forest-free 'islands' in the 'sea' of forest, dominated by grasses and shrubs (campinas) or low canopy forest (Fine et al. 2005; Fig. 1E). Terra-firme forests cover the majority of Amazonia in terms of area, whereas várzeas and igapós jointly cover between 5-7% (Peres 1997), and naturally open areas around 1.6%. These four habitat types support distinct communities of plants and animals 323 and are often associated with differences in species richness and community composition.
The biodiversity patterns described for Amazonia have been deduced from large organisms such as birds, mammals, and trees (Bass et al. 2010, Hoorn et al. 2010, Zizka et al. 2018. In contrast, we known very little about the 'hidden biodiversity' of inconspicuous but species-rich microorganisms such as fungi, nematodes, and bacteria (but see Bates et al. 2013, Peay et al. 2013, Lentendu et al. 2017, Mahé et al. 2017. This is problematic, because these groups likely comprise the vast majority of biodiversity in terrestrial ecosystems (Mora et al. 2011) and play pivotal roles in nutrient cycles and ecosystem functioning (Dominati et al. 2010). This lack of basic understanding of biodiversity pattern is not only problematic from a scientific perspective, but may also compromise the effectiveness of conservation strategies and hinder the sustainable development of Amazonia.
The scarce knowledge of the majority of Amazonian biodiversity is mostly due to difficulties in sampling of, and taxonomic identification in, these groups, notably insects, fungi, nematodes, and bacteria. High-throughput DNA sequencing methods overcome many of the present challenges to studying biodiversity (Biggs et al. 2015). In particular, metabarcoding (Taberlet et al. 2012) allows quantification of genetic, and, to a certain degree taxonomic, diversity of a locality without the need to spend years collecting and examining specimens (Gibson et al. 2014). Even in highly diverse and poorly sampled environments such as tropical rainforests -areas for which reference sequence databases are very thinly populated -the use of operational taxonomic units (OTUs, Blaxter et al. 2005) defined on the basis of sequence similarity makes the assessment and comparison of biodiversity across sites possible (Balmford and Whitten 2003, Giam et al. 2012, Stahlhut et al. 2013. In this study, we investigate patterns in genetic soil and litter diversity on a west-to-east transect across Brazilian Amazonia along the Amazon River. We analyse environmental DNA through metabarcoding, using ribosomal 16S (prokaryote) and 18S (eukaryote) gene sequences as proxies to provide one of the first large-scale biodiversity assessments across Amazonia. Specifically, we test two hypotheses on correlations of OTU richness and community structure from 39 study plots.
Hypothesis 1 -microorganism OTU richness is similar to the patterns previously documented for vertebrates and plants, decreasing in richness from west-to-east throughout Amazonia. At the finer level, richness is determined by habitat type, with a gradient related to low stress level and high nutrient availability (terra-firme > várzea > igapó > campinas).
Hypothesis 2 -OTU community structure is linked to geographic vicinity, such that community similarity decreases with geographic distance as observed in vertebrates. OTU community structure reflects habitat type and is more similar within than among habitat types.

Sampling localities
We sampled 39 plots at four localities across a large longitudinal range in Brazilian Amazonia (Fig. 1). The fieldwork took place between 5th and 29th of November 2015. Localities were selected to maximize geographic distance and the number of habitat types (terra-firme, igapós, várzeas, and campinas). The locations are: A) Benjamin Constant (BC) -a municipality near the border of Brazil, Colombia, and Peru, situated approximately 1100 km west of Manaus at the upper Amazonas River (4.383°S, 70.017°W). This region is accessible by boat only and is characterized by a low human population density and relatively low rates of deforestation. The region is situated in the southern margin of the Amazon River and supports large areas of várzeas, terra-firme, and some igapó forests. B) Jaú (JAU) -a national park that encompasses an area of 2 272 000 ha. It is located on the lower Rio Negro (1.850°S, 61.616°W), 200 km northwest of Manaus. About 70% of the forested area is covered by terra-firme forest (Borges et al. 2001). There is considerable heterogeneity in local plant communities in the terra-firme forests due to soil mosaics in the region. This heterogeneity might also be related to human disturbance (Ferreira and Prance 1999). Approximately 12% is covered by igapó forests. Jaú also includes Novo Airão (2.620°S, 60.944°W), a site with campinas of low shrubby vegetation with only few trees reaching above 5 m in height. These open areas are close to the road AM 325 and are thus subject to relatively high anthropogenic influences. C) Reserva do Cuieras (CUI) and Reserva da Campina (CUI) -Reserva do Cuieras is a nature reserve covering 22.7 ha, and it is located about 70 km north of Manaus (2.609°S, 60.217°W). The vegetation is a mosaic of evergreen forest with a canopy height of about 35-40 m, with emergent trees over 45 m tall. Igapós and campinaranas cover 43% of the area, whereas terra-firme forests occupy 57% (Zanchi et al. 2011). Reserva da Campina is a nature reserve located 60 km north of Manaus (2.592°S, 60.030°W), on the right side of the Negro river. It covers approximately 900 ha, of which 6.5 ha is campinas and campinaranas. The campinas (2.6 ha) are mosaics of shrub islands surrounded by white bare sandy soil, with a canopy height of about 4-7 m. D) Caxiuanã (CXN) -a national forest of 371 000 ha of rainforest located 350 km west of Belém (1.735°S, 51.463°W) in the lower Amazon region of northern Brazil. About 85% of the area is covered by terra-firme forest and about 10% by várzea and igapó forests, but this reserve also has some campinas (Behling and Costa 2000).

Sampling design
We installed three temporary circular plots with a 28 m radius in each habitat type at each locality, totalling 39 plots (9 at BC, JAU, and CUI and 12 at CXN), following the soil sampling protocol of Tedersoo et al. (2014). Inside each plot, we selected 20 trees at random and collected litter samples and two soil cores in opposite directions of each tree, summing to a total of 40 litter and 40 soil samples per plot. We pooled all samples to obtain one litter and one soil sample for each plot. Litter was defined as all organic material above the mineral soil and varied from 0 to ca 50 cm in thickness. We collected the soil samples from the top 5 cm of the mineral soil using a metal probe with a 2.5 cm diameter. We used gloves and masks and changed equipment in between each new plot to reduce the risk of cross-plot contamination. The samples were stored in sterilized white silica gel 1-4 mm (pre-treated by two minutes of microwave heating (800 W) and 15 min of UV light). All plots were provided with GPS coordinates. All dry samples were processed at the Univ. of Gothenburg, Sweden.

DNA extraction
For total DNA extraction, we used the PowerMax Ò Soil DNA Isolation Kit (MO BIO Laboratories, USA), according to the manufacturer's instructions. We use 10 g (dry weight) from all soil samples and 15 ml of the litter samples (corresponding to 3-10 g of dry weight litter, depending on texture and composition in that we standardized the samples by volume). Each DNA sample was concentrated and cleaned following the PowerMax Ò Soil DNA Isolation's instructions (MO BIO Laboratories, USA).
18S: we targeted the V7 region of the 18S rRNA gene using the forward and reverse primers (5¢-TTTGTCTG STTAATTSCG-3¢) and (5'-TCACAGACCTGTTATTGC-3') designed by Guardiola et al. (2015) to yield 100-110 bases long fragments. Amplification was performed in a total volume of 25 μl and consisted of: 0.25 μl of AmpliTaq1 Gold DNA polymerase, 5U μl -1 , 2.5 μl Pfu polymerase buffer 10×, 0.5 μl dNTP (final concentration of each dNTP 200 µmol; all above mentioned reagents are from Promega Ò , Sweden), 0.25 μl of 50 mol of forward and reverse primers, 20.25 μl of nuclease free water, and 1 μl of DNA template. The PCR conditions were an initial denaturation step of 2 min at 95°C and then 30 cycles of denaturation at 95°C for 1 min, hybridization at 50°C for 45 s, and elongation at 72°C for 1 min, followed by a final elongation at 72°C for 10 min and finishing at 4°C. Each sample was amplified three times and pooled to reduce biases of amplification efficiency variation on different species and stochastic effects of amplification (Carew et al. 2013, Edgar 2013, Piñol et al. 2015. The quality of the amplification was checked in UV light using GelRed TM stain (1%; Biotium, USA) on a 2% agarose gel. All samples were purified using the QIAquick Ò PCR purification kit. Dual PCR amplifications were performed for Illumina MiSeq sequencing (Illumina, USA), using fusion primers as described in Bourlat et al. (2016). For indexing, we used the Nextera XT DNA index kit (Illumina, USA) according to the manufacturer's instructions. We checked the quality of the PCR products on a 2% agarose gel. We then made size selection using magnetic beads and a magnetic stand, using the ratio 0.9:1 beads/PCR product. We checked the DNA concentration in a Qubit 30 Ò fluorimeter (Invitrogen, Sweden), and we assessed the quality and size selection of the PCR products with a 2200 Agilent 2200 TapeStation Ò (Agilent, USA). We normalized and pooled the PCR product (with the same concentration) following the Illumina protocol. The samples were sequenced at SciLifeLab (Stockholm, Sweden) using an Illumina MiSeq 2×250 machine.

Sequence analyses and taxonomic assessment
We used the USEARCH/UPARSE ver. 9.0.2132 Illumina paired reads pipeline (Edgar 2013) to quality filtering, dereplicate and sort reads by abundance, to infer OTUs, and to remove singletons. We filtered the sequences to discard chimeras and clustered sequences into OTUs at a minimum similarity of 97% using a 'greedy' algorithm that performs chimera filtering and OTU clustering simultaneously (Edgar 2013). We used SILVAngs 1.3 (Quast et al. 2012) for assessment of the taxonomic composition of the OTUs, using a representative sequence from each OTU as query sequence. We used the SINA ver. 1.2.10 reference data for ARB SVN (revision 21008, Pruesse et al. 2012) for both markers.

Statistical analyses
We performed all statistical analyses in R ver. 3.4.2 (R Development Core Team). We pooled the litter and soil OTUs from each plot for all analyses as our goal was to address general diversity patterns and not to compare the OTU composition from different substrates. We used the stringr ver. We rarefied all samples to equal depth, where the depth was determined by the lowest number of reads obtained from a single plot (55 111 for 16S and 10 919 for 18S; Supplementary material Appendix 1 Fig A1). We subsequently transformed the OTU tables to presence/absence for both prokaryote (16S) and eukaryote (18S) data. Read abundances have been shown to be largely unreliable, especially for the 18S marker due to the large variance in biomass of the 325 study organisms (e.g. protozoa vs trees) (Carew et al. 2013, Deagle et al. 2013. Additionally, both the 16S and the 18S genes are multicopy genes, and hundreds of 18S copies per cell are known from some eukaryotes (Lindner et al. 2013).
To test hypothesis 1, we fitted a two-way ANOVA model with longitude and habitat as predictors and prokaryote (16S) and eukaryote (18S) OTU richness as response variables, respectively. We fitted a two-way ANOVA model with locality and habitat as predictors and performed a Tukey Honest significance test to evaluate the significance of among-group differences. We furthermore performed a general linear model considering habitat interaction with longitude and the effect on taxonomic groups (glmm = richness ~ habitat × longitude + (1/taxa)).
To test hypothesis 2, we performed a permutational multivariate analysis of variance (PERMANOVA) test using habitat and longitude as predictors and dissimilarity matrices using Jaccard index of prokaryote (16S) and eukaryote (18S) OTUs as response variables, respectively, using the 'vegan' package ver. 2.4-3 (Oksanen et al. 2007) in R. Additionally, we constructed two-dimensional non-metric multidimensional scaling (NMDS) ordinations of the presence/absence matrices of prokaryote (16S) and eukaryote (18S) data using the Jaccard dissimilarity index, as implemented in the metaMDS function in the vegan package to analyse community dissimilarity among all samples. We then used the 'envfit' method implemented in vegan to fit locality and environmental type onto the NMDS ordination as a measure of the correlation of these factors with the NMDS axes. We tested the isolating effect of distance with a Mantel test, and visualized the OTU community similarity among plots with the 'qgraph' function in R using a similarity index (1/Jaccard dissimilarity). Furthermore we performed variation partitioning on the prokaryote (16S) and eukaryote (18S) communities to investigate the compositional effects of changing habitat and locality. Variation partitioning resolves the contribution of habitat and locality to the total community variation (Legendre and Legendre 1998), but also resolves the variation shared between factors (i.e. shared between both habitat and locality). We used the 'varpart' function of the vegan package and assessed the significance for each section of the variation partitioning approach using redundancy analysis. We analysed the community matrix in models against habitat, locality, or both together as explanatory variables. We finally constructed Venn diagrams with the 'gplots' package in R to check the number of exclusive and shared OTU as a complement of community structure.

Results
We obtained a total of 2 984 233 reads and 6625 OTUs for prokaryotes (16S) and 9 149 502 reads and 15 840 OTUs for eukaryotes (18S). See Supplementary material Appendix 1 Table A1 for the number of sequences and OTUs for each plot (see also Supplementary material Appendix 1 Fig. A2 for rarefaction OTU number by plot) after rarefaction. The correlation between prokaryote (16S) and eukaryote (18S) OTU richness was weak overall and absent if considering each locality separately, with just the CUI location correlating significantly (Supplementary material Appendix 1 Fig. A5 and A4).

Taxonomic composition
The taxonomic composition of the prokaryote component shows that the groups with the highest number of OTUs were Proteobacteria (22% of the taxa identified in our samples, equivalent to about 1000 OTUs per habitat and locality, most of which belonged to Alphaproteobacteria: Fig. 2A and 2B) and Cloroflexi (15%, average ~600 OTUs; Fig. 2A and 2B). For eukaryotes, the group with the highest number of OTUs was Fungi (30%, ~2400 OTUs, mainly Ascomycota and Basidiomycota: Fig. 2C and 2D) followed by Cercozoa (15%, ~1100 OTUs: Fig. 2C and 2D) and Alveolata (10%, ~750 OTUs; Fig. 2C and 2D). Most of the eukaryotic OTUs for all taxa were relatively 'rangerestricted', occurring in less than five plots. In contrast, prokaryote OTUs were generally widespread across many plots (Supplementary material Appendix 1 Fig. A6 and A7). Due to the small replicate number (39 plots) in relation to the number of taxonomic groups, we did not explicitly test the patterns of richness or the community composition for habitat types or locality for each taxonomic group on the OTU level. However, based on a qualitative (visual) inspection of the results, we did not observe any difference in taxonomic composition by locality and habitat for neither prokaryotes (16S) nor eukaryotes (18S). The general linear model to assess the effect of habitat and longitude by taxonomic groups was non-significant (p > 0.05 for both prokaryotes and eukaryotes).

Hypothesis 1 -OTU richness by locality and habitat
We found a significant effect of longitude on OTU richness for prokaryotes (F [1,34] = 5.36, p < 0.05). The highest mean OTU richness for prokaryotes (16S) was found in BC (2076 OTUs), followed by CXN (1756) and JAU (1587), and the lowest was found in CUI (1462). The analysis of variance showed a significant effect of locality (F [3,32] = 26.63, p < 0.001). The Tukey HSD test showed a significant pairwise difference between BC and all other locations, as well as between CXN and CUI. Concerning habitat type, the mean OTU number was the highest in várzeas (1948) followed by campinas (1827), terra-firmes (1686), and finally igapós (1570). The analysis of variance showed a significant effect of habitat type (F [3,32] = 10.87, p < 0.001) on the prokaryote (16S) OTU richness (Fig. 3A). The Tukey HSD test showed a significant difference in OTU richness between campinas and both igapós and terra-firme.

Hypothesis 2 -OTU community structure by location and habitat
The PERMANOVA results showed a significant effect for habitat (16S [R 2 = 0.35, p < 0.001]; 18S [R 2 = 0.12, p < 0.001]) and longitude (18S [R 2 = 0.22, p < 0.005]; 18S [R 2 = 0.07, p < 0.001]) for both the prokaryote (16S) and the eukaryote (18S) communities. Variation partitioning also identified significant proportions of both prokaryote (16S) and eukaryote (18S) communities varying with habitat and locality, but with no shared variation between them. About 50% of the full prokaryote (16S) community variation was explained by the analysis, with 29% contributed by habitat and 21% by locality. A lower percentage of the total community variation was explained in the eukaryote (18S) analysis (27%), with habitat still explaining a larger proportion of the community variation (16%) than did locality (11%).
The similarity network analysis (Fig. 4) reveals a stronger (more aggregated) community structure for prokaryotes ( Fig. 4A coloured by locality and Fig. 4B coloured by habitat type) than for eukaryotes (Fig. 4C coloured by locality and Fig. 4D coloured by habitat type). Plots with highest similarity occurred inside the same locality and habitat. In both prokaryote and eukaryote communities, BC clustered tightly together and distinctly from the other localities. The results of the NMDS show similarity in community composition among the plots and the influence of locality and environmental type. The envfit test indicated significant effects of locality on both the prokaryote (R 2 = 0.38; p < 0.001) and eukaryote (R 2 = 0.33; p < 0.001) communities. The envfit test also indicated a significant effect for habitat type on the prokaryote (R 2 = 0.54; p < 0.001) and eukaryote (R 2 = 0.54; p < 0.001) communities.
The Venn diagrams show the number of unique and shared OTUs for each locality (Fig. 5A for 16S and  Fig. 5D for 18S). The lowest number of unique OTUs was found in várzeas (16S = 218; 18S = 1164) followed by igapós (16S = 308; 18S = 2181). The habitat with the highest number of unique OTUs was campinas for prokaryotes (805) and terra-firme for eukaryotes These results indicate that OTU richness varies significantly with location and habitat type in 16S, with higher richness in BC and in campinas. This effect is not observed for the 18S data.

Discussion
Here we provide a first mapping of Amazonian biodiversity that considers not only macroscopic organisms but also the microscopic, microbial component across a large geographic scale. We found that prokaryote (16S) and eukaryote (18S) OTU richness and community composition differ significantly among localities and habitats, with habitat type being a stronger predictor of diversity and community composition than locality.

Contrasting prokaryote and eukaryote diversity
The weak correlation found between prokaryotes (16S) and eukaryotes (18S) OTUs indicates that richness patterns may be different between the two groups. Previous reports found that localities with high bacterial diversity can have relatively low levels of plant diversity (Barthlott et al. 1999). In south-eastern Brazilian Amazonia, vast areas have been converted for agricultural use, and those areas are notably poor in animal and plant diversity. Surprisingly, some of they have been shown to have higher bacterial diversity than natural areas (Mendes et al. 2015). Taken together, our results add to current evidence that prokaryote and eukaryote diversity may be largely decoupled, and indicate that it would be inadequate to use one group as a proxy of diversity for the other.

Determinants of Amazonian diversity
Our results show that the OTU community composition reflects locality and habitat type, whereas OTU richness reflects locality and habitat type only for the prokaryote (16S) parts of our data. This was to some extent expected, as localities with different OTU community composition can still be similar in terms of overall OTU richness. Furthermore, all localities and habitat types were found to have a large numbers of unique OTUs. Our results show both locality and habitat types to be important factors in explaining Amazonia's diversity distribution, with habitat type being the strongest factor.
Contrary to our expectations based on studies of macroorganisms, we did not find a significant linear gradient of eukaryote (18S) OTU richness from west-to-east in Amazonia, although a trend could be observed (Fig. 3). In contrast, we did find a significant negative effect in longitude with respect to prokaryote (16S) OTU richness. The richness pattern of localities for both prokaryotes and eukaryotes was BC > CXN > JAU > CUI. In the community analyses, we found a grouping between BC and CXN and another between JAU and CUI in the prokaryote data. We obtained a similar result for 18S, although the signal was less clear. BC and CXN also had the largest number of shared OTUs. These patterns were expected: even though BC is localized in western Amazonia and CXN is situated in the easternmost part of our sampling design (representing the extremes of our longitudinal gradient) both these two localities are bathed by rich sediments from a white waters, the Amazon river in BC and Baía de Caxiuanã in CXN, which is part of the former Anapu River (Ferreira et al. 2005). Water type is suggested as an important factor structuring Amazonia's biodiversity  (Wittmann et al. 2010), with white-water basins being generally more diverse than black-water basins. Furthermore, BC and CXN are the only localities with várzea, which is the environment with the highest OTU richness. All these factors are likely responsible for the higher OTU richness and the composition similarity between BC and CXN. BC and CXN are also the localities with the highest number of unique OTUs, which could emphasize their importance under conservation strategies. In contrast, both JAU and CUI are bathed by acidic, sediment-poor rivers, which is characteristic of both the Negro (JAU) and Cuieras (CUI) rivers.
We found different richness patterns across the environmental types surveyed. Based on previously documented patterns for macro-organisms, we expected richness to decrease in the order terra-firme > várzea > igapó > campina. Therefore, it was surprising to observe várzea > campina > terra-firme > igapó for prokaryotes and campina > várzea > terra-firme > igapó for our eukaryote dataset. Várzeas are considered to be a stressful environment for many organisms, with long periods of flooding, but they have fertile soils that could explain the more substantial OTU richness. Our finding that campinas was the richest habitat for eukaryotes and the second richest for prokaryotes is puzzling. Campinas are nutrient-poor (Prance 1996, Fine et al. 2005, with scleromorphic physiognomy (Anderson 1981) and are relatively low in diversity of macro-organisms (Wüster et al. 2005, Smith et al. 2012). The third place in richness was held by terra-firme. We found this to be at least as unexpected, given that terra-firmes are by far the most predominant vegetation type in the Amazon basin since the beginning of the Miocene, and contains a very high macro-organismal diversity (Jaramillo et al. 2006, Irion andKalliola 2010). However, such high documented richness could in part be an effect of the disproportionately larger area of terra-firmes as compared to the other habitat typesan effect that should not be evident in our sampling design. Interestingly, the várzeas and igapós are more similar to each other in terms of community composition despite their differences in OTU richness, which could be related to similar environmental filters linked to stress by flooding, potentially favouring a shared set of specialized organisms. The community similarity between campinas and terra-firmes might be linked to the fact that campinas often are small 'islands' within large 'seas' of terra-firme forest, and in this way these campinas may receive DNA from the surrounding forests (from, e.g. leaves, insects, and fungal spores) in addition to their specific 'specialized' OTU community.
Most eukaryotic (18S) OTUs were restricted to few plots (Supplementary material Appendix 1 Fig. A8), with the majority restricted to one habitat or one locality. We found the number of unique (site-specific) prokaryote OTUs to decrease in the sequence campinas > terra-firmes > igapós > várzeas, whereas for eukaryotes the pattern was terra-firmes > campinas > igapós > várzeas, these patterns differs markedly from the results of our richness analyses. If we were to consider only richness as a conservation priority, várzeas would be the most important habitat type; but if we were to preserve the most unique environment instead, várzeas would have been the least important habitat. Although these results should be viewed with some caution, not least for the tiny sample they represent out of the enormous region covered, they showcase the difficulties in prioritizing conservation areas based on single metrics (Orme et al. 2005).

Taxonomic composition
The taxonomic resolution of metabarcoding data is limited by the availability of comprehensive reference databases (Cowart et al. 2015). Such databases are generally meagre with respect to Amazonian biodiversity, even for well-studied (e.g. trees; Balmford andWhitten 2003, Giam et al. 2012). This reduces our ability to identify many of the OTUs to resolved taxonomic levels, in particular those from less studied group of organisms (e.g. platyhelminthes and Alveolata). A further complication with metabarcoding is the compromise between taxonomic coverage and taxonomic resolution. While the universal 18S primers can capture the majority of eukaryotic organisms, this gene is not variable enough to distinguish all eukaryotes at the species level (Hartmann et al. 2010, Lindahl et al. 2013). In addition, since most amplicon studies target only a part of the 18S (e.g. ~110 bases, as in this study) rather than its full length (~2000 bases), the available information and resolution are further decreased. This means that for most plants, family-level designations are usually the most resolved level of taxonomic composition possible using 18S fragments. For many insects and fungi, the precision may be at the order level (Lindahl et al. 2013, Liu et al. 2013).
The lower the taxonomic level considered, the more idiosyncratic will its distribution generally be, adding heterogeneity to general diversity patterns. For instance, Bass et al. (2010) found a west-to-east diversity gradient for Amazonian mammals, but this gradient was not observed by Maestri and Patterson (2016) using a lower taxonomic level (rodents). Looking at individual OTUs in our data, we found striking differences in richness and community composition across sites. When OTUs were collapsed into high taxonomic groups, such as orders and phyla, these differences were less pronounced (Supplementary material Appendix 1 Fig. A8 and A9).

Conclusions
The broad taxonomic coverage of 16S and 18S, together with the standardised sampling, extraction, and sequencing approaches applied here, allowed us to directly compare biodiversity patterns across a large spatial range in Amazonia. We stress the importance of considering the 'hidden biodiversity' (microscopic, subterranean, or otherwise inconspicuous species) to characterize a larger proportion of the total diversity patterns. We detected a different habitat gradient from what we expected initially, but as expected we found a longitudinal gradient for OTU richness (hypothesis 1) and community composition (hypothesis 2). Furthermore, we found habitat to be the strongest predictive factor of biodiversity, a pattern that was particularly strong for the prokaryote communities. Our results show that the currently accepted diversity patterns in Amazonia do not hold for all organisms, which suggests that biodiversity patterns of different groups of organisms may be largely decoupled. We also found different patterns between richness and uniqueness of OTUs across sites and environmental types, showing the pitfalls in choosing single biodiversity metrics for prioritizing conservation areas.