Microbial communities across a hillslope‐riparian transect shaped by proximity to the stream, groundwater table, and weathered bedrock

Abstract Watersheds are important suppliers of freshwater for human societies. Within mountainous watersheds, microbial communities impact water chemistry and element fluxes as water from precipitation events discharge through soils and underlying weathered rock, yet there is limited information regarding the structure and function of these communities. Within the East River, CO watershed, we conducted a depth‐resolved, hillslope to riparian zone transect study to identify factors that control how microorganisms are distributed and their functions. Metagenomic and geochemical analyses indicate that distance from the East River and proximity to groundwater and underlying weathered shale strongly impact microbial community structure and metabolic potential. Riparian zone microbial communities are compositionally distinct, from the phylum down to the species level, from all hillslope communities. Bacteria from phyla lacking isolated representatives consistently increase in abundance with increasing depth, but only in the riparian zone saturated sediments we found Candidate Phyla Radiation bacteria. Riparian zone microbial communities are functionally differentiated from hillslope communities based on their capacities for carbon and nitrogen fixation and sulfate reduction. Selenium reduction is prominent at depth in weathered shale and saturated riparian zone sediments and could impact water quality. We anticipate that the drivers of community composition and metabolic potential identified throughout the studied transect will predict patterns across the larger watershed hillslope system.


| INTRODUC TI ON
Soil microbial communities impact our environment by driving biogeochemical cycles from centimeter to global scales (Rousk & Bengtson, 2014;Schimel & Schaeffer, 2012). They expedite rock weathering (Gorbushina, 2007;Krumbein, 1988) recycle organic material in the subsurface, and facilitate the growth of vegetation by altering the availability of nutrients in the soil (Wardle et al., 2004).
These changes influence soil nutritional status and productivity and plant survival and biotic interactions.
Mountains contribute the majority of water discharge in river basins (Viviroli, Weingartner, & Messerli, 2003) and were previously considered to be the origin of much of the world's water resources (Rodda, 1994). In recent years, studies have also addressed their contribution to subsurface carbon storage and carbon cycling (Chang et al., 2014;Hagedorn et al., 2010;Wan et al., 2018).
These environments are comprised of a complex system of components, such as forests and meadows, floodplains, and glaciers.
In turn, each of these accommodates various habitats including soil, bare rock, permafrost, and snow. Development of a predictive understanding of the behavior of such a heterogeneous and interconnected set of ecosystem compartments is an extremely complicated undertaking. Employing a scale-adaptive approach in which different ecosystem compartments are considered as "systems within systems" could assist in disentangling the processes that shape overall mountain ecosystem function (Hubbard et al., 2018;Levin, 1992). A first step toward such a goal is to investigate structure and functioning within individual montane ecosystem compartments to provide a basis for future comparative studies and modeling efforts. In the long term, the "systems within systems" approach may better enable predictions accompanying natural or anthropogenic environmental perturbations.
Hillslope and floodplain compartments host the majority of soils in alpine and subalpine mountain ecosystems, and biogeochemical processes that occur there impact downstream ecosystems. Runoff and groundwater transport solutes along the elevation gradient and into aquifers, rivers, and lakes. Soils on hillslopes and in floodplains, and in general, harbor considerable microbial diversity (Donhauser & Frey, 2018;Frey et al., 2016;Rime et al., 2014). Most studies of microbial communities in mountainous soils have been concerned with the microbial community structure across different climate zones on the mountain slopes (Bardelli et al., 2017;Djukic, Zehetner, Mentler, & Gerzabek, 2010;Klimek et al., 2015;Xu et al., 2014;Zhang, Liang, He, & Zhang, 2013). However, most work has focused only on shallow soil, down to 20 cm (Bardelli et al., 2017;Yuan, Si, Wang, Luo, & Zhang, 2014;Zhang et al., 2013) and sometimes only the top 5 cm (Singh et al., 2014). The shallow layer of soil is profoundly affected by low temperatures that frequently drop below 0°C and snow cover that crucially limits biological, chemical, and physical processes, and thus microbial life (Zumsteg, Bååth, Stierli, Zeyer, & Frey, 2013).
In contrast, the deeper soils and weathered rock in mountain ecosystems have been little studied. While affected by events taking place in shallow layers, the microbial communities there are probably also influenced by moisture gradients and the geochemistry of the underlying bedrock (Tytgat et al., 2016).
The watershed has a mean annual temperature of ~0°C, with average minimum and maximum temperatures of −9.2°C and 9.8°C, respectively. The watershed receives ~600 mm of precipitation per year, the bulk of which falls as snow, and is representative of many other headwaters systems within the upper Colorado River Basin (Hubbard et al., 2018;Pribulick et al., 2016).
The present research focused on a lower montane hillslope through floodplain transect located within the East River, CO watershed, which is the focus of the Lawrence Berkeley National Laboratory-led Watershed Function Project. The intensively studied site investigated in the current study is referred to as PLM (Pump House Lower Montane). The Watershed Function Project builds upon a scale-adaptive investigation, which focuses on different spatial and temporal scales within the East River watershed, explores how mountainous watersheds retain and release water, nutrients, carbon, and metals downgradient (Hubbard et al., 2018). The current study aims to lay the groundwork for the scale-adaptive, system within systems approach by identifying ecological niches of interest that would later be tested in a bottom-up approach across the watershed. We hypothesize that microbial community composition and metabolic potential is similar among sites along an altitudinal transect down the hillslope and that hillslope communities differ from those of the floodplain riparian zone. Furthermore, we hypothesize that proximity to shale and groundwater will affect the composition and functionality of microbial communities, differentiating hillslope communities from other watershed microbial consortia.

| Site description and sample collection
The PLM intensive study site is located on the northeast facing slope of the East River valley near Crested Butte, Colorado, USA (38°55′12.56″N, 106°56′55.39″W) (Figures 1 and A1). Exact locations were determined at an accuracy of 0.5 m with a Trimble Geo 7X GPS. All samples were collected during three days in September 2016 from meadow sites before any intensive research activities were performed. The ground surface at each site was cleared of vegetation with a hand trawler prior to sampling.
Samples were collected with a manual corer lined with 7.6 cm tall and 15.2 cm diameter bleached sterile plastic liners. Five soil profile sampling sites abbreviated PLM0, PLM1, PLM2, PLM3, and PLM4 were chosen along a 230 m hillslope transect. The profiles terminated at depth in the unsaturated zone, with the exception of PLM4, which extended below the water table. The base of PLM3 and PLM6 profiles is located near or within the weathered Mancos Shale bedrock, while the base of PLM0 was located >1 m above the weathered bedrock. PLM0 is at the top of the hill and PLM4 on the East River floodplain, 2,804 m and 2,757 m above sea level, respectively (Figure 1). One full core was taken at each sampling depth, and the soil in between sampling depths was removed with an auger. An additional site, PLM6, was sampled by drilling and provided access to weathered shale. Samples at PLM6 were taken from a split-spoon, dry drilled core. In total, 20 samples were collected as follows: 30,60 cm;30,60,100 cm;30 cm;30,60,127 cm;170,200 cm;32,65,90 cm. Immediately after extraction, a sample from each site and depth collected within an individual sterile plastic liner was placed in a sterile Whirl-Pak bag and manually homogenized. Aliquots of 5 g of soil from each bag were placed in 10 ml of LifeGuard Soil Preservation Solution for RNA and DNA co-extraction, whereas the rest of the sample was used for DNA extraction. Care was taken to avoid roots and small rocks. Samples in sterile Whirl-Pak bags and preservation solution were placed in a chilled cooler until processing at the Rocky Mountain Biological Laboratory (RMBL) later that day. In the laboratory, roots and small rocks were removed from sampling bags, and three 10 g subsamples were weighted from each sample and placed in a −80°C freezer. Samples were shipped overnight on dry ice to University of California, Berkeley for DNA and RNA extractions.
Particle size analyses of samples were conducted according standard methods (Gee & Or, 2002). Geochemical measurements were made at the Earth and Environmental Sciences department's Aqueous Geochemistry Laboratory. Water soluble cation-anion composition was measured by water extraction (1:1 soil:DIW mass ratio) and ICP-MS. Total inorganic carbon (TIC) and total organic carbon (TOC) in soil samples were determined using a Shimadzu TOC-VCSH total inorganic and organic carbon analyzer combined with a solid sample combustion unit of SSM-5000A. Total nitrogen (TN) was analyzed using a Shimadzu Total Nitrogen Module

| DNA extraction and sequencing
DNA was extracted from 10 g of soil with DNeasy PowerMax Soil Kit in two batches of 5 g each which were combined during the cleaning step. Extraction process followed the manufacturer's protocol with the following modifications: Soil was vortexed at maximum speed for an additional 3 min in the sodium dodecyl sulfate reagent and then incubated for 30 min at 60°C, with intermittent shaking in place of extended bead beating, as established by Hug et al. (2015).
DNA was further cleaned with DNeasy PowerClean Pro Clean Up Kit following the manufacturer's protocol.
F I G U R E 1 East River Watershed hillslope-riparian zone transect sampling sites. (a) The location of East River PLM intensive study site. (b) Five PLM sites are located across a hillslope transect. PLM0 is the highest point of the transect, and PLM4 is located in the floodplain. (c) Schematic representation of the sampling sites. Elevation of the surface, given in meters above sea level, appears below the name of the sampling site. Maximum depth at each sampling site is specified below the depiction of the sampled core in centimeters. Horizontal distances between sites are given at the bottom of the illustration. Maximum and minimum water levels are depicted by dashed blue and red lines, respectively. The PLM6 site was initially drilled for another study, 5 m from PLM3 but at the same elevation. A full view of the East River watershed is given in Figure A1 (c) Kit. The co-extraction and cleaning steps were conducted according to the manufacturer's protocol. While RNA was extracted for the purpose of another study, using co-extraction as a second extraction method was expected to improve the detection of the total diversity of microbes in the sample (İnceoǧlu, Hoogwout, Hill, & Elsas, 2010).

Soil
Overall, two DNA samples were produced from each sampling, one from DNA extraction and the second from the DNA that was co-ex-

| Bioinformatic analyses
Raw reads processing followed protocols described elsewhere (Hernsdorf et al., 2017). Briefly, reads were trimmed based on quality scores with Sickle (Joshi & Fass, 2011) and assembly was accomplished with IDBA-UD v1. were retrieved from the metagenomes. Average coverage was used as a proxy for relative abundance of different sequence types. In this analysis, the scaffolds were trimmed to include 2 kbp flanking the rpS3 gene. If the scaffold spanned less than 2 kbp on both sides, then the entire scaffold was kept, with a minimal length of 1 kbp.
The relative abundance of each trimmed scaffold was determined by mapping the reads from each sample to each trimmed scaffold with bowtie2 (Langmead & Salzberg, 2012). The average coverage and breadth of coverage of each scaffold in each sample was then calculated (Olm et al., 2017). Each scaffold is considered to be present in at least one sample (at minimum, the sample from which it was originally assembled) but could be falsely identified in other samples due to a low breadth cutoff (i.e., false positive). Therefore, we imple-

| Taxonomy and phylogeny
The longest amino acid sequence from each rpS3 protein sequence cluster was selected as a representative and was compared to a database of rpS3 protein sequences (Hug, Baker, et al., 2016;Hug, Thomas, et al., 2016) using the UBLAST function in USEARCH (Edgar, 2010).
Results were filtered to include only the top hits with e-values < 1e−5.
While each cluster roughly correlates with a species, not all clusters could be taxonomically identified to that level. Therefore, further investigation relied on phylogenetic distance, which enables a high-resolution analysis. A phylogenetic tree was created by aligning only the representative amino acid sequences using MAFFT with an automated strategy (Katoh & Standley, 2013)    were partialled out when inspecting variables as recommended by the bioenv user's manual (Oksanen et al., 2018). The results were evaluated with Pearson's correlation. The significance of the results was validated with Mantel test also using Pearson's correlation.

| Statistics
Maps were retrieved from Google maps database using Google Earth v7.3.2.

| RE SULTS
For the hillslope samples analyzed, the soils are loamy to silty loam ( Figure A2 and Table A3). Shallow samples from PLM0 and PLM1 have higher sand content than downslope PLM3 and PLM4 samples, which have higher content of clay and silt, potentially as a result of downslope fining of transported sediments. Soil moisture increases with proximity to the East River, but decreases with depth ( Figure A3 and Table A4). An exception to this is at the floodplain, where moisture increases close to the water table (72 cm below the ground surface at the time of sampling). The hillslope meadow is dotted with smooth brome (Bromus inermis) and lupines (Lupine sp.); however, neither occurred within a 50 cm radius of the sampling sites (qualitative assessment on site). In contrast, the floodplain is dominated by willows and sedges that are not present on the hillslope. Gopher activity increases downslope, but does not occur at the floodplain location (W. Brown, personal communication, February 2018).
Assembling reads from 41 samples, comprising 610 Gbp of sequence data, resulted in 6.5 million scaffolds longer than 1 kbp (Table A2). On average, 27.8% (±11) of the reads could be mapped back to these scaffolds. This is an expected result given huge diversity in soil and the near flat nature of most of the rank abundance curve. The unassembled reads likely derive from the background of rare organisms in soil. Encoded on the assembled scaffolds, 3,536 rpS3 amino acid sequences were identified and clustered into 1,660 F I G U R E 3 Relative abundances of proteobacterial classes (a) and archaeal phyla ( in clades 3 and 4, see Figure A4). Some distinct species (clade 2 in Figure A4) occur only below the water table (Syntrophaceae, Figure A4, clade 2). Thaumarchaeota related to Nitrososphaera sp.
are the dominant archaea at every location other than at the floodplain (Figure 3b). At the floodplain site (PLM4), Pacearchaeota are present in soil samples close to, although above the water table whereas Bathyarchaeota and Euryarchaeota are present in samples below the water table.
Out of the 37 microbial phyla that were identified, 20 are candidate phyla (CP) (i.e., phyla that lack an isolated representative). Of the CP, eight are part of the Candidate Phyla Radiation of Bacteria (CPR) ( Figure 4). Members of CP are present at all sites along the hillslope transect, but their detection is positively correlated with depth of sampling (Pearson's r = 0.851, p-value < 0.0001) ( Figure 4a). Moreover, depth could be used as a predictor for the abundance of CP as a linear regression has an r 2 = 0.66 and slope = 5.07 (p-value < 0.0001).
Interestingly, CPR bacteria are almost exclusively found at the floodplain site and only just above (7 cm above the water table) and within groundwater-saturated sediment ( Figure 4b). Although sampling sites above and below the water Thus, for soils that contain similar types of organisms, sampling depth and proximity to weathered rock shift organism abundance relative levels. Overall, distance from groundwater at the floodplain site and weathered shale at the hillslope sites seem to be dominant factors in determining the microbial community structure across the hillslope.
Forty geochemical factors were assessed in order to elucidate the factors that shape community structure in the soil profile sites.
The combination of soil moisture and concentrations of Na, Se, and Zn were correlated to microbial community structure (r = 0.751) ( Figure 6).  A6). A depth-dependent trend in overall metabolic potential is also observed along the hillslope. In addition, gradient in overall metabolic potential correlates with elevation (i.e., position on the hillslope).
The patterns identified in the NMDS are driven in part by genes encoding enzymes involved in N 2 fixation (nifDHK), denitrification (norBC and nosZ), and the Wood-Ljungdahl carbon fixation pathway (codhC and codhD) ( Figure A7). The dsrA and dsrB genes that encode reversible dissimilatory sulfite reductase are found in groundwatersaturated saprolite samples PLM4 90 cm, in samples taken 10 cm above groundwater (PLM4 65 cm), and also present in samples collected at 5 cm depth. However, dsrD which is present only in samples from below groundwater and in samples taken 10 cm above it F I G U R E 5 Samples cluster based on proximity to weathered shale and groundwater-saturated soil. (a) NMDS based on unweighted UniFrac distance computed using maximum-likelihood phylogenetic tree. (b) NMDS based on weighted UniFrac distances computed using maximum-likelihood phylogenetic tree and abundance of each taxon. Confidence ellipses (95% interval) are shown in Figure A4  F I G U R E 6 NMDS ordination of microbial communities and correlated geochemical factors. Spearman correlation was tested using Bray-Curtis distances and Euclidean distance matrix. Out of 40 geochemical measurements (Table A4)   were not available. These samples were taken from fractured shale which is rich with selenium, and therefore, it is assumed that adding these measurements will result in a stronger positive correlation.

| D ISCUSS I ON
We integrated metagenomics and soil chemical analyses to investigate how microbial community structure and metabolic potential vary within the subsurface across a transect from high on an East River hillslope to its adjoining floodplain. Our analyses indicate that communities are differentiated according to depth and proximity to weathered shale and groundwater, and that microbial communities of the floodplain soils and sediments differ substantially from those collected along the hillslope.
Notably, the abundance of species of Archaea, Proteobacteria The abundance of genes encoding methanol dehydrogenase (mdh1/mxaF/xoxF in Figure A6) and the catalytic subunit of carbon monoxide dehydrogenase (coxL in Figure A7) were consistently lower in the groundwater-saturated floodplain samples than in any  The ability to make predictions at more than one level of resolution requires identification of the processes of interest and the parameters that affect these processes at different scales (Turner, Dale, & Gardner, 1989). For that purpose, the current work focuses on the centimeter to meters scale, serving as a starting point for a "bottom-up" approach for exploring microbial ecology across the watershed.
The microhabitats that were identified in the hillslope and floodplain compartments of the watershed may be considered as "systems within systems" at a local scale. However, the term might also be applicable at a larger scale-one that spans across the en-

CO N FLI C T O F I NTE R E S T S
The authors declare no competing interests in this study. J.F.B supervised the study and mentored the first author.

DATA ACCE SS I B I LIT Y
Raw reads are available through the NCBI Short Reads Archive.
Accession number for each sample is provided in    F I G U R E A 7 Spatial abundance of genes central to metabolic pathways. Samples from the floodplain (blue colored clade) are distinct from samples from across the hillslope (black colored clade), particularly with respect to carbon fixation and selenate reduction. Sample names in red denote DNA samples that were co-extracted with RNA (see Section 2). The sources of HMMs their description and detection cutoffs are given in Table A1 C1 compounds Uncrertainty for anions IC <5%.

R E FE R E N C E S
TA B L E A 4 (Continued)