Positive selection of digestive proteases in Daphnia: A mechanism for local adaptation to cyanobacterial protease inhibitors

Due to the combined effects of global warming and eutrophication, the frequency of deleterious cyanobacterial blooms in freshwater ecosystems has increased. In line with this, local adaptation of the aquatic keystone herbivore Daphnia to cyanobacteria has received major attention. Besides microcystins, the most frequent cyanobacterial secondary metabolites in such blooms are protease inhibitors (PIs). Recently, it has been shown that a protease gene showed copy number variation between four D. magna populations that differed in tolerance to PIs. From that study, we chose two distinct populations of D. magna which had or had not coexisted with cyanobacteria in the past. By calculating FST values, we found that the two populations were genetically more distant in the protease loci than in neutral loci. Population genetic tests applied to the tolerant population revealed that positive selection was most probably acting on the gene loci of the digestive protease CT448 and CT802. We conclude that the selection of digestive proteases and subsequent reduction in copy number is the molecular basis of evolutionary changes leading to local adaptation to PIs.


| INTRODUC TI ON
Adaptations -locally and over time -have been observed in both terrestrial and aquatic ecosystems. In contrast to studies on adaptations over time, for which recent populations are ideally compared with ancestral populations, the study of local adaptations offers the possibility of investigating adaptation by comparison between recent local populations. These populations should have evolved under different conditions. The process leading to local adaptation is the ongoing (or very recent) natural selection of traits responsible for higher tolerance of a population in its local environmental (Kawecki & Ebert, 2004). As local adaptation is a very recent event, it is often possible to identify the selective forces at work (Kawecki & Ebert, 2004).
However, the search for the target genes that had been selected remains difficult. Many studies have aimed at elucidating the molecular basis of local adaptation, e.g., by identifying differences in gene expression patterns or SNPs (Brown et al., 2013) between adapted and nonadapted populations. Although the functional relevance of these patterns often remains unclear, a few studies were | 913 SCHWARZENBERGER Et Al. able to draw a clear link between distinct candidate genes and the adaptive traits of tolerant populations. For example, an indication for the genetic basis of local adaptation has been found by Xia, Camus-Kulandaivelu, Stephan, Tellier, and Zhang (2010) who have reported differences in nucleotide diversity patterns between drought-related candidate genes and reference genes of three populations of a tomato species. Storz et al. (2007) demonstrated that haplotypes of α-globulin with high oxygen-binding affinity predominated in high-altitude samples of deer mice which frequently experience high-altitude hypoxia, whereas α-globulins with low oxygen-binding affinity predominated in low-altitude mice.
In the case of the aquatic herbivore Daphnia, it has been shown that populations can locally adapt to a variety of biotic (e.g., cyanobacteria and predators: Cousyn et al., 2001;Sarnelle & Wilson, 2005) and abiotic stressors (e.g., temperature : Yampolsky, Schaer, & Ebert, 2013). However, only a few target genes in combination with their functional role in local adaptation have been identified.
One study connecting target genes with the origin of Daphnia clones was from Schwarzenberger et al. (2014). Here, four D. magna genotypes from ponds with or without microcystin-producing cyanobacteria differed in the expression of transporter genes which was associated with differences in tolerance to microcystin. However, to our knowledge, the only study that found a clear connection between candidate gene expression and local adaptation of Daphnia populations was a study by Scoville and Pfrender (2010) in which invariantly increased upregulation of Ddc and ebony went along with reduced melanization and thus adaptation to a newly introduced fish predator.
One very important stressor of Daphnia is cyanobacterial protease inhibitors. The production of protease inhibitors as an antiherbivore defence of cyanobacteria against Daphnia has received increasing attention. This is due to the fact that instances of elevated cyanobacterial biomass have increased in lakes over the last few decades. This increased frequency of cyanobacterial blooms is attributed to the combined effects of nutrient input and global warming (Smith & Schindler, 2009). Although cyanobacterial blooms have been shown to decrease the abundance of Daphnia (Ghadouani, Pinel-Alloul, & Prepas, 2003), the increasing frequency, duration and intensity of such blooms should select for more tolerant zooplankton genotypes (Ger, Hansson, & Lürling, 2014). In fact, Blom, Baumann, Codd, and Jüttner (2006) showed local adaptation of Daphnia to Oscillapeptin J, a strong protease inhibitor (Blom et al., 2003), in an in vitro study. Similarly, Schwarzenberger, Keith, Jackson, and Von Elert (2017) have demonstrated that four distinct D. magna populations differed in tolerance to dietary chymotrypsin inhibitors, and that this difference in tolerance corresponded with the cyanobacterial history of the populations' lakes of origin which hints at local adaptation.
In Daphnia, digestive serine proteases have been identified as targets of cyanobacterial protease inhibitors (Agrawal et al., 2005).
These protease inhibitors affect Daphnia by decreasing their somatic growth rate and influencing expression and activity of serine proteases in Daphnia's gut (Schwarzenberger, Zitt, Kroth, Mueller, & Von Elert, 2010; Appendix S1). The most important digestive serine proteases in the gut of Daphnia magna are chymotrypsins and trypsins (Von Elert et al., 2004). Schwerin et al. (2009) have found a surprisingly high number of serine protease genes in the genome of D. pulex (Colbourne et al., 2011), which can also be observed in the genome of D. magna (www.wfleb ase.org). Interestingly, only three trypsin and three chymotrypsin genes (i.e., CT383, CT448 and CT802) were assigned to the proteases active in the gut of D. magna (Schwarzenberger et al., 2010). One of those chymotrypsin genes, i.e., CT448, was demonstrated to show copy number variation with fewer copies in a more tolerant population (Schwarzenberger et al., 2017). This indicates that fewer but more stable isoforms of CT448 were maintained. Therefore, the three chymotrypsin genes and especially CT448 are likely to be the targets of selection, and might therefore constitute the genetic basis underlying local adaptation of Daphnia populations to chymotrypsin inhibitors.
Here, we chose two populations from the study of Schwarzenberger et al. (2017): One population was sensitive and the other tolerant to a cyanobacterial strain that originated from the tolerant population's lake of origin. We hypothesized that (a) the three chymotrypsin genes and especially CT448 showed signs of positive selection, and that (b) the tolerance of Daphnia magna to chymotrypsin inhibitors could be explained by genetic variation in chymotrypsin gene loci. For this we sequenced alleles of three chymotrypsins and, for comparison, six putative neutral loci, i.e., single-copy genes, of several D. magna genotypes of both populations. In a population genetic approach, we calculated the genetic distances between the two populations and investigated whether the chymotrypsin genes of the tolerant population had been selected.

| Animals and cultures
Daphnia magna clones stemmed either from Lake Bysjön (22 clones, Southern Scania, Sweden) or from a pond near Warsaw (12 clones, Kampinoski National Park, Poland). Lake Bysjön is known to fre- Lake Bysjön, Sweden (Schwarzenberger, D'Hondt, et al., 2013), was cultivated in a chemostat (dilution rate 0.23/day) on cyanophyceanmedium at 20°C and constant light (50 µE m −2 s −1 ). This cyanobacterium contains neither trypsin inhibitors nor microcystin but shows a high chymotrypsin inhibition (Schwarzenberger, Sadler, & Von Elert, 2013). Carbon concentrations of the autotrophic food suspensions were estimated from photometric light extinction (470 nm) and from carbon extinction equations previously determined and described in detail by Schwarzenberger et al. (2017).

| Cloning and sequencing of chymotrypsins and reference genes
Genomic DNA from each genotype (clonal lineages were differentiated with microsatellites; data not shown) was extracted with the peqGold DNA Tissue Kit (peqlab). Primers for the single-copy genes (ATP synthase gamma chain, guanyl-nucleotide exchange factor, vitamin k-dependent γ-carboxylase, pyridoxal kinase, YEATS domain containing protein 4 and smad anchor for receptor activation, Appendix S1) were designed based on information from the D. magna genome data base (www.wflea base.org) with the program netprimer (http://www. premi erbio soft.com/netpr imer/). Primers for the chymotrypsin genes were taken from Schwarzenberger et al. (2010). Each primer pair is specific to one gene locus and was not found to bind at other positions, i.e., other paralogs, within the genome. After PCR amplification, the ORFs of three chymotrypsin genes (Schwarzenberger et al., 2010) and six single-copy genes were cloned (TOPO TA cloning Kit, LifeTechnologies, TZ101 alpha chemically competent cells, Genaxxon). A total of 12 haplotypes from each population were sequenced except for CT383 (five haplotypes per each populations), CT802 (five haplotypes of the Polish population), and single-copy gene guanyl-nucleotide exchange factor (24 haplotypes of the Swedish population). Each clone used in the analyses was either homozygous (one haplotype) or heterozygous (two or three haplotypes). The sequences were aligned with the program BioEdit (Hall, 1999). Intron and exon sequences of the single-copy genes were determined with EST-data from the D. magna genome database (www.wflea base.org).
The K a /K s ratio at each site of the proteases and the exon sequences of the single-copy genes was calculated with Selecton (Stern et al., 2007). The K a /K s ratio is calculated as the ratio of the

| Genetic distance between populations
As a measure of genetic distance between the two populations, F ST values were calculated for microsatellites (Schwarzenberger, D'Hondt, et al., 2013), the exon sequences of the single-copy genes and the three chymotrypsin genes. The single-copy genes showed

| Selection of proteases
Positive selection on particular coding sites may lead to a reduction in the level of (linked) polymorphism in the nearby region (Smith & Haigh, 1974). Non-coding sequences on the other hand should show a neutral level of polymorphism, since selective forces usually do not directly work on them. Therefore, in order to test whether the Swedish D. magna population follows the neutral expectation of mutation-drift equilibrium, we conducted population genetic tests using the intron sequences of the single-copy and protease genes.
Nucleotide diversity of all intron sequences was not different from zero, indicating a neutral level of polymorphism of the Swedish population in these regions. To exclude population expansion after a F I G U R E 1 (a) UPMGA microsatellite tree with bootstrap test of phylogeny of the two populations. A geographically distinct population from Belgium served as outgroup. (b, c) Maximum likelihood phylogenetic trees with bootstrap test of phylogeny based on haplotype sequences of the Swedish and the Polish D. magna populations for the protease CT448 (b) and the single-copy gene ATP synthase gamma chain (c). The haplotypes from the Polish population all start with a "P" followed by a number, whereas the Swedish haplotypes are assigned with "S" followed by a number. Brackets indicate the clustering of the clones of a population within a tree bottleneck, which often results in an excess of singletons, Tajima's D was calculated for the intron sequences of all genes. Tajima's D was not different from zero for eight of nine intron sequences (Appendix S4), indicating that the site frequency spectrum did not deviate from that predicted by neutral theory. The same was true for two other neutrality tests (Fu & Li's D*, Fu's F s ). Only the intron sequence set of one single-copy gene (smad anchor for receptor activation) showed significantly negative values of Tajima's D, Fu & Li's D* and Fu's F s (Appendix S4). Consequently, we cannot rule out the possibility that this gene experienced any non-neutral evolution, which led us to exclude it for the subsequent analyses. However, haplotype data for eight intron sequences, including all protease introns, remained for further analyses. Also, the mismatch distribution, i.e., the distribution of the number of differences between sequence pairs, was not different from zero for seven of eight intron sequence sets, except for vitamin k-dependent γ-carboxylase. Six intron sequence data sets, including the proteases, suggested no evidence for changes in population size (i.e., no ongoing population expansion) and no signs of selection affecting the evolution of these non-coding sequences.
In contrast to the six neutral intron sequences, Tajima's D, Fu & Li's D and Fu's F s were significantly negative for CT448 (Table 1) but not for CT802 and CT383 (except for a significant Fu's F s ), which hints at positive selection due to a selective sweep or a recent population expansion in CT448.
In order to investigate if selection was working on the two other chymotrypsin genes, the K a /K s ratio at each site of all genes, i.e., protease and single-copy loci, were calculated with Selecton. The exon sequences of five single-copy genes showed no selected sites (data not shown). Only one single-copy locus, i.e., ATP synthase gamma chain, showed significantly positively selected sites with a signifi-

| D ISCUSS I ON
Digestive proteases in the gut of Daphnia are inhibited by cyanobacterial protease inhibitors. By feeding protease inhibitor encapsulated in liposomes, Von Elert, Zitt, and Schwarzenberger (2012) demonstrated that growth rate reduction was an effect of the inhibitors only and not due to other cyanobacterial products. Nevertheless, it has been shown that Daphnia are able to respond to such inhibitors by an increase in protease gene expression (Schwarzenberger et al., 2010). This increased protease gene expression was demonstrated to be maternally transferred to the offspring of pre-exposed mothers, which causes higher somatic growth rates of their offspring in comparison to offspring from naïve mothers .
Several findings strongly suggest that differences between populations in tolerance to cyanobacterial protease inhibitors result from differences in the molecular structure of digestive proteases of As a first approach to identifying potential differences in the molecular structure of the three chymotrypsins between the tolerant (Swedish) and sensitive (Polish) population (Schwarzenberger et al., 2017), we compared the genetic distance of three protease genes with those of single copy genes, assuming that the latter would represent neutral loci. The F ST values for the neutral loci were well in the range known from Daphnia literature (Hebert & Finston, 1996;Mort & Wolf, 1986).
However, the genetic distance between populations was much higher for the proteases than for the neutral loci. Since F ST values are regarded as a useful first step for the identification of candidate genes that might have been under selection (Beaumont, 2005), we concluded that adaptive selection has likely been driving the evolution of these protease genes.
In Daphnia studies, the relevance of dispersal and strong founder effects have led to the need to compare numerous populations to differentiate founder effects from selection (Bohonak & Jenkins, 2003). Bohonak and Jenkins (2003) discussed molecular genetic population differentiation. In contrast to this, functional molecular approaches offer the opportunity to test for positive selection even within single genomes (Hansen, Olivieri, Waller, & Nielsen, 2012), and do not necessarily require more than a single population. Therefore, we focused on the chymotrypsin genes of the tolerant population to investigate if the digestive chymotrypsins might have been under positive selection. Non-coding sequence regions should not be targets of any direct selection factor. Therefore, in order to calculate selection of chymotrypsin genes, we compared coding regions of the three chymotrypsin genes with their non-coding regions, and with intron sequences of single-copy genes that proved to be neutral markers.
One chymotrypsin gene, i.e., CT383, was not found to be selected. Several causes are imaginable: (a) The difficulties in sequencing due to long, repetitive elements resulted in a low number of sequenced haplotypes; this might have led to nonsignificant population genetic tests for this locus. (b) CT383 has a minor role in the tolerance to dietary chymotrypsin inhibitors and was therefore not a target of selection. (c) Not the protease itself, but rather its promotor region has been under selection. The latter is a possibility because the relative increase in gene expression of CT383 was much higher than that of CT448 and CT802 in a D. magna clone feeding on dietary chymotrypsin inhibitors (Schwarzenberger et al., 2010).
In contrast, we found evidence for adaptive evolution in the two other chymotrypsin genes of the Swedish population. In CT802 we detected positively selected sites which was based on the analysis of K a /K s ratios. However, since this test was developed for species comparisons and not for intraspecific comparison, we cannot completely rule out that the detection of selected sites in CT802 might simply reflect polymorphism rather than true divergence. Nevertheless, one site that was positively selected is part of the active centre of the encoded enzyme. This finding suggests that this site might have undergone selection and putatively codes for a more stable CT802 isoform.
In The putative selective sweep of CT448 is also supported by the finding that a 100% amino acid substitution between the Swedish (Gln) and the Polish D. magna population (Glu) has occurred at position 34, which is close to the active centre. This amino acid substitution might have resulted in a more stable isoform of CT448 in the Swedish population.
Both findings, i.e., the putative positively selected sites in CT802 and the probable selective sweep in CT448 (eventually leading to the 100% amino acid substitution), might influence the tertiary structure of the chymotrypsins. Such a change in tertiary structure might have caused the higher resistance of the tolerant population to inhibition by protease inhibitors in comparison to the Polish population (Schwarzenberger et al., 2017). In future studies, it will be interesting to investigate whether different isoforms differ in stability to protease inhibition, e.g., via recombinant expression and subsequent kinetic analysis of different isoforms of CT448 and CT802.
Here, we have demonstrated in a population genetic approach that protease genes from a tolerant Swedish D. magna population have most probably been targets of selection, and we assume that these selected proteases cause the higher tolerance of the population's digestive proteases to chymotrypsin inhibition (Schwarzenberger et al., 2017). Therefore, we conclude that the selection of digestive proteases by dietary protease inhibitors, in addition to reduction of gene copy numbers, is an important mechanism underlying local adaptation of tolerant Daphnia populations to protease inhibitors.

ACK N OWLED G EM ENTS
The authors would like to thank Sebastian Kallabis, Lea von Ganski, wrote the manuscript. All authors interpreted the results and read and approved the manuscript.

DATA AVA I L A B I L I T Y S TAT E M E N T
Gene sequence data were uploaded to GenBank and can be found under the accession numbers MN556344-MN556548. The data that support the findings of this study are available from the corresponding author upon reasonable request.