Transcriptome‐wide genotype–phenotype associations in Daphnia in a predation risk environment

Phenotypic variation plays an important role in how species cope with environmental challenges. Pinpointing which genes and genomic regions are underlying phenotypic variability thus helps to understand the processes of acclimation and adaptation. We used Daphnia as a system to identify candidates playing a role in phenotypic variation related to a predation risk environment with a genome‐wide association approach. Furthermore, a gene co‐expression network analysis allowed identifying clusters of co‐expressed genes which correlated to life history traits. To enhance the understanding of the functional roles of the transcripts, we identified orthologs and paralogs from related species and used ontologies to annotate the candidates of interest. Our study revealed that only one life history trait and two morphometric traits have a genetic association in the presence of predation risk (fish kairomones), whereas most genotype–phenotype associations were detected in a genotype–environment interaction analysis for reproduction‐related phenotypic traits. The gene co‐expression network analysis identified a total of 44 modules, of which one module correlated to another life history trait namely the ‘total number of broods’. The combined use of gene co‐expression network and transcriptome‐wide association analysis allowed the identification of 131 candidate transcripts associated with life history traits in Daphnia galeata. These results lay the ground for targeted studies to further understand phenotypic variability in this species.

genotypes (both gene expression and alterations of the DNA sequence such as SNPs) and (c) the variation of the trait, given that genetic and environmental factors are identical (Ziv, Shuster, Siegal, & Gresham, 2017). The resulting phenotypic variation promotes population performance and thus ecological/evolutionary success in the long run (Forsman & Wennersten, 2016). Understanding how each of these factors contributes to variation in quantitative traits remains a challenge, especially because phenotypic plasticity is common, thus blurring the relationship between genetic variants and phenotype.
Phenotypic plasticity is the ability of an organism to produce multiple phenotypes from a single genotype depending on the environment (Miner, Sultan, Morgan, Padilla, & Relyea, 2005). It is also an important mechanism that helps coping with environmental perturbations (Alberto et al., 2013;Charmantier et al., 2008). Although phenotypic plasticity is advantageous in heterogeneous and/or fast changing habitats, its maintenance is associated with costs (DeWitt, Sih, & Wilson, 1998;Van Buskirk & Steiner, 2009) and sometimes becomes maladaptive (Ghalambor, McKay, Carroll, & Reznick, 2007;Langerhans & Dewitt, 2002). A wide diversity of organisms exhibits phenotypic plasticity in response to biotic and abiotic factors in their environments (DeWitt et al., 1998;Harvell, 1990;Karban & Myers, 1989;Sultan, 2000) leading to changes in behaviour, morphology, physiology and life history traits. These plastic responses can be expressed either within the lifespan of a single individual (Young, Stanton, & Christian, 2003) or across generations (Miner et al., 2005).
Examining the genetic architecture of phenotypic traits not only identifies causal mutations but also helps in understanding past and predicting future evolutionary processes of adaptation (Ronnegard et al., 2016). Genetic variation can be studied at two levels of organization: patterns of gene expression and the DNA/RNA sequence level. Associating gene expression profiles to phenotypic traits can be accomplished by constructing gene co-expression networks, which identify clusters of co-expressed genes. Often co-expressed genes within one module or gene cluster share conserved biological functions revealing their potential genetic pathways (Subramanian et al., 2005). The benefit of gene network analysis lies in the opportunity to correlate the gene expression information to biological information; by gathering insights of the biological association of genes and traits, candidate genes can be identified. For example, a study on lake whitefish used a weighted gene co-expression network analysis (WGCNA) to identify gene clusters correlating to three phenotypic traits such as trophic behaviour, trophic morphology (gill rakers) and reproduction (Filteau, Pavey, St-Cyr, & Bernatchez, 2013).
The genotype-environment interaction (GxE) is a common phenomenon describing how a genetic variant has different phenotypic effects in different environments (Smith & Kruglyak, 2008). For example, human individuals with sickle cell anaemia have a survival advantage in endemic areas of malaria but are at a disadvantage in areas without malaria (Ferreira et al., 2011). Recently, biologists have applied genomic data and traditional pedigree information to explain phenotypic differences in life history traits such as horn shape in Soay sheep (Johnston et al., 2013), clutch size in collared flycatchers (Husby et al., 2015;Ronnegard et al., 2016) and Glanville fritillary butterfly (Duplouy, Wong, Corander, Lehtonen, & Hanski, 2017) where life history trade-offs may be involved in promoting genetic variation at one or several loci in the species.
Daphnia plays a central role in the trophic cascade of freshwater ecosystems (Carpenter et al., 2001;Sommer et al., 2003). This small, filter-feeding crustacean has become a widely used isogenic model organism in subfields as varied as ecology, toxicology and evolutionary biology, not only because of short reproduction times and easy handling, but also due to its ability to reproduce parthenogenetically.
Daphnia is an ecological link between primary producers (algae) and secondary consumers (planktivorous fish and invertebrates) (Gaedke & Straile, 1998). Predation is thus (together with food quality, parasites and toxins, among others) an important selective force shaping phenotypic variability and plasticity in Daphnia (Reger, Lind, Robinson, & Beckerman, 2018). Fish are visual predators and strongly influence growth and reproductive strategies in their prey (Macháček, 1991).
This selection might appear to be strongly directional towards smaller body size (e.g. Gliwicz & Boavida, 1996;Weider & Pijanowska, 1993) and skewing towards an r strategy. However, phenotypic variability remains substantial, likely because of other contrary selective forces through other predator types and frequency-dependent selection.
To date, however, only a few studies in Daphnia research have associated phenotypic traits with the individual's genetic setup by using transcriptomic information and/or genome-wide information. Routtu et al. (2014) identified putative QTLs underlying variation in life history traits and phototactic response for Daphnia magna by analysing an F2 panel. Herrmann, Henning-Lucass, Cordellier, and Schwenk (2017) used a genotype-phenotype association approach to show the effects of temperature on fitness in D. galeata. Further, Bento et al. (2017) investigated the genetic basis of parasite resistance in D. magna by using a quantitative trait locus (QTL) approach, and Bourgeois, Roulin, Müller, and Ebert (2017) analysed the correlation between parasite susceptibility and genotype in D. magna, making use of RADseq data collected for 97 genotypes; this approach allowed the identification of two genomic regions involved in resistance to the bacterium Pasteuria ramosa. Finally, Asselman, Pfrender, Lopez, Shaw, and De Schamphelaere (2018) and Malcom, Juenger, and Leibold (2018) have explored the correlation between gene co-expression modules and phenotypic traits in Daphnia pulex.
In the present study, we associate genotypic and phenotypic data of 24 clonal lines of European D. galeata by applying two approaches: a genome-wide association (GWA) analysis and a weighted gene co-expression network analysis (WGCNA). We applied a GWA analysis to the existing life history trait data set (Tams, Lüneburg, Seddar, Detampel, & Cordellier, 2018) and the corresponding transcriptome-based SNP data set (Herrmann, Ravindran, Schwenk, & Cordellier, 2018;Ravindran, Herrmann, & Cordellier, 2019). Further, we applied a WGCNA to understand the association between gene expression and life history traits. Finally, we estimated heritability of the measured life history traits to infer their susceptibility to selection. Based on these analyses, we addressed the following questions: (1) Which SNPs (genotype) associate to different phenotypic life history traits in the two different environments? and (2) Which gene co-expression modules are correlated to life history traits in control environment? We answered these questions by looking at two levels of association: (a) genotype (SNPs)-phenotype (life history traits under control and fish environment) and (b) molecular phenotype (gene co-expression modules)-organismal phenotype (life history traits in the control environment). By answering these questions, we contribute to the understanding of the genetic basis of phenotypic variation in the absence and presence of predation risk in D. galeata.

| MATERIAL S AND ME THODS
The SNP data set used in the present study has been described in Herrmann et al. (2018) and is available on Dryad (https://doi. org/10.5061/dryad.56jt85j -SNP data Ravindran). The life history trait data set was described in Tams, Lüneburg, et al. (2018). Variance stabilized normalized read counts for the gene co-expression network analysis are available on Dryad (SI_2 -https://doi.org/10.5061/ dryad.p85m5) together with an overview of sequencing and mapping results as described in Ravindran, Herrmann, et al. (2019).
Functional annotation of the D. galeata transcriptome has been described in Huylmans, López Ezquerra, Parsch, and Cordellier (2016) and Ravindran, Herrmann, et al. (2019). A summary of the data sets used for the present analysis is represented in Figure 1. Although data were collected at different time points (years) in different studies, the clonal lines were maintained in standardized and constant conditions in the laboratory throughout the years.
To avoid confusion with terminology, we use the term 'clonal line' for the 24 genotypes and the term 'genotype' for SNPs throughout this manuscript.

| Phenotype data set and design of life history experiment
In Daphnia, inducible defences are mediated by chemicals released by the predators (such as fish), called 'kairomones'. This chemical messenger benefits the receiver rather than being advantageous F I G U R E 1 Overview of data sets used in this study for a total of 24 clonal lines of D. galeata. The RNA-seq data set provides the gene expression data and the genotype (Single Nucleotide Polymorphism = SNP) information in control environment (Herrmann et al., 2018;Ravindran, Lüneburg, et al., 2019). The phenotype data set provides the data of ten phenotypic traits in control and fish environment (Tams, Lüneburg, et al., 2018) Daphnia galeata 24 (Herrmann et al. 2018 Phenotype dataset  ‚GWA' Genotype-phenotype association ‚WGCNA' Gene co-expression analysis & ME-trait correlation for the predator (sender) itself (Brown, Eisner, & Whittaker, 1970).
Phenotypic data originate from the life history experiment investigating the effect of fish kairomones on D. galeata (described in detail in Tams, Lüneburg, et al., 2018) for a total of 684 experimental individuals (aim: 24 clonal lines × 2 environments × 15 replicates = 720 individuals). Prior to the experiment, each clonal line was bred in kairomone-free water (control) and in kairomone water (fish) for two subsequent generations to minimize inter-individual variances.
Breeding and experimental phases were conducted at a temperature of 20°C and a 16 hr light/8 hr dark cycle in a brood chamber with a light intensity of 30% of its maximum light intensity (Rumed, Typ 3201D). Experimental individuals (F2) were female neonates of the 3rd to 5th brood. The experiment lasted for 14 days. A total of ten phenotypic traits were recorded, eight life history traits: age at first reproduction ('AFR') [day of releasing offspring from brood pouch], total numbers of broods per female ('broods') together with F I G U R E 2 Violin plots visualizing the data distribution of the life history trait age at first reproduction ('AFR') for each clonal line. Average values for each clonal line is indicated by a full circle. (a) control environment (control). (b) predation risk environment (fish) brood size ('brood1', 'brood2', 'brood3', 'brood4'), total numbers of neonates per female ('offspring'), 'survival' [in days] (see Table S1 in Tams, Lüneburg, et al., 2018) Based on the phenotypic data from the above-described experiment (hereafter 'full phenotypic table', see Table S1 in Tams, Lüneburg, et al., 2018), we derived two phenotypic data sets. The raw phenotype data values (Tables S1a and S2a) were normalized using quantile transformation ('normalize.quantiles()' function) in R.
These normalized phenotypic trait values were used to calculate the mean (hereafter, 'mean data set', Table S1b) and statistical variance (hereafter, 'variance data set', Table S2b) using the functions 'mean()' and 'var()', respectively, in R, for each environment separately and were used as input for GWA analysis. For the analysis of heritability, we used the raw phenotypic data obtained from up to 15 individuals per clonal line (hereafter 'full phenotypic table', see Table S1 in Tams, Lüneburg, et al., 2018). An edited version of the 'full phenotypic table' was used as input for the WGCNA described in detail below.

| Genotype data set and SNP calling
The SNP calling procedure was performed on the control environment and has been described in (Herrmann et al., 2018) along with sequencing, mapping and assembly parameters and summary statistics. Briefly, reads from an RNA-seq experiment under control conditions were aligned to the reference transcriptome published by (Huylmans et al., 2016), merged with samtools (Li et al., 2009) and realignment around indels was performed using GATK's IndelRealigner tool (DePristo et al., 2011). The initial variant calls were made using GATK's HaplotypeCaller. Using GenotypeGVCF tool in GATK, samples were jointly genotyped and a single vcf file was obtained.
Variants were further filtered using VariantFiltration tool imple- To use the SNP data for GWA analysis, we further filtered variants with a minor allele frequency (MAF) of 0.1 to exclude rare variants. Furthermore, indels were excluded and only biallelic sites were considered for further analysis. A total of 144,938 SNPs distributed in 11,857 transcripts (median number of SNPs per transcript: 7, ranging between 1 and 187) were used as input for GWA analysis (will be made available on Dryad).

| Genotype-phenotype association analysis
The genome-wide association analysis was performed using the 'GenABEL' package version 1.8 (Aulchenko, Ripke, Isaacs, & Van Duijn, 2007) in R. Each SNP (genotype) was associated with the ten life history traits (phenotype) in the absence (hereafter, control environment) and presence (hereafter, fish environment) of fish kairomones. A quality check (QC) of our input genotype and phenotype data was performed using 'check.marker()' function in 'GenABEL'.
This QC step helps to select which SNPs and individuals should enter the GWA analysis based on call rate, minor allele frequency (MAF), value of the chi-squared test for Hardy-Weinberg equilibrium (HWE), heterozygosity and IBS (identity by state across a random sample of markers). This step is repeated until no individuals or markers are removed further. After QC, 20 clonal lines (all except G1.6, J1, J2 and M12) from the phenotype data and 114,978 SNPs passed all the criteria and were used for all further genotype-phenotype association analysis (univariate, multivariate and GxE).
'GenABEL' was used to perform a univariate analysis, where each SNP was tested for association to the normalized mean and variance phenotypic trait values in each environment (control and fish). To test for univariate association between a phenotypic trait and SNPs, the function 'egscore()' was run with 100,000 permutations. This function implements the method of Price et al. (2006, EIGENSTRAT), which makes use of principal components of the genomic kinship matrix to adjust both phenotypes and genotypes for possible stratification. The -log 10 p-values were calculated automatically by 'GenABEL' and visualized using Manhattan plots in R.
Multivariate analysis was done using 'MultiABEL' package version 1.1-6 in R (Shen et al., 2017), where each SNP was tested for association to the combination of all ten life history traits in each environment. For the multivariate analysis, the function 'MultiLoad()' was used to test for association between a SNP and combination of all phenotypic traits and was permuted for 100,000 times.
PLINK v.1.07 (Purcell et al., 2007) was further used to test for differences in genotype-phenotype associations between the two environments. The '--gxe' command was used to reveal the univariate genotype-environment interactions (GxE). For the multivariate GxE analysis, multivariate PLINK was used with the option '--mqfam'.
Settings were applied to correct for population stratification in the data set by permuting for 100,000 iterations within populations for the univariate GxE and 1,000 iterations within populations for multivariate GxE analysis.
Both univariate and multivariate analyses were performed for the control and fish environment and the GxE interaction separately.
For the multivariate and GxE (univariate and multivariate) analysis, all p-values obtained were corrected for multiple testing using the Bonferroni correction method in R (R Development Core Team, 2008).
The -log10 p-values were calculated on adjusted p-value and visualized using Manhattan plots in R. A SNP was said to be associated with a phenotypic trait if it had a -log 10 p-value of ≥ 1.3 (p-value < .05).

| Estimation of trait heritability
Thanks to the high replication of the life history data, we could precisely estimate heritability for life history traits using the function 'marker_h2()' in the R package 'heritability ' (v1.3, Kruijer et al., 2015).
Originally conceived for immortal lines of plants, which resemble clonal lines in Daphnia, this function uses a genetic relatedness matrix and phenotypic observations at individual level. Heritability estimates (H 2 ) are based on REML estimates of the genetic and residual variance and their standard errors. Most importantly and unlike the GWA analysis, this procedure was conducted on the full data set and not on summary statistics for each genotype, that is 563 data points were used (274 and 289 observations for the control and the fish conditions, respectively).
Heritability was calculated separately for each environment and each phenotypic trait. Additionally, heritability was calculated on the 'full phenotypic table', including 'environment' as a covariate, similarly to the procedure outlined in Jury, Delano, and Toonen (2019).
This analysis was conducted on the 20 genotypes passing all filters mentioned above (all but G1.6, J1, J2 and M12).

| Gene co-expression network analysis-Linking gene co-expression and life history traits
Gene expression data for all 24 clonal lines were available only for the control environment. The gene co-expression network analysis was based on variance stabilized read counts obtained from HTSeq data used in the R package 'DESeq2' (Love, Huber, & Anders, 2014) by Ravindran, Herrmann, et al. (2019) to investigate differential gene expression at population level between the four European D. galeata populations in control environment. We applied a weighted gene co-expression network analysis (WGCNA) by using the R package 'WGCNA' v. 1.6.1 to find putative pathways from the highly correlated genes clustered in modules (Langfelder & Horvath, 2008).
A single, signed, weighted gene co-expression network was con-   (Frisch, Becker, & Wojewodzic, 2020). Finally, the correlation of module eigengenes and the resampled life history trait mean values was calculated. This whole procedure of resampling to calculate randomized means and their correlation to module eigengenes was repeated 10,000 times to verify the robustness of ME-trait correlations. ME-trait correlations were considered as robust if occurring more than 95% of all iterations. ME-trait correlations with a correlation value of>±0.5 absolute value were considered as significant.

| Module eigengene-trait correlation
Significant ME-trait correlations resulted in transcript sets of interest which were extracted to investigate their biological functions. In addition, hubgenes-the most interconnected gene per module-were identified and their biological functions were explored.

| Functional analysis
For every SNP/transcript associated with a life history trait in both GWA and WGCNA, we assigned Gene Ontology (GO) terms using annotations from Huylmans et al. (2016). An enrichment analysis with 'topGO' (Alexa & Rahnenfuhrer, 2016) was performed to investigate their biological relevance. We identified orthologs and paralogs for the transcripts associated with a phenotypic trait using orthoMCL cluster information from the reference transcriptome of D. galeata (Huylmans et al., 2016). These orthoMCL clusters are based on orthologous protein sequences from three Daphnia species (D. galeata, D. magna and D. pulex) and two insect species (Drosophila melanogaster and Nasonia vitripennis). To enhance our understanding of the ecological role of transcripts associated with life history traits, we performed a BLAST analysis on the Daphnia stressor database (Ravindran, Lüneburg, Gottschlich, Tams, & Cordellier, 2019). This enabled us to identify stressors for the candidate transcripts of interest from our GWA and WGCNA.

| Univariate analysis
When considering the mean values of replicates for each clonal line, we observed associations between one SNP, 'Dgal_ o2802d15519t2:1614' and one life history trait 'brood2' in the control environment (Table 1). In the fish environment, the morphometric traits 'SGR' and 'length' were associated with the same SNP, namely 'Dgal_sd13197565:1333'. Further, significant GxE interactions across the two environments were found for five life history traits, namely 'brood2', 'brood3', 'brood4', 'broods' and 'survival', associated with four, one, three, one and one SNPs, respectively.
Among them, the SNP 'Dgal_t1576c0t2:164' was found to be associated with four life history traits, namely 'brood3', 'brood4', 'broods' and 'survival' in the univariate GxE analysis. No associations were found for the other phenotypic traits (Table 1 and Table S3).

| Multivariate analysis
A multivariate testing was performed using the mean and variance trait values, where we assessed the effect of one SNP on all phenotypic traits combined, taking the interdependence of the phenotypic traits into account. Multivariate testing has been shown to be more powerful compared to univariate analysis (Galesloot, van Steen, Kiemeney, Janss, & Vermeulen, 2014). We identified one SNP (Dgal_o14292d44685t1:1732) to be significantly associated with all life history traits in the control environment of the 'mean data set' (Table 2). However, no SNPs were significantly associated with all phenotypic traits in the fish environment of 'mean data set'. For both environments, control and fish, no significant genotype-phenotype association was found in the 'variance data set'. The multivariate GxE analysis resulted in one SNP (Dgal_o2092t4:604) to be associated with all life history traits in the 'variance data set'.

| Trait heritability estimates
We report variance (genetic and residual) and trait heritability in Table 3 for the estimation on the 'full phenotypic table' while setting'environment' as a covariate. Further results per environment are provided in Table S4. For the traits 'survival' and 'brood4', heritability was near 0 due to an overall lack of variance in these traits. For the other traits, heritability ranged between 0.22 (±0.09) and 0.71 (±0.14).
The morphometric traits 'length' and 'SGR' had the highest heritability, and life history trait 'offspring' was also on the upper end.

| Gene co-expression network analysis-Linking gene co-expression and life history traits
The single network construction resulted in 44 modules of co-expressed transcripts in control environment ( Figure S1). Most transcripts were assigned to the module 'turquoise', 'blue', 'brown' and 'yellow'.
The 'grey' module is the largest and includes all transcripts which could not have been assigned to any module (22%; n = 7,297). For each module, the hubgene was identified. Phenotypic trait information was correlated to module eigengenes (ME) to assess the biological meaning of modules. Only one significant module-trait correlation was found  Table S5.

| Gene Ontology analysis
Gene Ontology (GO) terms were assigned to transcripts identified in the univariate and multivariate GWA, in addition to hubgenes and transcripts of the 'darkorange' module which were identified in WGCNA. In total, GO terms were assigned to 39 transcripts (15 transcripts in GWA and 24 transcripts in WGCNA, Table S3) and to 15 out of the 44 hubgenes (Table S5). All these assigned GO terms were enriched for metabolic processes (Table S6). There were 18 GO terms assigned to the hubgenes and included functions for enzyme activities, binding and transport activities which are important for general metabolic processes.

| Comparative genomics
To gain insights into the biological relevance of our candidate transcripts identified in the GWA analysis and WGCNA, we made use of orthoMCL cluster information of the D. galeata reference transcriptome (Huylmans et al., 2016). Nine candidate transcripts identified in the 'mean' GWA analysis were assigned to eight orthoMCL clusters, and 20 transcripts identified in the 'variance' GWA analysis were assigned to 19 orthoMCL clusters (Table S3, Figure S2). Fiftyeight out of 85 transcripts identified in WGCNA were assigned to 53 orthoMCL clusters (Table S3). There was no overlap between the orthoMCL clusters identified in GWA analysis and WGCNA.
One-third of the orthologs are conserved (i.e. comprised in clusters also containing insect proteins), whereas the other two-thirds are Daphnia specific. Of the Daphnia specific orthologs, 23% seem to be D. galeata specific (3).
Identifying which stressor causes differential expression of a transcript is another way of looking into the functional aspects of a transcript (Figure 3). To this aim, we compared the sequences of transcripts identified in the GWA and WGCNA to those available in the Daphnia stressor database. For 18 transcripts (7 from GWA and 11 from WGCNA), the database contained a Daphnia transcript whose expression was influenced by stressors in previous studies (Table S3). These stressors are mostly abiotic factors such as salinity (2), temperature (2) and phosphorus (1) (Table S3). However, no GO annotation exists for that specific transcript or any of the other D. galeata transcripts clustered in the orthogroup 'ORTHO_ALL324'.

| Genotype-phenotype association analysis
We found no direct evidence of genotype-phenotype associations for most of the phenotypic traits except for 'brood2', 'brood4' and 'broods' in the control environment and 'length', 'SGR' and 'brood1' in the fish environment. These genotype-phenotype associations for the above-mentioned traits are promising candidates for true associations. To verify these associations, subsequent investigations should test the candidate SNPs for their biological relevance for the corresponding phenotypic trait experimentally, for example via a RNA knock-out or silencing of the respective transcript/gene. Alternatively, a broader sampling strategy could be applied to establish a diverse clone panel, and their genotypes and phenotypes recorded to explicitly test the association identified here. It is to be noted, however, that trait recording might cause a bias in 'brood4' as the experiment was stopped at 14 days, and the value for this trait was set to zero if the individual had not produced a fourth brood.
Our results for this trait are therefore likely an artefact.
The observed significant GxE interaction effect for the trait 'offspring' in the 'variance data set' could be best explained by the significant associations of SNPs to 'brood1', 'brood2', 'brood3', 'brood4' and 'broods' since these life history traits are not independent. The trait 'offspring' describes the total number of offspring per female and thus includes the total number of offspring for the first, second, third and fourth brood ('brood1' to 'brood4'), whereas the trait 'broods' describes the total number of broods. This correlation of life history traits might have biased the statistical genotype-phenotype association and led to a false-positive association in the GxE analysis.
By leveraging the high replication in phenotypic trait recording and considering the variance of traits instead of mean values, we did include a proxy for phenotypic plasticity. Predator-induced plastic changes in Daphnia are a text book example of phenotypic plasticity (Tollrian, 1995), and the data gathered by Tams, Lüneburg, et al. (2018) and used here demonstrate this. Reger et al. (2018) showed how predation drives local adaptation of phenotypic plasticity in D. pulex. Here, we could identify significant associations between the genotype of D. galeata clonal lines and the variance for reproduction-related phenotypic traits (e.g. 'broods', 'offspring'). Although the same precautions as above apply for 'brood 4' for interpreting these results, plasticity for 'brood1' and 'brood2' sizes might have a genetic basis and this warrants further investigation.
The overall number of 131 identified genotype-phenotype associations for ten phenotypic traits in our analysis is concordant to previous genotype-phenotype associations in different organisms.
Two genomic regions with about 30 out of 7,034 SNPs were associated with parasite resistance in D. magna .
One GWAS found 139 genotype-phenotype associations to one F I G U R E 3 Flow diagram representing the proportion of candidate transcripts as identified in GWA and WGCNA and their associated stressors. Each rectangle bar is called a 'node', and each vertical group of nodes is called a 'step'. The coloured areas linking the nodes are called 'flows'. The step 'Transcript' contains the candidate transcripts as identified in GWA and WGCNA. The step 'Analysis' contains three nodes: 'GWAS_multivariate', 'GWAS_univariate' and 'WGCNA' representing the analysis from which the candidate transcripts were obtained. The step 'Interaction' contains three nodes: 'control', 'fish' and 'gxe' representing the control and fish environments and GxE interactions. The step 'Stressors' represents the identified stressor for each transcript based on sequence similarity from Daphnia Stressor database  (Pitchers et al., 2019). Further, GWAS revealed a total of 410 genotype-phenotype associations to 30 phenotypic traits in a wild tree population (McKown et al., 2014), whereas one out of 50,000 SNPs was significantly associated with the trait clutch size in the collared flycatcher (Husby et al., 2015).
Several explanations for the lack of association between the genotype and phenotypic traits are possible: low statistical power, low genetic variation, polygenic basis (many loci of small effect), missing/ untargeted genomic regions and epigenetic effects, among others.
The statistical power of our study is limited because of the relatively small sample size. GWAS is often conducted on human populations with many hundreds of individuals, and in a breeding context (i.e. for plants) with a few hundred accessions (e.g. Atwell et al., 2010;van Heerwaarden, Hufford, & Ross-Ibarra, 2012;Zhao et al., 2011). It is all the more remarkable that we uncovered a few associations, but the lack of association found here should not lead to the conclusion that the phenotypic traits have indeed no genetic basis.
Low genetic variation can be dismissed in our case because substantial genetic variability at the sequence level among the studied Thus, the phenotypic traits might reflect a pattern of additive, polygenic inheritance. The relative high heritability we estimated for the morphometric traits 'length' and 'SGR' seems at odds with the fact that almost no association was found through the GWA. A polygenic basis for these traits is therefore likely and would warrants further investigation.
We studied genotype-phenotype associations at the tran- Last but not least, future investigations should include as well the epigenetic level (Tak & Farnham, 2015), which we did not address in the present study. Only a small portion of heritability is explained by associated genetic factors (Trerotola, Relli, Simeone, & Alberti, 2015). Underlying factors and mechanisms such as different epigenetic modifications add to variation and plasticity without changing DNA sequences (Trerotola et al., 2015).

| Functional analysis
The functional analysis revealed GO enrichment for enzymatic activities and metabolic processes (17 in WGCNA and eight in GWA;  & Tollrian, 2004;Tollrian, 1995).
For example, a smaller body size in the presence of fish kairomones was observed in D. galeata from Greifensee, in contrast to a larger body size in the presence of Chaoborus kairomones, an invertebrate predator (Wolinska, Löffler, & Spaak, 2007).
Digestion-related GO terms for 'cysteine-type peptidase activity' were found for the hubgenes of the 'royalblue' (405 transcripts) and 'lightcyan1' (70 transcripts) modules. Peptidases are major digestive proteases in the gut of Daphnia (von Elert et al., 2004) and are important for the energy allocation for Daphnia growth (Schwarzenberger, Küster, & Von Elert, 2012). Thus, the identified gene co-expression modules with hubgenes annotated to relevant GO terms are likely important for predator-induced responses in D.
galeata (Tams, Nickel, et al., 2018). However, this needs to be verified, for example by differential gene expression and gene co-expression analysis of all 24 clonal lines exposed to fish kairomones.
In the end, our functional analysis allowed two straightforward conclusions. First, an integrative approach is beneficial to identify putative candidate transcripts/genes. Our integrative approach resulted in a candidate transcript list of overall 131 transcripts (14 from 'mean' GWA, 32 from 'variance' GWA and 85 from WGCNA) being involved in phenotypic variation of life history traits in D. galeata (Table S3). Second, genomes still lack functional annotations thus hindering the interpretation of biological relevant transcripts. As is the case in many other nonmodel species, only about one-third of the candidate transcripts had a GO annotation, limiting our conclusions.
Depending on the genes or transcripts of interest, further research can be conducted on these candidates whose expression was shown to respond to environmental stressors in previous studies ( Figure 3). Furthermore, expansion of the Daphnia stressor database (Ravindran, Lüneburg, et al., 2019) as well as newly available data may help researchers to identify biologically relevant transcripts and to infer stress responses in other Daphnia and related species.

| CON CLUS I ON S AND OUTLOOK
In this study, we explored the association of phenotype, genotype and environment in European D. galeata emphasizing the complexity of their interactions. The present study helped us to identify a total of 131 candidate transcripts associated with phenotypic traits.
Hence, we are now one step closer to understand the genetic basis of phenotypic variation related to predation risk in Daphnia. The present study also brought to light some shortcomings. First, an appropriate GWA approach is missing to account for the clonal nature of Daphnia. We would have gained more information and statistical power by using the complete phenotypic data set of individuals (n = ~700) rather than phenotypic means per clonal line (n = 24).
Second, although we found genotype-phenotype associations at the transcriptome level, we cannot exclude the role of noncoding regions in shaping phenotypic variation. Once genomic information is available, investigating the role of noncoding regions can help to understand the interplay of genotype, phenotype and environment.
Further, whole genome sequence would allow localizing the candidates identified in this study and finding out whether neighbouring regions contain loci of interest that might explain the association.
Third, more complete functional annotation information for Daphnia would help to identify biologically meaningful transcripts. Finally, to better understand the influence of predation risk here simulated by the presence of fish kairomones on Daphnia life history traits, gene expression profiles under predation risk are needed for all 24 clonal lines in the fish environment (the present study only uses gene expression data obtained in the control environment independently). These gene expression profiles would allow the application of a differential gene co-expression network analysis between the two gene co-expression networks (control versus fish kairomones), further revealing biologically significant pathways and hence candidate transcripts. Overall, the identification of biologically significant transcripts being involved in predator-induced responses in Daphnia provides a valuable source for further investigations of the environment-dependent genotype-phenotype relationships in Daphnia.

ACK N OWLED G M ENTS
Open access funding enabled and organized by Projekt DEAL.

PE E R R E V I E W
The peer review history for this article is available at https://publo ns.com/publo n/10.1111/jeb.13699.