Mitigating type 1 error inflation and power loss in GxE PRS: Genotype–environment interaction in polygenic risk score models

The use of polygenic risk score (PRS) models has transformed the field of genetics by enabling the prediction of complex traits and diseases based on an individual's genetic profile. However, the impact of genotype–environment interaction (GxE) on the performance and applicability of PRS models remains a crucial aspect to be explored. Currently, existing genotype–environment interaction polygenic risk score (GxE PRS) models are often inappropriately used, which can result in inflated type 1 error rates and compromised results. In this study, we propose novel GxE PRS models that jointly incorporate additive and interaction genetic effects although also including an additional quadratic term for nongenetic covariates, enhancing their robustness against model misspecification. Through extensive simulations, we demonstrate that our proposed models outperform existing models in terms of controlling type 1 error rates and enhancing statistical power. Furthermore, we apply the proposed models to real data, and report significant GxE effects. Specifically, we highlight the impact of our models on both quantitative and binary traits. For quantitative traits, we uncover the GxE modulation of genetic effects on body mass index by alcohol intake frequency. In the case of binary traits, we identify the GxE modulation of genetic effects on hypertension by waist‐to‐hip ratio. These findings underscore the importance of employing a robust model that effectively controls type 1 error rates, thus preventing the occurrence of spurious GxE signals. To facilitate the implementation of our approach, we have developed an innovative R software package called GxEprs, specifically designed to detect and estimate GxE effects. Overall, our study highlights the importance of accurate GxE modeling and its implications for genetic risk prediction, although providing a practical tool to support further research in this area.


K E Y W O R D S
genotype-environment interaction, GxEprs, GxE PRS model, polygenic risk score

| INTRODUCTION
Human complex traits and diseases arise from a complex interplay between genetic and environmental factors.For example, conditions such as body mass index (BMI), cholesterol levels, diabetes, coronary artery disease (CAD), and hypertensive disorders are all influenced by both genetic and environmental factors, which can interact to modulate genetic effects on these outcomes.This concept is supported by evidence from twin and family-based studies (Brönner et al., 2006;Carmelli et al., 1994;Hottenga et al., 2005;Mangino & Spector, 2013).
In the genomic era, genome-wide association studies (GWAS) have made significant progress in identifying genetic variants associated with complex traits and diseases (Uffelmann et al., 2021).Leveraging the summary statistics from GWAS, polygenic risk scores (PRSs) can be generated as a single measure that reflects an individual's risk for a specific disease.PRSs are widely utilized in genomic prediction, providing valuable insights into disease susceptibility (Lambert et al., 2019).Additionally, genome-wide environment interaction studies (GWEIS) have been employed to identify genetic variants whose effects are influenced by environmental factors (Manning et al., 2012;Siegert et al., 2013;Wu et al., 2012).Incorporating gene-environment interactions (GxE) into genomic prediction, such as GxE PRS models, holds promise for improving the accuracy of predicting complex traits and diseases, particularly when modifiable environmental variables are available.
Existing methods for GxE PRS models face several challenges that need to be addressed.The current literature presents various GxE PRS models (Guloksuz et al., 2019;Li et al., 2015;Peyrot et al., 2018), where the PRS term considered in the interaction component is, primarily constructed based on additive genetic effects only.However, this approach can substantially reduce the power to detect GxE in the target data set, an issue that has received limited investigation (Arnau-Soler et al., 2019;Zhai et al., 2022).Another challenge in GxE PRS models is the potential for model misspecification (Kolenikov & Angeles, 2004;Montgomery et al., 2012;Sun et al., 2018).When higher-order polynomial effects of the environment or unknown nongenetic effects modulated by the environment are present, spurious GxE signals may arise.Unfortunately, this problem has not been extensively studied, highlighting the need for further investigation and understanding.
To tackle these challenges, we conducted extensive simulations to assess the potential improper application of existing GxE PRS models in terms of the type 1 error rate and statistical power.We also evaluated the impact of model misspecification by considering nongenetic residual effects influenced by the environment.In response to the concerns observed, we propose novel GxE PRS models for both quantitative and binary traits, for which we have developed a user-friendly R package called "GxEprs".To demonstrate the effectiveness of the proposed model, we applied it to analyze a diverse set of quantitative and binary traits using data from the UK Biobank.The results of our analyses showcase that the proposed GxE PRS model is robust against spurious GxE signals and addresses key limitations observed in existing models.This advancement contributes to improving the accuracy and reliability of GxE PRS modeling, providing researchers with a valuable tool to explore gene-environment interactions in the context of disease risk assessment.

| METHODS
This section describes the existing and proposed statistical models for GxE PRS.Subsequently, it illustrates details on genotype data used for this study, along with the quality control (QC) procedures undertaken.Finally, it provides details on variables including phenotypes and covariates used in the real data analysis.

| Statistical models
A conventional GWAS model in the discovery data set can be written as where y is the n 1 × 1 vector of phenotypes that have been adjusted for all available fixed effects in the discovery sample, W i 1 is the n 1 × 1 vector of genotypes at the ith single nucleotide polymorphism (SNP), β ˆtrd i is the estimated additive effect of the ith SNP, and ε denotes the n 1 × 1 vector of residual terms, and n 1 is the number of samples in the discovery data set.
When considering GxE interactions, a GEWIS model can be written as where y and W i 1 are as defined above, E is the n 1 × 1 vector of the covariate that can modulate the genetic effects, β ˆadd i , is the additive effect of the ith SNP, β ˆcov is the estimated effect of the covariate, β ˆgxe i is the nonadditive effect of GxE at the ith SNP, and ε denotes the n 1 × 1 vector of residual terms.It is noted that the term, ⊙, represents the Hadamard product between the two vectors.Overall, this equation provides a means of examining the complex interplay between genetic and environmental factors in relation to the outcome variable.
In case there is a nonlinear relationship between y and E because of various reasons such as underlying biological mechanisms or collider bias due to adjustment, we can use an alternative GWEIS model that includes quadratic effects of E. The model can be written as where y, W i 1 , E, β ˆadd , and β ˆgxe are as defined above, β ˆcov1 is the estimated effect of the covariate, β ˆcov2 is the estimated effect of the squared covariate, and ε denotes the n 1 × 1 vector of residual terms.
A conventional PRS which has been widely used to predict disease risk, is calculated as the weighted sum of risk alleles identified in a GWAS (e.g., PLINK functions-linear interaction for estimating SNP effects for quantitative phenotypes--glm for estimating SNP effects for binary phenotypes, and--score for estimating PRSs).On the basis of GWAS summary statistics obtained from the discovery samples (Equation 1), a set of PRSs can be constructed using the following model: where X ˆtrd is the n 2 × 1 vector of estimated PRSs, W i 2 is the n 2 × 1 vector of genotypes at the ith SNP, m is the number of SNPs and n 2 is the number of individuals in the target data set.
Similarly using GWEIS summary statistics from Equation (2) or (3), two sets of PRSs in a target data set can be estimated as follows: where X ˆadd and X ˆgxe are the n 1 × 1 vectors of estimated PRSs for additive and GxE interaction effects.
Using these three sets of PRSs estimated from Equations ( 4) to (6) based on a whole-genome approach, we tested five GxE models (referred to as Models 1-5) in the target data set.
The first model, commonly used in the conventional GxE analysis, is written as where y is the n 2 × 1 vector of phenotypes adjusted for fixed effects in the target sample, E is the n 2 × 1 vector of the covariate, αˆ1 , αˆ2 , and αˆ3 are the estimated regression coefficients of the PRS, covariate and interaction terms, respectively, and ε denotes the n 2 × 1 vector of residual terms of each target model.
The second model, a slightly modified version of the conventional GxE model, is represented as follows: where y, E, and ε are defined as above, αˆ1 , αˆ2 , and αˆ3 are the estimated regression coefficients of the PRS, covariate and interaction terms, respectively, modeled with X ˆadd instead of X ˆtrd .
The third model is similar to the second model, but uses X ˆgxe to model the interaction term, as follows: where variables are defined as above, and αˆ3 is the estimated regression coefficient of the interaction terms based on X ˆgxe not X ˆadd .It is worth noting that this model has been previously used (Arnau-Soler et al., 2019;Zhai et al., 2022).
We propose two new extensions to the existing GxE PRS models, which are denoted as Models 4 and 5. Model 4 can be expressed as where variables are defined as in Model 3, and αˆ4 is the estimated regression coefficient for the PRS of GxE interaction (X ˆgxe ).This model is more intuitive than Model 3 because the main effects of X ˆgxe can be partitioned as spurious GxE interaction if not modeled.Moreover, it is worth emphasizing that, within the context of statistical modeling, Model 3 can be deemed theoretically incomplete.In the context of statistical models involving an interaction component, it is considered standard practice to consistently include the main effects of each term engaged in the interaction (Garcia et al., 2022;Moore, 2009;Neter et al., 1996).
To address the potential issue of spurious GxE signals that may arise from nonlinear relationships between the outcome variable y and covariate E, we propose the inclusion of a quadratic polynomial component for E. This can be particularly relevant in cases of binary outcomes and sources of bias, such as model misspecification.The proposed model, denoted as Model 5, is expressed as where variables are defined as in Model 4, and αˆ5 is the estimated regression coefficient for quadratic effects of E.

| Permutation
To address the potential issue of spurious GxE signals that may arise from unknown relationships between the quantitative outcome variable (y) and covariate (E), we implemented a permutation procedure on the term X ˆgxe in the interaction component in Model 4 and denoted it as Model 4*.The purpose of this permutation was to maintain the correlation structure between the outcome variable and other model components although specifically permuting the interaction component.This allowed us to assess the null hypothesis of no gene-environment interaction effect, by evaluating whether the observed significance of the interaction component was due to the true gene-environment interaction or due to chance.Similarly, in Model 5 for a binary outcome, we performed the same permutation on the X ˆgxe term in the interaction component and denoted it as Model 5*.

| Simulated phenotype and covariate data
We used R (Core Team, 2022) and MTG2 (Lee & Van der Werf, 2016) software packages to simulate phenotypes of a complex trait, on a data set consisting of 10,000 individuals and 10,000 SNPs, which were selected randomly from the 288,792 individuals and 1,118,829 SNPs.We generated three sets of SNP effects that were weighted to the real genotypes of the data, to simulate the genetic effects of the main phenotype (G 0 ), the genetic effects of the environmental variable (G E ) and the genetic effects interacted with the environmental variable (G 1 ), using R library MASS (Venables & Ripley, 2002).It is noted that we initially simulated the data considering the entire sample, and subsequently divided the simulated data into discovery and target data sets.
To simulate phenotypes of a complex trait and evaluate the performance of the proposed GxE PRS models, we employed four different simulation models.The first is the null simulation model, where the phenotype is purely determined by the genetic effects of the main phenotype and random error, expressed as The second is the GxE simulation model, which includes the interaction effect between genetic effects and the environmental variable E, expressed as The third is the residual-environment interaction (RxE) simulation model, which considers the interaction effect between unknown random effects R and the environmental variable E with the absence of GxE, expressed as The fourth is the RxE simulation model, which considers the interaction effect between unknown random effects R and the environmental variable E with the presence of GxE, expressed as 0 1 (10) The fifth is the covariate (E) simulation model, which simulates the environmental variable E as a combination of genetic effects and random error, expressed as where y is the vector of simulated phenotypes, G 0 is the vector of main genetic effects of the phenotype, G 1 is the vector of genetic effects of the phenotype, R is the vector of random variables generated from either N (0, 0.25) or N (0, 0.5), E is the vector of environmental variable, G E is the vector of genetic effects of E, ε is the vector of residuals of the main phenotypes, and ε E is the vector of residuals of E.
Here, the genetic variances of the main phenotypes (Var (G 0 )) and the environmental variance (Var(G E )) were fixed to 0.4 and 0.5, respectively.The variance of interaction component (Var(G 1 )) was set to 0, 0.01, and 0.05 and correlation between G 0 and G E was varied among 0, 0.2, 0.4, 0.6, and 0.8.The residual structure was defined such that, residual variance of phenotype (ε) and residual variance of environmental variable (ε E ) were both fixed to 0.5 and the residual correlation between the phenotype y and the environmental variable E was varied among 0, 0.2, 0.4, 0.6, and 0.8.Finally, separate data files for the main phenotypes and environmental variable (covariate) were stored for the downstream analysis.
To generate binary outcome variables, we followed the same process as described above for quantitative traits except that case-control status was determined based on the liability threshold model, given a predefined population prevalence rate.Specifically, we first simulated quantitative continuous phenotypes using the method outlined earlier.Next, using a population prevalence of 1%, 10%, or 50%, we determined the threshold for liability, using standard normal distribution theory, to classify individuals as cases or controls based on their quantitative continuous phenotypes.Covariates were generated using the same method as for the simulation of quantitative continuous phenotypes, using Equation ( 11).
In addition, we considered the possibility of RxE interaction effects generated by unmeasured or unobserved factors, as shown in Equation ( 9).These interactions can be examined by GxE PRS Models 1-5, but they can also result in spurious GxE due to model misspecification problem.Model misspecification occurs when the statistical model used to describe the data does not accurately represent the underlying data-generating process.This can happen due to a variety of reasons, such as incorrect functional form, omitted variables, or unmeasured or unknown factors that contribute to the variation in the response variable.It is a common issue in statistical modeling that can lead to biased estimates and incorrect inferences.Therefore, we explicitly assessed GxE PRS Models 1-5 for this potential model misspecification issue.

| Real data
In real data analyses, our primary objective was to demonstrate how these models can be applied in realworld scenarios.We aimed to showcase how these GxE PRS models can be effectively used to assess the interaction between the aggregated PRS, which encompasses multiple individual genetic variants, and environmental factors in shaping the phenotype.

| Sample sizes
After performing QC (see Section 2.2 for details), 288,792 individuals remained for the analyses.We randomly split the total number of individuals into 2 data sets in the ratio of 8:2, namely, discovery and target data sets.The discovery data set was used to perform GWAS/GWEIS although the target data set was used to predict risk scores.

| Real phenotypic data
Real phenotypic data were extracted from the UK Biobank.The phenotypes include anthropometrical traits such as BMI, waist-to-hip ratio (WHR), and clinical traits such as low-density lipoprotein (LDL) cholesterol, which are quantitative traits.In addition, we considered binary traits, such as diabetes (DIAB), hypertension (HYP), and CAD.

| Covariates
To account for nongenetic environmental effects, we included fixed effects such as sex, age at recruitment, Townsend deprivation index, education in years (Okbay et al., 2016), and the first 10 genetic principal components (Bycroft et al., 2018) in our analysis.These variables were incorporated to adjust for main phenotypes and control for confounding factors.In the discovery data set, the phenotype (outcome) was adjusted during the GWAS/GWEIS stage.In the target data set, these confounders were included in the respective target models to capture their effects on the outcome.
In the GxE analysis, we considered several environmental variables.For quantitative outcomes (BMI, WHR, or LDL), we examined three covariates: frequency of alcohol consumption (ALC), healthy diet (HD), and physical activity (PA) as the E variable in GxE.For binary outcomes (CAD, DIAB, or HYP), we examined four covariates: BMI, high-density lipoprotein (HDL), hemoglobin (HGB), and WHR as the E variable in GxE.Each outcome was analyzed separately with its corresponding covariates.It is important to note that these covariates The type 1 error rate and statistical power of various GxE PRS models when using quantitative traits.Note: We used simulation models (Equations 7, 8, and 11) to generate phenotypes and covariates.Different genetic and residual correlations (Cor(G 0 , G E ) = 0, 0.4, and 0.8), and Cor(ε, ε E ) = 0, 0.4, and 0.8) were considered in various scenarios.We applied Models 1-4 to estimate the GxE component, with SNP effects estimated from GWAS (Equation 1) or GWEIS (Equation 2).The error bars indicate the 95% confidence intervals for type 1 error rate and statistical power (vertical axis), indicating the 2.5th and 97.5th percentiles of the distribution of the proportion of significant GxE estimates across 200 simulation replicates.The significance level was set at a p value of 0.05 in each simulation replicate, and we used the Wilson score interval as a binomial proportion interval (Wilson, 1927).(a) Var(GxE) = 0, (b) Var(GxE) = 0.01, and (c) Var(GxE) = 0.05.GWAS, genome-wide association studies; GWEIS, genome-wide environment interaction studies; GxE, gene-environment interaction; PRS, polygenic risk score; SNP, single nucleotide polymorphism.
were standardized independently in the discovery and target data sets before being used in the downstream analysis.
For a comprehensive overview of the variables used in our study, including detailed information on the adjusted phenotypes and environmental variables, see Tables S1, S2, and S3.

| Quantitative traits
Figure 1 presents the summarized results of the simulation study for quantitative traits.The study included varying levels of genetic and residual correlations, with Cor(G 0 , G E ) values set at 0, 0.4, and 0.8, and Cor(ε, ε E ) values set at 0, 0.4, and 0.8.In the discovery data set, SNP effects were estimated using Equation (1) (GWAS) and Equation (2) (GWEIS).In the absence of GxE interaction in the simulations (Var(GxE) = 0), Models 1-4 were controlled for a type 1 error rate of 5% (Figure 1a).However, in the presence of GxE interaction in the simulation (Var(GxE) = 0.01 or 0.05), the statistical power of Models 3 and 4 outperformed both Models 1 and 2. Note that Models 1-3 are existing models although Model 4 is newly introduced in this study.When comparing Models 3 and 4, there is no remarkable difference in their performance.Figure S1 provides more results for additional data points in regard to genetic and residual correlations.

| Binary traits
Figure 2 presents the simulation study results for binary traits with a population prevalence of 10%.Similar to the simulation for quantitative traits (Figure 1), we varied levels of genetic and residual correlations, and estimated SNP effects using Equation (1) (GWAS) and Equation (2) (GWEIS).When using binary traits, the existing methods (Models 1-3), showed inflated type 1 error rates, particularly at higher genetic and/or residual correlations (Figure 2a).However, the proposed method (Model 4), maintained a well-controlled type 1 error rate at 5% in all cases.Regarding statistical power, Model 4 outperformed Models 1 and 2, exhibiting higher power in the presence of large GxE effects (Figure 2c-e).It is important to note that although Model 3 showed higher power, this may be due to type 1 error rate inflation (as evidenced in Figure 2a).When GxE was low (i.e., 0.01), there appeared to be no power to detect GxE using Model 4 (Figure 2b).This may be due to an overall power decrease when using binary responses (as shown in Figure 1 vs. 2).For additional data points in regard to genetic and residual correlations, refer to Figure S2.Moreover, when we tested different population prevalence values (1% and 50%), we consistently observed that Model 4 controlled the type 1 error rate and exhibited reasonable statistical power, unless var(GxE) was negligible (Figures S3  and S4).However, Models 1-3 either showed inflated type 1 error rates or lacked power.

| Assessing the impact of model misspecification on quantitative traits in GxE PRS models
To evaluate the potential issue of model misspecification in GxE PRS models, we assessed Models 1-5 using a simulation analysis for quantitative traits.The true simulation models were defined by Equations ( 9)-( 11), and a range of values were selected for genetic and residual correlations (Cor(G 0 , G E ) = 0, 0.4, and 0.8, and Cor(ε, ε E ) = 0, 0.4, and 0.8), and RxE interaction effects generated by unmeasured or unobserved factors (Var (RxE) = 0.25 or 0.5). Figure 3a illustrates that Models 1-4 generated an inflated type I error rate due to model misspecification.However, to remedy this inflation issue, a permutation technique was found to be useful (i.e., Model 4*) as shown in Figure 3a.In the presence of GxE, Models 1 and 2 demonstrated a lack of power even with Var(GxE) set to 0.05 as illustrated in Figures 3b,c.On the other hand, Models 3, 4, and 4* demonstrated reasonable power unless Var(GxE) was negligible, as depicted in Figure 3c.Figures S5 and S6 provide more results for additional data points in regard to genetic and residual correlations when using Var(RxE) = 0.25 and Var(RxE) = 0.5, respectively.

| Assessing the impact of model misspecification on binary traits in GxE PRS models
To assess the impact of model misspecification in GxE PRS models for binary traits, we conducted a simulation analysis using Models 1-5 with a population prevalence of 10% (Figure 4).The true simulation models were defined by Equations ( 9)-( 11), and binary responses were generated using a liability threshold model.However, in the absence of an interaction term (Var(GxE) = 0), none of the models, including the permutation model (Model 4*), controlled the type 1 error rate, regardless of whether Var(RxE) was 0.25 or 0.5 (Figures 4 and S7).
To address the issue of model misspecification, we further adjusted the proposed model by incorporating a second-order covariate term (E 2 ) in both discovery (Equation 3) and target (Model 5) models.When we applied Model 5 with the adjusted GxE PRS method, we observed effective control of the type 1 error rate at 5%, regardless of whether Var(RxE) was 0.25 or 0.5 (Figures 5a,S8a,and S9a).Model 5 also provided reasonable power when increasing the variance of GxE interaction .
Furthermore, we investigated the influence of varying population prevalence, specifically at 1% or 50%, on the evaluation of the models.When the population prevalence was set to 1%, Models 5 and 5* effectively controlled their The type 1 error rate and power comparison of existing and proposed methods for binary traits with 10% population prevalence.Note: To assess the performance of our proposed method (Model 4) and existing methods (Models 1-3) on binary traits, we conducted simulation studies using various scenarios.We generated quantitative phenotypes and covariates using simulation models (Equations 7, 8, and 11) and converted them to binary phenotypes using a liability threshold of 10% population prevalence.We considered different levels of genetic and residual correlations (Cor(G 0 , G E ) = 0, 0.4, and 0.8), and Cor(ε, ε E ) = 0, 0.4, and 0.8) and estimated SNP effects using GWAS (Equation 1) or GWEIS (Equation 2).We applied Models 1-4 to estimate the GxE component.The error bars indicate the 95% confidence intervals for type 1 error rate and statistical power (vertical axis), indicating the 2.5th and 97.5th percentiles of the distribution of the proportion of significant GxE estimates across 200 simulation replicates.The significance level was set at a p value of 0.05 in each simulation replicate, and we used the Wilson score interval as a binomial proportion interval (Wilson, 1927).type 1 error rate (Figures S10a and S11a).As the population prevalence increased to 50%, all the models maintained control over their type 1 error rate (Figures S12a and S13a).However, when considering the presence of GxE interactions, Model 5 did not achieve sufficient statistical power at a population prevalence of 1%.It is worth noting that Model 1 appeared to have relatively higher power at lower levels of genetic and residual correlations, but this observation may be attributed to the inflated type 1 error rate of Model 1 (Figures S10 and S11).In contrast, when the population prevalence was increased to 50, Models 3-5* demonstrated reasonable statistical power, especially when the variance of the GxE interaction was large (Figures S12 and S13).

| REAL DATA ANALYSIS
We analyzed BMI, LDL and WHR as quantitative traits with relevant modifiable covariates such as ALC, HD, and PA across each GxE PRS model including Models 1-4 where Model 4 is the proposed model (see Section 2).We recorded the estimates and significance of the interaction term of interest (i.e., GxE component in the prediction model).We aimed to identify the significant GxE effects from the proposed models and to compare the GxE estimates and the corresponding significance values across all the target models, Models 1-4 and 4* (the permuted model of Model 4).This would illustrate that although Models 1-3 are widely used, their results

F I G U R E 3
The type 1 error rate and statistical power of various GxE PRS models when using quantitative traits with Var(RxE) = 0.25.Note: We used simulation models (Equations 9-11) to generate phenotypes and covariates.Different genetic and residual correlations (Cor (G 0 , G E ) = 0, 0.4, and 0.8), and Cor(ε, ε E ) = 0, 0.4, and 0.8) were considered in various scenarios.Var(RxE) was set to 0.25.In the absence of GxE (Equation 9), we simulated the quantitative trait by adding the RxE component with the residual term.In the presence of GxE (Equation 10), we added the GxE component in addition to RxE and residual terms to simulate the quantitative trait.We applied Models 1-4 to estimate the GxE component, with SNP effects estimated from GWAS (Equation 1) or GWEIS (Equation 2).Model 4* is the permuted version of Model 4, obtained by permuting the X ˆgxe term of ⊙ X E ˆgxe component 1000 times.The error bars indicate the 95% confidence intervals for type 1 error rate and statistical power (vertical axis), indicating the 2.5th and 97.5th percentiles of the distribution of the proportion of significant GxE estimates across 200 simulation replicates.The significance level was set at a p value of 0.05 in each simulation replicate, and we used the Wilson score interval as a binomial proportion interval (Wilson, 1927).(a) Var(GxE) = 0, (b) Var(GxE) = 0.01, and (c) Var(GxE) = 0.05.GWAS, genome-wide association studies; GWEIS, genome-wide environment interaction studies; GxE, geneenvironment interaction; PRS, polygenic risk score; SNP, single nucleotide polymorphism.can differ from the proposed methods that have been validated through simulations.
In Figure 6, it is evident that the genetic effects of BMI are modulated by ALC, as indicated by the significance of Models 4 and 4*.However, although Models 1 and 2 identified a significant GxE effect in the analyses of the outcome/covariate pair BMI/PA, the significance was not confirmed by the proposed models (Models 4 and 4*).Considering the limitations of Models 1-3 in controlling the type 1 error rate, as demonstrated in the simulation study, we cannot solely rely on the GxE effects identified by Models 1 and 2 (Figure 6).Regarding the analysis of WHR/ALC, a significant GxE effect was observed in Model 4*, which is the permutation of Model 4 and designed to address spurious signals.It is important to note that Models 4 and 4* yield similar p values, both of which are very close to the significance threshold.Therefore, based on the findings from Figure 6 and considering the limitations of Models 1-3 in controlling the type 1 error rate, we conclude that, except for the BMI/ALC and WHR/ALC outcome/ covariate pairs, there is insufficient evidence to support the presence of GxE effects for the traits assessed.The estimated regression coefficients, along with their corresponding standard errors and associated p values for the quantitative traits with the relevant modifiable covariates, are available in Tables S4-s12.
We also analyzed CAD, DIAB, and HYP as binary traits with relevant modifiable covariates such as BMI, HDL, HGB, and WHR across each GxE PRS model including Models 1-5 where Model 5 is the proposed model (see Section 2).Similar to quantitative triat analyses, we reported the estimates and significance of the GxE interaction term in the target models (Models 1-5 and 5*.Here Model 5* is the permuted model of Model 5).
Figure 7 illustrates that the genetic effects of HYP are modulated by WHR, as supported by the significance of Models 5 and 5*.Model 3 identified a significant GxE effect in the analyses of CAD/HDL, CAD/WHR, DIAB/ BMI, and DIAB/HDL, but these were not confirmed by Models 5 and 5*.Similarly, Model 1 or 2 identified significant GxE effects in the analyses of CAD/BMI, CAD/HGB, and DIAB/HGB, but these were not confirmed by Models 5 and 5*.Considering that Models 1-4 lack control over the type 1 error rate for binary diseases, as demonstrated in the simulation verification, we cannot solely rely on the GxE effects identified by Model The type 1 error rate of various GxE PRS models when using binary traits with Var(RxE) = 0.25 or 0.5.Note: We used simulation models (Equations 9-11) to generate phenotypes and covariates.We used a liability threshold of 10% population prevalence to simulate binary phenotypic outcomes.Different genetic and residual correlations (Cor(G 0 , G E ) = 0, 0.4, and 0.8), and Cor(ε, ε E ) = 0, 0.4, and 0.8) were considered in various scenarios.In the absence of GxE (Equation 9), we simulated the quantitative trait by adding the RxE component with the residual term, and then converted to a binary scale using a liability threshold of 10% population prevalence.We applied Models 1-4 to estimate the GxE component, with SNP effects estimated from GWAS (Equation 1) or GWEIS (Equation 2).Model 4* is the permuted version of Model 4, obtained by permuting the X ˆgxe term of ⊙ X E ˆgxe component 1000 times.The error bars indicate the 95% confidence intervals for type 1 error rate (vertical axis), indicating the 2.5th and 97.5th percentiles of the distribution of the proportion of significant GxE estimates across 200 simulation replicates.The significance level was set at a p value of 0.05 in each simulation replicate, and we used the Wilson score interval as a binomial proportion interval (Wilson, 1927).(a) Var(GxE) = 0.25 and (b) Var(GxE) = 0.5.GWAS, genome-wide association studies; GWEIS, genome-wide environment interaction studies; GxE, gene-environment interaction; PRS, polygenic risk score; SNP, single nucleotide polymorphism.
1, 2, or 3 (Figure 7).However, it is noteworthy that HYP/ WHR exhibited significant GxE estimates when Models 5 and 5* were employed.Therefore, based on the findings from Figure 7, we conclude that there is no reliable evidence for GxE effects in the binary outcome/covariate pairs, except for HYP/WHR.The estimated regression coefficients, along with their corresponding standard errors and associated p values for the binary traits with the relevant modifiable covariates, are available in Tables S13-S24.
In accordance with the conventions of prior research (Liu et al., 2022;Niu et al., 2019;Schunkert et al., 2020), we treated educational attainment as a fixed effect, considering it as a part of socioeconomic status.To assess the robustness of our findings under this condition, we conducted additional analyses, specifically excluding The type 1 error rate and power of various GxE PRS models when using binary traits with Var(RxE) = 0.25% and 10% population prevalence.Note: To investigate the impact of model misspecification, we generated phenotypes and covariates with RxE effects using simulation models (Equations 9 and 10).The genetic and residual variances were set to fixed values of 0.4 and 0.5, respectively, for the main phenotypes.We used a liability threshold of 10% population prevalence to simulate binary phenotypic outcomes.Different genetic and residual correlations (Cor(G 0 , G E ) = 0, 0.4, and 0.8), and Cor(ε, ε E ) = 0, 0.4, and 0.8) were considered in various scenarios.Var(RxE) was set to 0.25.In the absence of GxE (Equation 9), we simulated the quantitative trait by adding the RxE component with the residual term, and then converted to a binary scale using a liability threshold of 10% population prevalence.In the presence of GxE (Equation 10), we added the GxE component in addition to RxE and residual terms to simulate the quantitative trait and converted to binary scale in a similar manner.We applied Models 1-5, with SNP effects estimated from GWAS (Equation 1) or GWEIS (Equation 3).Model 5* is the permuted version of Model 5, obtained by permuting the X ˆgxe term of ⊙ X E ˆgxe component 1000 times.The error bars indicate the 95% confidence intervals for type 1 error rate and statistical power (vertical axis), indicating the 2.5th and 97.5th percentiles of the distribution of the proportion of significant GxE estimates across 200 simulation replicates.The significance level was set at a p value of 0.05 in each simulation replicate, and we used the Wilson score interval as a binomial proportion interval (Wilson, 1927).educational attainment as a fixed effect, for the evaluation of BMI/ALC, WHR/ALC, and HYP/WHR interactions.These additional analyses indicated the consistency of our results (see Table S25).
We also assessed the robustness of our results by investigating the effects of interactions between genes and the additional 14 confounders (as suggested by Keller, 2014).Our findings indicated no substantial deviations from the original analysis, regardless of whether the GxConfounder interaction effects were considered solely in the discovery phase or in both the discovery and target models (refer to Table S26).
Finally, we reanalyzed the aforementioned outcome/covariate pairs by considering genome-wide significant SNPs under the thresholds 0.1 and 0.01, where our initial analysis has considered all the SNPs (whole genome approach) from the discovery sample.We found no difference in results upon using all-SNPs and genome-wide significant SNPs under the selected thresholds (Table S27).This indicates that the gene-environment interaction (GxE) remains consistent, regardless of genome-wide significance.

| DISCUSSION
The present study aimed to evaluate the performance of various GxE PRS models for both quantitative and binary traits.For this, we developed an improved modeling approach that is more robust and assessed the impact of genetic and residual correlations between the main phenotypes and covariates (i.e., E) as well as model misspecification (Sun et al., 2018).The results obtained from this study provide important insights into the effectiveness and limitations of GxE PRS models.
For quantitative traits, Models 3 and 4 demonstrated superior performance compared to Models 1 and 2 in terms of controlling type 1 error rates and statistical power to detect GxE interactions.Model 4, introduced in this study, performed similarly to Model 3, indicating its efficacy in capturing GxE interactions.However, in the presence of model misspecification, none of the GxE PRS models except the permutation of Model 4 worked effectively.This highlights the importance of carefully selecting GxE PRS models when detecting GxE interactions.
For binary traits, existing models (Models 1-3) exhibited inflated type 1 error rates, particularly at higher levels of genetic and/or residual correlations.However, the proposed Model 4 maintained a wellcontrolled type 1 error rate of 5% in all scenarios.Additionally, Model 4 displayed higher power compared to Models 1 and 2 in the presence of GxE effects.It is worth noting that Model 3 showed higher power, but this may be attributed to its inflated type 1 error rate.To address the issue of model misspecification, the adjusted Model 5, which incorporated a second-order covariate term (E 2 ) in both the discovery and target models, effectively controlled the type 1 error rate regardless of the magnitude of the variance of the source for model misspecification (e.g., RxE interaction).
This study introduces two innovative models: Model 4 for quantitative traits and Model 5 for binary traits.These novel models proficiently control type 1 errors while enhancing statistical power by integrating PRSs, effectively encompassing both additive and interaction genetic effects.For binary traits, we achieved this by including an additional squared term for nongenetic covariates.The existing Models 1 and 2 use the additive PRS term in the interaction component, which does not properly account for GxE effects.In contrast, the proposed models (Models 4 and 5) properly consider the interaction effects between gene and environment in the interaction PRS term.Model 3 correctly forms the interaction PRS term in the interaction component, however, it has failed to follow the principle of marginality by not including the main effects of the interaction PRS term.This limitation is addressed in our proposed models, by including the main effects of interaction PRS terms in their target models.
The analysis of real data further supported the efficacy of the proposed models.Given an outcome/ covariate pair, a variation in significant results across models justifies this point by agreeing with simulation findings.That is, we were capable of correctly identifying, whether the existence of GxE effects was true or not, in the instances where existing models resulted in spurious signals.For quantitative traits, significant GxE effects were identified in the analysis of BMI/ALC (Shin & Lee, 2021;Zhou et al., 2020) and WHR/ALC, confirmed by the permuted version of Model 4 (Model 4*).For binary traits, high significance was found in the GxE component in the HYP/WHR analysis (Shin & Lee, 2021;Wolf et al., 2007).These findings highlight the importance of considering GxE interactions in prediction models and demonstrate the potential of GxE PRS models for uncovering gene-environment interplay in complex traits and diseases.
Not all GxE interactions found in this study align with previous studies (Rask-Andersen et al., 2017;Shin & Lee, 2021;Zhou et al., 2020), including inconsistencies with our own previous studies (Shin & Lee, 2021;Zhou et al., 2020).There are several potential reasons for this discrepancy.First, previous studies relied on traditional models based on the discovery data set only, without incorporating GxE PRS models, like, the current study.By using GxE PRS models and incorporating both the discovery and target data sets, our study may yield more robust outcomes.Second, previous studies, which employed GxE PRS models, mostly used Model 1, 2, or 3. Consequently, the observed signals in those studies may have been influenced by factors such as lack of power, genetic and residual correlation, and potential model misspecification.In contrast, our study adopted a novel approach (Models 4 and 5, as well as permutation) that addresses these factors, enhancing the accuracy and reliability of our results.Lastly, it is worth considering that the study-wise Bonferroni correction employed in this study may have led to over-correction, potentially diminishing the significance of certain signals.Therefore, a cautious interpretation of the results is necessary.
Overall, the results of this study emphasize the significance of GxE interactions in the context of PRS models and provide valuable insights into the performance and limitations of various GxE PRS models.The findings support the potential utility of incorporating GxE effects to improve the accuracy and power of genetic risk prediction models for both quantitative and binary traits.However, it is important to carefully consider model misspecification issues (Sun et al., 2018) and potential confounding factors (Arnau-Soler et al., 2019;Zhai et al., 2022) when applying GxE PRS models in realworld settings.In addition, it is plausible that the unmodelled interaction could be captured by the residual term of the prediction model (Ni et al., 2019), while maintaining an unbiased main PRS effect estimation.However, overlooking the GxE interaction leads to a decrease in prediction accuracy, emphasizing the importance of modeling interaction effects in the prediction model for more accurate risk estimation.
There are several limitations in our study.First, we cannot guarantee the directionality of the GxE model, and we rely on prior knowledge to inform our assumption about the causal direction.We could not disentangle or determine the correct direction of GxE effects.For instance, we could not ascertain whether the genetic effects of the modifiable covariate are modulated by the outcome status, or vice versa.Further research is needed to explore and determine the causal direction in the context of GxE interactions (Arnau-Soler et al., 2019;Shin & Lee, 2021).Additionally, there are other factors that may influence the performance of GxE PRS models, such as variations in population prevalence and the interplay between multiple environmental factors.Moreover, investigating the robustness and generalizability of these models across diverse populations and traits would be valuable in advancing our understanding of GxE interactions in complex phenotypes.
In conclusion, this study provides valuable insights into the performance and applicability of GxE PRS models for both quantitative and binary traits.The findings support the importance of considering GxE interactions in genetic risk prediction and highlight the potential of GxE PRS models for uncovering complex gene-environment interplay.The proposed models are useful to study GxE interactions and their future applications have implications for personalized medicine and risk assessment, as they provide a framework for incorporating environmental factors into genetic risk prediction models.

F
I G U R E 6 Estimates and Significance of GxE Components for Quantitative Phenotypes across Models.Note: The heatmap represents the estimated regression coefficient of the GxE term for each model.The color gradient ranges from dark red to dark blue, indicating the magnitude of the regression coefficient.The size of each square in the heatmap is proportional to the corresponding p value.Significance levels are indicated by asterisks, representing the significance after Bonferroni correction (significance level = 0.05/21≈0.002),considering a total of 21 analyses conducted.The number of permutations performed in Model 4* was determined based on the p value obtained in Model 4 to ensure an adequate number of permutations (with a minimum of 1000).The horizontal axis represents the outcome/covariate pairs, while the vertical axis represents the models used for estimating GxE.ALC, frequency of alcohol consumption; BMI, body mass index; GxE, gene-environment interaction; HD, healthy diet; LDL, low-density lipoprotein; PA, physical activity; WHR, waist-to-hip ratio.

F
I G U R E 7 Estimates and significance of GxE components for binary phenotypes across models.Note: The heatmap represents the estimated regression coefficient of the GxE term for each model.The color gradient ranges from dark red to dark blue, indicating the magnitude of the regression coefficient.The size of each square in the heatmap is proportional to the corresponding p value.Significance levels are indicated by asterisks, representing significance after Bonferroni correction (significance level = 0.05/21≈0.002),considering a total of 21 analyses conducted.The number of permutations performed in Model 5* was determined based on the p value obtained in Model 5 to ensure an adequate number of permutations (with a minimum of 1000).The horizontal axis represents the outcome/covariate pairs, while the vertical axis represents the models used for estimating GxE.BMI, body mass index; CAD, coronary artery disease; DIAB, diabetes; HDL, high-density lipoprotein; HGB, hemoglobin; HYP, hypertension; WHR, waist-to-hip ratio.