Computationally efficient familywise error rate control in genome‐wide association studies using score tests for generalized linear models

In genetic association studies, detecting phenotype–genotype association is a primary goal. We assume that the relationship between the data—phenotype, genetic markers and environmental covariates—can be modeled by a generalized linear model. The number of markers is allowed to be far greater than the number of individuals of the study. A multivariate score statistic is used to test each marker for association with a phenotype. We assume that the test statistics asymptotically follow a multivariate normal distribution under the complete null hypothesis of no phenotype–genotype association. We present the familywise error rate order k approximation method to find a local significance level (alternatively, an adjusted p‐value) for each test such that the familywise error rate is controlled. The special case k=1 gives the Šidák method. As a by‐product, an effective number of independent tests can be defined. Furthermore, if environmental covariates and genetic markers are uncorrelated, or no environmental covariates are present, we show that covariances between score statistics depend on genetic markers alone. This not only leads to more efficient calculations but also to a local significance level that is determined only by the collection of markers used, independent of the phenotypes and environmental covariates of the experiment at hand.


Introduction
In genome-wide association (GWA) studies the aim is to test for association between genetic markers and a phenotype.A large number of markers are tested, and it is important to control the overall Type I error rate.Our focus is on controlling the familywise error rate (FWER).Multiple testing correction methods may achieve this goal by estimating a local significance level for the individual tests.In this work we present a new method, the order k FWER-approximation method, for finding a local significance level in multiple hypothesis testing for correlated common variants, as is often observed in GWA studies.
Assume that we have collected independent individual observations in a case-control, cohort or cross-sectional study.The phenotype of interest can be continuous or discrete.We consider biallelic genetic markers, giving three possible genotypes.For each genetic marker we specify a hypothesis situation, where the null hypothesis is of the type "no association between the phenotype and genetic marker" and we have a two sided alternative.We will model the data using a generalized linear regression model (GLM) with phenotype as response (outcome), genotype as the independent variable of interest (exposure), and possibly non-genetical, referred to as environmental, independent covariates (not of interest) in the model.In epidemiological studies, a confounder is a common factor which is associated with both the exposure and outcome.In GWA studies, population substructure may be associated with both the exposure (genotype) and outcome (phenotype) and therefore may be a confounding factor and need to be adjusted for in the analysis.Population stratification can be adjusted for by including principal components of the genotype covariance matrix of the individuals as covariates in the model (Price et al., 2006).As test statistics for the multiple hypothesis problem we use the score test statistics to evaluate the genotype contribution to the model for each genetic marker separately.It is known that the vector of separate score test statistics asymptotically follows a multivariate normal distribution with a covariance matrix that can be estimated using key features of the fitted GLM model and the genetic markers (Schaid et al., 2002;Seaman and Müller-Myhsok, 2005).This has also been a key ingredient in the work of Conneely and Boehnke (2007).
Further, we show that for the special case when no environmental covariates are present or when environmental and genetic covariates are observed to be independent, the estimated correlation matrix between score test statistics can be approximated by the estimated correlation matrix between the genetic markers.
In a multiple testing situation with m tests the familywise error rate can be controlled at level α by specifying a local p-value cut-off, α loc , to be used for all the m hypothesis tests.Inspired from the work of Moskvina and Schmidt (2008) and Dickhaus and Stange (2013) we will use an approximation to the m-dimensional asymptotic simultaneous multivariate normal distribution of the score test statistics vector to estimate α loc .The α loc estimate can be used to define an effective number of independent tests, and our FWER-approximation can be used to compute FWER-adjusted p-values.
The order k FWER-approximation method is more powerful than the Šidák method (which assumes that the score test statistics are independent across markers) and the Bonferroni method (which is valid for all dependence structures between the score test statistics).Further, it is more efficient and more widely applicable than the method of Conneely and Boehnke (2007).In Section 5 we will see that the method of Conneely and Boehnke (2007) is built on numerical integration in m dimensions and is computationally intensive.
The Westfall-Young permutation procedure is known to have asymptotically optimal power for a broad class of problems, including block-dependent and sparse dependence structure (Meinshausen et al., 2011).However, this method is computer intensive and to have a valid permutation test, the assumption of exchangeability needs to be satisfied (Commenges, 2003).This assumption is in general not satisfied when environmental covariates are present in the model.
We will use two genetic data sets presented by Athanasiu et al. (2010), Djurovic et al. (2010), Aspenes et al. (2011) and Loe et al. (2013) to illustrate our method applied to real data.
The paper is organized as follows.In Section 2 we present statistical background on the score test, and derive expressions for the score test covariance matrix, which is of importance for the subsequent work.Our proposed method is outlined and presented in detail in Section 3, together with characteristics of our method.In Section 4 real data and an artificial correlation structure are used to evaluate our proposed model and compare to other methods.Finally, we discuss and conclude in Sections 5 and 6.

Statistical background
In this section, we present notation and details on the score test in generalized linear models.

Notation and data
We assume that data -phenotype, m genetic covariates and d environmental covariates -from n independent individuals are available in a case-control, cohort or cross-sectional study.Let Y be an n-dimensional vector having the phenotype Y i of individual i as its ith entry, i = 1, . . ., n.Let X e be an n × d matrix having environmental covariates (the first one being 1 to allow for an intercept in the model presented below) for individual i as its ith row, and let X g be an n × m matrix having genetic covariates, or genotypes, for individual i as its ith row, each column corresponding to a genetic marker.
We assume that the genetic data are from common variant biallelic genetic markers with alleles a and A, where A is the minor allele.We will use the additive coding 0, 1, 2 for the genotypes aa, aA, and AA, respectively, in the genetic covariate matrix X g , but other coding schemes are also possible.We denote the total design matrix X = (X e X g ), which has the total covariate vector for individual i as its ith row.

Testing statistical hypotheses with the score test
We assume that the relationship between the phenotype Y and covariates X can be modelled by a generalized linear model (GLM) (McCullagh and Nelder, 1989) with an n-dimensional vector η = X e β e + X g β g = Xβ of linear predictors, where β = (β T e β T g ) T is a d + m-dimensional parameter vector.Let η i be the ith entry of η, and let µ be the n-dimensional vector having µ i = EY i as its ith entry.We assume that the link function g defined by η i = g(µ i ) of the GLM is canonical, which implies that the log likelihood for individual i is , where b and c are functions defining the exponential family of the phenotypes and φ i the dispersion parameter.In our context φ i = φ will be equal for all observations.In general, The full d + m-dimensional score vector n i=1 ∇ β l i can then be calculated to be which is asymptotically normal with mean 0 and covariance matrix where Λ is the diagonal matrix having σ 2 i as its ith entry.Partition U into its environmental and genetic components, U T = (U T e U T g ).Since β e are nuisance parameters and unknown, they are estimated by their maximum likelihood estimates under the null hypothesis of β g = 0.In effect, µ is to be replaced by μe , the fitted values in a model with only environmental covariates X e present, giving the statistic Then U g|e has the conditional distribution of U g given U e = 0, which is asymptotically normal with mean 0 and covariance matrix where V ee , V eg , V ge and V gg are the upper left d × d, upper right d × m, lower left m × d and lower right m × m submatrices of V , respectively (see Smyth, 2003).
The score test statistic U T g|e V −1 g|e U g|e with β g = 0 is asymptotically χ 2 distributed with m degrees of freedom when the complete null hypothesis β g = 0 is true (see Smyth, 2003).However, our interest lies not in the complete null hypothesis, but in the m individual hypotheses β gj = 0 for each component β gj of β g , 1 ≤ j ≤ m, against two-sided alternatives.We consider the standardized components of U g|e , where U g|e j denotes the jth entry of U g|e and V g|e jk the jk entry of V g|e .Under the null hypothesis H j : β gj = 0, T j is asymptotically standard normally distributed, and H j will be rejected for large values of |T j |.Under the complete null hypothesis, β g = 0, the vector T = (T 1 , T 2 , . . ., T m ) is asymptotically multivariate standard normally distributed with covariance matrix R, having as its jk entry, all evaluated at β g = 0. Note that the dispersion parameter φ is cancelled from T and the covariances.However, the σ 2 i of Λ will have to be estimated.

Special cases
We will now look at U g|e and V g|e for some special cases.

No environmental covariates
If no evironmental covariates except the intercept are present in the GLM, then X e = 1, the n-dimensional vector having all entries equal to 1, and Λ = σ 2 I under the null hypothesis, where I is the n × n identity matrix.Then where x j is the jth column of X g , 1 ≤ j ≤ m, 1 ≤ k ≤ m.So T j , the score test statistic for testing β gj = 0, is √ n times the Pearson correlation between x j and Y when σ 2 = Var Y i is replaced by the estimate Y T (I − 1 n 11 T )Y /n, and Cov(T j , T k ) is the sample correlation between x j and x k .Thus, for a GLM without adjustment for environmental covariates, the correlation between the score test statistics can be estimated by estimating the genotype correlation.The genotype correlation estimates twice the composite linkage disequilibrium if the genotypes are coded 0, 1, 2 (Weir, 2008).

Uncorrelated environmental and genetic covariates
Two n-dimensional vectors X 1 and X 2 of observations have zero Pearson correlation if their centered observations are orthogonal, If X 1 and X 2 are two matrices, then near zero Pearson correlation of each combination of a column of X 1 and a column of X 2 can be written compactly as If we consider genetic and environmental covariates to be random variables, and all pairs of an environmental and a genetic covariate to be independent, we would expect (6) to hold for all X 1 having columns that are functions of genetic covariates and X 2 having columns that are functions of environmental covariates.In particular, we consider X 1 = X g and X 2 = ΛX e .Since Λ is a function of environmental covariates only under the null hypothesis, so is X 2 .By ( 6), X T g ΛX e ≈ 1 n X T g 11 T ΛX e , Then, from (2), We now turn to the term X T g ΛX g .Its (j, k) entry is X T 1 Λ1, where X 1 is the vector consisting of the entry-wise products of the jth and the kth column of X g .Letting X 2 = Λ1, by ( 6), independence of environmental and genetic covariates yields and we have which is the same expression as in the case of no environmental covariates with the exception that the common variance σ 2 of the responses is replaced by their average variance tr Λ/n = 1 n n i=1 σ 2 i , where the σ 2 i are defined by the environmental covariates.The conclusion is that, if environmental and genetic covariates are uncorrelated, correlations of the score vector under the null hypothesis can be estimated more easily by estimating only correlations between genetic covariates instead

The normal model
For Y i normally distributed, Λ = σ 2 I, where I is the n × n identity matrix.The score vector can then be written and (2) reduces to e X e ) −1 X T e is the idempotent matrix projecting onto the column space of X e .Then I − H is the idempotent matrix projecting onto the orthogonal complement of the column space of X e , and (I − H)Y are the residuals when fitting the multiple linear model with only the environmental covariates present.Note that σ 2 enters into the test statistics T j (3), and needs to be replaced by an estimate; we have used the residual sum of squares of a fitted model with only environmental covariates present (the null hypothesis), divided by n − d.

The logistic model
For Y i Bernoulli distributed, φ = 1 and the σ 2 i of Λ are estimated by μei (1 − μei ), where μei are the fitted values under the null hypothesis with only environmental covariates.Inference about β g is valid also if data are collected in a case-control study since the canonical (logit) link is used (Agresti, 2002, pp. 170-171).
In the special case of no environmental covariates, that is, X e = 1, each score test statistic, T j (5), is equal to the Cochran-Armitage trend test (Armitage, 1955;Cochran, 1954) , where s i are the possible values of the genetic covariates, n 1 and n 2 the number of 0 and 1 phenotypes Y i , respectively, x i the number of observations having phenotype 1 and genotype s i at marker k, y i the number of observations having phenotype 0 and genotype s i , and m i = x i + y i .The Cochran-Armitage test is used in disease-genotype association testing with scores (s 0 , s 1 , s 2 ) = (0, s, 1) (Sasieni, 1997;Slager and Schaid, 2001), for example with s = 1 2 for an additive genetic model.

Familywise error rate control and approximations
We now turn to the topic of how to control the familywise error rate (FWER) by intersection approximations, and then apply this to our situation.

Multiple hypothesis familywise error rate control
We have a collection of m null hypotheses, H k : β gk = 0 (no association between phenotype and genotype at marker k), 1 ≤ k ≤ m, against two-sided alternatives.We will present a method for multiple testing correction that controls the FWER -the probability of making at least one type I error.We adopt the notation of Moskvina and Schmidt (2008), and denote by O k the event that the null hypothesis H k is not rejected, and by Ōk its complement, 1 ≤ k ≤ m.Then, if all m null hypotheses are true, In our case, O k is an event of the form |T k | < c, where T k is the test statistic of (3).We will consider single-step multiple testing methods, and choose the same cut-off c for each k.We denote by α loc = 2Φ(−c) = P ( Ōk ), the asymptotic probability of false rejection of H k , where Φ is the univariate standard normal cumulative distribution function.When the joint distribution of the test statistics is known under the complete null hypothesis, or can be estimated, FWER control at the α significance level can be achieved by solving the inequality FWER ≤ α for α loc , based on either the union or intersection formulation of (7).When m is large, this involves evaluating high dimensional integrals over the acceptance or rejection regions, which is suggested by Conneely and Boehnke (2007).
To avoid evalulating these costly integrals, we may instead control FWER by considering bounds based on (7).For example, the Bonferroni method is based on the Boole inequality applied to the union formulation of (7), from which it is seen that a local significance level of α loc = α/m guarantees FWER ≤ α.
When the FWER is calculated under the complete null hypothesis, so-called weak FWER control is achieved.However, in our situation, subset pivotality is satisfied, meaning that the distribution of any subvector (T k ) k∈K is identical under k∈K H k and under the complete null hypothesis m k=1 H k , for all subsets K ⊆ {1, 2, . . ., m}.In particular, a subvector of U g|e (1) and a submatrix of V g|e (2) corresponding to K only involves genetic covariates corresponding to K. Then strong FWER control is achieved, meaning that FWER ≤ α regardless of which null hypotheses are true (Westfall and Young, 1993;Westfall and Troendle, 2008).
The focus in this work will be on the intersection formulation of (7).Background theory will be given next and new application in 3.3.

Intersection approximations
Following Glaz and Johnson (1984), we define kth order product-type approximations to where probabilities are evaluated under the complete null hypothesis.This is similar to the usual multiplicative rule for the probability of intersection of events applied to γ m = P (O 1 ∩ • • • ∩ O m ), but with dimension of distributions limited to k.The idea is that the γ k should constitute increasingly better approximations of γ m as k increases, and that calculation of γ k is less costly than calculation of γ m when k < m, since only k-variate distributions are involved in γ k .
Note that the approximations depend on the order of the components of T = (T 1 , . . .T m ).We have used the order in which the m markers are positioned along the genome, assuming that the largest correlations occur between close markers.
In our case, Since T is asymptotically multivariate normally distributed with mean 0 under the complete null hypothesis, γ 1 ≤ γ m asymptotically ( Šidák, 1967) It is well known that the α loc found by this method, the Šidák method, is slightly larger than the α loc found by the Bonferroni method, thus the Šidák method will give slightly higher power.
We have seen that in our case, γ 1 ≤ γ m , meaning that the Šidák method can safely be used.If γ k ≤ γ m , then FWER = 1 − γ m ≤ 1 − γ k = α can be used to control FWER by solving the last equation for α loc (choosing the greatest solution if not unique -we have, however, never observed a γ k that is not monotonically decreasing in α loc ).If γ k ≤ γ l , then continuity of γ k and of γ l as functions of α loc implies that the α loc making 1 − γ l = α is no less than the α loc making 1−γ k = α, so that the power obtained by the lth approximation is no less than the power obtained by the kth approximation.
The ideal property Block et al. (1992).Unfortunately, our |T | is not MSM m−1 .It is possible to construct a trivariate normal distribution with mean 0 such that γ 1 < γ 3 < γ 2 for some α loc .However, the violations of MSM m−1 we have observed have been very small and only for restricted ranges of α loc , and only for carefully constructed covariance matrices.We have not observed violations for covariance matrices estimated from real data, and will therefore proceed to apply γ 2 and γ 3 as better approximations to γ m than γ 1 (the latter giving Šidák cutoffs).A summary of concepts of positive dependence, like MSM, was given by Dickhaus (2014, pp. 58-61).

Controlling FWER using kth order approximation for score tests
As we have seen, the vector T of score test statistics is under the complete null hypothesis asymptotically standard multivariate normal with covariance matrix R (4).We denote by O j the event |T j | < c of non-rejection of H j , which has probability P (O j ) = 1 − α loc under the null hypothesis, with α loc = 2Φ(−c).We will detail how to find α loc given by the second order approximation, γ 2 : Denote by r j the (j − 1, j) entry of R. Then For a desired upper bound α on FWER, the equation 1 − γ 2 = α is solved with respect to α loc , which can be done numerically using for example a bisection algorithm.Note that α loc enters into c = −Φ −1 (α loc /2).We can control FWER by higher-order approximations by solving the equation 1 − γ k = α for α loc in a similar way, which we will henceforth refer to as order k FWER approximation.By (8), γ k can be written as a ratio of products of k-dimensional and products of k − 1-dimensional multivariate normal integrals.Good numerical methods for calculating multivariate normal integrals exist for small dimensions (Genz and Bretz, 2009).We will illustrate using k = 2 and k = 3 for real data in Section 4.
The procedure to find α loc does not depend on the exact form of the test statistic, only that the vector (T 1 , . . ., T m ) of test statistics is asymptotically standard multivariate normal under the complete null hypothesis and |T j | ≥ c leads to rejection.In particular, (9) is identical to what was found by Moskvina and Schmidt (2008) for an allelic test and correlations given by linkage disequilibria.
In practice, instead of calculating α loc , it may be preferable to calculate FWERadjusted p-values: Replace α loc with p, the unadjusted p-value for an individual test, in the calculation of γ k .Then 1 − γ k is an FWER-adjusted p-value for the test, in the sense that if 1 − γ k ≤ α (rejection based on adjusted p-value), then p ≤ α loc (rejection based on local significance level).

FWER control with independent blocks
Genetic markers are distributed along the chromosomes and a common assumption is independence of genetic markers from different chromosomes.
As we have seen in Section 2.3, if the genetic markers are independent and no environmental covariates that are correlated with the genetic markers are included, the score test statistics for these markers would also be independent.Within a chromosome, genetic markers can belong to different haplotype blocks, being highly correlated within a block and independent or nearly independent between the blocks (Griffiths et al., 2002).
Assume that the m markers to be tested, and thus {O 1 , . . ., O m }, can be partitioned k be the kth order approximation given by ( 8) for the intersection of the events belonging to the lth block, 1 ≤ l ≤ b, and let γ k be the overall kth order approximation.Then it is easy to verify that γ k = b l=1 γ (l) k .

The effective number of independent tests
The concept of an effective number of independent tests, M eff , in multiple testing problems has been described and discussed by many authors, including Nyholt (2004), Gao et al. (2008), Moskvina and Schmidt (2008), Li and Ji (2005), Galwey (2009) and Chen and Liu (2011).All except Moskvina and Schmidt (2008) first estimate M eff , and then use M eff in place of m in the Šidák formula to calculate α loc = 1 − (1 − α) 1/M eff .An alternative formulation using the Bonferroni formula also exists.None of these methods use the concept of FWER in the derivation of M eff , and there is no mathematical justification that FWER is controlled.All methods start with the linkage disequilibrium or composite linkage disequilibrium matrix, and there is no mention of the dependence of the M eff estimate on the test statistics used for the hypothesis tests.
The method of Moskvina and Schmidt (2008) is based on an allelic test and controls the FWER using second order intersection approximations.As for our method, the main output of their method is an estimate of α loc .The above Šidák formula can then be used to define M eff = ln(1 − α)/ ln(1 − α loc ).Note that M eff depends on both α loc and the FWER threshold α.We will not consider M eff further in this article.The TOP data set is a case-control GWA data set, in which case is schizophrenia or bipolar disorder.The data set was collected with the aim to detect single-nucleotide polymorphisms (SNPs) associated with the schizophrenia or bipolar disorder (Athanasiu et al., 2010;Djurovic et al., 2010).The preprocessed TOP GWA data contain genetic information on 672972 SNPs (Affymetrix Genome-Wide Human SNP Array 6.0) for 1148 cases and 420 controls.Our dataset included individuals sampled until March 2013, and therefore the sample size is larger than in the cited papers.Preprosessing of the data was done as described in Athanasiu et al. (2010) and Djurovic et al. (2010).
Genotype-phenotype association was assessed by fitting a logistic regression without any environmental covariates, so that score test correlations equal genotype correlations (Section 2.3.1).
The VO 2 -max data set comes from a cross-sectional GWA study (Aspenes et al., 2011;Loe et al., 2013), in which the aim was to find SNPs associated with maximum oxygen uptake.The preprocessed VO 2 -max GWA data consist of 123497 SNPs (Illumina Cardio-MetaboChip, Moore et al., 2012) for 2802 individuals.The VO 2 -max data were analysed using a normal linear regression model, including age, sex and physical activity score as covariates.For both datasets, some genotype data were missing.In the TOP data, mean imputation was done for 0.04% of the genotypes, and in the VO 2 -max data for 0.7% of the genotypes.
For the VO 2 data, the local significance level controlling the FWER at level 0.05 was lowest for the Bonferroni method, slightly higher for the order 1 approximation (the Šidák method), and further increasing through the order 2 and 3 FWER approximations (Table 1).
For the TOP data, since no environmental covariates are included, permutation of the binary response vector is feasible (the exchangeability assumption is satisfied), and the maxT method can be used to estimate the local significance level controlling FWER at level 0.05.Permutation of the responses, followed by calculation of the maximal score test statistics over the whole genome, is a time consuming task, and we will only present results on two of the smallest chromosomes (chromosome 21 and 22): The local significance level controlling the FWER at level 0.05 was, as for the VO 2 data, lowest for the Bonferroni method and increasing through order 1-3 FWER approximations (Table 2).The highest level was obtained for the maxT method, and also the TOP data (left) and VO 2 -max data (right).The plots are logspline density estimates (Stone et al., 1997), implemented by the R logspline package (Kooperberg, 2016).
lower bound of the 95% confidence interval for α loc of maxT was greater than the order 3 FWER approximation.On a 4 × 6-core Xeon 2.67 GHz computer (Intel CPU) running Linux (Ubuntu 14.0) using one thread, the analyses on chromosome 22 took 85 hours for maxT, 20 minutes for order 3 FWER and 10 seconds for order 2 FWER approximation.
Smoothed frequency distributions of the estimated correlations between neighbouring SNPs along chromosomes are very similar across chromosomes (Figure 1), and therefore, we would expect that the trends for chromosome 21 and 22 can be extended to the other chromosomes and to the whole genome.

Correlation structure and local significance level
Consider 100 markers and a multivariate normal test statistic T having an AR1 correlation structure, that is, all entries on the main diagonal of the 100 × 100 correlation matrix are equal to 1, on the sub-and superdiagonal ρ, on the next diagonals ρ 2 , and so on.We investigated the effect of positive ρ on the local significance level α loc found by order 1-4 FWER approximations to control FWER at the 0.05 level.Also, the "true" α loc was calculated without approximation (that is, based on γ 100 ; see Section 3.2), using the pmvnorm function of the R (R Core Team, 2015) package mvtnorm (Genz et al., 2016) using the Genz-Bretz algorithm (Genz, 1992(Genz, , 1993;;Genz and Bretz, 2002).The pmvnorm function can calculate multivariate normal probabilities with some accuracy for dimensions up to 1000.
The inverse of an AR1 correlation matrix contains only negative off-diagonal entries, which ensures a property called MTP 2 (Karlin and Rinott, 1981)  which implies that the product-type approximations γ k of Section 3.2 are non-decreasing in k (Glaz and Johnson, 1984), making the α loc of the order k FWER approximations non-decreasing in k.
The effect of ρ on α loc was small for ρ < 0.4 (Figure 2), so for the 100 markers considered, there would be no gain in using FWER approximation or even the true joint distribution of T instead of Šidák this case.For larger ρ, order 2 FWER approximation provides an improvement compared to Šidák.The increase in α loc from Šidák to order 2 FWER approximation was greater than the difference between higher orders.
To assess the order k FWER approximation method for a more realistic correlation structure, we considered the empirical correlation matrix for the first 1000 markers on chromosome 22 of the TOP data.The order 1-4 approximations to control FWER at the 0.05 level gave an α loc of 5.1 ( Šidák), 5.8, 6.2 and 6.4 • 10 −5 , respectively, whereas the α loc calculated without approximation using the Genz-Bretz algorithm was 7.3 • 10 −5 .

Discussion
We have presented the order k FWER approximation method for estimating the local significance level α loc used to control FWER in a GWA study.Our method takes the estimated correlation structure between the test statistics into account, and is applicable when environmental covariates are present.The relation between the phenotype response and the genetic and environmental covariates can be modelled by any generalized linear model (using the canonical link); in particular, both models with discrete and models with continuous phenotypes are allowed.We have applied the method to common genetic variants, but it can also be used for rare variants.However, since rare variants are less correlated than common variants, we expect the increase in α loc from the Šidák method to be less than when analyzing common variants.The order k FWER approximation is based on conditioning on the previous k−1 neighbouring markers along the chromosome.A sufficient condition to have non-decreasing local significance levels when the order of the FWER approximation increases from 1 to k, and that the order k order approximation gives valid FWER control, is that the test statistic has the MSM k property.Even MSM 2 and MSM 3 are difficult to verify for our test statistic with GWA data, but it is reasonable to assume that they are satisfied (Section 3.2).
Population substructure can be associated with both the genotype and phenotype and is therefore a possible confounding factor in GWA studies.Population substructure can be adjusted for in the analysis using principal components of the covariance matrix of the individuals (Price et al., 2006) as covariates.In both the TOP data and the VO 2max data, related individuals were removed in the preprocessing of the data, and no adjustment for population structure was done in our analysis.
The AR1 correlation structure (Section 4.2) might not be a realistic model for genotype correlations, but the calculations nevertheless show potential for a significant improvement over the Šidák method by applying the fast order 2 FWER approximation.Also, there is potential to get quite close to the local significance level given by the full joint distribution of the test statistic vector by using order 3 or 4 approximation.Calculations using the more realistic empirical correlation matrix of part of chromosome 22 of the TOP study confirm this impression.
The maxT method (Section 3.6) of Westfall and Young (1993) may give higher power than FWER approximation (Table 2).However, there is no general way of including envir-onmental covariates using that method (Section 3.6).Also, computing time is much larger than for lower-order FWER approximation (Section 4.1), and the α loc estimate would likely differ if a new set of permutations were made (see confidence limits of Table 2).
Another alternative is parametric bootstrap methods (Seaman and Müller-Myhsok, 2005), which could be used to estimate the local significance level when the exchangeability assumption is not satisfied.It would be an efficient method, but to our knowledge it has not been proven that parametric bootstrap will control the overall error rate, since nuisance parameters need to be estimated.Conneely and Boehnke (2007) introduced a method for multiple testing correction for GLMs for multiple responses (traits) based on the estimated correlation matrix of the score vector.The focus of the method is to calculate FWER-adjusted p-values based on the multivariate integral arising from (7).Currently, this integral can be computed numerically with some accuracy for dimensions smaller than or equal to 1000 using the pmvnorm function of the R package mvtnorm (see Section 4.2 for details and references).Thus, the method of Conneely and Boehnke is not applicable for larger problems, e.g. more than m = 1000 hypothesis tests.
For our order k FWER approximation method we have used standard R functions to compute the second order approximation given by (9).For orders 3, 4 and 5 we have used the above-mentioned function pmvnorm specifying the Miwa algorithm (Miwa et al., 2003) instead of the default Genz-Bretz algorithm.The Miwa algorithm can be used for small dimensions, and is deterministic, whereas the Genz-Bretz algorithm includes simulations that lead to inaccuracies, which accumulate to an intolerable level when used for the large number of factors in (8).The research into better and faster integration of multivariate normal densities is ongoing, and Botev (2016) provides an interesting new approach, applicable for dimensions smaller than or equal to 100.This will enable our order k FWER approximation method to be applied with larger values of k than what has been presented here.

Conclusions
We have presented a new method for controlling the FWER for GWA data.The order k FWER approximation method can be used for generalized linear models and include adjustment for environmental covariates, possibly confounding, like population substructure or sex.We have applied the FWER approximation method to GWA data, and shown that our method is a powerful alternative to the Bonferroni and Šidák methods, especially in situations were permutation methods cannot be used (exchangeability assumption not satisfied).
The method provides a local significance level, α loc , for the individual tests, meaning that the null hypothesis of no association between phenotype and genetic marker should be rejected if the (unadjusted) p-value of a test is less than α loc .We found a substantial increase in α loc already at the order 2 approximation, compared to the α loc produced by the well-known Bonferroni and the Šidák methods -methods that does not take correlation structure between the test statistics of the markers into account ( Šidák assumes independence, but that could be considered worst-case for GWA data).

Figure 1 :
Figure1: Smoothed frequency distributions of absolute value of the estimated genotype correlations between neighbouring SNPs on each chromosome, one line per chromosome.TOP data (left) and VO 2 -max data (right).The plots are logspline density estimates(Stone et al., 1997), implemented by the R logspline package(Kooperberg, 2016).

Figure 2 :
Figure2: Local significance level α loc for order 1-4 FWER approximations and α loc based on true joint distribution of test statistic as a function of the parameter ρ of an AR1 correlation matrix for 100 markers.The horizontal line corresponds to Šidák correction (order 1 FWER approximation), then α loc is increasing with the order of the approximation (order 2-4; the three curves in the middle).The uppermost curve shows α loc based on the true joint distribution.

Table 1 :
Local significance level α loc calculated by the Bonferroni method and by order 1-3 FWER approximations for the TOP and VO 2 -max data, controlling the FWER at level 0.05, and ratio of α loc to Bonferroni α loc .significancelevel is mainly dependent on score test statistics correlations under study (not the sample size).

Table 2 :
for the density of |T |, Local significance level α loc for the TOP data calculated by the Bonferroni method and by order 1-3 FWER approximations, and estimated by the maxT method, controlling FWER at level 0.05, and ratio of α loc to Bonferroni α loc .Chromosome 21 contained 9802 SNPs and chromosome 22 contained 8970 SNPs.The number of permutations for the maxT method was 500000.The lower and upper values for maxT are bounds of a 95% confidence interval for α loc (see Section 3.6).