A Likelihood Ratio Test for Genome-Wide Association under Genetic Heterogeneity

Authors


Corresponding authors: Yongzhao Shao, Ph.D. Departments of Population Health and Environmental Medicine, New York University School of Medicine, 650 First Avenue, Room 538, New York, NY 10016, USA. Tel: 1-212-263-0324, Fax: 1-212-263-8570; E-mail: shaoy01@nyu.edu

Summary

Most existing association tests for genome-wide association studies (GWASs) fail to account for genetic heterogeneity. Zhou and Pan proposed a binomial-mixture-model-based association test to account for the possible genetic heterogeneity in case-control studies. The idea is elegant, however, the proposed test requires an expectation-maximization (EM)-type iterative algorithm to identify the penalised maximum likelihood estimates and a permutation method to assess p-values. The intensive computational burden induced by the EM-algorithm and the permutation becomes prohibitive for direct applications to GWASs. This paper develops a likelihood ratio test (LRT) for GWASs under genetic heterogeneity based on a more general alternative mixture model. In particular, a closed-form formula for the LRT statistic is derived to avoid the EM-type iterative numerical evaluation. Moreover, an explicit asymptotic null distribution is also obtained, which avoids using the permutation to obtain p-values. Thus, the proposed LRT is easy to implement for GWASs. Furthermore, numerical studies demonstrate that the LRT has power advantages over the commonly used Armitage trend test and other existing association tests under genetic heterogeneity. A breast cancer GWAS dataset is used to illustrate the newly proposed LRT.

Introduction

Common and complex diseases (or traits) are often genetically heterogeneous in aetiologies (Lander & Schork, 1994; Zhou & Pan, 2009). Some well-known complex diseases with genetic heterogeneity include asthma, breast cancer (Hall et al., 1990; Wooster et al., 1994; Turnbull et al., 2010), and diabetes (Hattersley, 1998; Sladek et al. 2010). As in Zhou & Pan (2009), this paper considers the situation when a complex disease (or trait) is caused by mutations in multiple unlinked loci, commonly referred to as locus heterogeneity (Ott, 1999; Abreu et al., 2002; Fu et al., 2006). As a consequence of genetic heterogeneity, the population of individuals with disease may be decomposed into various latent sub-populations, each with disease caused by mutations at different loci (or their combinations). Most of the existing association tests for population-based case-control studies, e.g. GWAS, are based on comparing the mean genotype scores (e.g. the Armitage trend test (ATT)) between the case and control groups, which are not efficient in the presence of genetic heterogeneity. Zhou & Pan (2009) showed that it can be beneficial to use methods that account for genetic heterogeneity for testing association in a case-control study.

Similar to admixture mapping in linkage analysis (Smith, 1963; Abreu et al., 2002; Fu et al., 2006), Zhou & Pan (2009) proposed a binomial mixture model to account for genetic heterogeneity and developed a modified likelihood ratio test (MLRT) for a single locus (Fu et al., 2006). They also consider two methods to combine single-locus-based MLRTs across multiple loci in linkage disequilibrium to boost power when causal single nucleotide polymorphisms (SNPs) are not genotyped (Zhou & Pan, 2009). They illustrated, with a wide spectrum of numerical examples, that the proposed MLRT tests are more powerful than some commonly used association tests under genetic heterogeneity. Following Zhou and Pan, we define the genetic score X as the number of the minor alleles at a single locus for a subject. Zhou and Pan (2009) assumed that the genetic score math formula in a healthy control subject follows a binomial distribution, that is

display math(1)

where math formula and math formula represents the minor allele frequency (MAF) on that specific locus of the control subject. On the other hand, under genetic heterogeneity, the genetic score for a diseased subject, math formula, follows a simple two-component mixture binomial distribution,

display math(2)

where θ represents the probability of having the minor allele on one chromosome for a sub-group of cases with disease associated with the minor allele. They adopt a two-step procedure for parameter estimation. First, a maximum likelihood estimate (MLE) of math formula is obtained based only on the control sample. Then, fixing the estimated math formula at its MLE derived from the control-group data, maximum penalised likelihood estimates of other parameters in the mixture model are obtained using an expectation-maximization (EM)-type algorithm (Li et al., 2009). Subsequently, the penalised MLEs from the EM-step are plugged into a likelihood ratio to form a test statistic to detect the association between the marker genotypes and the disease status. Finally, they proposed a permutation procedure to obtain the p-value of the association test.

Zhou and Pan's idea is applicable to an association study for a limited number of candidate markers; however, there are several challenges in applying their proposed method to genome-wide association studies (GWASs). First, the computation of their proposed MLRT for a vast number of SNPs in a typical GWAS would be very intensive. Since the penalised MLEs are obtained by an EM algorithm for maximization of the penalised mixture likelihood, there are known complexities and caveats associated with the EM or other iterative methods for identifying MLEs and penalised MLEs in mixture models including the challenges in selecting multiple starting points for parameter estimation. Moreover, the p-value of the MLRT is proposed to be attained by permutation, which is also difficult to apply directly to detect the SNP-disease association in GWAS with a vast number of SNPs, where the significance level is usually set to be less than 10−6. In addition, it is widely believed that complex diseases and traits are caused by interplays of a large number of genetic loci and environmental risk factors. The simple binomial mixture model with two components in Equation (2) may be too simple to capture the complex heterogeneity for many complex diseases. A more general form of binomial mixture model can be written as follows:

display math(3)

where math formula, math formula, and math formula if and only if math formula. In particular, for many of the complex diseases with genetic heterogeneity, it is likely that J is quite large. Since it is hard to know the number of the sub-populations J under genetic heterogeneity, it is desirable to have a new test that is applicable without the need to know the exact value of J while allowing math formula.

In this paper, we developed a LRT for GWASs based on the more flexible binomial mixture models in (3). It is widely believed that complex diseases and traits are caused by interplays of a large number of genetic loci and environmental risk factors. Thus, we assume that the genetic score in the case group, math formula, follows a general binomial mixture distribution in (3) which allows the possibility of a large and unknown J. The proposed LRT overcomes the above-mentioned challenges of using Zhou and Pan's method for testing association of a vast number of SNPs in a typical GWAS. In particular, we derived the closed-form formula for the LRT statistic even though the MLEs of parameters in the binomial mixture model are non-regular with loss of identifiability (Liu & Shao, 2003). We further derived the simple closed-form asymptotic null distribution of the LRT which avoids the intensive numerical calculations, such as the EM-based iterations for identification of MLEs and the permutations for evaluation of p-values. Additionally, the LRT can be implemented without the requirement of knowing the number of components J in the mixture model (3). We conducted extensive simulation studies to show that the LRT has power advantages over Armitage trend test (ATT) and some other association tests under genetic heterogeneity. We applied our test to a real dataset from a breast cancer GWAS to illustrate that it can achieve a much smaller p-value than some commonly used tests when there is evidence of genetic heterogeneity. Thus, the proposed LRT might be used to scan SNPs in GWAS to make novel discoveries by taking account of genetic heterogeneity.

Method

Notation and Setup

We focus on detecting marker-disease association at a single locus with two alleles A and a, such as a SNP in a case-control GWAS. Suppose math formula controls and math formula cases are sampled from the population. For each SNP, the genotype frequencies in a case-control study can be summarised as in the following 2 × 3 table (Table 1).

Table 1. The genotype frequencies for case-control data of a SNP
 math formulamath formulamath formulaTotal
Casen0n1n2math formula
Controlm0m1m2math formula

Let the genetic score math formula and math formula denote the number of minor alleles, say a, at a single locus for a healthy control and a diseased case, respectively. It is clear that math formula Similar to Zhou and Pan's setup, we assume that under the null hypothesis, both math formula and math formula have the same binomial distribution math formula as described in Equation (1). As in Zhou & Pan (2009), math formula is assumed to have a binomial distribution under H1. Under the alternative hypothesis of genetic heterogeneity, we assume that math formula has a mixture distribution as described in Equation (3). This last assumption is worthy of further comments. On the one hand, it is possible to have math formula in Equation (3) under H1 both in practice and in theory; thus, it is conceptually desirable to allow math formula in Equation (3). On the other hand, for likelihood inference, it is not necessary to have math formula in the model in order to achieve the maximum of the likelihood because the model is actually saturated with math formula In other words, for a given dataset, posing a model (3) with math formula or with math formula, the testing results from the LRT are not going to be different. In fact, as will be seen in the next section, our proposed LRT actually has the “non-parametric” nature because it has a closed-form formula, with a simple null distribution shown to be valid,; thus, it will be valid for testing any alternative models including the common models and those under heterogeneity. In this paper, we will establish that the test is actually a LRT under the specified setup motivated by the elegant work of Zhou & Pan (2009) and by the fact that the LRT has well-known optimalities in terms of statistical power and efficiency.

Mixture Binomial and Maximum Likelihood

Assuming the setup in the previous sub-section, under H0, using the notation in Table 1 and denoting the true value of math formula as P0, the MLE of P0 for the overall combined case-control data in Table 1 is

display math(4)

Thus, the binomial likelihood function for the overall combined case-control data evaluated at math formula, L0, is given by

display math(5)

where math formula and math formula is defined in (4). Following Zhou & Pan (2009), in the control group, the genetic score math formula is assumed to follow a binomial distribution under the alternative hypothesis, say

display math(6)

Using the notation in Table 1, the MLE of math formula within the healthy control group only is given by

display math(7)

The binomial likelihood function of the healthy controls data evaluated at math formula, math formula, is given by

display math(8)

Similarly, in the case group, if the genetic score math formula has the distribution math formula, the MLE math formula of math formula within the diseased case group only would be

display math(9)

However, as in Zhou & Pan (2009), we assume that under genetic heterogeneity, the cases can be divided into multiple latent sub-populations. Thus, under the alternative hypothesis of genetic heterogeneity, we assume that math formula has a mixture distribution as described in Equation (3). It can be shown that (see Appendix Appendix), using the above notation, the maximum of the mixture likelihood for math formula has an explicit formula:

display math(10)

The derivation of the above equation can be found in Appendix Appendix. It is also clear from the derivation in Appendix Appendix that the mixture likelihood function of the parameter vector math formula in the mixture model (3) can have many local maxima due to the lack of identifiability in parameters (Liu & Shao, 2003). Nevertheless, the supremum of the mixture likelihood math formula for math formula has a single unique value for each dataset and can be obtained from the explicit formula in Equation (10). In the typical case-control study design, math formula is independent of math formula.

The LRT

Using the maximum of the likelihood L0, math formula and math formula in Equations (5), (8) and (10), respectively, we can write down the explicit formula of the log-LRT statistic as follows:

display math(11)

No iterative numerical maximization of the mixture likelihood function is needed for the evaluation of the LRT statistic in (11). Thus, the LRT statistic is easy to compute even for GWAS. It is known that the LRT statistics for testing homogeneity in mixture models often have complicated asymptotic distributions that typically lack closed-form representations. However, the above statistic math formula can be shown to have an explicit form of asymptotic distribution under the null hypothesis. More specifically, under H0, as math formula → ∞ and math formula → ∞, we have

display math(12)

where math formula denotes a chi-squared distribution with d degrees of freedom, math formulamath formula1, 2. Although the above asymptotic null distribution can be derived from general results such as those in Chernoff & Lander (1995), Chiano & Yates (1995) or Liu & Shao (2003), an elementary and detailed direct derivation of the above asymptotic null distribution is given in Appendix Appendix for readers who are interested in a direct derivation based on first principles.

It is worth pointing out that our extensive numerical simulations discussed in the next section indicate that the simple asymptotic null distribution in (12) approximates the exact finite sample null distribution very well. The asymptotic formula is only slightly conservative. Therefore, the p-values of the LRT can be easily read off from the above simple closed-form asymptotic null distribution. For example, given any observed data in Table 1, one can first evaluate the value of math formula in (11) and then can obtain the p-value using the following simple command in the widely used R-platform:

display math(13)

Last but not least, it is well known that the LRT generally has better power than other ad hoc tests. Thus, it should not be a surprise to see that the LRT can be more powerful than other commonly used tests which ignore the genetic heterogeneity that exists for many common complex diseases such as breast cancer. Finally, to implement the LRT, there is no need to identify the exact number of mixture components J in (3), which is desirable because J is hard to determine in practice.

Numerical Results

Type I Errors

The LRT has an explicit asymptotic distribution under H0. Consequently, it is convenient to evaluate the p-value and type I errors. We conducted comprehensive simulations to compare the empirical type I error of the LRT to the nominal significance level ranging from 10−2 to 10−8. In the Monte Carlo simulations, the genotype data for both the control group and the case group were generated from the same binomial distribution math formula), where math formula takes some fixed value P0, which represents the MAF. A number of simulation setups, which varied over a range of MAF and sample size, were selected. The control and case sample sizes are set to be equal. The nominal significance levels were taken to be 10−2, 10−3, 10−4, 10−5, 10−6, 10−7 and 10−8, respectively. For each setup, 1011 samples were generated. We found that the empirical type I error is slightly smaller than the nominal level, but they are extremely close to each other. Thus, using the asymptotic null distribution for the LRT is valid. For illustration, an example with math formula 0.4 and sample size math formula 1000 is shown in Table 2.

Table 2. Empirical type I error and nominal significance level at math formula= 0.4 and math formula
Nominal level0.010.00110−410−510−610−710−8
Empirical level0.00980.00099math formulamath formulamath formulamath formulamath formula

Power Comparison

The significance level of the association test is usually set very small for GWASs. For example, the genome-wide significance level of 5×10−8 is being increasingly used for arrays that contain one million SNPs. The most commonly used association tests for GWAS include ATT and the math formula test, both applicable for testing association in a 2 × 3 table between the case-control status and the three genotypes, as illustrated in Table 1. Accordingly, we designed simulation studies to evaluate and compare the powers of the LRT, the ATT and the math formula test when the significance level is set to be 5×10−8. Note that, the MLRT of Zhou & Pan (2009) was not included for comparison due to its severe computational challenge when the significance level is very small. In the first set of Monte Carlo simulations, the control sample was generated from a binomial distribution math formula); the case sample was generated from a two-component mixture binomial distribution as described in (3) with math formula:

display math

Twenty thousand replicate datasets of math formulamath formulamath formula controls and cases were simulated for each of the eight simulations setup and the empirical power for each test is shown in Table 3. The simulation results indicate that the LRT has power advantage over the ATT and the math formula test under genetic heterogeneity.

Table 3. Empirical power when math formula has a mixture distribution with math formula
Setup12345678
  1. Significance level is set at math formula.

math formula0.10.10.20.20.250.250.30.3
θ10.120.080.180.230.200.300.320.28
θ20.500.500.600.600.700.700.700.70
α10.900.850.800.90.80.90.90.8
α20.100.150.200.10.20.10.10.2
N10001000150015001500100020001500
Power        
 LRT0.7170.8400.9870.8330.9970.8290.6730.918
 ATT0.4390.0590.5760.7170.0870.7560.4900.362
 math formula0.3990.1500.8100.6870.7180.7090.4840.604

Similar power advantages of the LRT over other tests are also observed when the alternative mixture model has three components (math formula), as demonstrated in Table 4, where θ3 for the cases is set as equal to math formula for the control group.

Table 4. Empirical power when math formula has a mixture distribution with math formula
Setup12345678
  1. Significance level is set at math formula.

math formula0.10.10.10.20.20.20.30.3
θ10.130.150.150.10.250.220.20.33
θ20.50.50.40.60.60.60.70.7
α10.350.40.30.20.30.350.40.4
α20.150.10.20.20.10.150.20.15
α30.50.50.50.60.60.50.40.45
N800120080010002000150015001500
Power        
 LRT0.8830.9030.8430.8290.8460.9370.8640.852
 ATT0.5540.7070.7130.1250.6120.6960.0110.631
 math formula0.5280.6830.6410.320.640.7570.2740.665

Note that the ATT, also called Cochran–Armitage trend test (CATT) by many researchers, has good power only when the disease risk of the genotypes math formula and math formula is monotone increasing or decreasing under the alternative hypothesis (Armitage, 1955; Freidlin et al., 2002). Thus, ATT can have very low power when there is a violation of a linear trend in the disease risk across the ordered genotypes math formula and math formula, as in the case of both setups #3 and #5 in Table 3.

It is clear from the power simulations across the multiple simulation setups that the LRT can be much more powerful than the commonly used ATT and the math formula test in GWAS in the presence of genetic heterogeneity.

A Breast Cancer GWAS

Breast cancer is the most common cancer among women. Many genes on different chromosomes that underlie breast cancer have been identified including many well-known studies conducted two decades ago (Hall et al., 1990; Wooster et al., 1994). Many more genetic variants underlying breast cancer are still being discovered nowadays; thus, there is little doubt about the existence of genetic heterogeneity in the case of breast cancer. For illustration, we applied the newly proposed LRT to a breast cancer GWAS dataset. In particular, Turnbull et al. (2010) conducted a GWAS to identify breast cancer susceptibility alleles. They studied 582,886 SNPs in 3659 breast cancer cases and 4897 controls in the first stage, and evaluated promising SNPs that were identified in stage 1 in a second stage with 12,576 cases and 12,223 controls. In the paper, they reported five new susceptibility SNPs with summary genotype data of the five SNPs made publicly available. A literature search indicates that four of the five SNPs (rs1011970, rs10995190, rs704010 and rs614367) have been independently confirmed by other studies since the publication of their GWAS results in 2010 (Peng et al., 2011; Lambrechts et al., 2012). We evaluated the p-values of the LRT, ATT and math formula test for these four SNPs for comparison. The results are summarised in Table 5.

Table 5. Comparison of p-values for the four SNPs reported in Turnbull et al. (2010)
SNPStageLRT p-valuesATT p-valuesχ2 test p-values
rs10995190 Stage 1math formulamath formulamath formula
  Stage 2math formula10−8math formula
 Fisher p-valuemath formulamath formulamath formula
rs614367 Stage 1math formulamath formulamath formula
  Stage 2math formula10−8math formula
 Fisher p-valuemath formula10−16math formula
rs704010 Stage 1math formulamath formulamath formula
  Stage 2math formulamath formulamath formula
 Fisher p-valuemath formulamath formulamath formula
rs1011970 Stage 1math formulamath formulamath formula
  Stage 2math formulamath formulamath formula
 Fisher p-valuemath formulamath formulamath formula

Note that for the SNP rs10995190 and SNP rs614367, the p-values are smaller than the genome-wide significance level math formula for the newly proposed LRT and the ATT, and for each of the two stages. The performance of the LRT is as good as or better than the other two tests. In particular, the LRT has an extremely small p-value math formula for stage 2 data of SNP rs614367 showing statistical significance at even lower levels. It is thus not surprising that these SNPs are independently replicated by other GWAS. For the SNP rs704010 and SNP 1011970, a simple combined p-value (for combining the two stages), e.g. Fisher's meta p-value, indicates that both SNPs are significant even using the genome-wide significance level math formula for all three tests. The newly proposed LRT is also very competitive for these two SNPs. For example, for the stage 1 data of SNP rs704010, only the p-value of the LRT is smaller than the genome-wide significance level math formula. As an indication of overall strength of the test, Fisher's meta p-value of the LRT from the combined stages 1 and 2 is smaller than those of the other two tests, and the LRT is clearly the most competitive test among the three competitors. This example indicates the potential value of the proposed LRT for GWAS data to detect association of complex diseases where the presence of genetic heterogeneity is always a possibility.

Discussion

In the analysis of GWAS data, potential latent genetic heterogeneity has largely been ignored by researchers. Zhou & Pan (2009) first proposed mixture models to account for genetic heterogeneity. However, for the analysis of a vast number of SNPs in GWAS, the MLRT of Zhou and Pan has major computational challenges. In this paper, using a more general binomial mixture model, we have derived a LRT for case-control association studies that improve the MLRT by Zhou and Pan on computational efficiency and multiple other aspects. In particular, the LRT statistic has a simple closed-form formula, which could avoid intensive computation, such as the EM algorithm for penalised MLEs. Additionally, we have derived an explicit asymptotic null distribution for the proposed LRT, which is convenient to obtain p-values even at a small significance level. Moreover, to perform the LRT, there is no need to decide the exact number of mixture components, which is convenient in practice. Therefore, the new LRT has computational advantages over the MLRT proposed by Zhou and Pan and is suitable for scanning SNPs in GWAS data.

As demonstrated by our numerical studies, in the presence of genetic heterogeneity, the LRT can be much more powerful than either Armitage's trend test or the math formula test, both of which are among the most widely used tests in GWAS. Given that most complex diseases are widely believed to be polygenic and have environmental components, genetic heterogeneity is a hallmark of complex diseases. As illustrated using the GWAS data for breast cancer, newly proposed LRT can be easily used for any GWAS data; thus, researchers can use the simple algorithm to scan their SNPs as a cost-effective way to potentially make novel and important discoveries using existing data already collected in the large number of GWAS. Given that there are already about 1000 published GWAS, and many more genome-wide studies are being planned and conducted, the new LRT has the potential to become one of the useful tests to scan the SNPs in these GWAS, maybe as a secondary analysis to account for genetic heterogeneity. Thus, the new user-friendly LRT can potentially be used to increase the impact of existing and future GWASs.

Acknowledgements

This research is partially supported by the NIH Cancer Center Supporting Grant to NYU (2P30 CA16087), and the NIEHS Center Grant to NYU (5P30 ES00260), as well as a Stony Wold-Herbert Foundation grant to YS. The authors have no conflict of interest. The authors would like to thank the reviewers for insightful suggestions that lead to improvement of the paper.

Appendix I

Derivation of the Test Statistic of the LRT

To prove our proposed association test is indeed a LRT under the given setup, we just need to establish Equation (10), that is, when math formula follows the mixture distribution in (3),

display math(14)

where math formula is defined as in Equation (9). First, we want to show that when math formula, the MLEs math formula of the η in (3) satisfy

display math(15)

A simple application of Jensen's inequality yields that, for any η,

display math(16)

The right-hand side of the above inequality is an upper bound which may not be achievable in general. However, when math formula, we can show that the equality in (13) is achievable. In fact, when math formula, there are infinitely many values of the MLE math formula that can make (13) an equality. It is straightforward and elementary to verify that one set of solutions for MLE is given as follows:

display math(17)

As indicated above, math formula can take any values in an interval; thus, there are infinitely many sets of solutions for the MLE. Thus, Equation (15) is proved. Next, we show that when math formula,

display math(18)

First, we show that for any fixed η,

display math

Using the inequality math formula, we get

display math

It is straightforward to verify that

display math(19)

Therefore, when math formula, and for any η

display math

Finally, it is obvious that

display math

This finishes the proof (14), thus also (10).

Appendix II

The Asymptotic Null Distribution of the LRT

Under H0, both math formula and math formula have the same binomial distribution math formula. We denote the true null value for math formula as P0. Without loss of generality, we assume math formula to avoid math formula appearing in any denominator. First, we may consider testing math formula against math formula, using only the healthy controls. This is a classic problem, the LRT statistic is well known to have a math formula distribution. It is well known that under math formula, the LRT statistic can be written as

display math(20)

where math formula is the MLE of math formula using only the healthy controls. Similarly, we may consider testing math formula against math formula, using only the diseased cases. Then, under math formula, the LRT statistic has math formula distribution and can be written as

display math(21)

where math formula is the MLE of math formula using only the diseased cases. Similarly, we may consider testing math formula against math formula, using the overall sample combining both the diseased cases and health controls. Then, the MLE for math formula from the combined sample is math formula as defined in (4). The LRT statistic can be written as

display math(22)

From the above three equations, and the Equations (5), (8) and (10), we have, when 4n0n2 <math formula,

display math(23)

Denote math formula. Then, it is straightforward to verify that

display math

and

display math
display math

Denote

display math

and

display math

Then,

display math(24)

Note that math formula and math formula and math formula and math formula are independent. Thus,

display math

Therefore, when math formula, we have math formula.

On the other hand, under H0, when 4n0math formula, we can first consider testing goodness-of-fit of math formula using only the diseased cases. The LRT statistic has a math formula asymptotic distribution and can be written as

display math(25)

The first term at the right-hand side of the last equality is equivalent to the Pearson's classic χ2 statistic (via comparing observed to expected cell frequencies) for testing Hardy–Weinberg equilibrium which is well known to have the math formula distribution (Emigh, 1980). Using the above equations, when math formula, we have

display math(26)

By Equations (23) and (24), from the above equation, we have

display math(27)

Note that the two terms in the right-hand side of (18) are well known to be asymptotically independent which, in turn, implies asymptotic independence of the two terms at the right-hand side of (19). Therefore, when math formula, we have

display math

Finally, it suffices to show that math formula as math formula Note that under H0, math formula follow a multinomial distribution math formula π0, π1, π2), where math formula for math formula. Let math formula be the random vector ( math formula, math formula, math formula). Then, we have (Bickel & Docksum, 2000)

display math

where

display math(28)

Let math formula denote math formula and math formula denote math formula. Under H0, then

display math
display math

By the central limit theorem and the multivariate delta method, math formula has an asymptotic normal distribution with mean 0. That is

display math(29)

Thus, under H0, as math formula ∞,

display math

This finishes the proof of the following convergence in distribution, under H0,

display math(30)

Ancillary