SEARCH

SEARCH BY CITATION

Keywords:

  • Association;
  • χ2;
  • genetic model;
  • MAX;
  • power;
  • robustness

Summary

  1. Top of page
  2. Summary
  3. Introduction
  4. Methods
  5. Discussion
  6. Acknowledgements
  7. References
  8. Appendix

Choosing an appropriate single-marker association test is critical to the success of case-control genetic association studies. An ideal single-marker analysis should have robust performance across a wide range of potential disease risk models. MAX was designed specifically to achieve such robustness. In this work, we derived the power calculation formula for MAX and conducted a comprehensive power comparison between MAX and two other commonly used single-marker tests, the one-degree-of-freedom (1-df) Cochran-Armitage trend test and the 2-df Pearson χ2 test. We used a single-marker disease risk model and a two-marker haplotype risk model to explore the performances of the above three tests. We found that each test has its own “sweet” spots. Among the three tests considered, MAX appears to have the most robust performance.


Introduction

  1. Top of page
  2. Summary
  3. Introduction
  4. Methods
  5. Discussion
  6. Acknowledgements
  7. References
  8. Appendix

In case-control genetic association studies (CCGAS), single-marker analysis, which tests the association between the outcome and an individual SNP, is often used. The following two tests are usually applied in single-marker analysis when there are no other covariates to be adjusted for: the 1-degree-of-freedom (1-df) Cochran-Armitage trend test (CATT) (Klein et al., 2005; WTCCC, 2007) that corresponds to the score test derived from an additive disease model (CATTA) (on the logit scale), and the 2-df Pearson χ2 test (Chi-2df) that compares the 3-category genotype frequencies between the case and control groups (Yeager et al., 2007). In addition to these two tests, MAX, which takes the maximum of three CATTs derived under dominant, recessive, and additive models as the test statistic, has also been proposed for the association test (Sladek et al., 2007). A detailed definition of each test will be given below. When there are other covariates to be adjusted for, the test corresponding to each of those above can be derived from the standard logistic regression model that models the effect of the genotype, coded according to the assumed disease model, with adjustment for the covariates. An important common feature of CATTA, Chi-2df, and MAX is that their 2-sided testing results are independent of the choice of the risk allele.

The significance levels (p-values) of CATTA and Chi-2df can be obtained easily according to their theoretical asymptotic distributions. The calculation of the p-value for MAX is a little bit more involved and requires a multiple-integration or permutation procedure (Conneely & Boehnke, 2007; Li et al., 2008a). Although all three tests have been used in recent genome-wide association studies (GWAS) (Klein et al., 2005; Hunter et al., 2007; Sladek et al., 2007; WTCCC, 2007; Yeager et al., 2007), there is no consensus as to which one is generally preferable, and also there are few discussions in the literature of the analytic power of MAX or of comparisons of MAX with the other tests. In this work, we derive an analytic formula for calculating the power of MAX. The existence of a power calculation formula for MAX, together with the power formulas that already exist for CATTA and Chi-2df, enables us to conduct a comprehensive power comparison among the three tests.

Comparisons of various single-marker analyses have been reported by several groups (Freidlin et al., 2002, Guedj et al., 2006; Kuo & Feingold, 2008). The uniqueness of this work is to add the promising MAX test to the comparison. In particular we compare the asymptotic powers of these tests under a broad range of single-locus and multi-locus disease models. The comparison results should shed more light on the relative merits of the three considered tests under various disease risk models and provide guidance for the analysis of future CCGAS.

Methods

  1. Top of page
  2. Summary
  3. Introduction
  4. Methods
  5. Discussion
  6. Acknowledgements
  7. References
  8. Appendix

Definition of Test Statistics

We focus first on situations where there are no covariates to be adjusted for. Assume that there are r cases and s controls in a CCGAS, and that there are two alleles, G and g, at a given SNP locus with the possible genotypes gg, Gg, and GG. The notations for genotype counts in the case and control groups are given in Table 1. Based upon Table 1, the general form of the CATT can be written as

  • image(1)

where inline image is a genotype score vector for the coding of genotypes gg, Gg, and GG, and inline image, i= 0, 1, 2. The genotype score vector x is chosen by the investigator. It should be pointed out that the CATT given by (1) is equivalent to the score statistic testing for the null hypothesis inline image derived from the following standard logistic regression that models the effect of genotypes represented by x,

  • image(2)

Based upon (2), the three commonly assumed genetic models–recessive, additive, and dominant–correspond to the following assignments of the genotype score vector x: inline image, inline image, and inline image, respectively. Among the three CATT tests, Zx=A, called CATTA, which is derived according to an additive model, is usually preferred over Zx=D and Zx=R, as it does not rely on the assumption of a high-risk allele (assuming a two-sided test is performed), and thus this is the version of CATT that is generally used in CCGAS. The p-value for Zx can be obtained according to the standard normal distribution.

Table 1.  Notation for genotype frequencies.
 ggGgGGTotal
Caser0r1r2r
Controls0s1s2s
Totaln0n1n2n

If the true underlying disease model is known, the CATT test Zx is the most efficient. But in reality, the true disease model is unknown. For a more robust test that enjoys a good performance over a wide range of disease models, the following test statistic, called MAX, has been proposed (Freidlin et al., 2002; Sladek et al., 2007; Li et al., 2008a, 2008b):

  • image

There are several ways to evaluate the significance level of MAX. For example, the multiple-integration procedure, which is available in R, can be used (Conneely & Boehnke, 2007), and it is computationally feasible in the context of GWAS. A more computationally challenging approach is through a permutation procedure (Sladek et al., 2007). Li et al. (2008a) derived an analytic upper bound that is reasonably accurate for small p-values.

Another robust test is the 2-df χ2 test. Using the notations listed in Table 1, we can define the Chi-2df test as

  • image(3)

The significance level of the Chi-2df test can be evaluated through the 2-df χ2inline image distribution.

The Formula for Power Calculation

Under a given disease model, we denote the expected genotype frequencies of (gg, Gg, GG) for cases and controls as (p0, p1, p2) and (q0, q1, q2), respectively. The analytic power calculation for CATTA ZA can be found in Freidlin et al. (2002) and Pfeiffer & Gail (2003).

The power calculation for the Chi-2df test is also straightforward. Under the significance level inline image, the reject region is inline image, where inline image is the inline image quantile of the 2-df χ2 distribution. The χ2 test statistic (defined by (3)) in general follows a non-central 2-df χ2 distribution (Edwards et al., 2005) under a given disease model, with the non-centrality parameter inline image, so the power for the Chi-2df test is

  • image

Finally, we derive the power calculation formula for MAX. We denote the reject region of MAX under the significance level inline image by inline image, where inline image satisfies inline image. Since inline image follows a multivariate normal distribution under the null hypothesis with the mean vector of inline image and the covariance matrix inline image given by Freidlin et al. (2002), we can obtain the threshold inline image by solving the following equation:

  • image

This can be accomplished easily using an existing function in the R package.

Under the disease model with the expected genotype frequencies (p0, p1, p2) and (q0, q1, q2) in cases and controls, respectively, inline image asymptotically follows a multinormal distribution with mean vector inline image and covariance matrix inline image. The mean vector is given by

  • image(4)

with the score vectorx chosen as (0, 0, 1), (0, 0.5, 1), and (0, 1, 1) for inline image, inline image, and inline image, respectively, and with inline image. The definition for the covariance matrix inline image is more complicated and is presented in the Appendix, along with its detailed derivations.

The power of the MAX test for the alternative hypothesis H1 can be written as

  • image(5)

where inline image and inline image is the covariance matrix.

Power Comparison

We assume that the case and control sample sizes are r=s= 1, 000. We first conduct the comparison under a single-marker disease risk model. We let the minor allele frequency (MAF) f for a particular SNP in the study population be in the range of 5–50%. For the MAF = f, we let the genotype frequencies of (gg, Gg, GG) in the control population (q0, q1, q2), have the values (q0, q1, q2) = ((1 −f)2, 2f(1 −f), f2). This is reasonable for the study of a rare disease in a source population where Hardy-Weinberg equilibrium holds. Let the odds ratios (ORs) for having 1 copy and 2 copies of the high-risk alleles be R1 and R2, respectively. We have R2=R21 > 1 for an additive model (in the logit scale), R2=R1 > 1 for a dominant model, and R2 >R1= 1 for a recessive model. Given (R1, R2), we know that the genotype frequencies of (gg, Gg, GG) in the case population (p0, p1, p2) are (q0, q1R1, q2R2)/(q0+q1R1+q2R2).

In addition to the single-marker disease risk model, we compare the power of the three single-marker tests under the following 2-marker haplotype risk model. Suppose the disease risk is conferred by haplotypes consisting of two linked markers, with marker #1 having allele types B and b, and with marker #2 having allele types C and c. We designate the haplotype BC as the high-risk variant (corresponding to the high-risk allele in the single-marker risk model). As with the single-marker risk model, we can define the haplotype risk model as dominant, recessive, and additive. For example, if R1 and R2 denote the ORs for having one copy and two copies of the high-risk haplotype, respectively, we have R2=R21 for the additive haplotype risk model. To simplify the power comparison setup, we let p1be the BC haplotype frequency in the study population and assume the other three 2-marker haplotypes have the same haplotype frequency. We further assume the independence of the two haplotypes within a subject in the study population. In the Appendix, we provide the formula for calculation of (p0, p1, p2) and (q0, q1, q2), the genotype frequencies of (bb, Bb, BB) in the case and the control populations, respectively.

Figure 1 shows the power curves of the above-considered association tests under each of three commonly assumed single-marker risk models (additive, dominant, and recessive) at a significance level of 0.05. From Figure 1, we can see that MAX is always more powerful than Chi-2df, and in some cases it is associated with up to a 5% power increase. Comparing CATTA with MAX, we see that CATTA is slightly more powerful than MAX under the additive model, but in most cases the advantage is negligible. Under the recessive model, MAX (as well as the Chi-2df) is noticeably superior to CATTA. Under the dominant model, it is interesting to see that neither CATTA nor MAX dominates the other. CATTA is more favorable when the risk allele is relatively rare, while MAX becomes more attractive as the risk allele frequency gets larger.

image

Figure 1. Power of CATTA (red solid line), Chi-2df (blue dotted line), and MAX (black dash-dot line) under the significance level of 0.05. The number of cases and controls is equal to 1,000. R1 and R2 are the odds ratios of 1 copy and 2 copies of the high-risk alleles, respectively.

Download figure to PowerPoint

In addition to the three commonly used disease models, we also compared the power under a single-marker risk model with all possible combinations of two odds ratios R1 and R2, with each ranging from 1 to 1.5. Figure 2 summarizes the power comparison results. Clearly, there is no test that can outperform the others in all of the single-marker risk models considered. When the risk allele is relatively rare (say, MAF less than 0.2), all three tests have comparable power under various single-marker models, although CATTA outperforms the others in most of the (R1, R2) region. As the risk allele gets more common, CATTA becomes less powerful than both MAX and Chi-2df under the single-marker risk model whenR1 >R2, although whether this kind of disease risk model is reasonable is debatable. MAX and Chi-2df have similar performances under all the considered choices of risk models and MAFs, with MAX performing more favorably under the risk model when R1 < R2, and less favorably when R1 >R2.

image

Figure 2. Power of CATTA (red), Chi-2df (blue), and MAX (black) under the significance level of 0.05. The number of cases and controls is equal to 1,000. R1 and R2 are the odds ratios of 1 copy and 2 copies of the high-risk alleles, respectively. MAF is the minor allele frequency.

Download figure to PowerPoint

Power comparison results under the 2-marker haplotype risk models are given in Figure 3. Similar to what we observed in Figure 1, MAX appears to have the most robust performance. Although MAX is slightly less powerful than CATTA under the additive haplotype risk model, it has a noticeable power advantage over CATTA (more than 10% higher) under the dominant haplotype risk model. Also, from Figure 3, we notice that MAX is mostly better than the Chi-2df, although the percentage increase in power is limited.

image

Figure 3. Power of CATTA (red solid line), Chi-2df (blue dotted line), and MAX (black dash-dot line) under the significance level of 0.05. The number of cases and controls is equal to 1,000. R1 and R2 are odds ratios for one copy and two copies of the high-risk haplotype.

Download figure to PowerPoint

Discussion

  1. Top of page
  2. Summary
  3. Introduction
  4. Methods
  5. Discussion
  6. Acknowledgements
  7. References
  8. Appendix

Choosing the correct single-marker analysis is a critical step for the success of CCGAS. Because of the uncertainty about the true underlying disease risk model, robust tests that have good performances under a wide range of disease models are preferred over those that are too sensitive to the model assumptions. MAX was designed specifically to achieve such robustness. Compared with the commonly used CATTA and Chi-2df, the power of MAX is less understood even though its type I error rate has been thoroughly investigated by Li et al. (2008a). In this work, we derived the power calculation formula for MAX. Based on this power calculation formula, as well as the ones already existing for CATTA and Chi-2df, we conducted a comprehensive power comparison among the three tests. Not surprisingly, we found that each test has its own “sweet” spots. However, MAX appears to have the most robust performances when the underlying genetic models are recessive, additive, or dominant. Under various overdominant models, the Chi-2df and MAX have very similar performance, with the power of the Chi-2df slightly higher than that of the MAX.

In order to assess the statistical significance of MAX, Sladek et al. (2007) used a permutation procedure to estimate p-values of MAX for each SNP. In order to ensure a reliable estimation for any p-value falling below the level of 10−6, we would need to carry out more than 107 permutation steps. This would be time-consuming and computationally challenging for a large-scale CCGAS. Alternatively, multiple integration (Conneely & Boehnke, 2007) and an efficient approximation method (Li et al., 2008a) have been proposed to evaluate the statistical significance of MAX. For the integration procedure (Conneely & Boehnke, 2007), it would be possible to use the R package “mvtnorm”, which could be freely downloaded from the website http://cran.r-project.org/. The efficient approximation approach (Li et al., 2008a), which is based on a one-dimensional integral, is user-friendly and can be implemented in many software packages, such as C, C++, R, Matlab, and SAS.

Since the MAX test did not perform as well as the chi-squared test under the overdominant model (R1 >R2), we also considered its extension, called MAX4, which is the maximum of four trend tests under four genetic models– recessive, additive, dominant, and overdominant–with scores (0,0,1), (0,0.5,1), (0,1,1), and (0,1,0), respectively. We conducted some simulation studies to compare the asymptotic power of the MAX4 with that of CATTA, Chi-2df, and MAX. Table 3 shows the results. It can be seen from Table 3 that the MAX4 has the best performance among the four considered tests under the overdominant models, but it is slightly less powerful than the MAX test under the other three models considered. The choice between MAX and MAX4 depends on the likelihood of the overdominant model in real applications.

Table 3.  Power comparison under four genetic models with 1,000 cases and 1,000 controls at the significance level of 0.05.
 (R1, R2)Minor allele frequency
0.10.20.30.40.5
Recessive model(1,1.3)     
CATTA0.0590.1140.2370.4110.584
Chi-2df0.0800.1770.3340.5060.645
MAX0.0800.180.3450.5260.667
MAX40.0780.170.3250.5010.648
Additive model(1.2,1.44)     
CATTA0.4330.6580.7660.8120.820
Chi-2df0.3410.5520.6690.7220.732
MAX0.3680.5950.7140.7660.776
MAX40.3620.5740.6840.7330.742
Dominant model(1.3,1.3)     
CATTA0.6430.7670.7500.6560.503
Chi-2df0.5620.7270.7460.6930.585
MAX0.5820.7490.7670.7140.604
MAX40.5910.7520.7630.7020.584
Overdominant model(1.3,1.1)     
CATTA0.5840.6270.4910.2770.106
Chi-2df0.5440.6950.7100.6680.600
MAX0.5420.6690.6500.5600.443
MAX40.5660.7130.7210.6700.588

When there are covariates to be adjusted for, the corresponding MAX test can be derived from the logistic regression model. Li et al. (2008a) suggested a procedure for evaluating the p-value of the covariate-adjusted MAX test. Although the power comparison was conducted without any adjustment for covariates, we expect similar conclusions will still hold when covariate effects need to be adjusted for.

Acknowledgements

  1. Top of page
  2. Summary
  3. Introduction
  4. Methods
  5. Discussion
  6. Acknowledgements
  7. References
  8. Appendix

We would like to thank the editor and two anonymous reviewers for their insightful comments, which improved our presentation. We also thank B.J. Stone for her valuable help. K Yu, X Liang, and Q Li are supported by the Intramural Program of the National Institutes of Health. Q Li is supported in part by the Knowledge Innovation Program of the Chinese Academy of Sciences, Nos. 30465W0 and 30475V0.

References

  1. Top of page
  2. Summary
  3. Introduction
  4. Methods
  5. Discussion
  6. Acknowledgements
  7. References
  8. Appendix
  • Conneely, K. N. & Boehnke, M. (2007) So many correlated tests, so little time65281; Rapid adjustment of p-values for multiple correlated tests. Am J Hum Genet 81, 11581168.
  • Edwards, B. J., Haynes, C., Levenstien, M. A., Finch, S. J. & Gordon, D. (2005) Power and sample size calculations in the presence of phenotype errors for case/control genetic association studies. BMC Genet 8, 118.
  • Freidlin, B., Zheng, G., Li, Z. & Gastwirth, J. L. (2002) Trend tests for case-control studies of genetic markers: power, sample size and robustness. Hum Hered 53, 146152.
  • Guedj, M., Della-Chiesa, E., Picard, F. & Nuel, G. (2006) Computing power in case-control association studies through the use of quadratic approximations: application to meta-statistics. Ann Hum Genet 71, 262270.
  • Hunter, D. J., Kraft, P., Jacobs, K. B., Cox, D. G., Yeager, M., Hankinson, S. E., Wacholder, S., Wang, Z., Welch, R., Hutchinson, A., Wang, J., Yu, K., Chatterjee, N., Orr, N., Willett, W. C., Colditz, G. A., Ziegler, R. G., Berg, C. D., Buys, S. S., McCarty, C. A., Feigelson, H. S., Calle, E. E., Thun, M. J., Hayes, R. B., Tucker, M., Gerhard, D. S., Fraumeni, J. F. Jr., Hoover, R. N., Thomas, G. & Chanock, S. J. (2007) A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer. Nat Genet 39, 870874.
  • Li, Q., Zheng, G., Li, Z. & Yu, K. (2008a) Efficient approximation of p-value of the maximum of correlated tests, with applications to genome-wide association studies. Ann Hum Genet 72, 397406.
  • Li, Q., Yu, K., Li Z. & Zheng, G. (2008b) MAX-rank: a simple and robust genome-wide scan for case-control association studies. Hum Genet 123, 617623.
  • Klein, R. J., Zeiss, C., Chew, E. Y., Tsai, J. Y., Sackler, R. S., Haynes, C., Henning, A. K., SanGiovanni, J. P., Mane, S. M., Mayne, S. T., Bracken, M. B., Ferris, F. L., Ott, J., Barnstable, C. & Hoh, J. (2005) Complement factor H polymorphism in aged-related macular degeneration. Science 308, 385389.
  • Kuo, C. L. & Feingold, E. (2008) What's the best statistic for a simple test of genetic association in a case-control study? Joint Statistical Meetings, Biometrics Section. August 2–7.
  • Pfeiffer, R. M. & Gail, M. H. (2003) Sample size calculations for population- and family-based case-control association studies on marker genotypes. Genet Epidemiol 25, 136148.
  • Sladek, R., Rocheleau, G., Rung, J., Dina, C., Shen, L., Serre, D., Boutin, P., Vincent, D., Belisle, A., Hadjadj, S., Balkau, B., Heude, B., Charpentier, G., Hudson, T. J., Montpetit, A., Pshezhetsky, A. V., Prentki, M., Posner, B. I., Balding, D. J., Meyre, D., Polychronakos, C. & Froguel, P. (2007) A genome-wide association study identifies novel risk loci for type 2 diabetes. Nature 445, 881885.
  • The Wellcome Trust Case Control Consortium (2007) Genome-wide association study of 14,000 cases of seven common diseases and 3000 shared controls. Nature 447, 661678.
  • Yeager, M., Orr, N., Hayes, R. B., Jacobs, K. B., Kraft, P., Wacholder, S., Minichiello, M. J., Fearnhead, P., Yu, K., Chatterjee, N., Wang, Z., Welch, R., Staats, B. J., II, Calle, E. E., Feigelson, H. S., Thun, M. J., Rodriguez, C., Albanes, D., Virtamo, J., Weinstein, S., Schumacher, F. R., Giovannucci, E., Willett, W. C., Cancel-Tassin, G., Cussenot, O., Valeri, A., Andriole, G. L., Gelmann, E. P., Tucker, M., Gerhard, D. S., Fraumeni, J. F., Hoover, R., Hunter, D. J., Chanock, S. J. & Thomas, G. (2007) Genome-wide association study of prostate cancer identifies a second risk locus at 8q24. Nat Genet 39, 645649.

Appendix

  1. Top of page
  2. Summary
  3. Introduction
  4. Methods
  5. Discussion
  6. Acknowledgements
  7. References
  8. Appendix
Appendix A: The Covariance between Zx and Zy.

Theorem Let inline image and inline image be any two score vectors. The asymptotic covariance betweenZx and Zy can be written as

  • image

where inline image, with inline image for i= 0, 1, 2,

  • image

Proof Let inline image and inline image. Then we have

  • image
Appendix B: Two-marker Joint Genotype Frequencies under the Haplotype Risk Model

Suppose the disease risk is conferred by haplotypes consisting of two linked markers, with marker #1 having allele types B and b and marker #2 having allele types C and c. We designate the haplotype BC as the high-risk variant. Denote the haplotype frequencies for BC, Bc, bC, and bc as p1, p2, p3, and p4, respectively. Let R1 and R2 denote the ORs for having one copy and two copies of the high-risk haplotype, respectively. We assume that HWE holds in the control group. Table 2 gives the joint genotype frequencies. From the table, we can see that the frequencies of BB, Bb, and bb in the control group at marker #1 are (p1+p2)2, 2(p1+p2)(p3+p4), and (p3+p4)2, respectively; the frequencies of BB, Bb, and bb in the case group at marker #2 are inline image, inline image, and inline image, respectively, where inline image.

Table 2.  Two-marker joint genotype frequencies.
Genotype pairFrequency
BBCCp21
BBCc2p1p2
BBccp22
BbCC2p1p3
BbCc2(p1p4+p2p3)
Bbcc2p2p4
bbCCp23
bbCc2p3p4
bbccp24