## Introduction

To test for association between genotype and phenotype in a case-control study, one can either employ the exact test or fit a logistic regression model to the data and test whether the coefficient for genotype is zero or not. There are three asymptotic tests that are commonly employed: the likelihood ratio (LR) test, Wald's test and the score test. The LR test compares the null and alternative hypotheses on an equal basis, while Wald's test starts at the alternative and considers movement towards the null and the score test begins with the null and asks whether movement towards the alternative could be an improvement. The three tests have equivalent asymptotic power for testing local alternatives (Cox & Hinkley, 1974). From a computational standpoint, the LR test is most demanding because it requires both the restricted and unrestricted estimates of parameters, whereas Wald's test uses only the unrestricted estimates and the score test uses only the restricted estimates. Besides hypothesis testing, investigators are also interested in estimating a variant's odds ratio , where is the unrestricted maximum likelihood estimate (MLE) of the coefficient for genotype. Computer programs often produce and its estimated variance , which makes it convenient to compute the Wald's test statistic to test the null hypothesis β=β_{null}. Thus, Wald's test is often the default option – for example, –*logistic* command in PLINK (Purcell et al., 2007) – in a genome-wide scan; in particular, when covariates are present.

However, we notice an anomalous behaviour of Wald's test; if a variant is mainly present in cases or controls, which means large effect sizes under the alternative hypothesis, Wald's test generates an insignificant *P*-value. On the contrary, the other two tests produce significant *P*-values. This abnormal phenomenon of Wald's test may have been observed by many researchers, but its theoretical interpretation is less understood; in a binary logit model, as the distance between the parameter estimate and the null value increases, the test statistic decreases to zero and the power of the test diminishes to the test size (Hauck & Donner, 1977). This aberrant behaviour of Wald's test is particularly pertinent to low-frequency variants. Suppose a causal variant with high penetrance is present at low frequency in the cases and nearly absent from the controls; the power of Wald's test will be minimal even if the effect size estimate of the variants is large. Were Wald's test employed, the causal variant would not show statistically significant association with the disease, and one could miss the association by only screening a list of *P*-values. Thus, alternative tests – the LR test, the score test and the exact test – should be considered in this situation. In this paper, we compared the four tests in terms of both validity when a variant is at low frequency, and power when a low-frequency variant is highly penetrant.