Let y = (y1,y2, . . . ,yn)′ denote an n × 1 vector of a dependent dichotomic variable and xi = (xi1, . . . , xik)′ denote the k × 1 vector of covariates for the patient i. A predictive regression model deals with the problem of estimating the binary variable yi, which represents the fact of belonging or not to a study group. In this case, yi = 1 if the ith individual suffers an NI, and yi = 0 otherwise. Assume that yi = 1 with probability pi and yi = 0 with probability 1 − pi. In this dichotomous model, xi includes the risk factors for the ith individual. The regression model is given by
where β = (β1, . . . , βK)′ is a k × 1 vector of regression coefficients, which represents the effect of each factor in the model and F(·) is the link function. The likelihood function is given by
where x = (x1, x2, . . . , xn)′.
Frequentist estimation of conventional logit models. For conventional logistic regression, the link function is equal to . Observe that this is a symmetric function with respect to zero, so F(−z) = 1 − F(z) for all z.
The regression coefficients, β, are usually estimated by numerical evaluation of the likelihood function. Then, the model provides the probability of infection for any individual. The normal procedure is then to consider a cutoff in this probability for detecting infected individuals.
Bayesian estimation of symmetric and asymmetric logit models. A Bayesian estimation of the logistic regression model is obtained by assuming that the β coefficients are random nodes of the model. To facilitate the comparison with frequentist methods of estimation, we assume centered and noninformative normal densities as prior distributions for the coefficients.
We also propose the use of an asymmetric link function, fitting the resulting model from a Bayesian point of view. The model has been used in other contexts ([16,17,20,21], among others), but has had little application in the health field. The asymmetric model is adequate for binary response data when one response is much more frequent than the other, as occurs in the case we examine in this study.
Following Albert and Chib  and Chen et al. , we assume that the model uses a vector of latent variables w = (w1, w2, . . . , wn)′ in this form:
In this model, G is the cumulative distribution function of the half-standard normal distribution given by
F is the standard logistic cumulative distribution function, and zi and εi are assumed to be independent. The skewness in this regression model is given by δzi, where δ ∈ (−∞, ∞) is the skewness parameter. If δ < 0 then the probability of pi = 0 increases, although if δ > 0, the probability of pi = 1, i.e., the infection probability of the ith individual, increases. Obviously, if δ = 0, then the regression model is reduced to a standard logit.
The likelihood function in Eq. 1 can be rewritten as
We assume that the prior distribution of the coefficients is normal, i.e., βj ∼ N(0,1010), ∀j = 1, . . . , k, and δ ∼ N(0,1010). These noninformative prior distributions with a very large variance reflect the absence of prior knowledge about the parameters of interest, and they facilitate comparison with classical models.
Combining this prior structure and the likelihood in Eq. 2, we obtained the posterior distribution of parameters (β, δ):
where π(β, δ) is the prior distribution of (β, δ).
We can sample (β, δ) from this posterior distribution by using the WinBUGS package (Windows Bayesian inference Using Gibbs Sampling, developed jointly by the MRC Biostatistics Unit [University of Cambridge, Cambridge, UK] and the Imperial College School of Medicine at St. Mary's, London) , based on the Gibbs sampling applying Markov Chain Monte Carlo (MCMC) methods (see Carlin and Polson  and Gilks et al.  for further details).
One aim of our study is to use logistic regressions in order to make predictions. In Bayesian theory, predictions of future observables are based on predictive distribution. The predictive distribution of unobservable data yp, given a new set of covariates xp = (xp1, . . . , xpk) is defined as
The predictive distribution can also be simulated using MCMC techniques with WinBUGS . We include the WinBUGS code for more details in the Supporting Information Appendix for this article.