Current forecast of HIV/AIDS using Bayesian inference

In this study, we address the problem of fitting a mathematical model to the human immunodeficiency virus (HIV)/acquired immunodeficiency syndrome (AIDS) data. We present a quantitative analysis of the formulated mathematical model by using Bayesian inference. The mathematical model consists of a suitable simple system of ordinary differential equations. We perform a local and global sensitivity analysis of parameters to determine which parameters of the model are the most relevant for the transmission and prevalence of the disease. We formulate the inverse problem associated to the parameter estimation of the model, and solve it using Bayesian statistics. Then, we estimate the basic reproductive number of the disease based on the estimation of the parameters of the model and its comparison with one is tested through hypothesis tests. The data set consist of HIV and AIDS data from Luxembourg, Czech Republic, Japan, Croatia, United Kingdom, and Mexico.


| INTRODUCTION
HIV (human immunodeficiency virus) is a virus that attacks the cells that help the body to fight infections, making a person more vulnerable to other infections and diseases. It is spread by contact with certain body fluids of a person with HIV, most commonly during unprotected sex, or through sharing injection drug equipment. If left untreated, HIV can lead to the disease AIDS (acquired immunodeficiency syndrome). The human body cannot get rid of HIV and no effective HIV cure exists. So, once a person have HIV, it has it for life. However, by taking HIV medicine (called antiretroviral therapy or ART), people with HIV can live long and healthy lives and prevent transmitting HIV to their sexual partners. In addition, there are effective methods to prevent getting HIV through sex or drug use, including pre-exposure prophylaxis (PrEP) and postexposure prophylaxis (PEP) (Barnett & Whiteside, 2002). The main characteristic of the HIV is its large incubation period, several years on average.
First identified in 1981, HIV is the cause of one of humanity's deadliest and most persistent epidemics. About 38 million people worldwide are estimated to be living with HIV and around 8 hundred thousand people have died of AIDS in 2019 or 25 million have died from AIDS since the first cases were identified in 1981. About 2.5 million children under the age of 15 years are living with HIV and more than 12 million have been orphaned by AIDS by 2004. An increase number of individuals infected with HIV are now becoming ill and will die in the absence of intervention strategies in African countries. The progression from HIV infection to AIDS occurs approximately over one or two decades. The regional analysis of the Joint United Nations Programme on HIV/AIDS (UN-AIDS) shows that the main concentration of AIDS cases are in developing countries, 68% of persons living with HIV are in Sub-Saharan Africa in 2010. In second place is Eastern and Southeastern Asia, with an estimated number of 4 millions. Latin America, Eastern Europe and Central Asia are in the third place. The use of antiretroviral drug have decreased the number of deaths related to AIDS.
In this study, we propose a variation of a SI-type compartmental model including antiretroviral treatment applied to Luxembourg, Czech Republic, Japan, Croatia, United Kingdom, and Mexico. We determine the basic reproductive number and it is compared with one thorough hypothesis test. We also do a sensitivity analysis of parameters and their estimation through Bayesian inference.
The remainder of this paper is organized as follows: Section 2 describes the mathematical formulation and the parameter description of the proposed model. In Section 3 we determine the sensitivity indices for some parameters of the model, whereas in Section 6 we do a Bayesian sensitivity analysis. Section 4 describes the Bayesian inference framework to predict the dynamics of the spread of HIV/AIDS. In Section 5 we contrast the null hypothesis for R 0 in each country. Each section presents the mathematical framework and the numerical results. Discussion and conclusions are presented in Section 7.
To describe accurately the spread of the HIV/AIDS disease, a model that includes a combination of an age-chronological and an age-infection structures must be considered, see Figure 1. An agechronological structure is due to there exists evidence that the HIV cases are concentrated in the age of 15-50 years because in general, the individual's sexual activity is higher in this life stage. This fact, has been considered in some mathematical models (see e.g., Inaba, 2003;Luboobi, 1994;Okongo et al., 2013;Rong et al., 2007;Saxena & Hooda, 2015;Wang et al., 2015). An age-infection structure is because the HIV/AIDS disease has at least three main stages (see Figure 2; Hollingsworth et al., 2008;Perelson & Nelson, 1999). These three stages are due to the viral load in the individual changes during the period of time. At the beginning, the viral load starts to increase in the first 6 weeks until achieves its maximum, then decrease until the 10 week where it stays steadily for many years, then it increases again. Statistical data show that the average incubation period of the disease from HIV to AIDS is 5-12 years (Cai et al., 2008;Granich et al., 2009). Therefore, mathematical models have been proposed to incorporate these three stages of infection (Baryarama & Mugisha, 2007;Granich et al., 2009), but also, some delayed models have been studied. On the other hand, the compartmental structures "Susceptible-Infectious-Removed" (SIR) and "Susceptible-Exposed-Infectious-Removed" (SEIR) have been widely used to study the HIV/AIDS transmission dynamics (Mahato et al., 2014). Besides, some mathematical models have incorporated the antiretroviral treatment of HIV/AIDS patients (see e.g., Baggaley et al., 2005;Cai et al., 2009;Granich et al., 2009;Okongo et al., 2013;Rong et al., 2007;Waziri et al., 2012). According to Brauer and others (Brauer et al., 2008;Ejigu, 2010;Yan & Lv, 2016), more robust models of the spread of HIV/AIDS should be proposed using partial differential equations (PDEs).
In this study, it is assumed that the disease is only transmitted horizontally (the horizontal transmission can occur either by sexual contact or by indirect contact through drug syringe exchange). Additionally, inspired in a full data-driven approach, we have tried to use all the reliable data available to forecast the spread of the HIV/AIDS disease, keeping in mind that a simple model may fit better than a complex one (Roda et al., 2020). Thus, we formulate a single-stage SIA-type mathematical model similar to the proposed in Arazoza and Lounes (2002), Biswas and Pal (2017), Cai et al. (2008), Eduafo et al. (2015). In our model, the total population N , is divided into the following three epidemiological classes: susceptible individuals S, infective individuals who are the infected and infectious individuals that have not yet developed AIDS symptoms I , and AIDS patients who are infected and with AIDS symptoms A. Here, β denotes the disease transmission rate; μ and ν denotes the rates of birth and death, respectively; κ the death rate caused by the AIDS; σ denotes the transfer rate between the compartment A to I ; ξ 1 denotes the rate of infection due to drug syringe exchange; and ξ 2 denotes the rate at the AIDS patients (class A) obtain antiretroviral treatment. With the previous hypotheses, our model is represented through the following ordinary differential equations (ODEs) equations: Since the vectorial field defined by the right-hand side of the system (1) is continuously differentiable (polynomial formulation), the existence and uniqueness of solutions is guaranteed. It is easy to verify that the feasible region of (1) is where + 3 denotes the positive octant of 3 . On the other hand, the system (1) has two equilibrium points, P 0 (the disease-free equilibrium point) and P * (the endemic equilibrium point) which are expressed as follows: To understand the spread of a disease, it is important to know certain epidemiological thresholds, the basic reproductive number (denoted as R 0 ) is one of them. R 0 , is the expected number of secondary infectious cases generated by an infected person in an entirely susceptible population. This threshold quantifies how many susceptible persons are on average infected by one infected person. The estimation of R 0 can be done in two ways: either by inferring it from observed cases, or by following infection chains step by step. For our mathematical model, R 0 is the following: The prevalence of HIV, that is, the accumulated cases of the HIV incidence between two times of observation t t , and the prevalence of AIDS is given by where K 1 and K 2 account for the undetermined true number of HIV-AIDS cases since some reported HIV-AIDS cases occur once the infected individuals start presenting symptoms (Keeling & Rohani, 2008, p. 51;. Table 1 shows the parameters of the SIA model (1) and the Bayesian inference framework. We estimate the parameters β σ K , , 1 , and K 2 for the countries: Luxembourg, Czech Republic, Japan, and Croatia, and the parameters β σ ξ ξ K , , , , 1 2 1 , and K 2 for the countries: United Kingdom and Mexico. The natural birth μ and death rate ν, though they are assumed known fixed values, they are different for the six countries that we are analyzing in this manuscript. In determining how to reduce the number of susceptible population due to a disease outbreak, it is necessary to know the relative importance of the different factors responsible for its transmission and its prevalence. The initial transmission of a disease is directly related to R 0 , and its prevalence is directly related to the endemic equilibrium point, that for our case is given by P S I A * = ( * , * , * ) defined on the second equation of (2), more specifically to the magnitude of I * , since it represents the people who may be clinically ill. In this section, we compute the sensitivity indices of R 0 and I * with respect to each parameter involved on the model (1). Those indices allow us to measure the relative change in a variable when a parameter is changing. Sensitivity analysis is commonly used to determine the robustness of a model predictions to the parameter values, since there are usually errors in the data collection and the presumed parameter values (Chitnis et al., 2008;Zi, 2011).
The normalized forward sensitivity index of a variable to a parameter, is defined as the ratio of the relative change in the variable to the relative change in the parameter. When the variable is a differentiable function of the parameter, the sensitivity index may be alternatively defined using partial derivatives (Chitnis et al., 2008;Zi, 2011) as follows.
Definition 1. The normalized forward sensitivity index of a variable, Q, that depends differentiably on a parameter, p, is defined as Given that we have explicit formulas for R 0 given on (3) and I * given on (2), we derive an analytical expression for the sensitivity indices of both, R 0 and I * , to each of the seven different parameters described in Table 1. Those indices are given in Table 2.

| Global sensitivity indices of the total incidence of HIV
A global sensitivity analysis is done to identify relevant and noninfluential parameters when the values of the parameters are not specified but they vary over the entire range of input values. One objective of the global sensitivity analysis is to quantify how uncertainties in the T A B L E 2 Local sensitivity indices of R 0 and I * with respect each parameter involved on the model (1) model outputs can be apportioned to uncertainties in the model inputs that are considered in the parameter space. The global sensitivity analysis techniques can be broadly categorized as regression, variance, or screening-based methods. Variance-based indices are advantageous over regression and correlations-based indices since they do not require linearity or monotonicity. For this reason, they are sometimes referred to as model-free methods. In this section, we used variance-based indices, more specifically Sobol indices. We also briefly describe formulas for Sobol indices for uniform densities (Smith, 2013). Let us consider the scalar-valued and nonlinear model We initially assume that the random variables are independent and uniformly distributed on [0, 1] so that We consider the second-order High-Dimensional Model Representation (HDMR) or the Sobol expansion subject to the condition which ensures that the functions f i are orthogonal. The expansion terms: zeroth-, first-, and second-order terms have variance interpretations as follows: The total variance D of the response Y is given by The total variance can be expressed as so, by the definition, they satisfy The terms S i are often termed the importance measures or first-order sensitivity indices, and large values of S i indicate parameters that strongly influence the response variance. Similarly, S ij account for the influence of interaction terms. Because the number of first-and second-order Sobol indices is p + , their analysis quickly becomes untenable for large parameter dimensions. This motivates the consideration of total sensitivity indices which quantify the total effect of the parameter Q i on the response Y . Using the equation (8), it follows that Similarly, one can show that which yields a variance interpretation for S ij . Finally, the total sensitivity index has the interpretation Next, we describe the Sensitivity algorithm proposed in Smith (2013). The computation of S i given by (9)  for fixed q i and repeats the procedure M times to approximate the variance, a total of M 2 evaluations will be required to evaluate a single sensitivity index. For large parameter dimensions p, this approach is prohibitive. The following algorithm of Saltelli, reduces the number of required function where q i j and q i j are quasi-random numbers drawn from the respective densities.
which are identical to B with the exception that the ith column is taken from A. Compute M × 1 vectors of model outputs where the mean is approximated by The intuition for the algorithm is the following. In the scalar product y y A T C i , the response computed from values in A is multiplied by values for which all parameters except q i have been resampled. If q i is influential, then large (or small) values of y A will be correspondingly multiplied by large (or small) values of y C i , yielding a large value of S i . If q i is not influential, large and small values of y A and y C i will occur more randomly and S i will be small. Table 3 shows the global sensitivity indices of the total incidence of HIV (Φ T ), with respect to the model parameters of (1).

| BAYESIAN INFERENCE
For the parameter estimation of the countries Croatia, Czech Republic and Luxembourg, we used the yearly updated data UNAIDS. For the parameter estimation of Japan, Mexico, and United Kingdom, we used the yearly updated data National Institute of Infectious Diseases, Direccion General de Epidemiologia and HIV in the United Kingdom, respectively. From the mathematical point of view, the parameter estimation of ODEs system is regarded as an inverse problem. Fitting curve or estimation the parameters of a model is considered an inverse problem. Typically, an optimization method, for example, the Landweber in Prieto and Dorn (2016), Prieto and Ibarguen-Mondragon (2019), Smirnova et al. (2016), Alavez-Ramirez (2007), and Capistrán et al. (2009), or faster methods such as the Levenberg-Marquardt or Conjugate Gradient methods, and regularization techniques, such as Tikhonov, Sparsity or Total Variation, are used to solve this inverse problem. In this manuscript, we use Bayesian inference to solve the inverse problem since it is a tool which combines uncertainty propagation of measured data with available prior information of the parameters of the model, also, it is numerically more stable approach than classical methods, since classical methods rely on the starting parameter point must be relatively close to the true one, otherwise the solution obtained corresponds to a local minimum. Moreover, classical methods give only a point estimate solution instead of a band of the solutions using Bayesian inference, that is, in a Bayesian framework, one works with credible intervals. Some references of works using Bayesian inference are available in AcuñaZegarra et al. (2020) Note: S i and S Ti , represents the Sobol index and the total sensitivity index with respect to the ith parameter given by (9), and given by (10), respectively.
Bocharov (2018), Stojanović et al. (2019). Using Bayesian inference, the solutions of the inverse problem are obtained from the posterior distribution of the parameters of interest, and a solution of interest is obtained using the Maximum a Posterior, called MAP. This MAP gives the parameter value for which the posterior density is maximal. Also, one can compute the median and quantiles from this posterior sample. As already mentioned, the Bayesian framework provides a natural and formal way to quantify the uncertainty of the quantities of interest.
, that is, k denotes the number of state variables, here k = 3, and the parameters θ , that is, m denotes the dimension number of parameters to estimate, here m = 4. Thus, we can write the model (1) as the following Cauchy problem: Problem (13), defines a mapping θ , where + denotes the nonnegative real numbers. We assume that Φ has a Fréchet derivative, that is, the mapping , is injective, thus the forward problem (13) has a unique solution x for a given θ. The Fréchet derivative of Φ, denoted by Φ ′ , results to be the usual derivative for the system (1) since the domain and range of Φ ′ are finite dimensional spaces. Usually, not all states of the system can actually be directed measured, that is, the data consists of measurements of some state variables at a discrete set of points t t , …, n 1 , for example, in epidemiology, these data consist of number of cases of confirmed infected people. This defines a linear observation mapping from state variables to data such that x θ = Φ( ) holds, with y obs is the data which has error measurements of size η. Problem (13) may be solved using numerical tools to deal with a nonlinear least-squares problem or the Landweber method or the combination of both. As we mention before, we implement Bayesian inference to solve the inverse problem (14) in this manuscript. From the Bayesian perspective, all state variables x and parameters θ are considered as random variables and the data y obs is fixed. For random variables x θ , , the joint probability distribution density of data x and parameters θ, denoted by π θ x ( , ), is given by π θ x π x θ π θ ( , ) = ( ) ( )  , where π x θ ( )  is the conditional probability distribution, also called the likelihood function, and π θ ( ) is the prior distribution which involves the prior information of parameters θ. Given y I A = (˜,˜) obs , which correspond to the diagnosed infected HIV cases and the notified sick AIDS cases, respectively, the conditional probability distribution π θ y ( ) obs  , called the posterior distribution of θ is given by the Bayes' theorem: If additive noise is assumed: where η is the noise due to discretization, model error and measurement error. If the noise probability distribution π η ( ) H is known, θ and η are independent, then F I G U R E 7 The fit for the diagnosed HIV cases and the notified AIDS cases of Luxembourg using the Stan package (Carpenter et al., 2017). The blue and red semi-continuous lines represent the observed HIV and AIDS data, respectively, the solid orange and violet lines represent the medians and the shaded area represents the 95% probability bands for the expected value for the state variables: Infected and Sick people, respectively. AIDS, acquired immunodeficiency syndrome; HIV, human immunodeficiency virus F I G U R E 8 The fit for the diagnosed HIV cases and the notified AIDS cases of Czech Republic using the Stan package (Carpenter et al., 2017). The blue and red semi-continuous lines represent the observed HIV and AIDS data, respectively, the solid orange and violet lines represent the medians and the shaded area represents the 95% probability bands for the expected value for the state variables: infected and sick people, respectively. AIDS, acquired immunodeficiency syndrome; HIV, human immunodeficiency virus π y θ π y F θ ( )= ( − ( )).
H obs obs  All the available information regarding the unknown parameter θ is codified into the a prior distribution π θ ( ), it specifies our belief in a parameter before observing the data. All the available information regarding the way of how was obtained the measured data is codified into the likelihood distribution π y θ ( ) obs  . This likelihood can be seen as an objective or cost F I G U R E 9 The fit for the diagnosed HIV cases and the notified AIDS cases of Japan using the Stan package (Carpenter et al., 2017). The blue and red semi-continuous lines represent the observed HIV and AIDS data, respectively, the solid orange and violet lines represent the medians and the shaded area represents the 95% probability bands for the expected value for the state variables: infected and sick people, respectively. AIDS, acquired immunodeficiency syndrome; HIV, human immunodeficiency virus F I G U R E 10 The fit for the diagnosed HIV cases and the notified AIDS cases of Croatia using the Stan package (Carpenter et al., 2017). The blue and red semi-continuous lines represent the observed HIV and AIDS data, respectively, the solid orange and violet lines represent the medians and the shaded area represents the 95% probability bands for the expected value for the state variables: Infected and Sick people, respectively. AIDS, acquired immunodeficiency syndrome; HIV, human immunodeficiency virus function, as it punishes deviations of the model from the data. To solve the associated inverse problem (15)

 
We used the data set y I A = (˜,˜) obs , which correspond to the diagnosed infected HIV cases and the notified sick AIDS cases, respectively. We mention that we have not used the data column corresponding to the recovery people here because in a big range (from the beginning) of days this data was not been collected. A Poisson distribution, y μ ( )   , with respect to the time is typically used to account for the discrete nature of these counts, where μ is the mean of the random variable y, that is, Y μ [ ] = . In fact, the mean and variance of the Poisson distribution coincide. We assume independent Poisson distributed noise η, that is, all dependency in the data is codified into the model (1). In other words, the positive definite noise covariance matrix η is assumed to be diagonal. Therefore, using the Bayes' formula, the likelihood is As mentioned above, we approximate the likelihood probability distribution corresponding to diagnosed HIV individuals and AIDS patients with a Poisson distribution where Φ , Ξ i i are given by (4) and (5), respectively, and the index i denotes the number time, in our case the number of years. For independent observations, the likelihood distribution π y θ ( )  , is given by the product of the individual probability densities of the observations F I G U R E 11 The fit for the diagnosed HIV cases and the notified AIDS cases of United Kingdom using the Stan package (Carpenter et al., 2017). The blue and red semi-continuous lines represent the observed HIV and AIDS data, respectively, the solid orange and violet lines represent the medians and the shaded area represents the 95% probability bands for the expected value for the state variables: Infected and Sick people, respectively. AIDS, acquired immunodeficiency syndrome; HIV, human immunodeficiency virus , is given by the product of the parameter K 1 and Φ i at time t t = i . Analogously, the mean for the Poisson distribution K θ ( Ξ ( )) i 2  is given by the product of the parameter K 2 and Ξ i at time t t = i . For the prior distribution, we select Lognormal distribution for the β parameter and Uniform distributions for the rest of parameters to estimate: σ K K , , 1 2 using the Stan package (Carpenter et al., 2017) (Gamma distributions using the Twalk package Christen & Fox, 2010). The description of the parameters in (1) and their corresponding hyper-parameters' (range column) are given on Table 1.
The posterior distribution π θ y ( ) obs  given by (15) does not have an analytical closed form since the likelihood function, which depends on the solution of the nonlinear SIA model, does not have an explicit solution. Then, we explore the posterior distribution using the Stan Statistics package (Carpenter et al., 2017), general purpose Markov Chain Monte Carlo Metropolis-Hasting (MCMC-MH) algorithm to sample it, the package Twalk (Christen & Fox, 2010). Both algorithms generate samples form the posterior distribution π θ y ( ) obs  that can be used to estimate marginal posterior densities, mean, credible intervals, percentiles, variances, and others. We refer to House et al. (2016) for a more complex MCMC-MH algorithms. We point out that the parameter estimation of the countries: Luxembourg, Czech Republic, Japan, Croatia were done using the package Stan, and for the countries United Kingdom and Mexico were done using the package Twalk.
We have used the interface in Python (PyStan) (Carpenter et al., 2017), specifically, we have used the No-U-Turn-Sampler (NUTS) method. Figure 3 shows the credible intervals of the parameters of the model (1) within 95% highest-posterior density (HPD). From top to bottom: F I G U R E 13 Joint probability density distributions of the estimated parameters within 95% (highestposterior density) of Czech Republic. The blue lines represent the medians Luxembourg, Czech Republic, Japan, Croatia. Figures 4 and 5 show the credible intervals of parameters estimated within 95% HPD. Figures 6-11 show the fit for the diagnosed HIV cases and the notified AIDS cases of Luxembourg, Czech Republic, Japan, Croatia, United Kingdom, and Mexico. The blue and red semi-continuous lines represent the observed HIV and AIDS data, respectively, the solid orange and violet lines represent the medians and the shaded area represents the 95% probability bands for the expected value for the state variables: Infected class I and Sick class A, respectively, for the countries Luxembourg, Czech Republic, Japan, Croatia, United Kingdom and Mexico, respectively. Figures 12-17 show the joint probability density distributions of the estimated parameters within 95% (HPD) of the countries Luxembourg, Czech Republic, Japan, Croatia, United Kingdom and Mexico, respectively. The blue lines represent the medians. Table 4 shows R 0 given by (3) for the Luxembourg, Czech Republic, Japan, Croatia, United Kingdom, and Mexico. We performed 600,000 iterations with 300,000 of them as burn-in. Using both packages, we have done predictions 5 years in advance. Some future work will correspond to analyze the identifiability of the parameters of model (1), as suggested in Chowell (2017), Magal and Webb (2018), Roosa and Chowell (2019), specifically the β and K 1 parameters since these parameters are multiplied in the HIV prevalence Equation (4), thus, estimating both parameters simultaneously may lead to nonidentifiability difficulty. F I G U R E 14 Joint probability density distributions of the estimated parameters within 95% (highest-posterior density) of Japan. The blue lines represent the medians Similarly, it happens with σ and K 2 parameters since these parameters are multiplied in the AIDS prevalence Equation (5). We have uploaded all the codes used in this paper to the following Github link: https://github.com/kernelprieto/HIV-AIDS, for a detailed review.

| HYPOTHESIS TESTING
In this section we address the hypothesis testing on the basic reproductive number R 0 . To simplify notation, we use θ R = 0 . Suppose we observe n i.
The hypotheses H 0 and H 1 can be replaced by models M 0 and M 1 . A Bayesian version of th standard p-value can be produced once the posterior distribution is obtained. For the one-sided case in (18), the specified prior distribution of θ provides an a priori probability over the two ≤ ≤ ∞ . The distribution π 0 can take on an large number of forms, but the uninformative uniform distribution is particularly useful, and many authors have suggested that lacking specific information π p H = ( is true) = 0 0 1 2 is a useful value (Gelman et al., 2014;Shikano, 2019 Once prior probabilities are assigned, the Bayesian posterior probability is derived from the nonnormalized region defined by the null hypothesis divided by the total nonnormalized region, which can be derived as follows: where the part of the integral in the numerator from 1 to ∞ contributes zero to this calculation since H 0 is on the right-hand-side of the conditionals, similar argument holds for the H 1 part. The terms inside the integrals are modified using the definition of conditional probability: In the last step, the simplification because π π = 0 1 . This posterior value p H y ( ) 0 obs  is a p-value for: the probability that the null hypothesis is true, given the data and the model. Conversely, the standard p-value is the less revealing probability of seeing these or more extreme data, given model and assumed true null hypothesis. The value p H y ( ) 0 obs  corresponds the area numerator of (21) and the entire probability density function is the denominator as illustrated in fig. 7.1 of Gill (2014). Table 5 shows the hypothesis testing of the Basic reproductive number R 0 for the countries analyzed in this manuscript. From Table 5, we obtain that all the null hypothesis are accepted with p-value equal to 1. The results are consistent with Table 4 since the 95% of the credible interval were inside the interval of the null hypothesis.

| BAYESIAN SENSITIVITY ANALYSIS
In this section we present an analysis of how the prior distribution selection impacts on the posterior distribution through the Bayes Theorem for Japan's case. If the posterior does not depend a lot on the prior distribution selection, we say that our model is robust, otherwise, we say that our model is very sensitive to prior selection. We analyze the posterior distribution as compromise between data and prior information (Gelman et al., 2014). We make reasonable modifications to the assumptions in question, recompute the posterior distribution of parameters of interest, and observe how much the results on these posterior distributions have changed. If the posterior distribution change highly, that is, it is very sensitive to prior distribution changes, we should collect more data in general (Carlin & Louis, 2009). We first have done a comparison between diffuse, weak-informative and informative prior distributions. We used uniform distributions for all the parameters to be estimated as diffuse, for the weak-informative, we used the Log-Normal distribution for the β parameter (Capistran et al., 2021), and uniform distributions for the rest of parameters, and finally, for the informative distributions we used Gamma distributions for all the parameters since the Gamma distribution is conjugate of the Poisson distribution. Table 6 shows the results of the posterior distributions of the parameters when different type (diffuse, weak-informative, informative) of prior distributions are used. Table 7 shows the result of the posterior distributions of the parameters using different hyperparameters values for a Gamma (informative) Prior Distribution. Table 8 shows the result of the posterior distributions of the parameters using different hyperparameters values for a Uniform (diffuse) Prior Distribution.

| DISCUSSION AND CONCLUSIONS
In this study we formulated a mathematical model for HIV/AIDS spread keeping in mind a simple formulation, but enough to accurately to fit the HIV/AIDS data of six countries at hand: Luxembourg, Czech Republic, Japan, Croatia, United Kingdom, and Mexico. We presented a short-term (3 years) forecast of transmission of the HIV/AIDS disease using Bayesian inference based on two software: the Stan package (Carpenter et al., 2017) and the Twalk package (Christen & Fox, 2010). We showed credible intervals, bands projections with medians and the joint probability distributions given as a corner. From Figures 6-11, we could observe that the SIA model proposed in Section 2 fit adequately the transmission of the HIV/ADIS disease.
The assumption of independence for data collected of communicable diseases over time have been questioned in Chowell et al. (2009), Koopman and Longini (1994). This phenomenon T A B L E 5 Hypothesis testing of the Basic reproductive number R 0 for the countries analyzed in this manuscript is called dependent happening, that is, observation of a single infected individual is not independent of observing other individuals in a population of interest. This phenomenon is not significant in noncommunicable diseases. Thus, we point out that the independent happening assumption is a limitation of our current approach to identify the risk of this infectious disease. A solution for dependent happening to assess vaccine efficacy is addressed in Chowell et al. (2009), called the Household Secondary Attack Rate Method. Another limitation of our current model is that we have not included the incubation period of the disease, defined as the time from infection with the microorganism to symptom development, which in the HIV/AIDS case is rather long, may be one or a few decades. A study in Brookmeyer and Gail (1994) obtained short-term predictions of the progress disease of HIV/AIDS using the incubation period and the back-calculation method. We will try to incorporate the incubation period in a future work. We point out that for Luxembourg, Mexico and UK fits, we estimated more parameters, ξ ξ , 1 2 , than the cases Czech Republic, Japan, Croatia, to obtain a better fit for them. Inclusive, we performed a forecast of 2 years for the UK case, again, to obtain a more accurate forecast. The UK database contains only 20 data. Thus, we should try to collect more data points to increase the projection period of time.
We also did a sensitivity analysis of the main quantities of a epidemic, being the Basic Reproductive Number and the component of HIV cases of the endemic point, that is, R I , * 0 and the prevalence of HIV (Φ T ). We could observe in Tables 2 and 3 that the most relevant parameters are β and σ for I * and the β and K 1 for Φ T . We also performed a Bayesian Sensitivity Analysis for Japan case in Section 6, observing the Tables 6-8, we concluded that the Bayesian forecast is robust. An interesting result was that our estimation of R 0 for each country given on Table 4 was supported in Section 5, where we found that an HIV/AIDS certain outbreak will be expected for Czech Republic and Japan, as is shown in Table 5.
Finally, in modeling problems of real phenomena with ODEs, most attention has been paid to the problem of simulating the state variables with given parameters, but it is also often required to study the inverse problem, that is, to estimate or predict the parameters using some T A B L E 7 Posterior distribution of all the parameters to be estimated using different hyperparameters values for a Gamma (informative) Prior Distribution within 95% highest-posterior density