2.1 Proposed Model
The main goal of this section is to demonstrate how the exponential assumption can be relaxed, and how discretized distributions can be implemented for the latent and infectious periods. We utilize the more realistic assumption that there is a maximum time that an individual may sustain a latent infection before becoming actively infectious. We assume that all individuals in the exposed category will eventually move to the infectious category, as is done in Lekone and Finkenstädt 2006 and Anderson and May 1991, and do not consider cases of exposure without latent infection at this juncture.
The population averaged SEIR model of interest is found in Lekone and Finkenstädt 2006, which is itself the generalization from the SIR model found in Mode and Sleeman 2000:
Define i = 1, … , T as a subscript for discrete time and Si, Ei, Ii, and Ri represent the counts of individuals in the Susceptible, Exposed, Infectious, and Removed compartments at time i, respectively. The notation denotes a change of category. Let represent the mixing and possible intervention functions controlling the number of new exposures at time i + 1, and is constrained to be nonnegative, with representing the vector of parameters controlling mixing and interventions. Let h represent the number of days between time points in the data collection partition. The total number of individuals in the population is denoted by N.
For models utilizing the exponential assumption, the exposure data are typically arranged as a T-dimensional vector of counts, E = (E1, …, ET)′. Note that, in these models, the only necessary information for the evaluation of the likelihood is the total count in the exposed category at each time point, Ei. We relax this assumption by not only counting the number of exposed individuals at each time point, but also by utilizing the length of time each individual has been in the exposed compartment. Consider collecting the exposure counts in a TxM1 matrix E, where M1 is the maximum amount of time the infectious agent can remain latent. Cell (i,j) then contains a count of the number of individuals who are at time point j of the latent infection process on time point i of the epidemic. In other words, i represents objective time since the start of the epidemic, and j denotes the subjective, individual time in the diseases process. In practice, i and j will typically be measured in days, although this is certainly not required. The TxM2 infectious matrix I is defined analogously to E, with the rows representing the number of time points elapsed since the start of the epidemic, and the columns representing the number of time points an individual has remained in the infectious compartment.
When an individual is newly exposed and contracts a latent infection at time i, the individual moves from the susceptible class into row i, column 1, of the exposed matrix E. For every time unit in which the individual does not become infectious, the individual moves from row i, column j in a diagonal path, moving one column to the right, j+1, and one row down, i+1. When the individual becomes infectious at time i′, the individual moves to row i′, column 1, of I, and repeats the process until removed. This process allows the length of time each individual is in the exposed category to be imputed, and allows for many latent time and infectious time distributions to be discretized and utilized. Specifying a maximum length of time which an individual can remain in the Exposed or Infectious classes allows the number of columns of the matrix to be defined a priori, and removes the need to adaptively choose the size of the matrix as the analysis is running. While an adaptive scheme may be possible, it is not necessary to do so, since the maximum amount of time an infectious agent may remain in a latent state is often known. Additionally, an adaptive scheme may not be computationally efficient.
Because the exposure data and infectious data are being collected in matrices, the probability of compartmental change can vary with the amount of time an individual has stayed in the compartment. This allows the exponential assumption to be relaxed, and any distribution can be discretized and used to approximate the true, underlying latent and infectious time distributions. As noted in the Introduction, this allows more realistic distributions to be used for infectious diseases.
With this structure in place, the investigator is able to use strong prior knowledge of the length of time that individuals spend in the exposed and infectious categories. Typically, this information is available and multiple distributions may be fit and compared. It is unlikely that there will be strong prior information for the mixing and intervention parameters, so relatively weak priors can be used for these parameters.
The proposed PS SEIR model utilizes the following scheme: Let i denote discrete calendar time since the beginning of the epidemic, and j denote discrete time that an individual has spent in the latent or infectious state. Then,
The compartments are the Susceptible, Exposed, Infectious, and Removed classes, respectively. Define a bin as the amount of time between data collection times. In our discretization scheme, a bin will be h time units (often measured in days). Bins are used within the exposed and infectious compartments as the basic time unit for the discretizations. Most data sets will use h = 1, but all that is required is 0 < h < ∞. This style of discretization allows for the analysis of large data sets, while still providing the flexibility to use a time-dependent conditional probability of changing compartments. By defining bins within the Exposed and Infectious compartments, it is possible to vary the conditional probability of a compartment change depending on the length of time an individual has spent in the compartment, which, in turn, allows for distributions other than the exponential distribution to be used for the latent and infectious times.