A stochastic particle extended SEIRS model with repeated vaccination: Application to real data of COVID‐19 in Italy

The prediction of the evolution of epidemics plays an important role in limiting the transmissibility and the burdensome consequences of infectious diseases, which leads to the employment of mathematical modeling. In this paper, we propose a stochastic particle filtering extended SEIRS model with repeated vaccination and time‐dependent parameters, aiming to efficiently describe the demanding dynamics of time‐varying epidemics. The validity of our model is examined using daily records of COVID‐19 in Italy for a period of 525 days, revealing a notable capacity to uncover the hidden dynamics of the pandemic. The main findings include the estimation of asymptomatic cases, which is a well‐known feature of the current pandemic. Unlike other proposed models that employ extra compartments for asymptomatic cases, which force the estimation of this proportion and significantly increase the model's complexity, our approach leads to the evaluation of the hidden dynamics of COVID‐19 without additional computational burden. Other findings that confirm the model's appropriateness and robustness are its parameter evolution and the estimation of more ICU‐admitted cases compared to the official records during the most prevalent infection wave of January 2022, attributed to the intensified increase in admissions that may have led to full occupancy in ICUs. As the vast majority of datasets contain time series of total recovered and vaccinated cases, we propose a statistical algorithm to estimate the currently recovered and protected through vaccination cases. This necessity arises from the attenuation of antibodies after vaccination/infection and is necessary for long‐time interval predictions. Finally, we not only present a novel stochastic epidemiological model and test its efficiency but also investigate its mathematical properties, such as the existence and stability of epidemic equilibria, giving new insights to the literature. The latter provides additional details concerning the system's long‐term behavior, while the conclusions drawn from the R₀ index provide perspectives on the severity and future of the COVID‐19 pandemic.

J E L C L A S S I F I C A T I O N 62M20, 62L12, 60G35, 60E05

| INTRODUCTION
Efficient prediction of the spread of epidemic outbreaks plays an important role in reducing the transmissibility and drawbacks of contagious diseases.Infectious diseases are affecting more and more people every day, resulting in widespread health crises especially in third world countries.In addition, the novel coronavirus SARS-CoV-2 (COVID- 19), which broke out in Wuhan [1], was declared as a pandemic by the World Health Organization in January 2020 and has reached in more than 770 million new infections worldwide [2].
The severity of the situation accompanying the evolution of a disease, alongside with the necessity for early prevention, has led many researchers to the investigation of these phenomena using mathematical modeling.The spread of infectious diseases is often described through epidemiological compartmental models, where the (susceptibleinfected-recovered (SIR model represents the most widely known approach [3].Many articles base their exploration of epidemic dynamics on the SIR model [4], or some of its extensions such as susceptible-infected-recovered-susceptible (SIRS) [5,6], (susceptible-exposed-infectious-recovered) SEIR [7,8], or the susceptible-exposed-infectious-recovereddeceased (SEIRD) methodologies [9,10].In addition, Malkov [11] proposed the utilization of a susceptible-exposedinfectious-resistant-susceptible (SEIRS) model with time-varying infection rates for the evolution of COVID-19.However, these deterministic approaches fail to cope with the fluctuating dynamics of many of the aforementioned epidemics, especially over long periods of time.For instance, the introduction of restrictive measures such as mask mandates in closed and public spaces or the establishment of lockdown in many countries worldwide, led to a reduction in the daily infection rates for COVID-19 [12].On the other hand, the emergence of variants has been associated with increased transmissibility [13], hospitalizations, and mortality [14][15][16].Furthermore, the authors in [17] state that there exists a higher risk of serious illness for the unvaccinated delta-infected individuals than for the alpha-infected cases.
Since asymptomatic infections are difficult to detect, in parallel with errors and delays in records, we acknowledge the existence of uncertainties in the reported data [18,19].Moreover, most compartmental models do not consider the complete disease dynamics.Inevitably, this involves highly complex structures, resulting in severe overfitting, a phenomenon that can occur in any statistical or artificial intelligence model [20,21].Therefore, a transition from a deterministic to a stochastic approach appears to be necessary.
The authors in [22] use the basic Kalman Filter to obtain estimations for the evolution of the pandemic in India.However, as they state, these estimations are only reliable for a short time period.Otunuga [23] provides estimations for the parameters of an SEIRS model, while the authors in [24] investigate the geospatial behaviors of COVID based on a stochastic epidemic model.Zhu et al. [25] employ a SEIRD-extended Kalman filter (EKF) model with dynamic parameter estimation, adding the parameters of the model in the updating process of the EKF.Song et al. [26] combine the same two latter models, while the parameter estimation of the SEIRD takes advantage of an iterative optimization method based on maximum likelihood.In [27], the authors propose a SIRD-EKF model for the estimation of the COVID-19 outbreak in Algeria, while Papageorgiou and Tsaklidis [28] propose an extended SEIRS model mixed with an unscented Kalman Filter for the prediction of the evolution of COVID in France for a period of 265 days.
The authors in [29] utilize a susceptible-exposed-asymptomatic-infectious-recovered (SE(A)IR) methodology with particle filtering to estimate the early stages of the spread of COVID-19 in Ohio and Michigan, considering the transmissibility of the virus as the only time-varying parameter.In [30], they propose a sequential Monte Carlo methodology for the estimation of the effective reproduction number for contagious illnesses.They apply this methodology to daily Norwegian COVID-19 data.In [31], they propose a stochastic SEIR model with constant parameters for predicting the early stages of COVID-19, long before the initiation of vaccinations.Endo et al. in [32] present an introduction to particle Monte Carlo Markov Chain methodologies for modeling epidemics.
Furthermore, in [33], the authors introduce a Bayesian state-space model that encompasses a system of six compartments, applied to COVID-19 data from various US states and Colorado counties.They take into account time-varying infection and detection parameters.Nkwayep et al. [34] employ an Ensemble Kalman Filter with time-varying parameters, including infection rates, screening rates, decay rates, and more, for short-term predictions in Gabon and Cameroon.Long et al. [35] propose a physics-informed LSTM network that incorporates non-constant infection, recovery, and mortality rates based on the SIRD scheme.Rodriguez et al. [36] follow a similar approach by adapting a GRU neural network to the specificities of the susceptible-exposed-infected-remove (SEIRM) scheme.Additionally, in [37], the authors utilize a particle filter adapted to the SEIR model, considering time-dependent transmission and reporting rates.Finally, Chen et al. [38] utilize the SIR scheme with dynamically estimated parameters to provide predictions for COVID-19 in the United States.
In this paper, we present a novel extended SEIRS model with repeated vaccinations, accompanied by a sequential importance sampling particle filter with dynamic parameter estimation.This stochastic approach, in parallel with the estimation of time-varying parameters, overcomes the limitations of deterministic approaches while providing valuable insights into the hidden dynamics of epidemics.The validation of our model is based on the daily records of COVID-19 in Italy, demonstrating its effectiveness in explaining significant variations in the spread of the virus after the start of the vaccination period, even over an extended time interval of 525 days.Although this approach can be used to model other contagious illnesses characterized by repeated vaccinations, such as influenza, Ebola, and yellow fever, we chose COVID-19 as the case study to assess the efficacy of our model due to the abundance of daily observations of infections, hospitalizations, recoveries, deaths, and vaccinations.The data used were retrieved from official datasets of the European Union and the University of Oxford.More information is provided in Section 2. 3.
The proposed model provides a much more representative image of the evolution of the pandemic due to the increased number of suitable states and transitions.The increase in the number of mixed differential equations, where most states are observable in combination with the successive parameter estimation resulting from real-time feedback of records, prevents extreme parameter estimations, resulting in quite reliable predictions, even for the hidden states of the model.The only states not supported by available information are the susceptible and exposed ones, which are included in all extensions of the standard SEIR model.Furthermore, this methodology can be easily applied in cases where both the system and the observations are subject to uncertainties, providing robust estimations for both observable and hidden states by eliminating unnecessary noise.
The inclusion of the hospital and intensive care unit (ICU) admitted states in the model provides another notable advantage to the analysis.COVID-19 is characterized by a high percentage of asymptomatic cases, leading to the consideration that the levels of currently infected and recovered cases could be deemed as indicators of low reliability for the pandemic.On the contrary, hospital and ICU-admitted cases could be considered as the most accurate indicators to assess the fitting-predictive capacity of our approach.It should be noted that adding the extra states does not impose a significant computational burden on the model.
The main findings present an estimation of a higher number of currently infected cases compared to the daily records, which can be attributed to the asymptomatic infections that characterize the spread of COVID-19.The percentage of asymptomatic patients varies between 20.51% and 53.62% [39].Among the 18 antibody testing studies cited in [39], six used random selection of participants to achieve a representative sample of their target population.According to these analyses, which were conducted with representative samples in England, Spain, Bavaria, Louisiana, Brazil, and Connecticut, we conclude that the proportion of individuals who tested positive but did not report having symptoms ranged from 21.7% to 47.3%.These results remarkably enhance the validity of our model.Moreover, in a study carried out in the village of Vò Euganeo, Italy, during the first months of 2020, the authors stated that of the total subjects tested, 43.2% were asymptomatic [40].
An equally notable fact is that, compared to other epidemiological models in the literature that employ additional states for asymptomatic cases-forcing the estimation of this ratio and increasing the model's complexity-our approach reveals the hidden dynamics of the pandemic without any additional states or computational burden.Another important observation is derived from the investigation of the model's estimations of ICU admissions.The prevalent infection wave of January 2022 is notable not only for its intense transmissibility but also for the short time it took to reach its peak.This combination put intense pressure on the national healthcare system, leading to full occupancy in ICUs.As a result, we can consider that our model diagnoses this situation and provides an estimation of the individuals who stayed out of ICUs due to bed unavailability.Based on the above, our model reveals significant hidden dynamics of the epidemic, verifying its fitting capacity.
A mathematical analysis is performed, emphasizing the existence and stability of epidemic equilibriums.Furthermore, we propose an improved formula for R₀, commonly known as the basic reproduction number.This parameter is instrumental in predicting the potential for disease spread within a population.By incorporating stability analysis into our extended SEIRS model, we not only estimate R₀ more accurately but also gain valuable conclusions about the nature and future trajectory of the disease.Understanding the asymptotic behavior of the system, as revealed through stability analysis, is fundamental.It helps us anticipate whether an epidemic will eventually subside or continue to affect a population.The combination of stability analysis and an improved estimation of R₀ enhances our ability to provide accurate predictions and develop proactive strategies for epidemic control, ultimately contributing to better preparedness and response in the face of health crises.This analysis additionally indicates that the proposed epidemiological scheme aligns with existing literature, as it uncovers the presence of two equilibria and their stability, a common feature observed in numerous other models documented in the literature.
Additionally, we propose an algorithm to estimate the number of currently recovered and protected cases through vaccination, using the time series of cumulative recovered and fully vaccinated cases.This algorithm utilizes a vector of weights generated from a normal distribution, providing a correction for the aforementioned time series.This necessity arises from the attenuation of antibodies produced by vaccination or a previous infection.
Consequently, the proposed epidemiological scheme that takes into account repeated vaccinations, and its modification to a stochastic equivalent that deploys the available information through particle filtering, provides innovative evidence in the field of mathematical modeling for epidemics.Furthermore, unlike most articles in the literature, we not only propose a new stochastic model for epidemics and examine its predictive capacity but also investigate the mathematical properties of the novel scheme such as the existence and global stability of epidemic equilibria.The statistical approach has the potential to guide decision-making on the short-term course of the pandemic, generating reliable forecasts in particular for relatively short time periods.On the other hand, the stability analysis provides information about the long-term trend of the phenomenon contributing to enhanced awareness against this public health issue.Consequently, we contend that both analyses offer valuable perspectives on the pandemic, each providing distinct viewpoints.Finally, the presented algorithm for the estimation of the currently recovered and protected cases, from the respective time series of total recoveries and vaccinations, can be a valuable tool for examining the evolution of an epidemic characterized by imperfect vaccination or attenuating immunity over long time periods.
The rest of the article is organized as follows: in Section 2, we present the mathematical scheme of the proposed stochastic epidemiological model along with theoretical results on the existence and global stability of epidemic equilibria.In Section 3, we validate the model's fitting-predictive efficiency based on the daily COVID-19 records in Italy, starting from January 2021 when the vaccination period began.In Section 4, we discuss, and in Section 5, we conclude the main findings of our study, with a particular emphasis on the advantages of the stochastic epidemiological model.

| The extended SEIRS model with repeated vaccination
In this section, we present the structure of the extended SEIRS model that we later incorporate into the stochastic methodology, to investigate the evolution of COVID in Italy.Our extension of the standard SEIRS model includes additionally the states and associate differential equations of the hospitalized, ICU admitted, deceased and vaccinated cases.
This approach also enables the transition from the recovered or vaccinated back to the susceptible state, a feature that is essential for the examination of epidemics with attenuating immunity over long periods of time.Figure 1 shows the transitions between the states of the proposed epidemiological model.The selection of the presented states and transitions was considered with great care, aiming to manage the challenging dynamics of the phenomenon under consideration without significantly increasing the complexity of the model.Table 1 includes all the necessary explanations for the notation used in the extended SEIRS model of System (1).Lemma 1.According to System (1), the total population with respect to time, N t ð Þ, is bounded and finite.
Proof 1.After the summation of the seven differential equation of System (1), we calculate the derivative of the total population with respect to time: By moving the term δΝ t ð Þ to the left side of (2) and by integrating the above equation with respect to time, we get As the total population cannot be negative and while t !∞, we obtain where Λ and δ are finite non-negative constants.Thus, the population is always finite.□ From System 1, we can calculate the disease-free equilibrium (DFE) that resembles the eradication of the disease from the population.For the calculation of the equilibrium Y 0 , we set I 0 ¼ 0, as there should be no infectious individuals that may spread the disease.By setting the flows of System (1) equal to zero, we obtain In the remainder of this section, we present several theorems that are associated with the long-term behavior of the proposed epidemiological scheme.The proofs of these theorems are displayed in the Appendix A.
Theorem 1.The basic reproduction number of the proposed extended SEIRS model (1) is Theorem 2. The DFE Y 0 is globally asymptotically stable if and only if R 0 < 1: Theorem 3. The extended SEIRS model has a unique endemic equilibrium when R 0 > 1: Þ is globally asymptotically stable when R 0 > 1: Theorem 5. When R 0 < 1, the extended SEIRS system of ODEs converges exponentially to the DFE according to the maximum eigenvalue of the negative definite matrix Q: Similarly, when R₀ > 1, the system converges exponentially to the endemic equilibrium based on the maximum eigenvalue of matrix P:

| Stochastic epidemic model for the evolution of infectious diseases and statistical estimation
The evolution of infectious diseases is usually accompanied by fluctuating parameters influenced by external factors such as public health measures or variants.Characteristic examples include the infection, mortality, and recovery rates of patients.Moreover, deterministic compartmental models typically do not account for uncertainties and face challenges in describing the changing behavior of epidemics.Therefore, we aim to incorporate any available information about the evolution of the pandemic, such as daily numbers of recovered, vaccinated, deceased, hospitalized, and ICUadmitted cases, into the proposed extended SEIRS model.
In this section, we introduce a new statistical algorithm designed to estimate the current numbers of recovered and vaccinated cases.Additionally, we propose a stochastic epidemiological model that acts as a stochastic counterpart to the compartmental scheme depicted in (1).Our focus lies in establishing an effective state-space framework that amalgamates all available information pertaining to the phenomenon, enabling dynamic parameter estimation.Moreover, we delineate the statistical estimation method employed, utilizing a particle filter operating on an expanded state-space.This leads to a simultaneous dynamic estimation of the system's states and various epidemiological parameters, with specific details about the functioning of the particle filter provided.

| Description of the model
The majority of existing datasets contain the daily total number of recovered cases.However, during our analysis, we are not interested in the total recoveries but rather in the number of currently recovered cases.This is due to an individual's inability to retain antibodies produced by a previous infection indefinitely, or because the transmitting virus is constantly mutating.This feature, which is encountered in a plethora of infectious illnesses such as COVID-19, influenza, and Ebola [41], is introduced to our model through the transition from the compartment of the recovered to the susceptible individuals.
Aiming to provide an estimation of the daily currently recovered cases using the observations of the total recoveries, someone could employ the following conventional solution.Considering that immunity from infection primarily lasts for k days, one could subtract from the cumulative recovered cases at each time step t, the respective number of cases from time step t À k, leading to as the immunity of individuals who recovered during the time step t À k could be deemed unsatisfactory to protect them against a new infection.Note that this approach does not take into account the variability of the human immune system, leading to an oversimplification of the phenomenon under consideration, as many individuals may remain protected for shorter or longer periods.Next, we would like to provide a statistical alternative to this problem.Initially, we create the time series of new recoveries n rect , t ≥ 0 f g , by taking the first differences of the time series of total recovered cases.Next, in our attempt to estimate the number of currently recovered cases at time t, we refer back to the time series of new recoveries at time t À k ℕ and fit a normal distribution with a mean μ ¼ t À k.By "fitting," we mean generating a vector that represents probabilities for a series of time steps centered around μ ¼ t À k.The associated probabilities are derived from a normal distribution with mean μ and standard deviation σ.The result is a symmetric vector, with lower values on the sides and greater values as we move toward its center.In the R programming language, this vector can be constructed using the dnorm(Á) function.
Considering that immunity lasts primarily for k AE c days, we calculate the associated probabilities for each step Þ-th position of the vector is the highest, as it is associated with the central point of the chosen normal distribution.This is desirable as we subsequently assign weights that, when multiplied with vectors of new recoveries, give more weight to the central elements of these vectors.The user may choose the standard deviation σ of the distribution according to the specifics of the examined phenomenon.
We then normalize the 2c þ 1 probability values by dividing with their sum, and construct a vector of 2c þ 1 weights, w, with entries w i for i ¼ t À k À c,…, t À k þ c: Then, we take the inner product of w with the vector of new recoveries for the time steps t À k À c, …, t À k þ c.Finally, since the number of currently recovered cases should be a positive integer, we subtract the integer part of it from the total recoveries at time t, A smaller σ will give greater weights to the new recoveries around the center of the n rec tÀkÀc ,…, n rec tÀkþc À Á vector.Based on the properties of the normal distribution, using a smaller standard deviation (σÞ in the dnorm(Á) function will result in probabilities, and consequently weights, that decrease more rapidly as we move toward the sides of the w vector.The choice of a normal distribution is not arbitrary.New recoveries at each time step can be considered as arrivals of a non-homogenous Poisson process with rate λ t ð Þ [42].For sufficiently large values of λ, a Poisson distribution can be approximated satisfactorily by a Gaussian.Algorithm 1 summarizes the abovementioned methodology.
We should note in Algorithm 1 that M represents the length of the time series corresponding to the number of recovered cases.Similarly, the number of individuals that are currently vaccine-protected can be estimated, as there are infectious diseases where repeated doses are required to provide complete protection against infection.This fact is introduced into our model through the transition from the compartment of vaccinated to the susceptible individuals with rate v 2 : The only difference in the estimation of currently protected compared to currently recovered cases is the addition of the total number of booster doses, which is quite typical for diseases such as COVID-19, influenza or yellow fever.Thus, where Currently V t represents the number of individuals that are still protected from vaccination, belonging to the "vaccinated" compartment, while n vact , t ≥ 0 f gcorresponds to the daily numbers of new vaccinated cases.We should emphasize that, for the generation of Currently V t , we subtract from the cumulative number of vaccinated cases (Cumulative V t ) the inner product of weights and new vaccinated cases around time step t À k as the immunity of these individuals attenuates significantly after k days.For the recovered and vaccinated cases, we select k, c ð Þ¼ 240, 60 ð Þand k, c ð Þ¼ 180,30 ð Þ, based on the meta-analyses [43,44], as the immunity from infection lasts primarily 240 ± 60 days and the antibodies generated from vaccination drop significantly after 180 ± 30 days.We note that the selected k, c ð Þ pairs are closely related to the ψ and v 2 rates of System (1), which represent the immunity termination rates of a previously infected or vaccinated case, correspondingly.More specifically, they have an inversely proportional relationship as the former describes length of stay and the latter rates of transition out of the recovered and the vaccinated states.Thus, the values of these parameters are selected accordingly, as shown in Section 3.1.Now, based on [45], we should specify the importance density function where the most common selection is the transition density of the hidden states in one step f Since we are working with subpopulations, we have to utilize discrete distributions for the transitions between compartments, namely, multinomial distributions during the interval t, t þ Δt ð Þ.More specifically, we employ The terms denoted as (d IJ Þ t , in equations ( 9)-( 15), represent the number of individuals moving from state I to state J during the time step t: In addition, the terms d I: ð Þ t indicate transitions out of state I at time step t due to mortality from natural causes.
In (9), , and are weighting factors that ensure the probabilities of the multinomial distribution sum to unity.The first component corresponds to the transition from the susceptible to the exposed state, the second one to the transition from the susceptible to the vaccinated state, and the third one to the mortality of a susceptible case due to natural causes.Moreover, the exponential term 1 values in the interval 0, 1 ½ and is associated with the transition tendency out of the susceptible state.The interpretation of expressions ( 10)-( 15) is similar.We understand that equations ( 9)-( 15) are stochastic equivalents of the differential equations in System (1).Finally, the state equation, which dictates the evolution of the system's states for one time step ahead, is given by The likelihood that connects the observations given the hidden states at time t-namely, p y t jx t ð Þ-can be represented as an average of Poisson distributions with means equal to the values of the hidden states at time t ℕ: Specifically, we employ an average of 5 Poisson distributions, where This formula represents the observation equation of our model.We do not include the number of currently infected patients in the above formula, as it cannot be considered an accurate indicator [31].To some extent, the same can be asserted for the reported recovered cases.However, their inclusion provided increased likelihood values for the model and led to more reliable parameter estimations.Incorporating this time series, offers insights into how the number of recoveries changes over time, reflecting recovery rates that fluctuate as we move deeper into the vaccination season.The inclusion of recovery data in the function p y t jx t ð Þhas resulted in parameter trends that align with the pandemic's progression.Typically, during and slightly after the onset of infection waves, we observe a rise in the number of recoveries as more people successfully recuperate from the virus.This pattern is evident, particularly in the context of the third infection wave in January 2022.As detailed in Section 3, the model identifies this tendency through the particle filtering update process and culminates in recovery rates that display an upward trend as we move into the vaccination period.On the other hand, mortality rates exhibit the opposite tendency.This suggests that despite the increased infection rates, the risk of mortality has diminished, which is consistent with the characteristics of new variants like omicron [46].

| Statistical estimation
We emphasize that the states in our model have been carefully selected to include only those with daily available observations.Apart from the states of susceptible and exposed cases, for the remaining six states that participate in the proposed scheme, daily observations are available.These observations can be efficiently utilized to provide complementary information to the model, notably enhancing its fitting and predictive capacity.On the other hand, the daily observations are characterized by noise that is compatible with the nature of epidemics.This fact is influenced by errors or delays in recording, asymptomatic infections, or the inclusion of cumulative numbers of cases.Specifically, datasets contain the total number of recoveries or vaccinations and not the currently recovered or fully vaccinated cases.
In accordance with the aforementioned comments, we aim to adapt the particle filtering methodology to the specificities of epidemic spread.This involves constructing a hybrid particle epidemiological model that can efficiently integrate elements of the proposed compartmental model with available observations, while addressing the weaknesses of both sources.Therefore, merging these two methodologies requires modifying the deterministic extended SEIRS model into an equivalent discretized stochastic version.
Our purpose is to estimate the posterior density of the trajectory of hidden states employing the information provided by the time series of measurements up to time t, p x 0:t jy 0:t ð Þ.This Bayesian solution is approximated by a sum of weighted samples (particles), where p x 0:t jy 0:t ð Þ¼ P N p i¼1 w i t δ x t À x i t À Á and δ : ð Þ denotes the Dirac delta function.Since we cannot sample from the posterior density as it is unknown, we sample from the so-called proposal (importance) density f x 0:t jy 0:t ð Þ.The importance weights can be updated recursively [47] as In practice, our primary interest lies in the current filtered estimate p x t jy 0:t ð Þ: We assume that the density , considering that the states follow a first-order Markov process.Thus, we obtain , with the weights being finally normalized.The symbol / signifies the proportionality of the above quantities.
Considering the fluctuating dynamics of large-scale epidemics, we augment the state space by adding nine timevarying parameters representing the virus's transmissibility, recovery, mortality, hospitalization, and ICU admission rates.Thus, we increase the number of hidden states to efficiently describe the dynamics of an epidemic, leading to the augmented state vector for the i-th particle xi t .At each time step, parameter innovation is based on zero-mean normal distributions.It is important to note that we do not impose any predefined trends on the evolution of the parameters; their trajectories are entirely determined by the model's likelihood based on the available data.The operation of the modified sequential importance sampling particle filter is outlined in the following table.
We denote by N p the number of particles, while the decision of whether the resampling step should be performed is based on the effective sample size measure of degeneracy As this quantity cannot be calculated directly [48], we employ the estimation It is necessary to select a threshold value for the implementation of the resampling step.Usually, this threshold is a percentage of the predefined number of particles N p : In our case, we choose a threshold ¼ 0:75N p , providing a relatively frequent resampling process, aiming to mitigate degeneracy issues [45].
In summary, the main objective of Algorithm 2, which illustrates the operation of the particle filtering methodology, is the computation of the best posterior state estimation considering the system's dynamics and the available measurements.According to Figure 2, we first sample from a prior distribution, which is set as an initialization.During the i-th iteration, we predict the system's state based on the transition density f xi , and then update this estimate using the available measurements through the likelihood function p y t jx i t À Á : Thus, we arrive at the extraction of the best state estimation.Afterward, we proceed to the state estimation for the i þ 1 ð Þ-th step.If resampling is necessary ( b N eff t ð Þ < thresholdÞ, we sample new particles from the existing ones based on the weight distribution, aiming to eliminate particle trajectories which are not informative for the phenomenon.After that, the distribution of the weights is reset and becomes uniform.

| Dataset
To apply the stochastic particle filtering extended SEIRS model with the aim of providing estimations for the hidden states of the pandemic in Italy, we utilize daily observations from two datasets.First, we make use of the available information in the dataset from the website "Our World in Data" [49], which is an open-access database provided by the University of Oxford.From this dataset, we extract the daily observations of fully vaccinated individuals, deceased individuals, hospitalized cases, total cases, as well as observations of the total number of booster doses administered and ICU patients.
Additionally, we obtain information on the daily count of currently positive (infectious) cases from the database on the data.europa.euwebpage [50].This webpage is updated daily and includes official COVID-19 observations for all European Union member states, collected from national resources.By subtracting the daily counts of currently positive and deceased cases from the total daily cases, we arrive at the total number of recoveries.It is important to note that we have verified the compatibility of these two datasets.Both datasets display the same numbers for deceased, hospitalized, and ICU patients.Furthermore, the column representing total cases in the first dataset aligns with the column for total positive cases in the second dataset, further confirming the consistency of the two datasets.

| Application on daily COVID-19 observations in Italy
For this part of our analysis, we examine the fitting-predictive efficiency of the proposed (stochastic) particle filtering extended SEIRS, on the daily records of COVID-19 in Italy.We explore a long interval of 525 days from January 4, 2021, which represents the first day of reported fully vaccinated cases.In this time interval, we encounter four main infection waves, with the third wave being the most prevalent.Furthermore, this third wave represents a challenge for all modeling efforts, as we observe a sharp increase in the transmissibility of the virus in the population.An observation of considerable interest is that this steep increase occurred in a shorter time interval (December 28, 2021-January 23, 2022), compared to the two previous infection waves (February 19-March 21, 2021 and July 16-August 30, 2021).
Other valuable observations for the evolution of the pandemic during the vaccination period can be derived from the time series of hospital and ICU admitted cases.As we move deeper in the vaccination period, we notice a decrease in the rates of hospital and ICU admissions.This observation becomes evident during the first two examined waves.The first wave (February 19, 2021-March 21, 2021) resulted in approximately twice as many infections as the second wave, while hospital and ICU admissions during the first wave exceeded those of the second by approximately five to six times.Furthermore, the same declining trend in the mortality rates of COVID-19 can be observed based on the total number of deaths in Italy, while the respective recovery rates follow the exact opposite behavior.
In Table 2, we display the prior and posterior distributions for the parameters and states of the model.The posterior distributions are derived through the implementation of the stochastic SEIRS model for T ¼ 500 days and N p ¼ 800 particles.For the priors of the hospitalized, ICU-admitted, vaccinated, and deceased states (as of January 3, 2021), we utilize point estimates of the reported number of cases on this day, as the respective time series can be considered the most accurate indicators of the pandemic.For the prior distributions of the remaining states, we employ Poisson distributions, as we refer to discrete subpopulations.The selected distributions and generated statistics of the states' posterior empirical distributions refer to the number of cases, which are included in each state on a daily basis.For instance, at t ¼ 500, we have 7604 currently hospitalized cases with a 99% credible interval of 7039 to 8209 cases.Additionally, the respective quantities for the system's parameters represent daily transition rates.The same applies to the other model's states and parameters.The references in the fourth column provide information about how the selection of the priors has been made.
Aiming to consider an open population, a distribution selection for the demographic parameters Λ and δ is required.Based on official data [51], the mortality rate from natural causes was 10.749 and 10.841 deaths per 1000 people for 2021 and 2022, respectively, leading to annual mortality rates of 0.010749 and 0.010841.Thus, the daily mortality rates were $2:94 Â 10 À5 and 2:97 Â 10 À5 , correspondingly.Consequently, we choose a uniform distribution with mean and standard deviation equal to those of the two rates, which leads to δ $ Unif 2:923 Â 10 À5 ,2:997 Â 10 À5 ð Þ .Now, for the prior distribution of Λ, we should consider both Italy's birth and net migration rates.Annual birth rates of 0.0727 and 0.07154 and net migration rates of 2.053 and 2.155 are observed for the 2021 and 2022.Hence, we result in daily inflow rates of 2:555 Â 10 À5 and 2:55 Â 10 À5 after the addition of the birth and migration rates.Following the same process as with δ, we select Λ $ Unif 2:546 Â 10 À5 ,2:559 Â 10 À5 ð Þ : These distributions are considered to remain constant throughout the entire period, due to the low variability of these rates.Thus, the prior and posterior distributions are the same (Table 2).
According to meta-analysis [52], the mean incubation period for COVID is 5.6 days, with a credible interval of 5.2-6 days.Thus, we choose a truncated normal distribution a $ tN( 1 5:6 , 1 56 , 1 6 , 1 5:2 Þ.The estimations for the remaining parameters are derived in a similar manner.In addition, the rates of re-susceptibility after infection ψ or vaccination v 2 are also held constant during the evolution of the filter's estimations.We utilize truncated normal distributions for their priors due to the uncertainties associated with each individual's immune system.Finally, for the breakthrough infection σ ð Þ probability, we select a uniform distribution of σ $ Unif (0.15, 0:4Þ [53].This parameter also remains constant, as the breakthrough infection probability usually increases 5-6 months after vaccination.
All parameter initializations are based on meta-analyses, aiming to achieve the most representative conditions for the examined phenomenon.Moreover, we set ω ¼ 1  14 , as-based on the instructions of Global Health Organization-the necessary quarantine duration for an infected individual with mild symptoms was 14 days, before the introduction of vaccinations.Finally, the literature for parameters like β,γ, and λ is limited.Thus, we selected interval estimations that maximize the likelihood of the data based on the model's estimates.Interval estimates are necessary because the parameters of the studied phenomenon are accompanied by uncertainty.Furthermore, there are no explicit statistical equations available in literature for these rates.Deriving these equations is challenging due to our system's numerous state transitions, which complicate the generation of these expressions.Additionally, in case explicit formulas are derived, it is important to assess their statistical properties to ensure that their estimates are unbiased, consistent, and reliable.
On the other hand, the likelihood function has been frequently used for parameter or even model determination.In our case, we choose parameter values that maximize the log-likelihood function.The formula for estimating the likelihood values using the particle filter can be found in [55].We choose uniform instead of normal distributions, as we have no prior information about the distributions of these parameters.We can employ a grid-search approach utilizing likelihood values derived from the formula outlined in [55] to identify the optimal prior distributions for the parametric triplet β, γ,λ ð Þ.This involves analyzing various combinations of prior distributions.Through the particle filtering methodology, each combination generates state and parameter trajectories according to Algorithm 2. By comparing these trajectories with existing records, we can pinpoint the combination yielding the maximum likelihood value, determining the best set of prior distributions.
In Figure 3, we present the produced scaled log-likelihoods for several parameter combinations.The scaling has been implemented by dividing with the absolute value of the maximum generated log-likelihood.The diagrams display two different angles of the log-likelihood grid, only for combinations of γ and λ for β $ Unif 0:5, 1:5 ð Þto facilitate visual inspection.This representation shows that the maximum scaled log-likelihood corresponds to the pair of central points 0:0015, 0:025 ð Þ , which is associated with the distributions γ $ Unif 0:001, 0:002 ð Þand λ $ Unif 0:02,0:03 ð Þ .More interval estimates for the β parameter have been tested, although they provided reduced log-likelihood values.For the remaining parameters, we employ truncated normal distributions, selecting standard deviations that produce coefficients of variation of 10%.
At this point, we should mention that the ψ and v₂ rates are closely related to the selected k, c ð Þ pairs, which describe the average period after which acquired immunity significantly diminishes.As we mention in Section 2.2, for the correction of the time series of currently protected cases, we have chosen k, c ð Þ¼ 180, 30 ð Þ, signifying that immunity from vaccination primarily lasts 180 ± 30 days.To describe this mean tendency along with the parameter uncertainty arising from variations in human immune systems, we employ a truncated normal distribution with a mean daily transition rate of 1  180 to the susceptible state, and bounds 1 210 and 1 150 for the prior distribution of v₂.As a result, we have v₂ $ tN 1 180 , 1 1800 , 1 210 , 1 150 À Á .A similar approach is followed for determining the prior of ψ.From Table 2, it becomes clear that we have three cases of parameters.The time-independent (constant) with initial uncertainty Λ, δ,a, σ, ψ, and v 2 , the time-varying (initially) point estimations ω and v 1 , and the time-varying estimations with initial uncertainty β, γ, λ, κ, μ, τ, and ρ.For ω and v 1 , we provide the same initial value for all N p particles.Α detailed distinction of the abovementioned categories is presented in Table 3.We notice that parameters Λ, δ, a,σ, ψ, and v 2 are held constant due to their low variability during the pandemic period.A dynamic estimation process for these terms increases the model's computational complexity without enhancing the quality of our results.Consequently, these rates do not participate in the estimation process.Hence, we obtain the same prior and posterior F I G U R E 3 Scaled likelihood values for 144 combinations of the κ and λ parameters.The x and y axis show the central points of the selected uniform distributions for the γ and λ, respectively, while the z axis displays the scaled log-likelihood values.The selected central points are 0:0005,0:001, …,0:006 f g for γ and 0:015,0:02,…,0:07 f g for λ: The light red color corresponds to the maximum log-likelihood value.
[Colour figure can be viewed at wileyonlinelibrary.com] distributions (Table 2).Finally, the 99% credible intervals shown in Table 2 are derived from the empirical distributions provided by the N p particles.

| Examination of the model's efficiency
In Figure 4, we display the real-time observations of infectious, hospitalized, ICU-admitted, deceased, and vaccinated cases accompanied by the respective state estimates produced by the stochastic extended SEIRS model.The model's predictions appear to adequately capture the dynamics of the pandemic even over a long period of 500 days.More specifically, the estimations provided for hospitalized, ICU-admitted, and deceased cases are quite close to the reported data, which is particularly important as these three time series represent the most accurate available records.Furthermore, the estimations for the vaccinated cases fit quite precisely with the corresponding corrected observations derived from the implementation of Algorithm 1.
A highly important finding of our analysis is the estimation of more currently infected cases compared to the corresponding daily records.This phenomenon can be attributed to asymptomatic infections that characterize COVID-19.The reported data include mainly the number of symptomatic infections.Considering that the particle filter provides the true state estimations (symptomatic and asymptomatic), we can estimate the percentage of asymptomatic cases at each time step by subtracting from unity, the ratio of reported to the estimated from the particle filter infective cases.It is important to underline that hospitalized cases are a proportion of the estimated numbers of infective individuals.As stated above, the model estimates more infective cases than the reported ones.In contrast, the number of hospitalizations shows a satisfactory fit with the official data.Official hospitalization numbers can be considered one of the system's most accurate indicators, as typically only patients with severe and distinguishable symptoms enter the hospital.Interestingly, the model does not show the same overestimation for hospitalizations as it does for the number of infected cases, even though these quantities are closely related.This phenomenon enhances the robustness of the produced state estimations, while justifies the inclusion of asymptomatic cases in the system's estimates.
As observed in Figure 5, the 95% credible interval for the percentage of asymptomatic infections is 20.88% to 50.43%.According to the meta-analysis [39], among 18 antibody testing studies, six used random selection of participants to achieve a representative sample of their target population.In these six studies, the percentage of people who tested positive but did not report symptoms ranged from 21.7% to 47.3%.These results lend further significant validity to our model.A noteworthy fact is that, in contrast to other epidemiological models in the literature that employ extra compartments for asymptomatic cases-forcing the estimation of this percentage and significantly increasing the model's complexity-our approach reveals the hidden dynamics of the pandemic without any additional states or computational burden.
Another notable result of our analysis is derived from the exploration of the model's estimates for the number of individuals admitted to ICUs.During the most prevalent infection wave in January 2022, we observe an estimation of 70-148 more individuals in ICU, emerging during the peak of the wave.As mentioned above, this infection wave represents the most intense dispersion of the virus so far, while its duration is shorter compared to the two previous waves of the examined period.Undoubtedly, this combination exerted intense pressure on the national health system, leading to overcrowding in ICUs.As a result, we consider our model to be diagnostic of this phenomenon, providing an estimation of the number of individuals who remained out of the ICU due to overcrowding.
At the same time, we notice a slight overestimation of deceased cases.The first and third quantiles of this overestimation are 0.102% and 0.592%, respectively.This slight deviation between records and observations could be attributed to underreporting of deaths, especially in small villages or isolated areas.In summary, according to the aforementioned comments, our model reveals crucial hidden dynamics of the pandemic by providing new evidence to the literature while validating its fitting capacity.In Figure 6, we present the time-varying parameters of the model during 500 days of the pandemic in Italy.This figure shows the weighted means of the parameters accompanied by the respective 50% credible intervals derived from the empirical distribution of the particles at each time step.First, the transmissibility rate β ð Þ appears to adequately track the emergence of infection waves.Its value initially drops and then increases after the first months.After that, it shows a small decreasing tendency until the onset of the most prevalent infection wave of January 2022, where a new increase is observed.In addition, we encounter a notable overall declining trend in hospitalization γ ð Þ and ICU admission rates λ ð Þ.We only observe a small increase in these rates around December 2021 and January 2022, which can be justified by the onset of the respective infection wave.This phenomenon leads to the conclusion that despite the noteworthy increase in transmissibility, the high vaccination levels kept the number of serious infections at relatively manageable levels.
In parallel, the recovery rates of both hospital κ ð Þ and ICU-admitted cases τ ð Þ are characterized by an increasing trend.The only period where we encounter a slight decrease in these rates coincides with the period of December 2021-January 2022, during the most prevalent infection wave.On the other hand, the respective mortality rates μ ð Þ and ρ ð Þ show the exact opposite behavior, with a decreasing trend during the 500 days except for the period around January 2022.Additionally, the mortality trend of hospitalized cases is decreasing much more intensively than the mortality of ICU-admitted cases.This observation seems quite reasonable, as even after vaccination, the immune system of ICUadmitted patients is significantly weakened, leading to less reduced levels of mortality.The abovementioned tendencies seem to be quite encouraging for our approach as they align with the nature of the examined phenomenon.
In Appendix B, we display the empirical prior and posterior distributions derived after t ¼ 500 days for the seven abovementioned parameters.We use red and blue color to represent the prior and posterior distributions (Table 2), respectively.The posterior histograms show concentration of the particles around the weighted means.Moreover, the filter's particles are gathered around lower values compared to the initial ones, regarding parameters γ, λ, μ,ρ, which is compatible with the aforementioned comments concerning the reduced severity of the disease.The opposite behavior is encountered in the case of κ and τ: Once again, these rates show an increasing tendency as we move to future time steps, which is associated with rising recovery trends.
The shape of the displayed posterior distributions shows either a symmetrical or slightly skewed form.As a result, we consider that the filter successfully identifies the true system's parameters since its particles converged to unimodal distributions around specific regions of the parametric space.The prior and posterior shapes of the β, γ and λ parameters differ, as we have selected uniform distributions for their priors.This choice is quite common in cases where we have parameters for which information about their initial levels is limited.On the other hand, after the application of the filter, the particles are concentrated in specific regions of the parametric space providing these symmetric or slightly skewed forms.Moreover, the normality of the posterior distributions has been checked according to the Shapiro-Wilk statistical test, where the distributions of β, γ, and μ satisfy this assumption (p-value >0.05).Although, it should be emphasized that the existence of only Gaussian or symmetric distributions is not necessary, while the ability to describe distributions with alternative shapes signifies one of the main advantages of the particle filter methodology compared to other filtering attempts, like the Kalman.
In Table 4, we present the NRMSE values [57] for the fitting ability of five models, with the last row containing the normalized errors of the proposed model.We compare these five models based on the recorded hospitalized, ICUadmitted, deceased, and vaccinated cases.We believe that these four observable states can provide valid comparisons since their daily records are characterized by high accuracy due to their nature.On the other hand, infectious and recovered cases should not be considered as reliable indicators due to the existence of notable proportions of asymptomatic cases.We also believe that the records of daily ICU admissions, except for the period of the third infection wave that caused overcrowding in ICUs, can be considered trustworthy for including this state in the comparisons.
Our model provides the best fitting results, displaying the least NRMSE values compared to the four remaining models, while the deterministic approach shows by far the worst estimations.We also compare our model with other models that employ the usage of the EKF, as this is the most popular stochastic approach in the literature for modeling epidemic outbreaks [25][26][27].The SEIRD model is one of the most frequently used models to describe an epidemic, containing the most common observable states.Also, as we mention in the introduction, the authors in [25] proposed an EKF-SEIRD model with parameter estimation for the evolution of COVID.As a result, we compared our approach with a state-of-art model that approximates the proposed methodology as closely as possible.Moreover, in the Appendix C, we display the predictions of the model for 25 days ahead.These predictions are compatible with the above comments and confirm that the presented method can be used to accurately estimate the evolution of a pandemic, especially for the foreseeable future.More specifically, the filter displays an estimation of more infected cases compared to the reported ones, considering the impact of asymptomatic infections.The prediction for the hospitalized, ICU admitted, deceased, and vaccinated cases remain consistent with the estimates of the fitting process.Similarly, there is a small deviation between the estimated and reported ICU admitted patients, the interpretation of which is previously discussed during the inspection of the fitting results.Furthermore, the minor deviation, which was encountered for the deceased cases, reduced even more during the prediction phase.Thus, it should be considered that the presented statistical methodology shows reliable predictions for the future of the pandemic.However, long-term predictions should be conducted with great care due to the variability of exogenous factors, which influence significantly the elements of the disease's progression.

| DISCUSSION
Over the years, various epidemics have spread around the world, causing severe pressure on national health systems.Timely and accurate understanding of the behavior of an infectious disease, along with effective predictions, can provide remarkable benefits in addressing the severe drawbacks that affect public health and the overall socioeconomic environment.Typically, the mathematical modeling of infectious diseases relies on deterministic approaches, which are often based on small extensions of the conventional SIR epidemiological model.
However, deterministic approaches fail to handle the complex dynamics of epidemics like COVID-19 due to the significant fluctuations that accompany them.For instance, the introduction of restrictive measures such as lockdowns or the systematic emergence of variants can decrease or increase the transmissibility of the virus in the population.At the same time, uncertainties in reported data related to asymptomatic infections, errors, and delays in reporting, along with the time-varying evolution of epidemics, lead to the employment of more representative stochastic approaches.Furthermore, existing compartmental models are not perfect; an attempt to encapsulate all possible hidden states and transitions of an epidemic would inevitably lead to a complex model, making the process computationally expensive and the fitting/predictive performance untrustworthy.Finally, stochastic modeling is valuable for phenomena where the epidemic outcome depends highly on the variability in demography, environment, or disease transmission [58].
In this paper, we present a novel extended SEIRS model with repeated vaccination, accompanied by a sequential importance sampling particle filter with dynamic parameter estimation.Using this model, we aim to address all the mentioned limitations of deterministic compartmental models while significantly enhancing their predictive capacity.We extend the standard SEIRS model by increasing the number of differential equations in the system to eight, taking into consideration the populations of hospitalized and ICU-admitted cases, deceased individuals, and vaccinated cases.Additionally, the model accounts for repeated vaccination, which is necessary for investigating COVID-19 over long time periods or other epidemics with similar behavior.Through the utilization of particle filtering, we incorporate all available information on COVID-19 in Italy, significantly improving the accuracy of our estimations even for 525 days of the pandemic.With dynamic parameter estimation, we avoid the need to include different transition rates for vaccinated and unvaccinated individuals (which would require adding seven to nine extra parameters).Hence, we reduce the possibility of overfitting and provide additional confirmation of the reliability of our model through the inference of parameter evolution.
The additional states we have included to extend the standard SEIRS model have been carefully selected to encompass only states associated with available records.In addition to the susceptible and exposed states, which are always included in epidemic models, the remaining six states are accompanied by daily records that we efficiently utilize, providing supplementary information to our stochastic epidemiological model that significantly reinforces its estimation capability.The inclusion of the hospital and ICU-admitted states in the model has an additional advantage.In the case of COVID-19, which is known for its notable rates of asymptomatic individuals, the currently infected and recovered reported cases could be considered as low-accuracy indicators of the evolution of the pandemic.Instead, daily hospital and ICU admission records can be considered the most reliable information to validate the predictive ability of our approach.
We also present a novel statistical algorithm to estimate the number of currently recovered and vaccinated cases, using the time series of total recoveries and vaccinations.The algorithm utilizes weights generated from normal distributions to provide a correction for the aforementioned quantities.This requirement has its roots in the attenuation of antibodies produced by vaccination or infection after long time periods.This preprocessing step appears necessary for diseases like influenza, measles, yellow fever, and varicella, where antibodies produced from either infection or vaccination gradually weaken.Special attention should be given to the k,c ð Þ pair as it plays a central role in estimating the current numbers of recovered and vaccinated cases.We have opted for k, c ð Þ values of (240, 60) and (180, 30) for recovered and vaccinated cases, respectively, based on analyses [43,44].These specific k, c ð Þ choices showcase an inverse connection with the ψ and v 2 rates, representing immunity termination rates for previously infected or vaccinated individuals.These parameter values are specific to the disease at hand, and we suggest determining them through meta-analysis to ensure a more robust selection process.
The main findings of this study include the estimation of more currently infected cases, which can be attributed to the asymptomatic infections that characterize the spread of COVID-19.The 95% credible interval for the percentage of asymptomatic patients ranges between 20.88% and 50.43%.As we mentioned in the introduction, this is also supported by the literature, which notably strengthens the validity of our model.We should emphasize that the cases requiring hospitalization represent a fraction of the estimated infective individuals.As we have mentioned in the Section 3, the model tends to estimate a higher number of infective cases compared to the reported figures.Official hospitalization statistics are particularly reliable because they typically involve patients with severe and easily distinguishable symptoms.Consequently, when it comes to hospitalizations, the model exhibits a good alignment with the daily records.Thus, the model does not display estimations of more hospitalizations as it does for the number of infected cases, even though these two quantities are closely interconnected, which enhances the reliability of the resulting state estimations.
In contrast to other models in the literature that employ additional states for asymptomatic cases-forcing the estimation of this ratio and increasing the complexity of the model-our approach reveals the hidden dynamics of the pandemic without considering any additional states that would invoke additional computational costs.Furthermore, the prevailing wave of January 2022 in Italy is notable not only for the intense transmissibility of COVID-19 in the population but also for the short time it took to reach its peak.This combination put increased pressure on the national healthcare system, possibly leading to overcrowding in ICUs.Thus, our model diagnoses this situation and provides an estimation of individuals who were unable to access ICUs due to bed unavailability.Based on the above and in combination with the investigation of the evolution of the hidden parameters that follow the recorded data quite effectively, our method verifies its fitting capacity and appropriateness in describing the challenging dynamics of epidemics.
Other studies like [59] suggest using epidemic models with additive stochastic noise.In our case, stochasticity is incorporated into the transitions of hidden states for each time step, where terms (d IJ Þ t are derived using multinomial formulas (shown in ( 9)-( 15)) and generated using the rmultinorm(.)functions in R programming language.This process allows us to generate a sample of potential trajectories for each time step, from which the most suitable ones are determined based on the observation density function.Ultimately, the stochastic nature of particle filtering emerges from the inherent uncertainty in system dynamics and measurements, along with the probabilistic elements involved in sampling, updating, and resampling steps used to estimate the system's state over time.This particle filtering methodology presents an advanced stochastic approach, not limited to numerical examples, but focusing on an efficient framework for accurately predicting the future state of the disease.
Moreover, after the first transient phase ($30 days), which is often considered as a burn-in period, our model efficiently tracks the evolution of accurate indicators, such as reported hospital and ICU admissions, deaths, and vaccinations, while it reveals hidden dynamics and behaviors related to the nature of the epidemic event.The exploration of the hidden parameters' evolution not only provides valuable insights into the severity and future of the pandemic but also verifies the adequacy and robustness of our stochastic methodology despite the challenging, fluctuating features of the phenomenon under consideration.
The parameter initialization was based on trustworthy meta-analyses.For parameters where we had no prior information about their distribution, we determined their initial values using maximum likelihood estimations.The infection rate follows the onset of the infection waves during the 500 days, while parameters of interest like γ, λ, μ, and ρ display decreasing tendencies as we progress through the vaccination period.The only noteworthy elevations in their values are justified due to the wave of December 2021-January 2022, where after its attenuation, they continue their downward trend.By observing Figure 6, the parameter for hospital admissions (γ) shows the sharpest decrease, a phenomenon that is also consistent.Recovery rates after hospital or ICU admission (κ and τ) exhibit increasing trends, while some minor decreases in their magnitude are related to the aforementioned infection wave and last for a transient period.Finally, based on Figure 4, we observe a slight overestimation in deceased cases of 0.25% to 0.89%.This deviation may signify the existence of regional underreporting of deaths, although we cannot be absolutely certain of this claim due to the absence of strong evidence in the literature.
According to the results in Table 4, the model with constant parameters in the epidemic particle filter exhibits significantly higher NRMSE values for hospitalized (464.5%),ICU admitted (181.2%), and deceased (254.1%)cases compared to our proposed model with time-varying parameters.Similarly, the EKF-SEIHCRDV model shows greater NRMSEs for these same states, with increases of 92.5%, 9.1%, and 222.2%, respectively.Lastly, the state-of-the-art EKF-SEIRD model, which incorporates dynamic parameter estimation, demonstrates a 566.3% greater NRMSE for deceased cases when compared to our stochastic extended SEIRS model with repeated vaccination.
Our approach supports the effectiveness of complete vaccination against COVID, as the dynamically estimated recovery rates showed an increasing trend as we progressed deeper into the vaccination period.On the other hand, the hospital and ICU admission rates, along with the corresponding mortality tendencies, present a decreasing behavior, especially before and after the third-most prevalent-infection wave.This conclusion is in agreement with many publications that emphasize the importance of the vaccination campaign, resulting in lower mortality and hospitalization rates.For example, the authors in [60] mention that the vaccination campaign resulted in a 63% reduction in expected total deaths, while [61] state that non-ICU hospitalizations and ICU hospitalizations were reduced by 63.5% and 65.6%, respectively, over a period of 300 days in the United States.The authors in [62] declare that the vaccination efficacy in protecting against deaths was 72%, with a lower reduction in the number of deaths for B.1.1.7 versus non-B.1.1.7 variants (70% and 78%, respectively).Finally, according to [63], the mortality rates in counties with low (10%-39%), medium (40%-69%), and high (≥70%) percentages of adults (≥18 years) who had received at least one dose of vaccine in the first half of 2021 decreased by 60%, 75%, and 81%, respectively.
Theoretical analysis based on the proposed extended SEIRS model with repeated vaccination reveals valuable insights into the pandemic's spread.Unlike other studies, this article emphasizes both the mathematical properties and the predictive efficacy of the proposed stochastic epidemiological methodology.More specifically, the statistical method has the potential to guide decision-making on the short-term course of the pandemic, generating reliable forecasts, while the stability analysis sheds light on the pandemic's long-term trends, raising awareness about this public health issue.When combined, both analyses offer valuable and distinct perspectives on the pandemic.The derived formula for R 0 offers a more representative index for pandemic evolution and assists in investigating the existence and global stability of epidemic equilibria.In light of the current situation, where limiting virus transmissibility remains challenging for the foreseeable future, our focus should be on reducing the incidence of serious infections and mortality.Undoubtedly, systematic and timely vaccination of the population plays a crucial role in achieving this goal, a conclusion reinforced by our analysis.
Inevitably, every analysis comes with limitations.For instance, epidemiological models usually consider a homogeneous population.In contrast to deterministic schemes, stochastic methodologies such as the Kalman and particle filters address this issue to some extent by introducing noise/uncertainty.This uncertainty accounts for the demographic and environmental variability of the population and can be quantified based on empirical distributions.However, extended versions considering categories derived from demographic characteristics like age or gender could be interesting.In these situations, the epidemic model's complexity notably rises, posing an issue in conjunction with the heightened computational demands of particle filters.Thus, a more simplistic stochastic methodology like Kalman filters can be useful.The issue of determining explicit statistical expressions for the prior parameter determination serves as an additional limitation and objective for future work.The construction of explicit formulas for this issue, accompanied by statistical inference to validate the quality of these predictors, would provide interesting insights into the proposed epidemic model.Finally, long-term predictions should be conducted with great care due to the variability of exogenous factors, which may lead to a modification of the parameters that determine the pandemic's evolution.

| CONCLUSIONS
In this paper, we introduce a stochastic particle filtering extended SEIRS model with repeated vaccination and timedependent parameters designed to effectively capture the dynamic nature of epidemics with time-varying characteristics.We evaluate the model's performance using daily COVID-19 data from Italy spanning a 525-day period, and our results reveal interesting insights into the pandemic's hidden dynamics.It is worth noting that our particle filter methodology generates the full posterior empirical distribution for the considered states, rather than just estimating their first moments.For our analysis, we utilize official COVID-19 data from the datasets provided by the European Union and Oxford University (as described in Section 2.3), which are widely employed in other research studies.
The main findings include the revelation of the proportion of asymptomatic cases, a characteristic feature of COVID-19 according to the literature.Additionally, the model's reliability is underscored by the examination of the time-dependent parameter evolution and the slight overestimation of ICU cases during the most prevalent wave in January 2022.The incorporation of time-varying parameters significantly enhances the model's fitting and predictive capabilities, increasing its flexibility for describing the pandemic.However, it is important to note that even the best models should be used with caution when making long-term predictions.
Furthermore, we introduce a statistical algorithm to estimate the number of currently recovered and vaccineprotected cases.This is necessary because antibody levels generated by vaccination or infection diminish over time, resulting in discrepancies between currently recovered and vaccinated cases and the total numbers of recoveries and vaccinations.Additionally, we place special emphasis not only on investigating the model's predictions but also on exploring mathematical properties, including the existence and global stability of epidemic equilibria.These properties are analyzed based on the derived formula for the R 0 index obtained from the extended SEIRS system.
As part of future work, we believe it would be intriguing to explore the application of a Kalman Filter with nonnegativity constraints [64], Tobit-Kalman filtering [65], or neural networks [66], as they may offer advantages specific to epidemic modeling.Additionally, investigating disease dynamics using alternative stochastic approaches, such as discrete or continuous-time Markov chains, holds great promise [67][68][69].Employing Markov processes based on the extended SEIRS model we have proposed, could allow for the examination of stochastic properties, including the expected time until hospital admission for a susceptible individual, the average duration of an infection wave, and the anticipated number of deceased cases during a wave, among others.These avenues represent potential directions for future research.
The model we have presented can readily be applied to other existing or emerging epidemic outbreaks involving repeated vaccinations, such as Ebola, influenza, and others.This is because the states and transitions we have proposed are generally applicable to many epidemic scenarios.Additionally, this methodology is well-suited for situations where the dynamics of the system and the available observations are insufficient to accurately capture the progression of an epidemic due to inherent uncertainties.In such cases, it serves as a valuable tool for unveiling the hidden dynamics of epidemics.

APPENDIX A
Proof of Theorem 1.We base our analysis on the utilization of the next-generation matrix proposed in [70].Recall that the coefficient R₀ is defined as the average number of new infections produced by another already infectious individual [71,72].Let Χ ¼ E, I,H,C ð Þ T be the vector that contains the four states displaying the infected cases of the system, and Υ 0 be the DFE, where all individuals are gathered in the susceptible and vaccinated states.We set Χ 0 ¼ 0 T , as there should be no cases in any of the states representing infection during the disease-free period.Let F X ð Þ and Ѵ X ð Þ be the 4 Â 1 vectors, whose i-th entry exhibits the rate of new infections entering state i and the transfer rate of individuals out of state i or the transfer into state i by non-infection means, respectively.We do not consider the transition of an individual between the four states as new infection but rather the progression of an infected individual through the various states of the system.Thus, we have where The next-generation matrix is defined as the matrix product FV À1 where matrices F, V are the Jacobians of F X ð Þ, Ѵ X ð Þ evaluated at the DFE Y 0 (5).Hence, both F and V are 4 Â 4 dimensional matrices of the form and while V is an invertible lower triangular matrix.We note that since F is evaluated on Y 0 , the total population N is also constant at the equilibrium.Then, we obtain where ρ Α ð Þ stands for the spectral radius of matrix A: By substituting the DFE values in equation (A4), we obtain □ Proof of Theorem 2. For the purposes of this proposition, we utilize the LaSalle's invariance principle, where we choose a Lyapunov function L that is positive semidefinite in the feasible region ð Þ > 0, while for the sake of simplicity we drop the t ð Þ notation from the system's states.As for the derivate, we have where and where Q 2 I ð Þ is a polynomial matrix.We note that on the right-hand side of the second equation of (A7), it holds that leading to five negative eigenvalues, namely, For the quadratic polynomial in (A10), we apply the 2nd-order Routh-Hurwitz criterion, where the roots of the quadratic polynomial, lay on the left-hand side of the complex plane when coefficients and consequently, using (A4) we obtain The eigenvalues of Q 2 I ð Þ are all non-positive due to I t ð Þ being non-negative, 8t ≥ 0: Also, the eigenvalues of Q 1 are all negative if and only if R 0 < 1: According to the above observations we get that X T Q 1 X < 0 and X T Q 2 X ≤ 0; though Q 1 and Q 2 are not necessarily symmetric, the quadratic forms X respectively, defined by symmetric matrices (since QþQ T 2 is always symmetric).Finally, dL dt ¼ X T QX < 0 if and only if R 0 < 1, and Theorem 2 is proven.
Þ T be the endemic equilibrium of the model, which can be determined by setting all differential equations of the system equal to zero.We may limit the analysis to seven equations, as D t ð Þ does not contribute to any other equation in System (1).We explore the existence and uniqueness of a non-trivial endemic equilibrium, as Y 0 obviously constitutes the DFE of (1).
The equilibrium Υ Ã satisfies the system We aim to describe the components of Υ Ã as functions of Ι Ã , where We emphasize that c 1 0, γþω ψþδ ; this statement can be proved easily, by rewriting the c 1 in (A16) as We state that the displayed parameters are positive real values.
Substitution of relations (A13), (A17), and (A18) into the second equation of system (A12), leads to the cubic equation-without constant coefficient- which has one zero and two non-zero solutions.The respective quadratic equation derived from (A20) to determine the aforementioned two solutions is Using Vieta's formulas, we determine the sign of the solutions of (A21).Let I Ã 1 and I Ã 2 be these solutions, with The denominator of (A22) is strictly negative as On the other hand, the numerator of (A7) is positive when thus, according to (A5), R 0 > 1: Consequently, if R 0 > 1, there are solutions with opposite sign, leading to a unique endemic equilibrium, as the negative solution cannot be accepted for the number of infectious cases.Based on relations (A13)-(A18) we can describe the remaining quantities of Y Ã .□ Proof of Theorem 4. We aim to prove the global asymptotic stability of the endemic equilibrium Y Ã through the LaSalle's invariance principle.In Theorem 3, we proved the existence of a unique endemic equilibrium if and only if R 0 > 1: Namely, we select the quadratic form Using the second equation of the extended SEIRS model evaluated on the endemic equilibrium we culminate in through equation (A12).We note that Ι Ã cannot be zero as we refer to the case of the endemic equilibrium, corresponding to a phase where the prevalence of the disease is still evident.Subsequently, by taking the derivative of L (A23) with respect to time, we have that where PAPAGEORGIOU and TSAKLIDIS and 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 We note that on the right-hand side of the third equation of (A26) it is true that The eigenvalues of P 2 are all non-positive since I t ð Þ is non-negative 8t ≥ 0: Furthermore, all the eigenvalues of matrix P 1 are negative.According to the above observations, we can conclude that X T P 1 X < 0 and X T P 2 X ≤ 0, leading to dL dt ¼ X T PX < 0, if and only if R 0 > 1. □ Proof of Theorem 5. To find the convergence rate of the proposed epidemiological model to the DFE, we need to determine the value of the parameter γ > 0 that satisfies the inequality where L represents the aforementioned non-negative Lyapunov function.This determination will yield the system's convergence rate, which is equal to γ 2 : Namely, we are searching for the ideal value of γ that satisfies equation (A29).By substituting the formulas for L and its derivate with respect to time, we get leading to the fact that the matrix Q þ γ 2 Ι must be negative semidefinite.For R 0 < 1, the eigenvalues λ i of Q are all negative.Our goal is to determine the value of γ, for which the eigenvalues λ i þ γ 2 of the Q þ γ 2 Ι matrix are also negative.Hence, we conclude that λ i þ γ 2 ≤ 0 , γ ≤ À 2λ i for i ¼ 1, …, 7, which leads to the selection of γ ¼ À2λ max ¼ 2 min j λ i j > 0, i ¼ 1, …,7.This validates that the convergence rate to DFE is equal to Àλ max > 0. In a similar manner, when R 0 > 1 the convergence rate of the epidemiological model to the endemic equilibrium is based on the positive equivalent of the maximum eigenvalue of matrix P: □

APPENDIX B
In Figure B1, shown below, we present the prior and posterior distributions of the seven time-varying parameters estimated using the sequential importance sampling particle filter methodology.The prior distributions are depicted in red, while the posterior distributions are represented in blue.Moreover, we compute the skewness of the generated posteriors and check their normality using the Shapiro-Wilk statistical test (Table B1).

F
I G U R E 2 Diagrammatic representation of the particle filter's operation (Algorithm 2).[Colour figure can be viewed at wileyonlinelibrary.com]

F
I G U R E 4 Daily COVID-19 records and the respective state estimations for the (a) infectious, (b) hospitalized, (c) ICU admitted, (d) currently recovered, (e) deceased, and (f) vaccinated cases in Italy.The black dots represent the real-time data, the red line the model's estimations, and the blue areas correspond to the 99% credible intervals.[Colour figure can be viewed at wileyonlinelibrary.com]F I G U R E 5 Percentage of asymptomatic cases in Italy for 500 days.[Colour figure can be viewed at wileyonlinelibrary.com]

F I G U R E 6
Evolution of the stochastic model's time-dependent parameters for 500 days of pandemic in Italy, accompanied by the corresponding 50% credible intervals.[Colour figure can be viewed at wileyonlinelibrary.com] Figure C1 displays the predictions of the stochastic extended SEIRS model for the next 25 days.The red regions in the figure represent the prediction period for infectious, hospitalized, ICU admitted, recovered, deceased, and vaccinated

F I G U R E C 1
Prediction of (A) infectious, (B) hospitalized, (C) ICU admitted, (D) vaccinated, (E) recovered, and (F) deceased cases in Italy for 25 days ahead.The light red areas show the prediction period and the black diamonds the recorded data.[Colour figure can be viewed at wileyonlinelibrary.com] Parameter and state definition of the proposed SEIHCRDV model.
T A B L E 1 T A B L E 2 Prior and posterior empirical distributions for the parameters and states of the model.Unif stands for uniform distribution, Pois for Poisson, and tN for truncated normal distribution with tN mean,sd,min, max ð Þ .Concerning the posterior distribution of the states, we present the median value, while for the parameters, we display the weighted means.State distributions refer to number of cases that exist at each state on a daily basis, while the distributions of the parameters display daily transition rates.
NRMSEs of the deterministic SEIHCRDV, EKF-SEIRD and EKF-SEIHCRDV with dynamic parameter estimation and particle filtering with and without daily parameter estimation based on COVID-19 data in Italy.