A mathematical model for COVID‐19 pandemic—SIIR model: Effects of asymptomatic individuals

Abstract A new mathematical model called SIIR model is constructed to describe the spread of infection by taking account of the characteristics of COVID‐19 and is verified by the data from Japan. The following features of COVID‐19: (a) there exist presymptomatic individuals who have infectivity even during the incubation period, (b) there exist asymptomatic individuals who can freely move around and play crucial roles in the spread of infection, and (c) the duration of immunity may be finite, are incorporated into the SIIR model. The SIIR model has the advantage of being able to explicitly handle asymptomatic individuals who are delayed in discovery or are extremely difficult to be discovered in the real world. It is shown that the conditions for herd immunity in the SIIR model become more severe than those in the SIR model; that is, the presence of asymptomatic individuals increases herd immunity threshold (HIT).


| INTRODUC TI ON
A history of mankind is a history of fighting against infectious diseases, such as smallpox, polio, plague, flu, AIDS, SARS, and MARS.
Because of the spread of COVID-19 in these days, human beings continue to be suffered by pandemics, that is, danger of collapse of medical, social, and economic system.
Mathematical models have been used to assess the spread of infectious diseases for long time. As infectious diseases change and become more complex, mathematical models have also evolved.
Among these models, SIR (susceptible-infectious-recovered [removed]) model, which has three transitioning stages: susceptibility, infectious, and recovered (and removed from the infectious network), has been widely used for discussing the process of infectious diseases spreading. In addition, SEIR model, which divides the infection stage in the SIR model into two stages, noninfectious (exposed) and infectious, and investigates the four transitioning stages, has also been widely used. 1−9 Since the mechanism of the spread of COVID-19 consists of direct and indirect contacts between infected and susceptible individuals, it seems that the conventional models sufficiently explain the phenomena if COVID-19 is regarded to be the same as the conventional infectious diseases. However, unique features of COVID-19 should be taken into account. These features are (a) there exist presymptomatic individuals who are infectious even during the incubation period, (b) there exist asymptomatic individuals who can move around freely and spread infection, making it difficult to control the situation, and (c) the duration of immunity may be finite and susceptible population is reproduced; that is, it may take long time to achieve herd immunity and several peaks of the spread of infection may occur.
In the next section, SIIR model is introduced to explain the mechanism of COVID-19 infection spread which cannot be fully explained by the conventional mathematical models such as SIR model and SEIR model. In Section 3, SIIR model is verified by data on Japan to show that the model is suitable. Discussions are given in the last section.

| Model equations
As one of the features of COVID-19, there is incubation period after infection and presymptomatic carriers can spread the infection during the incubation period. Further, even though the symptomatic carriers or those who are confirmed to be infected are quarantined, there are also asymptomatic individuals who do not develop after the end of the incubation period. This makes the spread of infection difficult to be controlled because asymptomatic individuals are left unchecked without isolation. COVID-19 is characterized by two different infectious states (presymptomatic and asymptomatic) and is pointed out that antibody duration is not so long. 10 These are not considered in SIR or SEIR models and, therefore, a model that takes such characteristics of COVID-19 into account is needed.
Here, parties concerned in SIIR model at time t are S(t) as susceptible population, I 1 (t) as presymptomatic population (infectious), I 2 (t) as asymptomatic population (infectious), R 1 (t) as symptomatic population (not infectious by quarantine), R 2 (t) as recovered population (with antibody and not infectious), and R 3 (t) as fatalities because of COVID-19 (not infectious). Then, the interrelationship among the above variables is shown in Figure 1 and is described by the following coupled differential equations: which is called SIIR model since there are two states in the infection stage, that is, I 1 and I 2 .
In the SIIR model, presymptomatic and asymptomatic individuals are at different stages but they are both not quarantined and are free to move around. Therefore, in order to clearly show that they are infectious, they are denoted as I 1 and I 2 , respectively. On the other hand, asymptomatic individuals, recovered individuals with antibody, and those who passed away are not infectious, so they are represented by R, that is, R 1 , R 2 , and R 3 , respectively. The SIIR model is used to distinguish it from the SIR model. The feature of the model is that infectious and noninfectious states have internal structures.
The coefficient β in the SIIR model corresponds to the probability, with which susceptible individuals (S) will be infected by contacting with those who are infectious (I 1 + I 2 ) at a different stage of infectivity, and has a similar role to the one in the SIR model. Presymptomatic individuals (I 1 ) bifurcate into symptomatic individuals (R 1 ) who are quarantined and asymptomatic individuals (I 2 ) who are not quarantined with the coefficients b 1 and b 2 , respectively. The meanings of these coefficients are understood in the following way. Assuming presymptomatic population are evenly distributed during the incubation period t 1 , out of presymptomatic population I 1 (t), those at the end of incubation period t 1 denoted by n(t 1 ) may be expressed by n(t 1 ) = I 1 (t)/t 1 . At the end of the incubation period, the i-th presymptomatic individual bifurcates into the symptomatic state with a probability b 1i and asymp-

F I G U R E 1
Structure of the SIIR model: S(t) as susceptible population, I 1 (t) as presymptomatic population (infectious), I 2 (t) as asymptomatic population (infectious), R 1 (t) as symptomatic population (quarantined and not infectious), R 2 (t) as recovered population (recovered with antibody and not infectious), and R 3 (t) as fatalities because of COVID-19 (not infectious) [Colour figure can be viewed at wileyonlinelibrary. com] hand, assuming that symptomatic and asymptomatic individuals are equally distributed in the period of the onset, they are given by R 1 (t)/t 2 and I 2 (t)/t 2 , respectively, at the time of transition from incubation to onset. Thus, we have, Since I 1 (t) bifurcates into ∆R 1 (t) with b 1 and ∆I 2 (t) with b 2 , we have, Factors that determine whether those who are infected become symptomatic or asymptomatic have not yet been identified.
However, if health factors are not considered, 1/t 1 is apportioned The coefficients c 1 , c 2 , and c 3 are also related to the onset period.
At the end of the onset period t 2 , the following, holds at the transition from symptomatic to recovery with antibody or to death. For the transition from asymptomatic to recovery with antibody, the following relation holds: as long as there is no transition from asymptomatic to death.
Therefore, we have, which leads to, That is, c 1 is the inverse of the onset period, and 1/t 2 is apportioned between c 2 and c 3 .
If recovered individuals R 2 lose their antibody at time t, they become susceptible S again, and the coefficient d 1 represents the inverse of the antibody duration.
In this way, the SIIR model, in spite of a continuous model, is explicitly to capture the duration of each stage. In addition, the SIIR model is a five-variable system because of the following conserved quantity given by, If the variables are normalized by the conserved quantity N, it is convenient to see the characteristic behaviors common to the system with different N. In verifying the model with data, the parameters and scale conversion with respect to N are required when the theoretical results best-fit the data. For example, for the spread of infection in Japan, it does not make sense to set the question of how the entire population of Japan of one hundred twenty-six million susceptible individuals is infected. This is because the infection cannot evenly spread throughout Japan. There are many high-risk F I G U R E 2 Single peak (upper row) and multiple peak (lower row) solution. Upper and lower left: the susceptible population S(t) (blue) and recovered population with antibody R 2 (t) (cyan). Upper and lower right: The presymptomatic population I 1 (t) (red), asymptomatic population I 2 (t) (green), symptomatic and quarantined population R 1 (t) (yellow), and fatalities R 3 (t) (black) [Colour figure can be viewed at wileyonlinelibrary.com] areas such as densely populated areas and workplaces scattered throughout Japan. Since the spread of infection in each place is similar to each other, by combining the results of the normalized model as a whole of local infections such as cluster infections, community-acquired infections, and domestic infections, the substance of the spread of global infection may become clear. This is the power of mathematical models. So, we normalize the variables in the SIIR model by N and make the following replacements: By these replacements, Equations from (1) to (6) representing the SIIR model are isomorphic even after normalization.

| Numerical solutions of SIIR model
Now, we look for numerical solutions of the SIIR model. As is expected, the susceptible population S(t) (blue) decreases and the presymptomatic population I 1 (t) (red) increases, which is followed by the growth of both the asymptomatic population I 2 (t) (green) and the symptomatic and quarantined population R 1 (t) (yellow). Then, the recovered population with antibody R 2 (t) (cyan) and the fatalities the asymptomatic population (green) becomes larger than the symptomatic and quarantined population (yellow). In this way, the SIIR model has a definite advantage of handling asymptomatic population which is extremely difficult to be discovered in societies and plays crucial roles in infection spreading. Moreover, because d 1 ≠ 0, The t 2 determines how symptomatic and asymptomatic population decline from the peak. The reason is clear from the terms c 2 = 1/t 2 and c 2 + c 3 = 1/t 2 of Equations (3) and (4). If t 2 becomes larger, the attenuation becomes gentler and the peak width becomes wider.
In the SIIR model, all recovered individuals have antibodies, but since the antibody duration is set as 1/d 1 , they become susceptible again over time, that is, small peaks will be repeated until complete termination of the infection.

| Basic reproduction number and effective reproduction number for SIIR model
Basic reproduction number of the SIIR model is very much different from that of the SIR model. In the SIR model, ℛ 0 > 1 is a condition for infection to be spread. However, in Figure 2 for the SIIR model, even though we have, spread of infection has still occurred. This is because the condition on ℛ 0 in the SIIR model is relaxed because of the presence of asymptomatic individuals, which is one of the characteristics of the SIIR model. This is understood from the initial behavior of the SIIR model.
Since S(t) ≃ S 0 can be used near t ≃ 0, Equations (2) and (3) are expressed as, The eigenvalue λ of A is calculated as, Equation (8) is also obtained based on Equation (10). Using a resolvent operator: Equation (3) has the following formal solution: Near t = 0, we have, leading to The condition ℛ e > 1 gives Equation (8). Thus, the effective reproduction number depends on both the incubation period t 1 = 1/ (b 1 + b 2 ) and the onset period t 2 = 1/c 1 .

| Condition for herd immunity
In order for the spread of infection to reach an end, we have ℛ e (t) < 1, which is equivalent to the condition for the onset of herd immunity: If S(t) is a monotonically decreasing function which is the case for d 1 = 0, the condition for the herd immunity of the SIIR model is shown to be more severe than that of the SIR model, and the presence of asymptomatic individuals increases herd immunity threshold (HIT). In addition, since those who lose antibody after recovery are transferred to be susceptible again, S(t) may not be a monotonically decreasing function, for which ℛ e (t) oscillates around 1 and the infection peaks will appear multiple times in a period of 1/d 1 , the antibody duration.

| Cumulative fatalities of SIIR model
It is very much important to examine fatalities caused by infectious diseases. Here, we evaluate cumulative fatalities in the SIIR model.
As we see earlier, we have, and, from Equations (3) and (4), we have, Therefore, we obtain, and, since we set I 2 (0) = R 1 (0) = 0 for the initial values, we have, With Equation (12), the SIIR model becomes four-variable system. Now, we rewrite the basic equations and have, Then, the conservation quantity becomes, For simplicity, we set d 1 = 0 for now. And with Equation (12), Equations (6) and (16)  Because cumulative recovered population density can be expressed through cumulative fatalities density, we have, The cumulative fatalities will have a maximum value as a function of b 1 . The coefficient b 1 is the contribution from I 2 (t) and R 1 (t) through R 2 (t), and the contribution from R 1 (t) through R 3 (t). The coefficient b 1 in the cumulative density of the infected population is a contribution from I 2 (t).
The b 1 * maximizing R 3 * is found as follows. Differentiating R 3 * with respect to b 1 and putting the resultant equation to 0, we have, which is solved to give, From the condition b 1 < 1/t 1 , b 1 * is obtained as follows: It is confirmed by Equation (20) that the cumulative fatality density R 3 * in the final state is an increasing function of the fatality rate c 3 .
Also, if the onset period (1/c 1 ) becomes shorter, the recovery rate increases, and R 3 * is a decreasing function of c 1 .

| Comparison between SIIR model and SIR model
A major feature of the SIIR model is that it considers the existence of asymptomatic individuals, namely, I 2 (t), and certainly not for the conventional models such as SIR 1 that is defined as follow: .
SIR model analyzes three transitioning stages (susceptibility (susceptible population S(t)), infectious (infectious population I(t)), and recovered (and removed) (recovered population R(t))) where infection and recovery rates are denoted as β and γ, respectively. Comparison between SIIR model and SIR model is shown in Table 1.
If a route which connects presymptomatic individuals with asymptomatic individuals is closed and b 2 = 0 is set, I 2 (t) = I 2 (0)e −c1t in the SIIR model disappears. At the same time, if the antibody duration of the recovered individuals is set as infinity (d 1 = 0) and R 1 + R 2 +R 3 = R is set, SIIR model is reduced to be SIR model. Thus, it can be said that SIR model treats those who are removed from the infection networks equally and ignores internal structures. Given a finite antibody duration, these who recovered and gained antibody will be included in the group of susceptible individuals over time, but then, this will be excluded from the SIR treatment.
If we take the ratio of the basic reproduction numbers of SIR and SIIR models, we have, Since two models may give different basic reproduction numbers for the same infectious disease, policymakers should pay attention to the model they adopt.

| VERIFI C ATI ON ON S IIR MODEL BY DATA
Data used here are collected from the daily number of confirmed cases of COVID-19 determined by PCR or antigen tests and announced daily by the prefectural health centers. 11 The variable to be verified by the data is the daily symptomatic and quarantined population in the SIIR model which is denoted by ∆R 1 (t). This is because, among the infected population (presymptomatic, symptomatic, and asymptomatic population), presymptomatic and asymptomatic population have been rarely tested in Japan. Since the observed daily data are under uncontrolled daily fluctuations, ∆R 1 (t) = b 1 I 1 (t) calculated from the SIIR model is deviated a bit from the observed daily data. Therefore, we also compare the cumulative confirmed cases of COVID-19 with the cumulative symptomatic and quarantined population ∑∆R 1 (t) since the cumulative data are almost free from daily fluctuations which are smoothed out by coarse graining.
Below, the results of the SIIR model are compared with the observed data from Japan. Since the population size of the SIIR model is normalized, the population size of the targeted system must be determined from the data. Now, denoting the number of daily confirmed cases at date k by g(k) and assuming N to be the population size of the targeted system. We look for N that minimizes the following equation: where K d is the total number of data. Although I 1 (k) depends on β, t 1 , t 2 , b 1 , and I 1 (0), t 1 and t 2 can be fixed as epidemiological parameters. 12,13 From the basic reproduction number ℛ 0 , which is obtained from the initial growth curve of the data, β can be determined as = ℛ 0 ∕t 1 . Also, b 1 can be estimated from the position of the peak of the data. In addition, the initial value I 1 (0) determines the date when the amplitude rises. Thus, N minimizing Equation (25) is calculated as follows: Here, note that, even if b 1 I 1 (k) and g(k) are similar in profiles, Equation   ℛ 0 (SIIR) ℛ 0 (SIR) = t 1 .
(25)  Extending the SIIR model to describe this chain of spreads is not really difficult. We can build a scenario where remaining presymptomatic and asymptomatic carriers connect several SIIR systems. In fact, it seems that those carriers are moving to discover new areas of spread. It is possible to create n chained SIIR systems and consider that after the first system has ignited and terminated, the remaining presymptomatic and asymptomatic carriers will drive the spread as the initial value for the second system.  Here, two normalized SIIR systems 1 and 2 are connected at the day K = 140 to describe the spread and termination of infection observed in Japan:

TA B L E 1 Comparison
The parameters of 1 are set as follows: The initial values of 1 are as follows: Then, the following values, same as Equations (27) and (28) in 1 . The initial values at t = K are set as follows: where the population size of 2 is calculated from Equations (25) and (26) as follows: Note that the basic reproduction number here is ℛ 0 = 0.79968 (<1), and the infection does not spread in the SIR model; however, it spreads in the SIIR model as shown in Equation (8).
The asymptomatic population can be obtained from the SIIR models and data. From Equation (12), it can be seen that the asymptomatic population I 2 (t) is proportional to the symptomatic and quarantined population R 1 (t) and with the given parameters the ratio is given by, In other words, the asymptomatic population in Japan can be ap- in Japan, which can be reproduced quite well with the SIIR model ( Figure 3 (lower row). Since R 1 and R 3 are on different stages, there is a difference between the timings when the peaks occur. Therefore, in Figure 3 (lower row), the time when 1 connects to 2 is set as K = 170, which is different from K = 140 used in Figure 3 (upper row), and the fatality rates are set as follows: In 2 , unlike 1 , more of the infected are young, fewer elderly people with preexisting medical conditions are infected, and, in addition, healthcare professionals have gained experience in 1 so that they can respond appropriately to treatment. However, in the future, if the infection spreads from the young to the elderly, and if the "GO TO TRAVEL" campaign in Japan boosts the spread of infection to more people including those with preexisting medical conditions and the elderly, then fatalities may also increase. Finding and isolating asymptomatic individuals can help control the spread of the infection.

| DISCUSS IONS
In this paper, the SIIR model, dealing with the features of COVID-19 such as presymptomatic individuals who have infectivity even during the incubation period, asymptomatic individuals who can freely move around and play crucial roles in the spread of infection, and limited duration of immunity, has been introduced to show that it can replicate data such as the number of confirmed cases for COVID-19 and that of those who passed away because of COVID-19 in Japan.
Since infectious diseases spread depending on individual's health conditions and behavioral characteristics, the SIIR model cannot provide detailed information for policymaking, as far as our discussion is limited within median or mean. This is the limit of the SIIR model. In the future, we will address the following topics in order to discuss policy issues for the early termination of COVID-19.
• Verification of the SIIR model based on the spread of COVID-19 infection in various countries.
• Formulation of agent-based modeling (ABM) that incorporates diversity such as individual health conditions and behavioral characteristics and verification of ABM by actual data.
• Examination of detailed policies corresponds toward the convergence of infectious diseases.

CO N FLI C T O F I NTE R E S T
The authors have stated explicitly that there are no conflicts of interest in connection with this article.

AUTH O R CO NTR I B UTI O N S
All authors had access to the data and a role in writing the manuscript.

I N FO R M E D CO N S E NT
There is no need to obtain any consent for this article.
0.014 t 2 for t⟩K.