Applying a system approach to forecast the total hepatitis C virus-infected population size: model validation using US data


Homie A. Razavi, Center for Disease Analysis, Kromite, 901 Front Street, Suite 291, Louisville, CO 80027, USA
Tel: +1 720 890 4848
Fax: +1 303 552 9119


Background: Hepatitis C virus (HCV) infection is associated with chronic progressive liver disease. Its global epidemiology is still not well ascertained and its impact will be confronted with a higher burden in the next decade.

Aim: The goal of this study was to develop a tool that can be used to predict the future prevalence of the disease in different countries and, more importantly, to understand the cause and effect relationship between the key assumptions and future trends.

Methods: A system approach was used to build a simulation model where each population was modeled with the appropriate inflows and outflows. Sensitivity analysis was used to identify the key drivers of future prevalence.

Results: The total HCV-infected population in the US was estimated to decline 24% from 3.15 million in 2005 to 2.47 million in 2021, while disease burden will increase as the remaining infected population ages. During the same period, the mortality rate was forecasted to increase from 2.1 to 3.1%. The diagnosed population was 50% of the total infections, while less than 2% of the total infections were treated.

Conclusion: We have created a framework to evaluate the HCV-infected populations in countries around the world. This model may help assess the impact of policies to meet the challenges predicted by the evolution of HCV infection and disease. This prediction tool may help to target new public health strategies.


Center for Disease Control;


hepatocellular carcinoma;


hepatitis C virus;


International Conquer C Coalition;


injection drug use;


multi-objective decision analysis;


The National Health and Nutrition Examination Survey;


sustained viral response.

Hepatitis C virus (HCV) infection is associated with chronic, progressive liver disease. Chronic hepatitis C is a leading cause of cirrhosis and hepatocellular carcinoma (HCC) (1, 2), and the latter two are a major indication for liver transplantation (3). A better understanding of HCV infection prevalence and its modelling can help medical communities and government agencies manage the disease burden and develop treatment strategies in light of the emergence of several potent anti-HCV therapies.

Considerable work has been undertaken to forecast the epidemiology of hepatitis C and numerous authors have developed Markov models to estimate the disease burden (4–16). However, these models have limitations resulting from the assumption that the studied cohorts remain homogeneous and traverse through different states at a fixed rate over time when, in reality, populations are very heterogeneous (17). Other investigators have developed multicohort natural history models (17), but all forecasts are complicated by availability and uncertainty of key inputs, especially outside a few well-studied countries.

We addressed this problem by using a system approach to build a simulation model. In this method, each population group was assessed as a dynamic system with inflows and outflows. When the inflows of population outweighed the outflows, the size of the group increased. Conversely, the size of the population decreased when outflows were higher than inflows. The benefit of this approach was that it did not require accurate information, except for the relative size of populations moving in and out to predict trends. In addition, it enabled us to calculate the size of infected populations when data were missing.

The merits of this approach were verified taking as an example the HCV-infected population in USA, where considerable research has been carried out to understand the historical and future trends. These results allowed us to calibrate our model in order to subsequently apply it to other geographical regions throughout the world.

The objective of this paper was to present our methodology and demonstrate the validity of the simulation model using US data. Subsequent publications will describe the application of this methodology and tools to select countries. This work reports the results of one particular scenario: current trends in prevalence, incidence, mortality, diagnosis, treatment and response rates will continue at the same rate. It does not take into account future events like the introduction of new therapies or a significant increase in treatment rates. Our model is designed to analyse numerous scenarios, but this is beyond the scope of this particular article.


A number of techniques were used to develop a simulation model and estimate the future size of the HCV-infected population. A system dynamic modelling framework was used to construct the model in Microsoft Excel®. Individual populations (incidence, diagnosed, treated, etc.) were handled as stocks while transitions from one population to another were treated as flows with an associated rate/probability (Fig. 1). Each HCV-infected pool was assessed individually and then linked. To build the model, two tools were used: influence diagrams and flow diagrams (18). Influence diagrams were used to identify all the factors that influenced the HCV-infected pool under consideration and identify the potential key drivers that should be considered in the model. The flow diagram was used to describe the inflow and outflow of infected individuals and to build the final model. Not all factors that were identified in the influence diagram were explicitly incorporated into the model. However, efforts were made to incorporate all factors when gathering data for inputs to the model.

Figure 1.

 Simplified schematic of the simulation model.

All transition rates/probabilities were modelled as functions to allow change over time. The parameters used to define these functions were start value, end value, start date, years to end value and the curve type, where the curve type value selected from a range of concave to convex sigmoidal curves. Excel® optimization add-in, Solver, was used to calculate the parameters used in the rate functions to fit historical data and predict future trends by minimizing the root mean square value between the actual and the calculated values.

Hepatitis C virus-infected pools were calculated every year given last year's value along with the inflow and outflow rates. The model estimated the infected pools from 2004 to 2021. Reported data from 2004 to 2009 were used to calibrate the rate functions and estimate future trends. Sensitivity analysis was used to identify the transition rates that had the largest impact on future trend as measured by the size of 2021 infected population. The associated infected populations were segmented to enhance the fit with historical data and the projections. The most significant improvement in the model was observed after the incident, and prevalent populations were segmented by age and gender. Segmenting the populations by genotype also improved fit with historical data, given the response rate and duration of treatment variations in each segment. Further segmentation of the populations by the lines of treatment (naïve – never been treated before; second line – been through one round of treatment; third line – been through two rounds of treatment) had a marginal impact on the fit with historical data. However, it did enable us to forecast the impact of new treatments, as they reach the market, on the size of infected pools.

The model was set up for Bayesian analysis of assumptions with a large uncertainty using Crystal Ball®, an Excel® add-in by Oracle®. Crystal Ball® was also used to run sensitivity analyses. A β-PERT distribution was used for all the rate function parameters using the likeliest value along with 5th and 95th percentile inputs. A β-PERT allowed for non-symmetrical distributions based on the collected data. We assumed that ranges from the literature typically provided a 90% confidence interval. The low end of the range was used as the 5th percentile inputs and the high end of the range was used for the 95th percentile.

A comprehensive global review of the literature was used to gather the size and flow of HCV-infected pools around the world. Data references were identified through two sources: indexed journals and non-indexed sources. Indexed articles were found by searching PubMed and regional databases (e.g. Medigraphic, and Imbiomed in Latin America) using the following terms: ‘hepatitis C AND country name AND (incidence OR prevalence OR mortality OR viremia OR genotype OR diagnosis OR treatment OR SVR)’. Furthermore, references cited within the articles were used. More than 27 000 abstracts/full-text articles were reviewed and 2600 references were selected based on relevance. In addition, non-indexed sources were identified through searches of individual countries' ministry of health websites and international health agency reports. Finally, authors from each country provided government reports and proceedings of local conferences that were not published in the scientific literature. The search was not limited to English publications, although they accounted for over 90% of the data sources.

The key assumptions used in our model are summarized in Appendix A. Outside of USA, when data were missing, analogues were used. Data from other countries with similar risk factors and/or population composition were used as a proxy to help predict the future trends in size of infected populations. However, gaps in data in each country were also highlighted to help guide future research. Given the extent of the publications describing the epidemiology of this disease, we developed a ranking system to identify the best data sources. The scoring system did not assess the merits or the quality of the data sources. It simply measured their value for this specific analysis.

A systematic process using multi-objective decision analysis (MODA) (19–22) was used to select the most appropriate data sources, identifying a group of data sources that provided a range for the assumption under consideration rather than relying on a single source. MODA has been used for years to prioritize activities and spending within the pharmaceutical industry and government agencies (23, 24). However, this is the first example of its application to ranking of published work. The key objectives were defined along with a scoring system for each measure. A 0–10 scale along with definitions was developed in which each score indicated the relative contribution to meeting the specific objective. Thus, an article with a score of 10 was twice as valuable as another article with a score of 5 for the same measure, all other measures being equal. The scoring was conducted by the authors, who were familiar with the country-specific disease dynamics and treatment, or by independent epidemiologists.

The weighting for each objective provided the relative importance of that objective as compared with others. A swing weight method was used to estimate the weights (25). Two factors were considered to determine the weight: the importance of the individual measure to the overall objective and the range of scores. Thus, a particular measure that may have been important was weighted less if all data sources examined had very similar scores and the measure could not be used to differentiate one source from another. The overall MODA framework is summarized in Table 1. The data sources with the highest overall score were selected in which the overall score was calculated, as shown in Equation (1):

Table 1.   Multi-objective framework for selecting data sources
Weight (%)ObjectiveRatingDefinition
  1. IDU, injection drug use.

20Sample size0–10Log (sample size) scaled from 0 to 10 maximum at 10k
35Extrapolative0Analysis based on select subpopulations (e.g. IDU)
4Analysis based on random but self-selected populations (e.g. blood banks)
10Analysis based on a random general population
25Analysis type0Analysis completed before blood screening was implemented
5Analysis completed after blood screening was implemented using first- or second-generation screening technology
10Analysis completed after blood screening was implemented using the latest screening technology
20Analysis consistency/quality0Analysis completed at a small local research facility
5Analysis completed at a hospital lab/research lab
10Analysis completed by a national/international reference lab
NARelevance0Analysis does not provide enough details to be relevant for this exercise (e.g. article breaks out genotypes as G1 and all others)
1Analysis provides sufficient details to be relevant to this exercise (e.g. article breaks out specific genotypes: G1, G2, G3, G4, G5, G6 …)

The relevant score was used to eliminate articles that did not provide sufficient details for this exercise. Not all articles required scoring. For example, articles describing the risk factors that lead to HCV infection were typically descriptive and did not lend themselves to a scoring system.

The MODA approach was validated by having David Kershenobich score 50 references from Mexico and Stefan Zeuzem score 70 references from Germany. Their ranking was compared with the rating of an independent epidemiologist, Carolyn Wallace. Although there were variations in the absolute scores, the relative rankings had a correlation of 0.98. Further comparisons of ranking of data sources in Canada, Switzerland, Japan, South Korea and Israel provided the same result.

Incident population

We defined incident population as individuals with new chronic and viraemic HCV infections. It was calculated using the flow diagram in Figure 2 and Equation (2):


where IRAge & Gender is the incidence rate by age and gender, PopAge & Gender is the country's population by age and gender, RUU is under- and unreported ratio, RSC is the spontaneous clearance rate and RT×C is the treated and cured rate taking into account the number of treated patients and the sustained viral response rate.

Figure 2.

 Hepatitis C virus incident population flow diagram.

The acute phase is defined as the first 6 months after infection and is characterized by a sharp increase in the ALT levels and seroconversion (26–28). The US Centers for Disease Control and Prevention (CDC) requires centres to couple an acute increase in ALT levels and/or seroconversion with evidence of prior HCV non-infection. New acute cases are reported to the CDC through sentinel counties and are adjusted for under-reporting. A multiplier is then applied to account for all asymptomatic cases that are not diagnosed or reported. The unreported multiplier was estimated for USA in a study by Armstrong et al. (29).

To estimate true incidence (defined as chronic and viraemic), the number of acute infections spontaneously cleared was extracted. Additionally, those successfully treated during the acute phase of infection had to be accounted. In reality, treatment during the acute phase of infection was estimated to represent a very small proportion of all acute cases, even though there is ample evidence of a high response rate to therapy at this stage of the disease (30–33).

The literature reported that approximately 75–85% of the newly infected persons develop chronic infections (28, 34, 35). The factors that seem to impact the persistence of the infection are an older age at infection, male gender, African-American race, immunosuppressed state, specific HLA subtypes/polymorphisms and blunted innate immune response (28).

For the US model, we used the 2004 CDC reported incidence rates by age and gender and assumed that it remained constant (36). The under- and unreported ratio was calculated from data from the same agency (Table 2) by dividing the annual number of new infections by the number of reported cases (37). This rate was maintained constant throughout the duration of the model. The UN estimates were used for future US population by age and gender (38). A spontaneous clearance rate of 82% was used (28) and it was assumed that the treated and cured rate in the acute phase of infection was zero.

Table 2.   Center for Disease Control's estimate of the number of new hepatitis C virus infections in USA (37)
Number of acute clinical cases reported1223891758694802849
Estimated number of acute clinical cases480045004200340032002800
Estimated number of new infections29 00028 00026 00021 00019 00017 000


We defined mortality as all-cause mortality in order to account for the appropriate portion of all infected individuals who were removed from the infected pool each year. Outside of a handful of countries, little data were available on mortality among HCV-infected individuals. A model was developed to estimate the mortality rate in different countries based on age, liver-related deaths due to HCV infection and per cent of the prevalent population infected by injection drug users (IDU) and transfusion. Although other risk factors also contribute to HCV infections (nosocomial, unsafe injections, etc.), it was assumed that the associated incremental increase in mortality was negligible.

Studies by Duberg et al. (39), McDonald et al. (40) and Amin et al. (41) all show an increase in liver-related deaths in HCV-infected populations. Through personal communications with the authors, the liver-related mortality rates in these studies were compared with the general population mortality in the same country and age cohorts (42, 43). A marked increase in liver-related deaths was observed starting at ages 40–45. The standard mortality ratio (SMR) was >1 as compared with the general population until age 70 when mortality in the general population became higher than liver-related deaths. Seeff et al. (44) and Guiltinan et al. (45) studies among individuals who have been screened for high-risk behaviour (e.g. blood donors and military recruits) showed a SMR of 1.5–2.1. For the purpose of our mortality model, we assumed that all HCV-infected individuals had a mortality rate of 2.0 between the ages 45 and 70 due to liver-related deaths.

The HCV-infected populations in Australia, Scotland and Sweden were heavily weighted towards IDU. For example, in Scotland, 90% of the diagnosed HCV-infected population had injected drugs (14). Analysis of mortality among IDU without HCV infections who received treatment for drug misuse shows an excess mortality ratio of 22.0 between the ages 15 and 44 (46). All three retrospective studies showed a marked increase in drug-related mortality between these same age groups as well. The SMR for drug-related deaths among HCV-infected was 19.3 in Australia, 25.1 in Scotland and 20.7 in Sweden, the latter being specific to deaths due to mental causes due to alcohol and drugs. Even though the HCV infection was not the cause of death in these young age groups from an epidemiological perspective, these individuals were still excluded from the total infected population. In our mortality model, we assumed that the portion of the infected population that was made up of IDU had a mortality rate of 22.0 between the ages 15 and 44.

The final risk group that was identified with a higher mortality rate was blood transfusion recipients. A study from Denmark and Sweden examined 1 118 261 transfusion recipients and found the SMR to be 17.6 in the first 3 months, 2.1 one to four years and 1.3 seventeen years after their first transfusion (47). This study did not separate HCV-infected individuals, but showed a higher mortality rate in blood recipients. In our model, we assumed that the portion of the infected population who contracted HCV from blood transfusion had an SMR of 2.1.

The overall mortality rate among HCV-infected populations was calculated using Equation (3):


where GPMRby Age & Gender is the General Population Mortality Rate by age cohort for all causes provided by United Nations' Demographic Yearbook's Table 19 (42) and MML is the adjusted mortality multiplier for liver-related morbidity and is defined as


AMIDU is the adjusted mortality multiplier for injection drug-related morbidity, defined as AMIDU=(MMIDU×PopIDU)+(1−PopIDU), where MMIDU is the injection drug user mortality multiplier defined as


and PopIDU is the proportion of the population who were injection drug users, AMTrans is the adjusted mortality multiplier for transfusion-related morbidity, defined as AMTrans=(MMTrans×PopTrans)+(1−PopTrans), where MMTrans is the transfusion mortality multiplier (2.1) and PopTrans is the proportion of the population who contracted the disease from transfusions.

In USA, we assumed that 47% of the infected population had experimented with IDU and 7% of the cases were due to transfusion (27, 48).

Prevalent population

In our analysis, we defined the prevalent population as the total viraemic uncured HCV-infected population. It was calculated according to the flow diagram shown in Figure 3, Equations (4) and (5):


where PRAge & Gender is the prevalence rate by age and gender, 2004 PopAge & Gender is the US population by age and gender in 2004, and RViraemic is the viraemic rate.


where PPAge & Gender(yr−1) is the previous year's viraemic prevalence population by age and gender, IPAge & Gender(yr−1) is the previous year's viraemic incident population by age and gender and MPAge & Gender(yr−1) is the size of the mortality population among the previous year's prevalent population (PP) defined as MPAge & Gender(yr−1)=PPAge & Gender(yr−1)×Mortality RateAge & Gender(yr−1), and TxC(yr−1) is the number of individuals who were treated and cured in the previous year as defined in the treated population section.

Figure 3.

 Prevalent population flow diagram.

Our model started in 2004. As shown in Equation (4), the 2004 prevalent population was estimated by applying the 1999–2002 National Health and Nutrition Examination Survey (NHANES) prevalence rates by age and gender (49) to the 2004 US population (38) although a recent study suggested that the total number of infections could be 27% higher if subgroups not sampled by NHANES (e.g. homeless, incarcerated, veterans, active military personnel, etc.) were also considered (50). A viraemic rate of 79.7% was applied to estimate the starting population size with HCV RNA (49). After 2004, the prevalent population was calculated each year using the formula shown in Equation (5). The previous year's new cases were added to the last year's total infected pool and the mortality among the same pool was subtracted. Finally, the number of individuals treated and cured was also excluded.

Outside of USA, HCV prevalence data were available for most countries. However, not all data were representative of the country's population, and the years for which estimates were provided differed. Furthermore, HCV prevalence was determined by local practices, risk factors and access to care in particular communities. Thus, extreme care was required in extrapolating the disease burden from a few well-studied data points.

Studies based on first- and second-generation immuno-assay tests provided the upper bound of the potential prevalence. Because of the number of false-positive outcomes in early tests, older HCV prevalence studies overestimated the total population infected with the disease (51).

In many countries, blood bank data were available and used as a basis for the country's HCV prevalence. However, due to selection bias, sero-epidemiological studies using blood donors may not reflect the epidemiological reality (52). In addition, studies in the USA (53) have suggested that direct questioning related to behavioural habits before blood donation results in decreased blood donor HCV prevalence (0.63%→0.4% between 1991 and 1996). Blood bank data were useful, but likely provided an estimate that was rather low compared with the HCV prevalence in the general population (54).

Diagnosed population

We defined the diagnosed population as the total viraemic individuals who have been diagnosed with HCV infection. Our definition excludes all who were diagnosed and cured or those where were removed due to mortality. It was calculated, as shown in Figure 4 and Equation (6), where this year's diagnosed population started with last year's pool, added newly viraemic diagnosed patients, and subtracted diagnosed patients who died that year and any who were treated and cured:


where Dxyr is the size of the current year's diagnosed population, Dxyr−1 is the size of the previous year's diagnosed population, UnDxyr−1 is the size of the previous year's undiagnosed population, NDxRyr−1 is the previous year's newly diagnosed rate, RViraemic is the viraemic rate, TxC(yr−1) is the number of previous year's individuals who were treated and cured as described in the treated population section and MPyr−1 is the size of last year's mortality population in the diagnosed pool defined as MPyr−1=Dxyr−1×Mortality Ratio where mortality ratio was defined in the mortality section.

Figure 4.

 Diagnosed population flow diagram.

In USA, a NHANES survey found that 50% of randomly selected individuals who tested positive for HCV were already aware of their infection (5). Similar studies in subpopulations by NYC HANES, Veterans Affairs' Mental Illness Treatment and Research Center, and Philadelphia Department of Public Health found that between 50 and 72% of those who tested positive to HCV already knew of their infection (55–57), suggesting that a high percentage of the total infected population is already diagnosed in USA.

The newly diagnosed rate was estimated by a recent CDC study, which reported that 69.2 per 100 000 (or 211 600 in 2007) of new diagnosed HCV infections were reported between 2006 and 2007 (58). In our model, we assumed that before 2004, 795 000 viraemic individuals were diagnosed in USA. A viraemic rate of 79.7% was applied to estimate the starting population size with HCV RNA (49).

Treated population

The numbers reported by Volk et al. (5) were used for the number of treated individuals in USA. This comprehensive analysis looked at longitudinal data from 2002 to 2007 to estimate the number of treated HCV patients in USA.

Disease burden

We used the number of infected individuals over 60 years old as a proxy for disease burden. A number of studies have evaluated the HCV disease burden as a function of time (7, 14, 17, 59–61), age, gender and risk factors. They forecasted the progression of HCV infection to cirrhosis, and the specific values for complications such as encephalopathy, ascites or variceal haemorrhage (8, 15, 17). Previously unpublished data by Poynard (Fig. 5) showed an increase in cirrhosis and fibrosis progression with age.

Figure 5.

 Progression of fibrosis assessed according to age in 2313 hepatitis C virus patients with a biopsy who were not treated (Thierry Poynard 2010, Université Pierre et Marie Curie Liver Center, Paris Frano pers. comm.).

The methodology described here was designed to be applied to numerous countries around the world. Outside of USA and Western Europe, data were insufficient to estimate the disease burden with any accuracy using more sophisticated modelling. Thus, the age of the infected population was used as a proxy for disease burden. Previous work (62, 63) already demonstrated a marked increase in the probability of disease progression around age 60. For our purposes, we assessed the percentage of infected population over the age of 60 as a surrogate for the disease burden.


A key advantage of the system approach was that it allowed us to quickly assess the impact on key populations without significant programming. To illustrate, consider the change in size of the viraemic infected population between 2007 and 2008. In 2007, we estimated that there were 3 million individuals infected with HCV. In that same year, the CDC estimated that 17 000 new US-based HCV infections occurred (Table 2) (37). With a spontaneous clearance rate of 18% (28, 64), 13 940 went on to have a chronic infection. Volk et al. (5) estimated that 83 000 patients were treated in that same year. With a 50% sustained viral response (SVR) (all genotypes combined), this would translate to 41 500 cured persons. For the same year, our mortality model estimated that 61 500 people infected with HCV died in 2007.

The system approach predicted that viraemic prevalence will decline in USA between 2007 and 2008 as shown in the balance beam in Figure 6. The true incidence, mortality and number of cured individuals were all uncertain. However, the conclusion held even if mortality, spontaneous clearance or cured rates were off by more than a factor of two. The likelihood that the true incidence was higher than the sum of mortality and cured individuals was negligible. Thus, it was safe to conclude that the overall viraemic prevalence will decline in USA. The trend will only reverse if there is a significant increase in incidence through a very large outbreak. This analysis showed that in order to estimate the future total infected population size, it was more important to develop a robust mortality model rather than fine-tune incidence estimates. Outside of USA, the reported incidence was most often newly diagnosed, which included both new and existing infections. However, using the above approach, we were able to estimate the number of new cases when the trends in prevalence, mortality and the number of treated and cured populations were available.

Figure 6.

 The 2007 total infected population balance beam for hepatitis C virus in USA.

Prevalent population

The simulation model predicted a decline in the total number of infected individuals in USA in the absence of new events including the utilization of more effective HCV antiviral medications in the future (Fig. 7). The viraemic prevalent population was projected to decline by 24% from 3.15 million in 2005 to 2.47 million in 2021. This trend was consistent with forecasts by Davis et al. (17), who modelled the total number of individuals ever infected using a multicohort natural history model. In their model, the total chronic HCV population declined 22% from 3.53 million in 2005 to 2.75 million in 2021. Overall, the true infected population is likely to be higher than the results shown here, because published reports exclude data on incarcerated and homeless persons or active IDU.

Figure 7.

 Projected number of hepatitis C virus-infected and diagnosed individuals in USA.

The expected decline in prevalence does not mean that the burden of the disease will decline as well. The dichotomy with HCV infection is that as prevalence declines in many countries around the world, the burden of the disease is expected to increase due to an ageing HCV population.

Diagnosed population

We estimated that 50% of the HCV-infected population was diagnosed by 2009 and close to 80% will know of their infection by 2021 (Fig. 7). The literature suggested that approximately half of the infected population already knew of their diagnoses before 2009. However, we started with a more conservative diagnosed population to account for sampling in the studies. Relaxing this assumption would simply accelerate the increase in the size of the diagnosed population in USA.

The model predicted a diminishing growth in the size of this population due to fewer undiagnosed individuals available as more are diagnosed and an increase in mortality as the infected population ages.


A comparison of select mortality studies is shown in Figure 8 (39–41, 44, 45, 65–67). We used our mortality model to explain the differences in the studies by the risk profile of the infected population in the studies. The output of our model for the US-infected populations is shown in Figure 9. We forecasted that the SMR will decline from 2.5 in 2005 to 1.6 in 2021. SMR was calculated by dividing the forecasted mortality in the HCV-infected population by age and gender by the expected deaths in the general population with the same age and gender profile. During the same period, the mortality rate (total deaths in HCV-infected population/total infected population) was forecasted to increase from 2.1 to 3.1%. Both trends were consistent with an ageing infected population that would cause the mortality rate to increase. At the same time, the SMR declined as the mortality rate in the general population (the denominator in SMR) increased as a result of an ageing population. Our SMRs were similar to previous US studies (range of 1.5–2.1), although the two sets cannot be compared directly because our estimates were for the total infected population (44, 45, 65). In a further validation of the mortality model, when the risk factors and general population mortality rates in Sweden were used, the forecasted SMR was within 12% of the actual numbers reported by Duberg et al. (39). The model underestimated the number of deaths among the HCV-infected population between the ages of 45 and 75. This could be due to the fact that we did not consider the impact of alcohol or body mass. However, for most countries without a robust mortality study, this would provide a reasonable approximation.

Figure 8.

 Select hepatitis C virus-infected mortality studies.

Figure 9.

 Projected US standard mortality ratio and mortality rate.

Our analysis was not consistent with studies by Harris et al. (68, 69), where it was shown that the mortality rate among HCV-infected transfusion recipients was the same as the control group after 16 years of follow-up. However, a more recent study from Denmark showed a clear increase in mortality in the HCV-infected population as compared with the cured population in line with our model (70).

It is worth noting that our analysis predicts that the mortality rate will change over time as the infected population ages. In addition, the mortality ratio will be different among incident and prevalent populations. The prevalent population in most countries includes individuals who contracted HCV from contaminated blood supplies or unintentional contaminated injections. Today, in most western countries, the new cases (incidence) are due to IDU, which means that mortality among newly infected individuals will be significantly higher.

Our analysis showed that it is possible to estimate the mortality rate in a country using the risk factors present. In addition, it predicted a lower SMR in USA as compared with the European studies, consistent with the literature. Finally, it demonstrated that rather diverse reports in the literature are actually consistent when the risk factors are considered.

Disease burden

Consistent with other studies (17, 59), we predict an increase in disease burden over the next decade despite a decreasing prevalence.

A recent study of veterans showed that ageing of the HCV-infected patients accounted for a significant proportion of the observed increase in the prevalence of cirrhosis and HCC, although not all (71). By 2011, 20% of the infected viraemic population will be over 60 (Fig. 10). This population was forecasted to more than double within 10 years. The objective of our analysis was not to calculate the estimated number of individual morbidities associated with hepatitis C, but rather to provide directional insight. According to Davis et al. (17), the number of HCV-related cirrhosis cases is expected to increase by 24% in the same period; decompensated cirrhosis cases and HCC will increase by 50%. A retrospective study among veterans reported a significantly higher increase in the burden of HCC and a lower increase in the prevalence of cirrhosis than predicted by mathematical models (71). Directionally, our model was consistent with previously reported estimates.

Figure 10.

 Per cent of US viraemic hepatitis C virus-infected population over 60 years old.

Treated population

The Volk analysis reported that the number of treated patients had been declining in USA. This trend was observed at a time when the number of diagnosed individuals was increasing. The factors that could explain this observed historical decrease in USA were clinical trials, eligibility, warehousing (physicians holding back patients until new therapies become available), physician perception and the treatment capacity, which may vary for each country.

There were more patients enrolled and treated in clinical trials as new drugs were developed to treat this disease. According to National Institute of Health (, in 2007, approximately 10 000 patients were enrolled in the US hepatitis C clinical trials. The Volk study did not include treated patients in trials. Alternatively, as more patients were treated, the remaining pool had a higher percentage of ineligible patients. However, this could not fully explain the downward trend, as the number of newly diagnosed individuals far outweighed the treated patients. The impact of warehousing could explain the downward trend after 2007 as more clinical trial data on new treatment became available. Another likely explanation for the decrease in the number of treated patients is treating physicians' perception of the need to treat given the slow progression of the disease, corroborated by our improved knowledge of the natural history of chronic hepatitis C and other cofactors of fibrogenesis.

Finally, there is a treatment capacity in some countries in which there are not enough physicians and other health care providers available to treat the infected individuals. This has been observed in parts of Brazil and Eastern Europe. In USA, on the other hand, the medical system can handle at least 144 000 treated patients, the number treated in 2004 (5). Historically, treatment capacity is an unlikely cause of decline in treated patients in USA.

The treatment rate of HCV remains very low. According to our analysis, <5% of the diagnosed population and close to 2% of the viraemic prevalent population are treated in USA today. Without an increase in the treatment (and cured) rate, the disease burden is expected to increase.

Sensitivity analysis

As evident from this analysis, forecasting the epidemiology of HCV at a regional or a national level requires numerous assumptions. This is further complicated by the uncertainty in individual inputs. To determine which assumptions have the largest impact on the future prevalence of the disease, sensitivity analysis was used. As described in Figure 11, there are typically two potential explanations for an assumption to be highlighted as important in a sensitivity analysis: (i) high level of uncertainty related to the assumption/input and (ii) output is highly sensitive to small changes in the assumption. In a typical sensitivity analysis, the two are confounded.

Figure 11.

 Reasons for assumptions showing up as important in a sensitivity analysis.

Our analysis focused on identifying assumptions that have the largest impact on future prevalence by selecting a fixed range ±10% above and below the base assumption. The assumptions that were varied were all the transition rate/probabilities and starting populations as shown in Figure 12. In the base scenario, the 2021 prevalence was calculated (2.47 million) while keeping all assumptions at their base value. One assumption was increased by 10% while keeping all others at their base value, and the new 2021 prevalence was recorded. The same assumption was reduced by 10% and the new 2021 prevalence was recorded again. This was repeated with all assumptions under consideration. In case of mortality, the mortality rate in the general population was changed by ±10%, which had the same effect as changing the mortality rate in the HCV-infected pool by the same percentage. The resulting prevalence numbers were divided by the original value to determine per cent change in the 2021 prevalence as a result of ±10% change in each assumption. The assumptions were ranked from the highest to the lowest impact and the results are plotted in Figure 12.

Figure 12.

 Key drivers of uncertainty for 2021 prevalence.

The tornado diagram shown below ranks the assumptions by the magnitude of their impact on 2021 prevalence. When all assumptions were set to their base value (centre line), the forecasted 2021 prevalence was the same as the original value (100%). When the starting prevalence rate of 1.37% was increased by 10% (1.51%) or decreased by 10% (1.23%), the future prevalence increased or decreased by 10% as well. The same behaviour was observed when per cent viraemic (base value of 79.70%) was changed by 10%, implying that future prevalence is very sensitive to both of these assumptions and any change in these assumptions will have an equal impact on the forecasted prevalence.

An interesting observation was the importance of mortality rate on the future prevalence of the disease. When the mortality rate was increased by 10%, the 2021 prevalence decreased by 5%. When it was decreased by 10%, the 2021 prevalence increased by 5%. This implies that in our model, mortality is the third key driver of future prevalence after prevalence rate and per cent viraemic. After mortality, incidence rate and per cent viraemic among the incident population had the largest impact on the 2021 prevalence. Varying all other assumptions (SVR, % genotype 1, diagnosis rate, treatment rate and relapse rate) by ±10% had a <2% impact on future prevalence.


Although hepatitis C is considered an important cause of chronic liver disease, accurate and representative epidemiological data are difficult to obtain. Several models have been developed, but all predictive models are limited by the uncertainty of key assumptions. In this paper, we utilized a system-based approach for compiling epidemiological data and validating the model. These data are the result of the first phase of an ongoing disease-specific project. In this manuscript, we have described a system dynamic modelling framework that can be used to predict future prevalence of the disease in different regions; our methodology provides a unique means to understand the cause and effect relationships between the key assumptions that need to be made and future trends in the prevalence and incidence of the disease. Using a system approach, it is possible to estimate the future trend in the absence of perfect information.

A comprehensive global review of the literature was used to gather size and flow data of infected pools in 37 countries. A weighting was considered to rank the importance of the reported data. Sensitivity analyses were used to adjust for key drivers of the future prevalence. Where data were not available, a model was used to estimate mortality rates. This method provides an approach to modelling the epidemiology of HCV infection that can be used in countries where detailed data are not available, where each population group is first assessed as a dynamic system with inflows and outflows to and from the pool. The unique benefit of such an approach is that it does not require accurate information, except for the relative size of populations moving in and out.

We take into account trends showing that, for example, the prevalence is expected to decline in several regions including USA, as shown by Davis et al. (17). The benefits of this approach and the simulation model were validated using data from USA, where considerable research has been carried out to understand the historical and future trends of HCV infection.

There are many difficulties in determining the prevalence of HCV. Many countries have reported cross-sectional data in unrepresentative populations – for example, blood donors. Others have relied on expert opinion. Similarly, incidence is one of the most problematic factors in estimating the future epidemiology of HCV infection for any given country, and reported prevalence rates may not be representative of a country's population. Even when robust studies are available, it is possible that the overall prevalence is underestimated. Studies in USA have shown that the true prevalence is higher than NHANES' estimates due to undersampling in high prevalence subgroups (50, 72). When underrepresented subgroups were taken into account (e.g. homeless, incarcerated, veterans, active military personnel, etc.), HCV prevalence was shown to be significantly higher. For the purpose of this work, NHANES data were used, given that the trends and conclusions will remain the same.

Our results reveal that the diagnosis of HCV has been increasing while treatment rates have been declining. The diagnosis rate in USA is higher than originally estimated. Although the exact reasons for this are not clear, it is possible that an increasing number of patients are being assessed and treated within clinical trials or that treatment is being temporarily withheld until the advent of protease inhibitors in combination with pegylated interferon and ribavirin. Another key finding of this study is that the mortality rate among HCV-infected individuals was determined to be higher than that of the general population and is a key driver of future prevalence; also, the disease burden resulting from HCV infection will increase even though the total size of the infected population will decrease.

Our model shows that mortality rates among HCV-infected individuals are higher than those of the general population, and predicts a noticeable increase in liver-related mortality rates after the age of 45 as the disease progresses, a finding that is in keeping with previous studies. For example, recently, McDonald and colleagues observed a significant excess for any diagnosis as well as liver-related diagnosis in a large cohort of HCV-infected individuals (73). However, it is important to note that extraneous mortality due to IDUs overshadows all other causes of mortality in individuals between ages 15 and 44.

It should be pointed out that the pool of existing cases of liver disease is due to past infection. HCV was not identified until 1989, but by then had infected tens of millions of people worldwide. Thus, the prevalence of HCV infection may decrease despite the prevalence of liver disease increasing because of a significant lag between the onset of infection and advanced disease. The impact of the disease is predicted to be substantial, however. A many-fold increase in health care utilization is likely as a result of the current and impending disease burden. Hepatitis C is now a common reason for a hepatology consultation and is the single leading indication for liver transplantation in USA and several other countries.

The launch of new drugs, higher treatment levels and increased efficacy rates will accelerate the reduction in prevalence with time. Sensitivity analysis was used to further understand the cause and effect relationship between the different assumptions. Most factors that were identified as part of this analysis – prevalence, viraemic and incidence rates – were predictable. However, a critical factor highlighted as a result of this work, which has not been identified in other studies, was the large impact of mortality rate among infected individuals. Small changes in the mortality rate seem to have a substantial impact on the future prevalence. A review of the prevalence flow diagram highlighted the importance of mortality where it outweighs both new cases (incidence) and the number of cured patients. There are two logical reasons why mortality is now showing up as a key driver of future prevalence. First, the overall prevalent population is ageing and their overall mortality rate is increasing even in the absence of liver-related deaths. Second, most new cases in USA (and many other countries) are among IDUs, who, although younger, have a much higher mortality rate between the ages 15 and 44 due to their high-risk life style. These data are supported by recent Australian, Scottish and Swedish publications. The addition of the liver-related deaths at older ages (45+) in both populations only exacerbates the overall impact.

It is possible to suggest that our analysis predicts HCV infections will simply disappear with time as the result of low incidence and high mortality rates. This is not the case. Our analysis suggests that approximately 2 million infected individuals will remain in USA by 2021. In this same timeframe, the infected population over the age of 60 will more than double, resulting in a proportional increase in liver-related diseases. Treatment rates remain extremely low for this disease and they have been declining in USA since 2004. We predict an increase in the disease burden unless there is a substantial increase in treatment and cured rates.

From a system approach, the ageing of the infected population could result in some unexpected trends. An increase in nosocomial infections could occur among elderly populations as the prevalence in this population increases and a higher percentage of the older population requires medical treatment. In fact, a review of data among US blood donors (74) already shows an increase in incidence among 50+-year-old first-time blood donors. However, this assumption is based on a continued risk of transmission in hospital settings at a time when transmission by blood products is no longer occurring in regions where sensitive testing is utilized.

The model suggests that it is feasible to eradicate HCV infections. Accelerating the decline in the prevalent population will ultimately lead to a reduction in incidence, which will further reduce prevalence. This can be accomplished through segmentation of the high-risk and infected populations and developing unified strategies for each segment. While strategies to eradicate HCV are beyond the scope of this work, there is ample evidence that such strategies could work. In France, a multidisciplinary approach to treating IDUs has significantly reduced the number of new cases in this population. However, a close collaboration and communication between the medical and public health communities responsible for diagnosis, tracking and treatment (and the development of more effective and palatable treatments by the pharmaceutical industry) will be required. Our model suggests a means for control of the infection, in the absence of a vaccine.


This study was completed through the International Conquer C Coalition (I-C3) organization. Funding for this programme was provided through an educational grant provided by Merck and support from the Center for Disease Analysis. We are indebted to all I-C3 and Regional Conquer C Coalitions members for their contributions and comments. We would also like to thank David Goldberg and Scott McDonald of Health Protection Scotland, Ann-Sofi Duberg of Department of Infectious Diseases, Örebro University Hospital, Örebro, Sweden, and Janaki Amin, of Epidemiology and Clinical Research, University of New South Wales, Australia, for their contributions and insights to the mortality section of this article. We are also grateful to Dr. Thierry Poynard of Groupe Hospitalier Pitié-Salpêtrière in Paris, France, for providing us with previously unpublished hazard ratios for untreated HCV-infected patients. Finally, we would like to acknowledge Regina Klein and Angie Largen of Center for Disease Analysis for their assistance with data gathering and analysis in the preparation of this document.

Disclosures: DK Advisory Board: MSD. HAR Grant: Merck. CLC Research: Merck, Roche, Boehringer Ingelheim, Vertex, Tibotec and Gilead. Speaker: Merck, Roche. Advisor: Merck, Roche and Vertex. AA Consultant: Roche, Gilead, Novartis, BMS, J&J, Merck, Schering Plough, Research Grants: Gilead, Merck, BMS. GMD Financial relationship with a commercial interest: GlaxoSmithKline. Grant/Research Support: Vertex. SP Board Member: BMS, Boehringer Ingelheim, Tibotec/Janssen-Cilag, Gilead, Roche, Merck/Schering-Plough and Abbott. Speaker: GSK, BMS, Boehringer Ingelheim, Tibotec/Janssen Cilag, Gilead, Roche and Schering-Plough. Grants: BMS, Gilead, Roche and Merck/Schering-Plough. EZ Nothing to disclose. KK Nothing to disclose. KHH Nothing to disclose. SZ Consultant: Abbott, Achillion, Anadys, BMS, Gilead, Itherix, Merck, Novartis, Pfizer, Roche, Santaris, Tibotec and Vertex. FN Advisor: Schering-Plough, Roche, Novartis, Abbott and Gilead.


Appendix A

Table A1.   Summary of key assumptions
Model inputAssumptionData source
Incidence rate2004 incidence rate by age and genderCenter for Disease Control, Hepatitis Surveillance Report (36)
Acute to chronic rate82%Thomas and Seeff (28)
Prevalence rate2002 prevalence by age and gender applied to 2004 populationArmstrong et al. (49)
Viraemic rate79.7%Armstrong et al. (49)
Mortality model input –% of infected population with risk factorsIDU (PopIDU) – 47%Daniels et al. (27)
Transfusion (PopTrans) – 7%Alter et al. (48)
Genotype distributionG1 – 73.7%Alter et al. (48)
G2 – 14.9%
G3 – 7.4%
Other – 4%
Annual newly diagnosed rate69.2 per 100 000Klevens et al. (58)
Treated population144k in 2004 to 83k in 2007Volk et al. (5)
Relapse rateG1 – 23%WINR Study, Jacobson et al. (75)
G2 – 4.2%
G3 – 10.6%
Sustained viral response rateG1 – 40%IDEAL Trials, McHutchison et al. (76)
G2 & 3 – 72%WINR Study, Jacobson et al. (75)