Carcinoma of the breast has recently become the leading cause of newly diagnosed cancer among Hong Kong Chinese women,1 and a major cause of mortality. The rising incidence in Hong Kong2 tops virtually all other East Asian populations except Singapore's.3 Hong Kong reported age-standardized rates of 36.2 per 100,000 women compared to between 10.0 and 27.2 for China (Qidong County, Shanghai, and Tianjin), between 23.6 and 36.6 in Japan and 43.5 per 100,000 in Singapore during 1993–1997.3 Nevertheless, the age-standardized incidence rate remains only about half of those reported in North America, western Europe and Australasia.3, 4
Our previous work2, 5 found that the average annual increase in age-standardized incidence of 1.2% during 1973–1999 was mainly driven by a strong cohort effect, likely related to migration patterns, increasing affluence and westernization as a result of socioeconomic development. Thus, it would be useful to anticipate their impact on future disease rates. The accurate prediction of future breast cancer trends is vital for public health planning, resource allocation and policy making, not least to decide whether mammographic screening is likely to become less ineffective in the short- to medium-term for low-prevalence populations, such as Hong Kong's.6 In addition, since our previous work, life course theory has rapidly advanced our thinking on the relevance of social history in the evolution of population disease risk.7 Here, we specifically attempted to interpret past trends not only from the perspective of contemporaneous lifestyle factors, rather to discern the epidemiologic imprint of Hong Kong's socioeconomic developmental history. Thus in the prediction phase of the analysis, we could extend this line of reasoning to better inform specification of the model.
While breast cancer incidence in Hong Kong may continue to rise to levels currently experienced by western women as newer cohorts continue to adopt an increasingly western-influenced lifestyle, it is plausible that the cohort-driven incidence rise over the last 30 years might have been a transient phenomenon due to the initial wave of socioeconomic development and concomitant westernization with its unique colonial, historical and developmental circumstances rather than being part of a sustained accelerated epidemiologic pattern. More particularly, the development-induced cohort effect from improved nutrition and living conditions could exert influences through several pathways with differential effects on pre- and postmenopausal breast cancer, and effects over many generations, although inevitably much of the evidence to date has come from Caucasian studies and may not necessarily extend to Asian populations. Higher birth weight is primarily associated with premenopausal breast cancer8 potentially through higher in utero estrogen exposure.9 On the other hand migration studies have indicated that early life in a more affluent environmental also increases breast cancer risk.10, 11 This is consistent with cohort studies suggesting an independent contribution to carcinogenesis of faster and greater linear growth, especially at puberty, impacting both pre- and postmenopausal disease. It is possibly mediated by higher levels of growth factors or IGF-1 or faster growth allowing less DNA repair,12, 13 although the mechanisms are not at all clear, nor whether age of menarche is a risk factor as a marker of these growth processes or an additional factor.14 Because of epigenetic constraints on growth increments between generations, increases in height may take place over many generations15 and similarly breast cancer incidence and particularly postmenopausal breast cancer increases with the number of generations in an economically developed environment.16 Further, understanding the epidemiologic evolution of breast cancer incidence in the developed population of Hong Kong may act as a reliable epidemiological sentinel for the rest of developing China, consisting of one-fifth of the world's population.
In this article, we used an age-period-cohort model to project breast cancer incidence in Hong Kong to 2018 and to understand the relative contributions of different determinants of future disease trends.
Sources of data
We obtained data on female breast cancer incidence and mid-year population figures for the years 1974–2003 from the Hong Kong Cancer Registry and the Census and Statistics Department,17 respectively. The Hong Kong Cancer Registry is a population-based registry covering the entire local resident population (6.8 million people in 2003). Information on newly diagnosed cases were collected from both the private and public service sectors (mainly through departments of clinical and radiation oncology and histopathology), and from the Government's Births, Deaths and Marriages Registry, as well as voluntary notification from all medical practitioners. The completeness and quality of the data has been reported to be good, with over 95% coverage for most cancers. The Hong Kong Cancer Registry is an accredited member of the International Association of Cancer Registries.
We included all 34,052 breast cancer cases registered during the period of observation in the analysis. We grouped the incidence and population data into fourteen 5-year age groups from 20 to 24 years to 85 or above, and six 5-year calendar time periods from 1974–1978 to 1999–2003, respectively. This classification resulted in 19 birth cohorts centered at 5-year intervals beginning in the calendar year 1894.
Age-adjusted incidence rates were calculated by direct standardization according to the World Standard Population18 and expressed per 100,000 women, since this is the reference population currently adopted by the Hong Kong cancer registry.1 We further calculated the age-standardized rates using the Segi population,19 to allow comparison with previous studies.
We modeled breast cancer incidence using an age-period-cohort model20, 21, 22, 23 which decomposes disease rates over time by chronological age, calendar period and birth cohort. We chose the second and the penultimate periods and the central birth cohort as the reference categories. We used Bayesian inference to estimate the model parameters,24, 25, 26, 27 and the fitted model was used to project future incidence in 3 further 5-year periods up to 2014–2018.
For the age, period and cohort effects we specified second-order Gaussian autoregressive priors in the forward direction.28 These priors specified that the initial expected value of each effect was based on an extrapolation from its 2 immediate predecessors. We extrapolated 3 additional period and cohort effects so as to allow projections of future incidence.
We estimated the model parameters using Markov Chain Monte Carlo (MCMC) simulations with 5 concurrent chains started at different initial values since comparison of multiple chains can allow us to discern convergence. We used the criteria R-hat to monitor convergence.29 On the basis of the values of R-hat, we discarded the first 10,000 samples as a burn-in period, and then took a further 40,000 samples from the posterior distributions. The parameter estimates and derived rates were summarized in terms of posterior means and 95% credible intervals. Further technical details are given in the Appendix.
The model goodness-of-fit was measured by the posterior mean deviance D.30 To compare models, the deviance information criterion (DIC) was calculated, which adjusts the posterior mean deviance for the number of parameters in the model.30 A smaller DIC implies a better fit. We plotted the Pearson residuals to verify that the fit of the final model was appropriate.
To examine whether the results differed for pre- vs. postmenopausal cancer, we looked for any potential difference in cohort effects in women diagnosed at age 50 or below (i.e. premenopausal) vs. those above 50 years (postmenopausal) by including age-cohort interaction terms. Alternatively, we also stratified our original set of analyses by age of diagnosis (≤50 vs. >50 years). We also checked different cut-points of menopausal ages (45 or 55 years) in the sensitivity analyses.
All analyses were implemented using R version 2.3.031 and WinBUGS version 1.4.32
We fitted age, period and cohort effects sequentially and compared different models in terms of D and DIC. The full age-period-cohort model provided the best fit with substantially smaller values of D and DIC compared with the other partial models (Table I).
Table I. Summary Goodness-of-Fit Statistics of Age-, Age-Period, Age-Cohort and Age-Period-Cohort Models for Female Breast Cancer Incidence in Hong Kong from 1974 to 2003
Posterior mean deviance, D
Deviance information criterion, DIC
Age-period-cohort with age-cohort interaction term
Figure 1 plots the age-standardized annual incidence rates for women from 1974 to 2003 and the projected incidence to 2018. Historically the age-standardized annual incidence rates rose on average 1.2% annually in the 3 decades between 1974 and 2003. Our model projected that incidence would continue to rise by a total of 18.3% in the next 15 years, or 1.1% per annum, from 1999–2003 (45.9 per 100,000 women) to 2014–2018 (54.3 per 100,000 women). Alternatively, standardizing to the Segi population,19 the age-standardized incidence rose on average 1.1% annually between 1974 and 2003, and is projected to rise 1.2% per annum from 1999–2003 (41.8 per 100,000 women) to 2014–2018 (50.2 per 100,000 women).
The observed age-specific incidence per 100,000 women in each of the 6 5-year time periods from 1974–1978 to 1999–2003 are shown as points in the panels of Figure 2, where each panel corresponds to a different age group. Superimposed are the fitted incidence rates from the age-period cohort model, and the empirical projections for the next 3 periods (2004–2008, 2009–2013, 2014–2018) with 95% credible intervals. Upward trends are apparent for most age groups. Rising rates were projected to continue into the future for women between 50 and 80 years whereas recent incidence increases in women in their 4th and 5th decades would likely plateau. However, breast cancer incidence for those aged 85 or above was projected to decline.
The estimated parameter values of the age, period and cohort components are shown in Figure 3. Because of the well known identifiability problem of age-period-cohort models, where there is inherent linear dependence between the 3 component effects (i.e., cohort = period − age), only second-order changes (i.e., changes in slopes or inflection points) are interpretable.22,33 Figure 3a shows the fitted age-specific incidence rates per 100,000 women, and reveals no clear nonlinear age dependency of incidence except perhaps for a slight change in slope (i.e. a deceleration) centered around age 50. Relative risks were calculated for the 6 time periods from 1974 to 2003 and 19 birth cohorts from 1894 to 1979 (Fig. 3b). Cohort effects dominated and 3 inflection points (circa 1910, 1940 and 1960) can be clearly identified, while there were negligible second-order changes in period effects.
When we estimated the cohort effects separately for pre- and postmenopausal women, we found no evidence of a significant age-cohort interaction (Fig. 4) or different cohort effects under models stratified by age at 45, 50 or 55 years (Fig. 5). Furthermore, the inclusion of age-cohort interaction terms did not improve on the full age-period-cohort base case model as shown by the slightly higher value of DIC for the former in Table I.
Our findings predict that breast cancer incidence would maintain a 1.1% per annum rate of increase over the next 15 years, similar to the 1.2% annual percent change in the previous 30 years. Cumulatively this represents a total increase of 18.3% between 2004 and 2018. These trends would continue to be mainly driven by ageing and cohort effects, which began with the postwar baby boomer generation; whereas there is no evidence of important changes in risk by calendar time period. We were limited by the historical data series such that projecting disease rates further into the future beyond 2018 (i.e., extrapolating >50% beyond available data) would be inappropriate.
Women born in the earliest birth cohorts had the lowest breast cancer incidence, likely due to a combination of slow growth and late menarche (due to poor nutrition), early first pregnancy, high fertility and universal breastfeeding; and a shorter life expectancy which would have resulted in competing mortality risks where breast cancer did not develop in those susceptible to the disease. Starting with the postwar migrants who were the first generation of girls (and particularly of adolescent girls) to experience in large numbers the more westernized lifestyle associated with socioeconomic development, the cohort effect increased as indicated by the positive slope after the late 1930s and early 1940s (Fig. 3b). Although Hong Kong had a lower gross domestic product per head than western Europe in the 1950s there were undoubtedly better childhood living conditions relative to previous generation of Hong Kong residents who mainly grew up in China. For example, Hong Kong men born between 1950 and 1959 were about 2.5 cm taller than educated men of earlier generations attending Canton Christian College in 1924.34, 35, 36
The cohort effects plateaued for women born after 1960, even while socioeconomic development continued as shown by a linear rise in Gross Domestic Product (GDP) per capita from 1960 to 2006 (data not shown). This may suggest there is an upper threshold to the population effects of transitioning to a developed society, where for instance Hong Kong's total fertility rate has stayed the lowest in the world at 0.9 (where replacement is 2.1) and is unlikely to decline further, ever breastfeeding prevalence has recovered from its nadir at around 25% in the mid-1980s37 to about 50% currently,38 and age at menarche has shifted to its physiologic minimum at 11.7 years39 which is comparable to western figures.40 On the other hand, it may be that historical events related to that particular cohort effect had largely plateaued out. By about 1960, the proportion of Hong Kong girls growing up in Hong Kong plateaued out at about 80%, while the amount of intergenerational change in early life breast cancer risk factors, such as birth-weight, linear growth and earlier menarche is constrained. It will take more than 1 generation for these breast cancer risk factors to reach western levels,15 so that the full effects of an economically developed lifestyle could take many generations to become evident. An upward inflection in breast cancer risk might be expected in the next generation of women to grow up in Hong Kong, i.e. the daughters of this first generation who largely grew up in Hong Kong. This second generation of women largely growing up in Hong Kong (and born in Hong Kong) is most likely to be birth cohorts starting in 1975. Once this second generation had reached an age of vulnerability to breast cancer another upward inflection might be expected. We conducted a sensitivity analysis assuming that there was such a third wave of cohort effects starting after 1960 in which the rate of changes similar to that of the 2 waves of cohort effects (1890–1910, 1930–1960), and we found that the increasing cohort effects did not change the projected age-standardized incidence rates substantially and they were close to the upper limits of the 95% credible intervals based on the base age-period-cohort model (Fig. 6).
Cohort-driven effects in breast cancer incidences have been noted in a large number of Asian and western studies: Japan, Singapore, Taiwan, Canada, Connecticut and Sweden.41, 42, 43, 44, 45, 46, 47, 48 All these studies showed a strong birth cohort effect, confirming the central contribution of reproductive, lifestyle and other possible environmental factors in the aetiology of breast cancer. As in Hong Kong, the cohort effects were flat for birth cohorts in Singapore from 1895 to 1930 and increased steadily from the 1930 birth cohort until the 1965 birth cohort,48 although Singapore has a different migrant history from Hong Kong. Similarly, there was a rapid increase in breast cancer incidence in Japan after the 1930s birth cohorts until the 1960 birth cohort.42
Related to this, we also project the rising age-specific incidence rates for women between age 50 and 80 years continues into the next 15 years. This phenomenon is not observed in most other countries that have transitioned through socioeconomic development in the more distant past. In Hong Kong, it is likely due to ageing in the cohort starting with the postwar migrants who were the first generation of girls to have grown up in an increasingly westernized environment as Hong Kong underwent rapid economic transition in the meantime.
It has been suggested that the diagnosis of breast cancer encompasses several distinct pathologic entities which could be broadly divided into pre- and postmenopausal disease.49 However, our analysis of age-cohort interactions (Fig. 4) and stratification at age of menopause (Fig. 5) suggests that the risk due to birth cohort did not vary for pre- vs. postmenopausal cancers and the strong effects of birth cohort were similar for both groups, which have been previously observed by others.48, 50 These sub-analyses suggest that the process of westernization increased the risk of both pre- and postmenopausal disease to a similar extent.
Finally, from the perspective of public health planning and policy making, the projected sustained incidence rise in disease rates during the medium term to 2018 calls for a concomitant increase in resource allocation for the early diagnosis and treatment of breast cancer. Moreover, the present policy of no mass mammographic screening may need to be reviewed. As disease prevalence at screen increases, the false positivity rate and associated harm falls, the positive predictive value goes up and the cost-effectiveness of the preventive intervention may become less unfavorable.
In summary, our analysis shows that the previously observed increased risk2, 5 in breast cancer incidence has continued and confirms that this is mainly attributed to ageing and strong cohort effects. Of particular novelty, our findings can be consistently explained by Hong Kong's developmental history as reflected through disease rates in successive birth cohorts. On the basis of this insight, we presented an additional plausible “high burden” scenario to project future cancer incidence, in addition to the base case statistical approach.
The age-period-cohort model
We assumed that the number of observed breast cancer cases, cij, out of a population size of nij women in age group i in time period j, followed a Poisson distribution with mean μij. Under the full age-period-cohort model the mean is specified as:
where α (i = 1, …, I, I = 14) is the age effect, β is the period effect (j = 1, …, J, J = 6), γ is the cohort effect (k = 1, …, K, K = 19 and k = I + j − i) and log(nij) is the offset term. Including projections for N further periods, the fitted and projected incidence rates were obtained through a recombination of the smoothed age α, period β and cohort γ effects according to the relationship
Autoregressive priors for age, period and cohort effects
Under the Bayesian framework, we specified the Gaussian autoregressive priors for the I age effects αi,
where the hyperparameter is a precision parameter controlling the degree of smoothing on each time scale and was given a noninformative uniform(0, 1000) prior. We used similar prior distributions for the period effects βj and the cohort effects γk with hyperparameters and , respectively. The autoregressive priors provided nonparametric smoothing of the estimated age, period and cohort effects, and allowed extrapolation of future period and cohort effects based on the most recent 2 period and cohort effects, respectively.
We obtained all parameter estimates by Markov chain Monte Carlo (MCMC) simulation methods. In the simulations, 5 chains each with 50,000 iterations were run with the first 10,000 iterations of each chain used as a ‘burn-in’ period to minimize the effect of initial values. Initial values for each chain were randomly chosen from normal(0,100) distributions for α, β and γ, and uniform(0,1000) distribution for σα, σβ and σγ. We monitored the convergence by comparing the posterior distributions across the multiple chains (which had started at different values) and calculating the statistic with convergence defined as the point when this statistic fell below the threshold of 1.2.29 We sampled every 5th value (‘thinning’) after the burn-in period rather than every single value to reduce the autocorrelation in the sampled values and also to reduce memory storage requirements, and a total of 40,000 sample values were retained from the posterior distributions and used for parameter inference. All parameters and functions of parameter estimates (i.e., fitted and projected rates) were summarized by the posterior means, and 95% credible intervals were calculated from the 2.5th percentile and 97.5th percentile of the sampled values.