Analysis of retinoblastoma age incidence data using a fully stochastic cancer model

Authors

  • Mark P. Little,

    Corresponding author
    1. Radiation Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, DHHS, NIH, Bethesda, MD
    • Radiation Epidemiology Branch, National Cancer Institute, DHHS, NIH, Division of Cancer Epidemiology and Genetics, Bethesda, MD 20852-7238, USA
    Search for more papers by this author
  • Ruth A. Kleinerman,

    1. Radiation Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, DHHS, NIH, Bethesda, MD
    Search for more papers by this author
  • Charles A. Stiller,

    1. Childhood Cancer Research Group, Department of Paediatrics, University of Oxford, Headington, Oxford, United Kingdom
    Search for more papers by this author
  • Guangquan Li,

    1. Department of Epidemiology and Biostatistics, School of Public Health, Faculty of Medicine, Imperial College, London, United Kingdom
    Search for more papers by this author
  • Mary E. Kroll,

    1. Childhood Cancer Research Group, Department of Paediatrics, University of Oxford, Headington, Oxford, United Kingdom
    2. Cancer Epidemiology Unit, Nuffield Department of Clinical Medicine, University of Oxford, Headington, Oxford, United Kingdom
    Search for more papers by this author
  • Michael F.G. Murphy

    1. Childhood Cancer Research Group, Department of Paediatrics, University of Oxford, Headington, Oxford, United Kingdom
    Search for more papers by this author

Abstract

Retinoblastoma (RB) is an important ocular malignancy of childhood. It has been commonly accepted for some time that knockout of the two alleles of the RB1 gene is the principal molecular target associated with the occurrence of RB. In this article, we examine the validity of the two-hit theory for RB by comparing the fit of a stochastic model with two or more mutational stages. Unlike many such models, our model assumes a fully stochastic stem cell compartment, which is crucial to its behavior. Models are fitted to a population-based dataset comprising 1,553 cases of RB for the period 1962–2000 in Great Britain (England, Scotland and Wales). The population incidence of RB is best described by a fully stochastic model with two stages, although models with a deterministic stem cell compartment yield equivalent fit; models with three or more stages fit much less well. The results strongly suggest that knockout of the two alleles of the RB1 gene is necessary and may be largely sufficient for the development of RB, in support of Knudson's two-hit hypothesis.

Introduction

Retinoblastoma (RB) is an important ocular malignancy of childhood.1 It has become commonly accepted that knockout of the two alleles of the RB1 gene on chromosome 13q14.2 is the principal molecular target associated with the occurrence of RB.2–6 The normal RB1 gene encodes proteins, which regulate the cell cycle, cell survival and differentiation.7 Inactivation of RB1 potentially leads to disruption of the growth pattern of retinal cells and results in uncontrollable cell growth, a hallmark of cancer. The majority, about 60%, of RB patients (the nonheritable cases) are thought to develop the disease through two (somatic) mutations to, or deletion of the RB1 gene.5 These patients have unilateral disease, in which tumors appear in one eye only. A smaller number, about 40%, of patients possess a germline RB1 mutation; the majority of these have bilateral disease, i.e., develop RB in both eyes. About 2% of unilateral sporadic cases, however, are also germ line mutation carriers.8 Individuals possessing a germline RB1 mutation are thought to have on average approximately a 90% risk of developing RB.

There have been extensive modeling studies of RB. Based on the characteristics of the distribution of age at diagnosis, Knudson5 proposed a so-called two-hit model which hypothesizes that two mutational events are sufficient to induce RB. Moolgavkar and Venzon9 formulated a model in which malignancy is assumed to arise from two successive mutations. Moolgavkar and Venzon9 described both a fully stochastic model, with a birth–death process with time-constant rates for both the normal and intermediate cell populations, and a semi-stochastic approximation, in which the normal stem cell population was modeled as a deterministic function. The semi-stochastic version of the model has been extensively used over the last 20 years, but little use has been made of the fully stochastic model.

Although some modeling studies have supported the two-hit hypothesis,10, 11 others have not.12 In this article, we examine the validity of the two-hit theory for RB by comparing fits of models with different numbers of mutational stages in a large UK-based dataset. This approach may be contrasted with previous studies, which for example either used non-population based case-only data5, 13 or used non-mechanistic models.10, 12 By allowing for stochastic growth of retinal cells, the models we fit are an extension of the models of Moolgavkar and Venzon9 and Little.14 We shall obtain estimates of the RB1 mutation rates and other parameters, which we shall discuss in the light of known data.

Data

A total of 1,553 cases of RB were ascertained from the population-based registers kept by the Childhood Cancer Research Group for the period 1962–2000 in Great Britain (England, Scotland and Wales). Patients in this study were grouped into four categories by laterality and known family history, namely, bilateral cases with known family history (BH), bilateral cases without known family history (BnH), unilateral cases with known family history (UH) and unilateral cases without known family history (UnH). Two additional small groups of RB (23 cases in total) with unknown laterality were excluded from further analyses. Further breakdown of case numbers by age is given in Table 1.

Table 1. Number of retinoblastoma cases by age and type in the period 1962–2000 in the CCRG registry
inline image

Patients with known family history are defined by the diagnosis of RB in any relative of the index patient, either before or after the diagnosis of the index patient (and as such slightly different from the definition used elsewhere in this data, of a previous generation, or collateral family line or siblings8), whereas other cases are categorized as being with unknown family history, effectively meaning that there is no known first-degree relative that is affected by RB [It should be noted that age at diagnosis is a poor marker of disease severity in familial RB due to changing ophthalmologic follow-up in more recent years.15]. Throughout this article, these two categories of patients are referred to as familial and nonfamilial cases, respectively. Adoption is a potential source of incompleteness in family history data. The minimum proportions of adopted cases (for whom family history information would not always be available) born and diagnosed between 1962–1999 were 0.7% with an (unlikely) upper limit of 2.4% for the putatively heritable cases and 1.0% with an (unlikely) upper limit of 5.2% for the putatively nonheritable cases.16

Statistical Methods

Both the fully stochastic and semi-stochastic versions of the two-stage MVK model assume that neoplasia results from an accumulation of two mutational events.9 For homozygous wildtype RB1 individuals, the two-stage fully stochastic model starts with one embryonic (retinal) stem cell at or near conception with a pair of intact RB1 alleles. This cell undergoes division and differentiation to generate the required number of cells in the retina. Cell divisions take place and produce a copy of the ancestral cell and an intermediate cell with a hit in one of its two RB1 alleles. In addition to cell division and differentiation, the resultant intermediate cell may experience a further RB1 hit and become malignant, developing rapidly into a detectable tumor. Similarly, the one-stage model assumes all cells initially carry a defective copy of RB1, inherited or from a germinal mutation, so that one more hit to the wildtype RB1 allele is sufficient to cause RB. Details of fitted models are given in Tables 2 and 3 and Supporting Information Tables A1 and A2. We shall also consider the three-stage model, in which full malignant transformation requires three mutational events (details in Table 2, Supporting Information Tables A5 and A6). In contrast to previous approaches, the embryonic cell population is modeled here by a stochastic birth and death process in which the growth rate and the rate of differentiation are assumed to be piecewise constant. The growth rate is constrained (at 19.87 years−1) so that the expected number of retinal stem cells by 6 months (182 days) gestation agrees with previously derived estimates, 20,000 cells,17 and the death/differentiation rate set to 0. Further details on derivation of the hazard function are given in Supporting Information Appendix B. We therefore, further assume that the growth patterns of cells in the first compartment of the one-, two- and three-stage models are identical. Two classes (1 + 2 stage, 2 + 3 stage) of semi-stochastic model, in addition to the fully stochastic ones, with a deterministic retinal stem cell population are considered; the number of susceptible retinal cells at age t from conception is assumed to have the following functional form:

equation image
Table 2. Deviance [and degrees of freedom (df)] and Akaike information criterion (AIC) for optimal bilateral 1-stage + unilateral 2-stage fully stochastic and semi-stochastic models, and bilateral 2-stage + unilateral 3-stage fully stochastic and semi-stochastic models, with or without constraint on population prevalence, as given in Supporting Information Tables A1–A8
inline image
Table 3. Estimated parameters (and 95% CI) for optimal model (Supporting Information Table A2), a bilateral 1-stage + unilateral 2-stage fully stochastic model with cell growth fixed to give 20,000 cells in initial cell compartment by day 182 of gestation, with prevalence of RB1 heterozygotes constrained to be 2.0 × 10−5 (unit: per cell per year, unless otherwise stated)
inline image

Here X1− is fixed (at 19.87 years−1, as above) to generate 20,000 cells in the first 182 days (= 182/365.2425 = 0.498 years) of gestation. Further details are given in Table 2, Supporting Information Tables A3, A4, A7, and A8.

Depending on the underlying disease mechanism, each of the RB groups is associated with some combinations of the one- and two-stage models. Bilateral RB cases, both familial and nonfamilial, are assumed to be heterozygous at birth, with one normal and one mutated RB1 allele,18 which are usually present at conception. As we argue in the “Discussion” Section, incorporation of the gestational mutation does not appear to alter the hazard, so we are justified in using a one-stage model to describe the disease evolution of the two bilateral RB groups; the mutation is assumed to be present from conception. On the other hand, the majority of unilateral RB patients develop the disease through two somatic alterations,19 although a small number will be heritable.20, 21 To accommodate this uncertainty, a scaled combination of the one- and two-stage models (or alternatively the two- and three-stage models) is used for the two unilateral groups.

The data are stratified by age at diagnosis, from age 0 to 14, and quinquennial year of observation (apart from 1962 to 1965). Therefore, for each of the four RB groups, there are in total 15 age groups and 8 follow-up periods. The number of cases in each stratum is assumed to come from a Poisson distribution with the corresponding mean. More specifically, we assume that the number of familial bilateral cases (BH) in the stratum with age interval [ti,ti + 1] (measured from conception) and calendar year k (k = 1962,…,2000) is a Poisson random variable with mean:

equation image(1)

where PYi,k and equation image are the number of persons at risk in that stratum and the probability of a single malignant cell developing in that year group using the one-stage (and two-stage models, respectively), given by equation image where h1(t) is the hazard function of the one-stage (and two-stage models, respectively). ηA(i),B(k) is the adjustment for the B(k)th year of observation group (B(k) = 2,…,8 with the first-observation group, 1962–1965 anchored to unity) in age group A(i) [It should be noted that we constrain the arguments of the integral in m1[s,t] to be positive, so that large latent periods result in this term becoming trivial.]. There are highly significant differences between the adjustment for calendar year for age 0 and for larger ages, but no marked differences in adjustment above age 0, so we adjust using the two-level categorization A(0) = 1,A(i) = 2,i > 0. All time variables are augmented by adding age from conception, assumed to be 0.728 years (= 266/365.2425), the average length of gestation. For example, for the age interval [0, 1], the above integral becomes (in the absence of lag) equation image. σ is the proportion of the births that are heterozygous for RB1, i.e., carry a single RB1 mutant allele at birth and tlag is the assumed period of latency. pmath image determines the distribution of the cancers over time and is given by:

equation image(2)

Here ρ1 is the probability of a bilateral RB case that starts to develop in the first year of life being detected in that year of age, whereas ρ2 describes the probability, conditional on arriving at the beginning of a subsequent year of age without being detected, of being ascertained before the end of that year of age. When ρ1 and ρ2 are equal this simplifies to the standard geometric, or waiting time, distribution:

equation image(2a)

In a similar way, the expected number of nonfamilial bilateral (BnH) RB cases is given by:

equation image(3)

SBnH multiplicatively scales the one-stage (and two-stage, respectively) probability, i.e., subject to the adjustments for latency (pmath image, pmath image), it is the ratio of nonfamilial to familial bilateral cases among the RB1 heterozygous cohort. Analogously to Eq. (2), pmath image determines the distribution of the bilateral nonhereditary RB cases over time and is given by:

equation image(4)

Similarly, the expected number of familial unilateral (UH) cases is given by:

equation image(5)

where equation image is the probability of a single malignant cell developing in that year group using the two-stage (and three-stage, respectively) model, given by equation image, and where h2(t) is the hazard function of the two-stage (and three-stage, respectively) model (with the integral arguments limited as above). Analogously to Eqs. (2) and (4), pmath image and pmath image determine the distribution of the unilateral hereditary cancers over time and are given by:

equation image(6)

and similarly for pmath image. The number of nonfamilial unilateral (UnH) cases is given by:

equation image(7)

where analogously to Eqs. (2), (4), and (6), pmath image and pmath image determine the distribution of the unilateral nonhereditary cancers over time and are given by:

equation image(8)

and similarly for pmath image. SUH and SUnH multiplicatively scale the one-stage (and two-stage, respectively) probabilities, i.e., subject to the adjustments for latency (pmath image, pmath image and pmath image), they are the ratios of unilateral familial and unilateral nonfamilial cases to bilateral familial cases among the RB1 heterozygous cohort. T multiplicatively scales the two-stage (and three-stage, respectively) probability, i.e., subject to the adjustments for latency (pmath image and pmath image), it is the ratio of nonfamilial to familial unilateral cases among the RB1 wildtype homozygous cohort. The parameters equation image in Eqs. (1)(8) account for possible differences in time of diagnosis between the BH and BnH groups and the UH and the UnH groups, respectively.

In each age-calendar year stratum, the total at-risk population, PYi,k is divided into PYi,k · σ and PYi,k · (1 − σ); the former sub-population enumerates the population of RB1 heterozygotes, whereas the latter one describes the RB1 (wildtype) homozygotes. In some rare circumstances, the RB1 mutant allele carried at birth in the heterozygote population may have resulted from a mutation early in gestation, before the split of the retinal cell population between the two eyes; we address this possibility with calculations based on the optimal model in the “Results” Section.

Fitzgerald et al.10 estimated the RB mutation rate in New Zealand to be about 9.3 × 10−6 to 10.9 × 10−6 per gene. A slightly lower rate of 6 × 10−6 was estimated for the Hungarian population by Czeizel and Gárdonyi.22 Doubling these rates to estimate the rate per individual and adjusting for the 90% prevalence of RB that has been estimated among the RB1 heterozygous cohort8, 20 implies the RB1 mutation prevalence rate per individual should be about 2.1 × 10−5 to 2.4 × 10−5 per individual based on the data of Fitzgerald et al.10 or about 1.5 × 10−5 per individual based on the data of Czeizel and Gárdonyi.22 For certain fits (Table 2, Supporting Information Tables A2, A4, A6, and A8), we, therefore, fix the prevalence rate, σ, at the approximate average of these estimates, σ = 2.0 × 10−5.

For the purpose of comparing with the mechanistic models, an empirical model was also fitted, in which, for example, the expected number of bilateral familial cases is given by:

equation image(9)

and similarly for the other RB types. The age adjustments from the empirical model were used as the observed values [with 95% profile-likelihood confidence interval (CI)] plotted in Figure 1. Fitting of all models was via Poisson maximum-likelihood23; for the empirical model, this was done in R,24 whereas for the mechanistic models likelihood maximization was performed using the Numerical Algorithms Group quasi-Newton algorithm E04JYF25 in FORTRAN (programs available from the principal author on request). To ensure the stability of the optimal values for each fitting, 50 optimizations were performed, where the starting points for the current optimization are set to the optimal values from the previous optimization. The 95% CIs for model parameters were calculated from 199 parametric bootstrap samples.26 The Akaike information criterion (AIC)27, 28 is used for model comparison. A small AIC value indicates a better model for explaining the observed data.

Figure 1.

The observed and predicted age-specific incidences (with 95% CI) of bilateral and unilateral RB cases in the general population from 1962 to 2000, from the optimal model with bilateral 1-stage + unilateral 2-stage models, no latency, reduced stochastic delay model with separate adjustment for RB1 heterozygous and RB1 wildtype homozygous unilateral cases, G20+ = G10+, D20+ = D10+, with population prevalence of RB1 heterozygotes constrained to be 2.0 × 10−5 (Table 3, Supporting Information Table A2).

Results

Comparison of the AIC values in Table 2 show that a fully stochastic stem cell model, with one- and two-stage models, fits as well as most other models, comparable with the model with the deterministic stem cell population and much better than the model with a combination of two- and three-stage models. Models with a greater number of stages (three- and four-stage) yielded even worse fits (results not shown). A model with a deterministic stem cell population and constrained RB1 population prevalence (Table 2, Supporting Information Table A4) fits slightly better than any other, but there is practically no difference between the various optimal one- and two-stage models given in the upper part of Table 2. Figure 2 demonstrates that the susceptible stem cell population for the optimal one- and two-stage model peaks at 20,000 cells at 6 months of gestation, after which there is a steep decline, to near 0 by age 5.

Figure 2.

The number of susceptible retinal cells predicted by the optimal model with bilateral 1-stage + unilateral 2-stage models, no latency, reduced stochastic delay model with separate adjustment for RB1 heterozygous and RB1 wildtype homozygous unilateral cases, G20+ = G10+, D20+ = D10+, with population prevalence of RB1 heterozygotes constrained to be 2.0 × 10−5 (Table 3, Supporting Information Table A2).

As indicated by the AIC values in Supporting Information Tables A1–A8, models that use the simpler stochastic delay model (in which ρ1 = ρ2, ρ4,het = ρ5,het and ρ4,hom = ρ5,hom) with no latency (tlag = 0) fit as well as more complex models that relaxed these constraints. Imposing the constraint M10 = M20 = M21 (Mij = jth mutation rate in i-stage model) did not make the fit significantly worse for the one- and two-stage models with unconstrained RB1 population prevalence σ (Supporting Information Tables A1 and A3); however, when the constraint σ = 2.0 × 10−5 was used, imposing the further constraint M10 = M20 = M21 significantly impaired the fit (Supporting Information Tables A2 and A4). The models allowing for ρ4,het = ρ5,het and ρ4,hom = ρ5,hom to vary separately (rather than constrained to be equal: ρ4,het = ρ5,het = ρ4,hom = ρ5,hom) did not significantly improve the fit in any case (Supporting Information Tables A1–A8); nevertheless for the one-stage and two-stage models, the AIC was lowest in this case. Although it did not have quite the lowest AIC, because the analogous optimal stochastic model with RB1 population prevalence unconstrained (next to rightmost column of Supporting Information Table A1) has estimates of this parameter at least an order of magnitude too large, 2.48 (95% CI 2.07, 2.88) × 10−4, we regarded as the optimal model that given by the rightmost column of Supporting Information Table A2.

The scaling parameter SBnH was estimated to be 1.84 (95% CI 1.59, 2.11) (Table 3, Supporting Information Table A2), suggesting that the risk of RB is slightly higher in the “nonfamilial” RB1 heterozygote population than in the “familial” heterozygous population, although this is somewhat offset by the lower probability per year of detection of a case in the nonfamilial group (ρ3 = 0.788 vs. ρ1 = ρ2 = 1.0) (Table 3, Supporting Information Table A2). Table 4 indicates that 13.31% (95% CI 12.00, 14.72) of the RB cases are “familial” bilateral RB1 heterozygote vs. 24.14% (95% CI 21.77, 26.51) of the RB cases that are “nonfamilial” bilateral RB1 heterozygote and that 56.46% of the RB cases (95% CI 54.41, 59.92) are in the unilateral “nonfamilial” group vs. 0.00% cases (95% CI 0.00, 0.00) in the “familial” group. Table 4 demonstrates that within the RB1 heterozygous birth cohort the risk is predominantly expressed among persons with bilateral disease (13.31 + 24.14 = 37.45% of RB cases) compared with unilateral cases (4.31 + 1.78 = 6.09%). Table 4 also shows that the proportion of unilateral cases that are RB1 heterozygous is 9.74% (95% CI 5.75, 12.63).

Table 4. Description of population characteristics (penetrance and percentage of total RB cases)
inline image

To investigate the possibility that the RB1 mutant allele carried at birth in the heterozygote population may have resulted from a mutation early in gestation, before the retinal cell population divides between the two eyes, we calculated the chance of a first mutation occurring in the (wildtype) homozygote population within the first 3 weeks after conception (after which point the retinal cell population divides between the two eyes).29 Calculations using the optimal one- and two-stage stochastic model (Table 2) show that the probability of at least one mutation in the (wildtype) homozygous population up to 25 days of gestation is 2.86 × 10−7, small compared with the chance of a single mutation at conception, assumed to be 2.00 × 10−5, implying that the overwhelming majority of bilateral cases will result from persons carrying a RB1 mutation in the germline. We, therefore, judge that we are justified in modeling the bilateral cases using a simple one-stage model.

Discussion

In this analysis, we examined models based upon the two-hit paradigm as well as those with additional mutational stages. On the basis of model comparisons, it is evident that models derived from the two-hit theory, a one-stage model for heritable RB and a two-stage model for sporadic RB, performed noticeably better than those with more mutational stages. Although genetic alterations to both copies of the RB1 gene have been shown to be necessary to induce RB,3, 4, 6, 30 it is still unclear how many mutational events are required for tumorigenesis. In a series of papers, Knudson et al.5, 31, 32 satisfactorily described RB incidence by fitting a model with one postconception hit for bilateral cases and two such hits for unilateral cases. Likewise, Fitzgerald et al.10 analyzed the distribution of age at diagnosis of RB patients in New Zealand and concluded that linear and quadratic functions of age gave good fits to the bilateral and unilateral data, respectively, supporting the two-hit paradigm. In contrast, Bonaïti-Pellie et al.12 fitted polynomial models in age to French cohort data and found evidence that three stages were required for the progression of RB. However, the data collection in the study of Bonaïti-Pellie et al.,12 based on a combination of hospital records and data from private medical practitioners, may not be as complete and is not comparable in quality with the population-based registry data presented here. Mastrangelo et al.13 assessed the age distribution of a series of unilateral and bilateral cases, and used the discrepancy between the predictions of a two-hit model and their observations to argue against this model. However, their data consisted only of a hospital-based series of cases without estimates of the underlying population. It is also not clear how they derived their predicted distribution, which they assert should be normal.13 It is clear from our data and modeling that the distribution of age at RB diagnosis is far from the shape of a normal distribution (Fig. 1), but this does not argue against a two-hit mechanism.

Based upon the analysis here, we argue that the two hits to the RB1 gene are the principal events in RB. Although some molecular evidence suggests that genetic alterations in addition to the loss of RB1 occur in RB,2 it is questionable whether these alterations are necessary for RB development. It is not clear whether these additional alterations consistently occur in all RB tumors, in the same way that RB1 inactivation has been observed.2, 33 It has been shown that absence of the RB1 function results in defects in chromosome segregation and deregulation of spindle checkpoint, which together may contribute to various genetic alterations harbored in human RB tumor cells,34 suggesting the additional alterations may have been the consequence of the loss of RB1. The fact that the two-stage model provides a good fit to the data strongly suggests that the two RB1 “hits” are sufficient, a conclusion consistent with Knudson's hypothesis.5 However, it is difficult to rule out involvement of non rate-limiting mutations, so we cannot with certainty infer the sufficiency of two hits. However, the worse fits from models with more than two rate-limiting stages indicate that all other alterations are not rate-limiting and could be a result of the two hits.

In some cases, instead of RB, loss of RB1 leads to the benign lesion retinoma.2, 35 However, retinoma is rare—only present in ∼0.7% of germline RB1 mutation carriers36—which may due to the highly unstable genome resulting from the absence of RB1. Although an apparently large proportion of enucleated RB tumors exhibit retinomas in some proximity,35 the relevance of these clinical case reports to the population-based data being modeled here is slight. In any case, the register used here does not record retinomas, and so these events cannot be incorporated into the model that we use.

Although the first RB1 mutation is in the germline for most bilateral cases, it may have resulted from an early gestational mutation in some rare situations. As implied by our calculations in the “Results” Section, this is unlikely: to distribute this RB1 mutant allele to subsequent cells in both eyes, the mutation has to take place before the split of two distinct cell populations for each of the two eyes, which happens well before the 3rd week of gestation.29 The probability of such a mutation is low, a bit more than 1% of that of the prevalence of germline RB1 mutations. Set against this, there is evidence from a clinical series that about 5.5% of bilateral RB cases are mosaic.37 There are of course uncertainties in our estimate of gestational mutation rate, as also in the estimate of Rushlow et al.37 Moreover, the data underlying this latter paper are clinically based, unlike the population-based series used here, so one would not expect these proportions to be equal.

The most frequent events associated with the first mutation in hereditary and nonhereditary RB are single base substitutions, together with some short and large deletions,38, 39 whereas the second mutation is found to result from loss of heterozygosity, promoter hypermethylation, mitotic recombination or even a second independent base substitution.40–43 The variability in the types of mutations capable of inactivating the first and second RB1 allele would not make it surprising that there could be differences between mutation rates. We find evidence that the mutation rates in the RB1 heterozygous cohort ( equation image) is larger by an order of magnitude compared with the RB1 (wildtype) homozygous cohort ( equation image).

The population prevalence rate of RB1 heterozygotes that we estimate in the optimal one- and two-stage model with this parameter unconstrained, 2.48 (95% CI 2.07, 2.88) × 10−4 (Supporting Information Table A1), is very much higher than those estimated by others. As noted in the “Statistical Methods” Section, penetrance-adjusted prevalence rates in the range 1.5–2.5 × 10−5 per person have been estimated by various researchers.10, 22 For this reason, we prefer as the optimal model that in which the prevalence rate is constrained to σ = 2.0 × 10−5. The prevalence of RB among the RB1 heterozygote population that we estimate, 81.03% (95% CI 69.11, 94.77) (Table 4), agrees well with previous estimates, of about 90%.8, 20

The finding that 13.31% (95% CI 12.00, 14.72) of the RB cases are “familial” bilateral RB1 heterozygote vs. 24.14% (95% CI 21.77, 26.51) of the RB cases that are “nonfamilial” bilateral RB1 heterozygote (Table 4) implies that the majority of bilateral cases arise from new mutations. However, this really has to be applied to family history defined in terms of RB occurring in any relative of the index patient (either before or after diagnosis of the index patient) and depends hugely on the completeness of ascertainment of the family histories. It is expected that in some cases in the earlier years of the study period the only other members of the family having RB will be diagnosed more recently than the RB probands/index cases, so that the latter did not have family history of RB at the time of their own diagnosis. This suggests that the earlier ascertainment that would be expected with a known family history might not apply to “familial” cases in this early period.

Table 4 suggests that 3.06% (95% 0.00, 6.04) of sporadic unilateral cases are RB1 heterozygous. This is entirely consistent with the estimate of this proportion derived (via model fitting) of 1.7–2.3% by Draper et al.8 or the estimate (derived by DNA sequencing relevant cases) of 5.5% derived by Houdayer et al.21 Likewise, Table 4 indicates that the proportion of unilateral cases that are RB1 heterozygous is 9.74% (95% CI 5.75, 12.63), reasonably similar to the estimate of 13.5% cited by Rushlow et al.37 or the figure of 13.0% that can be derived from Richter et al.44; incidentally, the data underlying both of these papers are not population based, unlike those used here, so one would not expect these proportions to be equal—it is perhaps remarkable that the agreement is as good as it is.

Some attempts have previously been made to model RB mechanistically but none have explicitly taken account of the growth pattern of the retinal cells, which is suggested to be important to the development of this cancer.9, 20 Through an estimate of the cell division fraction, a ratio of the mean number of cell divisions that have occurred before some age to that of the total number of divisions, Hethcote and Knudson32 implied that the susceptible retinal cells are those actively dividing and whose number increases initially and then decreases. Tan and Singh45 constructed a two-stage model of RB which they used to fit to US Surveillance Epidemiology and End Results data, fairly similar to the model described here, although they used a logistic function of the stem cell population, which is not fully stochastic. Morris46 used a branching process with an allowance for cell loss to describe the development of the retina and predicted that there were 1.09 × 107 retinal cells in a single, mature eye. In our study, it is assumed that there are two cell types during fetal development of the retina, actively dividing cells and fully differentiated cells; the embryonic retinal development is modeled by a stochastic birth and death process with an age dependent growth rate and a constant differentiation rate.

It is crucial to identify the first cell or cell type able to trigger neoplastic growth when it is genetically altered.47 In search for the cell of origin of RB, two biological models have been recently proposed.47 The “progenitor cell” model designates the retinal progenitor cells as the cell of origin, whereas the “transition-cell” model, the transition cells, those that have committed to one of the seven fully differentiated retinal cell types but still have the capability of dividing, are the cell of origin47; in particular, there is suggestive evidence that RB are derived from cone cells.48 Our optimal model does not imply a large growth advantage for cells with an RB1 deletion: in the intermediate cell compartment, the death/differentiation rates exceed the growth rates (Table 3)—the development of RB is a consequence of the huge increase in numbers of stem cells in the first 6 months in utero. This is also supported by the similarity in birthweights among those with heritable and nonheritable disease.49 As such, our results tend to favor the transition-cell model.

In summary, we have given evidence that the two hits to the RB1 gene are the principal events in RB. Although some molecular evidence suggests that genetic alterations in addition to the loss of RB1 occur in RB,2 it is questionable whether these alterations are necessary for RB development. The fact that the two-stage model provides a good fit to the data strongly suggests that the two RB1 “hits” are sufficient, and that they are rate-limiting (events which happen sufficiently slowly to appreciably affect incidence), a conclusion consistent with Knudson's hypothesis.5 The worse fits from models with more than two rate-limiting stages indicate that all other alterations are not rate-limiting and could be a result of the two hits.

Acknowledgements

The authors are grateful for the detailed and helpful comments of the two referees. The views expressed here are those of the authors and not necessarily those of Children with Leukaemia, the Department of Health and the Scottish Ministers. G.Q.L. was also funded by the Overseas Research Students Awards Schemes through the Higher Education Funding Council for England. The authors are very grateful to Dr. Gerald Draper for many constructive comments. We are grateful to the regional and national cancer registries of England, Scotland and Wales, regional children's tumor registries, the UK Children's Cancer Study Group (now the Children's Cancer and Leukaemia Group), the Office for National Statistics and the General Register Office (Scotland), who all supplied the National Registry of Childhood Tumours with notifications of retinoblastoma cases for the period covered by this study.

Ancillary