Genomic clustering of fitness‐affecting mutations favors the evolution of chromosomal instability

Abstract Most solid cancers are characterized by chromosomal instability (CIN)—an elevated rate of large‐scale chromosomal aberrations and ploidy changes. Chromosomal instability may arise through mutations in a range of genomic integrity loci and is commonly associated with fast disease progression, poor prognosis, and multidrug resistance. However, the evolutionary forces promoting CIN‐inducing alleles (hereafter, CIN mutators) during carcinogenesis remain poorly understood. Here, we develop a stochastic, individual‐based model of indirect selection experienced by CIN mutators via genomic associations with fitness‐affecting mutations. Because mutations associated with CIN affect large swaths of the genome and have the potential to simultaneously comprise many individual loci, we show that indirect selection on CIN mutators is critically influenced by genome organization. In particular, we find strong support for a key role played by the spatial clustering of loci with either beneficial or deleterious mutational effects. Genomic clustering of selected loci allows CIN mutators to generate favorable chromosomal changes that facilitate their rapid expansion within a neoplasm and, in turn, accelerate carcinogenesis. We then examine the distribution of oncogenic and tumor‐suppressing loci in the human genome and find both to be potentially more clustered along the chromosome than expected, leading us to speculate that human genome may be susceptible to CIN hitchhiking. More quantitative data on fitness effects of individual mutations will be necessary, though, to assess the true levels of clustering in the human genome and the effectiveness of indirect selection for CIN. Finally, we use our model to examine how therapeutic strategies that increase the deleterious burden of genetically unstable cells by raising either the rate of CIN or the cost of deleterious mutations affect CIN evolution. We find that both can inhibit CIN hitchhiking and delay carcinogenesis in some circumstances, yet, in line with earlier work, we find the latter to be considerably more effective.

Current theories for the evolution of CIN, and, in fact, the evolution of genomic instability in general, usually invoke two distinct, yet not mutually exclusive, mechanisms of selection on instabilityinducing mutator alleles. Chromosomal instability mutators may intrinsically raise a neoplastic cell's chances of survival and reproduction, that is, its Darwinian fitness. As a result, CIN mutators may be directly favored by natural selection. For example, it has been suggested that CIN may originate from directly beneficial oncogenic mutations that simultaneously induce genomic instability [the oncogene-induced DNA replication stress hypothesis] (Halazonetis, Gorgoulis, & Bartek, 2008;Negrini et al., 2010).
Alternatively, mutators may be favored not for their own intrinsic effects on a cell's fitness but through genetic association with intrinsically beneficial mutations elsewhere in the genome. In other words, CIN mutators may evolve via so-called indirect selection by hitchhiking (Smith & Haigh, 1974) with selectively favored oncogenic and tumor-suppressing mutations they generate (Loeb, 2001;Sprouffske, Merlo, Gerrish, Maley, & Sniegowski, 2012). Indirect selection on alleles that increase the genomic mutation rate (e.g., CINinducing mutators) is expected to be particularly effective in asexual populations, such as cancers, in which the genetic association between mutators and beneficial mutations can never be disrupted by recombination. In fact, numerous theoretical and experimental studies in microbes have already shown that mutators may spread through non-recombining populations by hitchhiking if beneficial mutations are readily available (reviewed in: Raynes & Sniegowski, 2014;Sniegowski, Gerrish, Johnson, & Shaver, 2000).
Importantly, while genomic instability may increase the rate of beneficial mutations, it necessarily also increases the rate of deleterious mutations, which are generally more common (Cahill et al., 1999). However, alleles that elevate the point mutation rate (i.e., NIN-or MSI-inducing mutators) generate DNA changes confined to only a few nucleotides.
As a result, beneficial and deleterious mutations are likely introduced independently from each other by separate mutational events. Thus, while such mutators may be frequently lost to selection against the increased load of deleterious mutations, they may also occasionally expand within a neoplasm by hitchhiking with a rare beneficial mutation. In contrast, CIN mutators generate large-scale SCNAs that are not confined to single loci. Instead, SCNAs may disrupt many neighboring loci, thereby simultaneously introducing both beneficial and deleterious changes. For example, a single SCNA may delete a tumor suppressor and a neighboring housekeeping gene. Genetic linkage between the relatively rare beneficial loci and the more common deleterious ones may drastically limit the availability of SCNAs with net beneficial effects (needed to facilitate CIN hitchhiking) and, thus, inhibit CIN evolution.
We have hypothesized that indirect selection could, nevertheless, favor CIN given a spatial organization of the genome that minimizes co-occurrence of beneficial and deleterious loci in SCNAs.
Specifically, we wanted to test the hypothesis that CIN would be favored by indirect selection in genomes in which either beneficial or deleterious loci were spatially clustered along the chromosome.
To do so, we developed an individual-based stochastic population model of clonal evolution in spatially organized genomes. In simulation, we investigated the effect of the spatial distribution of beneficial and deleterious loci on CIN evolution and cancer progression.
We also tested the effectiveness of therapeutic strategies that aim to raise the deleterious costs of CIN in order to inhibit CIN evolution.
Finally, we examined the spatial distributions of candidate human oncogene and tumor suppressor loci (identified in Davoli et al., 2013) for evidence of spatial clustering that could facilitate indirect selection for CIN in real cancers.

| Stochastic simulations
To model the evolutionary progression of a neoplastic cell population to cancer, we developed and simulated an individual-based computational model of clonal evolution based on the earlier work of Beerenwinkel et al. (2007) and Datta et al. (2013). Like these earlier studies, we employed a Wright-Fisher model (Ewens, 2004), in which a population of neoplastic cells evolves in discrete, nonoverlapping generations with the probability of each cell's reproduction being proportional to its relative fitness. As in earlier studies, cells in the model could acquire beneficial and deleterious fitness-affecting mutations upon reproduction. Unlike these earlier studies, however, we also allowed for the evolution of CIN by adding SCNA-type mutations (described below) and introducing CIN mutator alleles.
As in the work of Beerenwinkel et al. (2007) and Datta et al. (2013), simulations here start with a tumor population of initial size N 0 = 10 6 cells and end when the tumor develops into a cancer.
Following Datta et al. (2013), we defined cancer as a tumor in which 10% of all cells have acquired at least 20 beneficial mutations.
The total size of the neoplastic population is constrained to grow exponentially at a rate proportional to the average fitness of the population. The size of the population at generation t + 1 is defined as N t+1 = N t ⋅ (1 +w ), where w is the average fitness of the tumor population (see below) and β is a constant that governs the rate of population growth; as in Datta et al. (2013), we set β = 0.0016.
Neoplastic cells can acquire single-locus mutations that either change fitness or induce CIN. Chromosomal instability-inducing mutations result in a mutator phenotype which allows cells to generate SCNA mutations (described below). In order to explicitly explore the role of indirect selection and genetic hitchhiking in the evolution of CIN, CIN-inducing mutations are assumed to have no intrinsic direct effect on a cell's fitness. Furthermore, a single CIN mutator mutation is assumed to be sufficient for the mutator phenotype; additional CIN mutations have no effect on the mutation rate and so cannot affect the dynamics of the mutator lineage.
Among the fitness-affecting mutations, beneficial mutations increase a cell's fitness by s ben while deleterious mutations decrease a cell's fitness by s del . Since s ben and s del are held constant, fitness of a cell with x beneficial mutations and y deleterious mutations can be computed as w xy = 1 + x ⋅ s ben − y ⋅ s del . Correspondingly, if f xy is the fraction of cells with x and y beneficial and deleterious mutations, respectively, in a tumor, then the average fitness of the tumor population is w = ∑ x ∑ y f xy w xy . Unlike in the earlier work (Beerenwinkel et al., 2007;Datta et al., 2013), the effect of multiple mutations in our model is additive rather than multiplicative. Note that there is little difference between additive and multiplicative fitness for genotypes containing a small number of mutations. (Algebraically, (1 + s ben ) x (1 − s del ) y ≈ 1 + x ⋅ s ben − y ⋅ s del for small x and y.) However, for genotypes containing many mutations (like some CIN genotypes in our model), multiplicative fitness, which increases exponentially with the number of mutations, results in unrealistically high values.
The tumor population is composed of genetic lineages of neoplastic cells defined by the counts of fitness-affecting and CIN mutator mutations they carry. As per the Wright-Fisher model, the size of a lineage with x beneficial mutations and y deleterious mutations at generation t + 1 is drawn from a multinomial distribution with expectation given by N t+1 f xy w xy ∕w, where f xy is the frequency of the lineage in generation t, and w xy ∕w is its relative fitness.  (2013), we set U ben = 10 -5 . Because estimates of the deleterious mutation rate and effects are limited, we set s del = 0.01 and we assumed that deleterious mutations outnumber the beneficial ones one hundred-fold and set U del = 10 -3 . Note that higher U del or s del would raise the deleterious load of SCNAs, reinforcing the importance of genomic clustering in promoting CIN; on the other hand, lower U del or s del would lessen the role of genomic clustering in facilitating CIN evolution. U CIN has no effect on the role of genomic clustering and was set at U CIN = 10 -5 . We set U SCNA = 0.01 SCNA mutations per cell per generation after Lengauer, Kinzler, and Vogelstein (1997) in lineages carrying CIN mutators. Finally, we set s ben = 0.1, based on the estimates of beneficial effects of ~0.004 to ~0.6 obtained from earlier theoretical studies (Beerenwinkel et al., 2007;Bozic et al., 2010;McFarland, Mirny, & Korolev, 2014).

| Genomic analysis
In the original study, TUSON predictions were used to rank every gene in the genome based on its potential as a tumor suppressor or an oncogene (Davoli et al., 2013). For our analysis, we used the top To assess whether human oncogenes and tumor suppressors are clustered more than random, we used a permutation approach.   across SCNAs. In these genomes, CIN mutators are strongly disfavored by indirect selection (P CIN 10% < P neutral 10% ) and the waiting time to cancer is not significantly different than in genomes without CIN ( Figure 3). However, as beneficial and deleterious loci become increasingly clustered, the probability of mutator establishment raises dramatically above that of a neutral mutation (P CIN 10% > P neutral 10% ). In other words, CIN mutators switch from being disfavored to being strongly favored by indirect selection. Correspondingly, as selected loci become increasingly clustered, cancer progression is accelerated, with the waiting time to cancer minimized in genomes with the most clustered loci.

| Spatial clustering of fitness-affecting loci across SCNAs promotes the evolution of CIN
Note that in genomes with clustered beneficial loci, the waiting time to CIN establishment is not significantly different from the waiting time to cancer. Here, CIN mutators become established by hitchhiking with SCNAs containing multiple beneficial mutation. As a result, an expanding mutator lineage may already carry enough beneficial mutations for the tumor to become cancerous as soon as it is

| Oncogenic and tumor-suppressive loci in the human genome are distributed with relatively high variance
Simulations show that increased variance in either beneficial or deleterious fitness effects of SCNAs, resulting from increased spatial clustering of fitness-affecting mutations, can promote the evolution of CIN. To assess the variance in beneficial fitness effects of potential SCNAs in the human genome, we examined the spatial distribution of tumor suppressor and oncogene loci identified by the TUSON algorithm, developed by Davoli et al. (2013). Intriguingly, the work of SCNAs and decreased for longer SCNAs. We then evaluated the observed variance using a permutation approach (Methods). We found that the true variance in the distributions of both oncogenes and tumor suppressors was consistently higher than the mean variance of the permuted distributions across all of SCNA lengths examined and significantly higher (above the 95th percentile) for focal SCNAs shorter than ~10 6 nucleotides. Thus, it appears that beneficial oncogenic and tumor-suppressive effects are, in fact, more clustered in potential SCNAs than expected by chance, suggesting that the human genome could be organized in a way that could promote CIN hitchhiking.

| Spatial organization of the genome affects the success of CIN-inhibiting therapies
In simulations, we showed that high variance in the spatial distribution of beneficial loci across SCNAs can promote the evolution of CIN, which in turn can significantly accelerate carcinogenesis.
Furthermore, our analysis of the distribution of known oncogenes and tumor suppressors suggested that mutations beneficial to neoplastic cells may be more clustered than random in the human genome. In light of these observations, we investigated whether the evolution of CIN could be inhibited by either increasing the mutation rate of CIN mutators or by exacerbating the effects of individual deleterious loci. Both therapeutic strategies have been previously shown to successfully reduce tumor size by exploiting its deleterious mutational load (McFarland et al., 2013). Correspondingly, we wanted to test whether these strategies could also inhibit CIN evolution by increasing the deleterious load associated with CIN mutators.
Using our model, we assessed the effect of increasing both CIN rate (U SCNA ) and deleterious mutation effects (s del ) in genomes F I G U R E 3 Clustering of fitness-affecting loci promotes CIN evolution and accelerates cancer development. Waiting time to CIN establishment (blue) and cancer (red) as a function of genomic organization. For genomes with clustered beneficial mutations (µ = 1), deleterious loci were distributed using the Dirac delta function (µ = 100). For genomes with clustered deleterious mutations (µ = 100), beneficial loci were distributed using the Dirac delta function (µ = 1). The rest of model parameters are as in Figure 2. Circles are mean values calculated over 100,000 runs of simulation (error bars represent ±95% CI, all times are represented with violin plots  increasing s del only five-fold resulted in CIN mutators becoming strongly disfavored by selection in both genomes, while the average waiting time to cancer increased to non-CIN levels seen in Figure 3. Thus, while increasing CIN rate may successfully inhibit CIN evolution in some spatially clustered genomes, magnifying the effects of deleterious mutations appears to be a considerably more effective strategy. We speculate on the reasons for this difference below.

| D ISCUSS I ON
Here, we have developed a stochastic, individual-based simulation model of clonal populations to examine the evolution of chromosomal instability (CIN) via indirect selection on associated beneficial variation. The propensity of genomic mutators to spread in clonal populations via indirect selection has been extensively studied in evolutionary theory (Gerrish, Colato, Perelson, & Sniegowski, 2007;Kimura, 1967;Taddei et al., 1997) and demonstrated in experimental microbial populations (Chao & Cox, 1983;Raynes & Sniegowski, 2014). Indirect selection on mutators and their potential role in carcinogenesis have also been investigated in computational and analytic models of cancer progression (Beckman & Loeb, 2006;Datta et al., 2013). However, whether indirect selection could favor CIN mutators during carcinogenesis has remained unclear. The reason is that CIN mutators generate SCNAs large enough to simultaneously (d) (c) affect multiple loci. As deleterious mutations generally outnumber beneficial ones, SCNAs with an overall beneficial effect may be too scarce to allow for CIN hitchhiking. Thus, we hypothesized that indirect selection could only favor CIN in genomes in which fitness-affecting loci were distributed in a way that minimized the co-occurrence of beneficial and deleterious loci in potential SCNAs.
In agreement with our hypothesis, simulations showed that the genomic distribution of fitness-affecting loci can strongly influence indirect selection on CIN-inducing mutators. CIN mutators failed to establish or affect carcinogenesis when both beneficial and deleterious loci were evenly distributed among potential SCNAs. However, spatial clustering of either the beneficial or the deleterious loci promoted rapid hitchhiking of CIN mutators and accelerated carcinogenesis. In genomes characterized by the higher clustering of the beneficial loci, CIN mutators succeeded by acquiring SCNAs containing multiple beneficial mutations, rather than acquiring such mutations individually (as previously seen in models of MSI mutators, (Datta et al., 2013)). On the other hand, in genomes characterized by the higher clustering of the deleterious loci, CIN mutators succeeded by acquiring SCNAs with few beneficial mutations but a reduced load of deleterious ones. Once established, CIN mutators in such genomes were able to accelerate carcinogenesis by rapidly producing additional beneficial mutations via further SCNAs.
Importantly, in our model, we assume that all single-locus mutations have a constant effect on a cell's fitness. Therefore, the availability of beneficial SCNAs that could facilitate hitchhiking in simulation depended solely on the variance in the physical distribution of beneficial and deleterious loci. In a real tumor, however, different mutations will likely have different effects on a cell's fitness. As a result, the distribution of beneficial effects of real SCNAs will be determined by both the physical distribution of individual loci and the fitness distribution of their mutational effects. For example, the overall beneficial effect of an SCNA could be set by a cluster of smaller effect beneficial mutations or, instead, a single mutation of large effect. Hitchhiking of CIN mutators should then depend on the variance in the distribution of beneficial effects of potential SCNAs being sufficiently high to allow for SCNAs whose beneficial effects could compensate for their deleterious load.
Unfortunately, while many potential oncogenic and tumor-suppressing loci have been discovered, little is known about their fitness effects as these have been difficult to measure empirically. Thus, as a first approximation of the variance in fitness of potential human SCNAs, we used the TUSON algorithm by Davoli et al. (2013), which ranks human loci based on the likelihood of their mutations acting as either oncogenes or tumor suppressors. Like Davoli and colleagues, we assigned each gene a fitness effect corresponding to its rank, resulting in a discrete uniform distribution of fitness effects. Given this simple scheme, both potential oncogenes and tumor suppressors appeared to be more spatially clustered than expected, with shorter SCNAs up to ~10 6 nucleotides significantly so. Thus, the human genome may be organized in such a way that some of the available SCNAs have sufficiently large beneficial effects that overcome their deleterious load. If such SCNAs are available, our model suggests that alleles that induce CIN may be favored by indirect selection even in the absence of any direct benefit to a neoplastic cell's fitness. Furthermore, the relatively high clustering of selected F I G U R E 5 Exacerbating deleterious mutations is more effective at inhibiting CIN than increasing the rate of CIN. (a) In genomes with high clustering of beneficial mutations (beta-binomially distributed, µ = 1). (b) In genomes with intermediate clustering of beneficial mutations (geometrically distributed, µ = 1). In both panels: deleterious mutations distributed with no clustering (Dirac delta, µ = 100). Circles are mean values calculated over 100,000 runs of simulation (error bars represent ±95% CI, all times are shown with violin plots). Model parameters as in Figures 2 and 3  Intriguingly, a comprehensive survey of focal SCNA length across multiple cancer types by Beroukhim et al. (2010) showed an inverse relationship between SCNA length and frequency, with a median length of 1.8 × 10 6 nucleotides (Beroukhim et al., 2010) Note that this observation is only consistent with the prediction that natural selection should favor shorter SCNAs and is not evidence for the role of indirect selection in CIN evolution.
It is surprising that the human genome may be organized in a way that promotes CIN evolution. After all, natural selection might have been expected to eliminate or at least reduce clustering of oncogenes and tumor suppressors to lower cancer susceptibility.
However, it is important to note that such selection would have been only one of the determinants of genome organization. It is becoming well understood that eukaryotic genes are not randomly distributed across the genome. Related genes and gene families that have arisen through gene duplication may be expected to co-localize (Demuth & Hahn, 2009). Genes with similar or coordinated expression are also frequently clustered (Hurst, Pál, & Lercher, 2004). Importantly, cancer genes identified by Davoli et al. (2013)  can rapidly produce beneficial variation and accelerate carcinogenesis. There is also some evidence that CIN may evolve early in cancer progression (Olaharski et al., 2006;Rajagopalan et al., 2004;Shih et al., 2001;Tonini, 2017). Thus, a potential therapy to prevent CIN evolution in the first place may be able to severely inhibit cancer development. Building on the earlier work of McFarland et al. (2013), we also used our model to examine whether increasing the rate of CIN or the cost of deleterious mutations could prevent evolution of CIN. Both therapeutic strategies aim to increase the deleterious mutational load and may theoretically be expected to inhibit mutator evolution. Intriguingly, we discovered that in genomes characterized by high variance in the distribution of beneficial mutations, increasing the deleterious effects of individual mutations was considerably more effective than increasing the rate of CIN. The disparity in the effectiveness of the two strategies is likely due to the mechanics of CIN hitchhiking in these genomes. Stronger CIN mutators can still produce the rare but very beneficial SCNAs available in these genomes, which allows them to rapidly spread despite the associated deleterious load. Correspondingly, in genomes characterized by lower variance, stronger CIN mutators become less successful.
On the other hand, exacerbating the cost of deleterious mutations dramatically decreases the fitness effect of all available SCNAs, amplifying the deleterious load associated with any increase in the rate of CIN and effectively inhibiting CIN mutators.
In the study of McFarland et al. (2013), both increasing the overall mutation rate of a tumor and magnifying the effects of deleterious mutations successfully led to cancer regression. In their model, both strategies work by strengthening selection against deleterious mutations accumulated by neoplastic populations during carcinogenesis. However, magnifying the deleterious effects of these mutations proved to be a more effective therapy in simulation than increasing the mutation rate. Our results agree that exacerbating deleterious effects could also be a more effective strategy to pre-  (Karras et al., 2017;Rutherford & Lindquist, 1998) and proteosomes that degrade such proteins (Crawford, Walker, & Irvine, 2011).
In summary, extending earlier computational models of cancer

ACK N OWLED G EM ENTS
We thank members of the Weinreich laboratory for helpful discussion. We also thank two anonymous reviewers for their constructive comments. This work was supported by the postdoctoral fellowship from the Center for Computational Molecular Biology at Brown University.

CO N FLI C T O F I NTE R E S T
None declared.

DATA A RCH I V I N G
Julia code for the simulation and the simulated data sets are available at https://github.com/yraynes/Chromosomal-Instability.