Virus dynamics and phyloanatomy: Merging population dynamic and phylogenetic approaches

In evolutionary biology and epidemiology, phylodynamic methods are widely used to infer population biological characteristics, such as the rates of replication, death, migration, or, in the epidemiological context, pathogen spread. More recently, these methods have been used to elucidate the dynamics of viruses within their hosts. Especially the application of phylogeographic approaches has the potential to shed light on anatomical colonization pathways and the exchange of viruses between distinct anatomical compartments. We and others have termed this phyloanatomy. Here, we review the promise and challenges of phyloanatomy, and compare them to more classical virus dynamics and population genetic approaches. We argue that the extremely strong selection pressures that exist within the host may represent the main obstacle to reliable phyloanatomic analysis.


| INTRODUC TI ON
Virus dynamics is the study of the population biology of viruses and the host cells within an infected individual. 1,2 In many cases, virus dynamics studies assume the host to be well-

Virus dynamics and phyloanatomy: Merging population dynamic and phylogenetic approaches Eva Bons | Roland R Regoes
Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland extended periods of time. Since most antiviral drugs target virus replication, these latent compartments are not cleared by therapy.
The paradigmatic latency is established by herpesviruses in nerve cells, from which they can reactivate under specific conditions, such as exposure to ultra-violet light or immune suppression.
HIV also establishes latency in a small minority of its target cells, which is revealed after long and effective antiretroviral treatment.
Similar to latent compartments are sanctuary sites where the virus is also protected from the immune response or antiviral drugs due to strong barriers between the sanctuary site and the other compartments. However, unlike in a latent compartment, the virus can typically replicate in a sanctuary site. For example, some viruses can pass the blood-brain barrier, entering the central nervous system. This is generally considered a sanctuary site, where there is only limited exchange with the bloodstream, allowing the virus to proliferate with little influence from the host's immune system or antiviral drugs.
A better understanding of the spread of viruses across the various anatomical compartments in the host has the potential to improve treatment and vaccination approaches. Only if we know where the infection is initiated and the path it takes to become established in the host, can we target interventions to prevent infection efficiently. Information about the early processes will aid new vaccination approaches that aim to elicit immune responses targeted to specific anatomical sites. 8 Furthermore, identifying bottlenecks during the colonization of the host might open up promising dynamical targets for intervention, and understanding the dynamics in latent compartments will help to prevent viral rebound.
In this review, we assemble what we know about the spatial dynamics of various virus infections. We approach this topic from two angles. First, we adopt the perspective of classical virus dynamics 1,2 that has been extended to account for multiple compartments.
Second, we will discuss the potential of applying population genetic and phylodynamic methods to viral sequence data obtained from various compartments within the infected hosts. This application has recently been called phyloanatomy. [9][10][11] Phyloanatomy has its roots in phylogeography and phylodynamics that combine phylogenetics-the building of lineage trees based on sequencing datawith dynamical models of disease spread and migration. 12,13 Phylodynamics and phylogeography can thus elucidate the deeper evolutionary and population history from recently sampled sequences. They are widely used in epidemiology to quantify the epidemiological spread of viruses and to understand new epidemics in real-time. This allows to locate the source of an epidemic, 14,15 as well as parameters such as population size and basic reproductive number (R 0 ), 16 and migration rates between different locations. 14,17 In phyloanatomy, these well-established methods are applied to within-host viral sequencing data to elucidate the importance of certain organs or cell types in the progress of an infection.
With this review, we aim to bridge the two methodologically separate fields of virus dynamics and virus evolution. Because of the large scope of our undertaking, we did not aim to review the literature in every relevant field comprehensively. Rather, we chose to discuss studies that shaped our thinking and that we found particularly enlightening. We restricted our efforts to the dynamics and evolution of clinically relevant viruses of humans and their respective animal models.

| HIV dynamics
The most paradigmatic virus dynamics studies have been performed in the context of HIV infection under treatment. 2,18-21 These studies, relying on data collected in the blood, modeled neither the anatomical structure within the hosts nor multiple cell compartments.
The decline of the viral load observed under treatment, however, was multiphasic, which was inconsistent with the simple singlecompartment models and pointed toward dynamical complexities.
While this multiphasic decline is commonly interpreted as evidence for the existence of cell compartments with lower viral turnover, explanations have been put forward involving redistribution between anatomical tissue compartments. Hlavacek et al. 22 proposed that the deceleration of viral decline under treatment could be the result of the release of virus that had been attached to follicular dendritic cells in secondary lymphatic tissues.
More recently, De Boer et al. 23 tried to resolve the discrepancy between estimates of the lifespan of HIV in plasma, which differ about 100-fold. The analysis of the viral kinetics under treatment yielded a lifespan of plasma virion of six hours. [18][19][20] Based on data from plasma apheresis in humans, the lifespan was estimated as 20-75 minutes. 24 Lastly, infusion of virions into the blood of rhesus macaques resulted in an estimate of three minutes. 25 De Boer et al. 23 proposed that the faster estimates are measures of the migration of virus from the blood into other anatomical compartments, rather than viral clearance. They estimate that HIV is cleared in lymphoid tissue at a rate of 10-100 per day, corresponding to a lifespan of 15-120 minutes.
In the course of antiretroviral treatment, the rate of viral decline in blood decelerates to almost zero before the virus is eradicated from the host. The virus persists in a latent reservoir that constitutes a serious obstacle for curing HIV. To this day, it is therefore a focus of intense research.
There are two main questions about this compartment. First, it is not clear if the latent compartment is anatomically separate from most of the infected cells during untreated infection. Second is the question of whether there is viral turnover and evolution in the latent compartment, or if the virus in this compartment is static. These questions are very difficult to study because the viral dynamics in the latent reservoir is observable only after many months of antiretroviral treatment that prevents the faster dynamics in productively infected cell compartments. At this stage, however, the viral load is usually not detectable.
The existing literature on classical HIV dynamics models with latency has been comprehensively reviewed by Rong and Perelson. 26 In brief, on the basis of the existing virus load data, these models cannot definitively resolve the question if there is ongoing replication and evolution in the latent reservoir. Assuming that latent cell activation is the only source of residual plasma viremia, however, quantitative analysis suggests that turnover and hence evolution in the latent compartment is unlikely. 27  For example, for HCV infection there are spatial models for the spread of the virus between hepatocytes. 29 These models were based on single-cell laser capture microdissection data from liver biopsy samples of patients chronically infected with HCV, in which the HCV RNA content within infected cells has been quantified. 30 Analysis of their spatial distribution indicated that infected cells occurred in clusters, with cluster sizes ranging from 4 to 50 cells. When the virus load determined in the few hundred cells in these microdisections was scaled up to the approximate 10 11 cells of the entire liver, there was very good agreement with plasma levels of HCV in each of the four patients analyzed. The data and the analysis support the hypothesis that HCV from the blood infects a random hepatocyte, and then spreads to adjacent cells. To which extent local cell-to-cell vs long-distance spread of free virions contributed to the spread of infection still remains to be determined and could not be inferred from these data.

| Spatial dynamics of other viruses
Regarding the dynamics of HCV between blood and liver which is more central to the topic of this review, there are no modeling studies to-date. In principle, such anatomically resolved modeling would be possible as viral loads in the liver can be determined in ongoing HCV infections. In Talal et al., 31 for example, the viral load under treatment was determined in blood and liver. This study also contains a mathematical model of the viral dynamics in blood and liver, but does not consider the exchange between these two compartments. Such studies would be interesting to understand the recolonization of the liver after a transplant. (See below for a description of a study 32 that analyzed sequence data before and after transplant and hypothesized nonhepatic sites of replication.) HSV-2 represents another infection, for which spatial aspects are central. While this infection affects a single anatomical site-the genital mucosa-it causes spatially well-defined lesions in this compartment. Hence, it is characterized by a strong spatial structure.
Mathematical models have been developed to capture the spatial spread of HSV-2 in the genital mucosa. 3 All these studies allude to anatomical aspects as one of many ways to explain discrepancies between the prediction of singlecompartment models and data obtained from a single compartment.
Rarely are multiple anatomical compartments sampled. Even if they are, sampling is not sufficiently frequent to yield comprehensive time series. This problem could be alleviated by phylogenetic methods that have the potential to infer processes that occurred before the time of sampling.

| APPLICATION OF POPULATION GENETIC AND PHYLODYNAMIC METHODS TO THE WITHIN-HOST DYNAMICS OF VIRUSES
As we discussed in the previous section, standard virus dynamics models can be used to reject simple single-compartment dynamics and to propose more complex dynamics that are consistent with the data. But, because of too sparse sampling of the time courses of virus and cell population sizes in all relevant compartments, they are often insufficient to determine the full multicompartment dynamics.
There are population genetic and phylodynamic methods that have been used successfully to identify compartmentalization and the dynamics in multiple compartments. These methods require not just measures of the population size, but information on the genetic composition of the viral population. This extra information allows these methods to extrapolate contemporaneously measured genetic diversity into the past. They accomplish this by making assumptions about the underlying evolutionary dynamics.
We will first discuss studies that tested for the compartmentalization of the virus population. After that we will review studies that used genetic information to estimate the parameters of the dynamics in multiple compartments.

| Identifying compartmentalization
There are several classical population genetic methods to test for compartmentalization. In principle, these methods compare the genetic diversity within and between compartments. These have been applied to various viruses.
Tests for compartmentalization can be divided into distance-and tree based. 35 Distance-based methods require some distance measure between the viral sequences, such as the Hamming distance. A well-known example of a distance-based test is the fixation index F st 36 that has been adapted for the use of sequence data using the Hamming distance instead of allele frequencies. 37 Intuitively, these distance-based tests compare the genetic distances of viral sequences within versus between compartments.
Tree-based methods, on the other hand, require the reconstruction of a phylogenetic tree before the test can be applied. The tests rely on the particular topology of the reconstructed tree. A classical tree-based method is the Slatkin-Maddison test 38  In HIV infection, a compartmentalization of systemic vs central nervous system infection is well established. 10,35 The compartmentalization between blood, secondary lymphatic system, genital tract mucosa, and gut-associated lymphatic tissue has also been investigated, but has not been unequivocally established (see table 1 in Feder et al. 39 ).
In their methodological study on a wide variety of compartmentalization tests, Zárate et al. 35 reanalyzed clinical sequence data, and simulated sequence data. They found that the tree-based methods (Slatkin-Maddison test, 38 Simmonds association index, 40 correlation coefficients, 41 and nearest-neighbor statistic 42 ) are more sensitive than distance-based methods, F st 36,37 and analysis of molecular variance (AMOVA). 43 The sequence diversity of the clinical samples analyzed by Zárate et al. 35 most likely comprises escape mutations from cytotoxic T lypmphocytes (CTL) that accumulate during the course of infection. 44,45 Because CTLs-the selective agent of these escapes-are distributed systemically, these escape mutations can arise independently in different compartments. The simulated sequence data, on the other hand, were generated from a neutral model, according to which such parallel evolution is very unlikely.
In Box 1 and Figure 1 we show how parallel evolution can confound compartmentalization tests. The effect applies predominantly to the distance-based methods, F st and AMOVA. In comparison to We extended our recent simulation model, 95  would then go to fixation in the compartment it appeared in first.
If the resistance mutation in the second compartment was still at low frequency, a compartmentalization test would be positive.
As soon as the resistance mutation goes to fixation in the second compartment, the statistical signal for compartmentalization might disappear.
For viruses other than HIV, our insights into the compartmentalization of the dynamics are more rudimentary. While the differences in the genetic composition of HCV between liver and periphery have been investigated for many decades, 50 formal establishment of compartmentalization is rare. For HCV, there seems to exist a similar separation between blood and central nervous system as in HIV but only in cognitively impaired patients. 51 There is also evidence for compartmentalization between healthy and tumorous tissue based on the Mantel test in the liver. 52 Interestingly, the methods used to test for compartmentalization required the compartments to be known. This means that each viral sequence needed to be associated with a specific compartment from which it was sampled. There are population genetic methods that allow the identification of compartments that are a priori unknown. 53 These methods have been extensively used to characterize the host or the vector diversity of virus infections but not for the viruses themselves, certainly not within host. One reason for the fact that these methods are not applied to viruses may be that they assume independent loci and evolutionary equilibrium, which is rarely fulfilled during any infection within the host.
Phylodynamic methods that allow insights into the number of relevant compartments from transmission trees have been conceived. 54 The further development of such methods would be a fruitful direction for future research. This would particularly benefit the study of the within-host dynamics of viruses, in which the nature and number of relevant compartments is still unknown.

| Phyloanatomy
Phylodynamic methods are increasingly popular in macroevolutionary studies and epidemiology. In brief, these methods reconstruct one or multiple phylogenies from viral sequences, and use them to F I G U R E 1 Statistical power of several tests of compartmentalization as a function of migration rate. Statistical power is determined as the proportion simulations correctly identified as compartmentalized. Simulations assumed two compartments with parallel selection. This figure corresponds to Figure 1 in Zárate et al. 35 Fifty simulations of sequence evolution (see Box 1) were performed for each migration rate. We simulated two compartments under parallel evolution. We assumed identical dynamical parameters and MFEDs in each compartment, and equal migration rates between the compartments. All simulations were run for 100 generations, with 10 4 sequences of 2600 bases per compartment and a mutation rate of 10 −5 per base per generation. Parallel evolution leads to compartmentalization which is harder to detect than with a neutral model (compare with Figure 1 Zárate et al. 35 ). Especially the distance-based methods (AMOVA, Wrights measure of population subdivision F st ) often fail to detect compartmentalization even in the absence of migration estimate population dynamical parameters. This is accomplished by considering the phylogeny to be a realization of the assumed underlying population biological and sampling processes. Different population dynamical scenarios have been conceived, such as temporally changing population sizes ("skyline") [54][55][56][57] and migration between multiple geographical locations ("phylogeography"). 14,17 The migration models have been used in epidemiology to pinpoint the origin of infectious diseases, and to reconstruct the spread of an infectious disease across different geographical locations. 14, 15 These methods have been implemented in dedicated software packages such as BEAST, 58,59 or MrBayes. 60,61 In principle, these phylogeographic methods can be used also to infer the viral dynamics between different tissue compartments within the infected host. This application has been termed phy- on SIV-infected, CD8-depleted macaques, they could estimate the viral population sizes in the periphery and the brain, and the timings of migration between these two compartments. 62 Their study was based on peripheral viral sequences sampled at several time points during infection, and viral sequences from the brain sampled after death. Viral sequence data obtained after death allows only the identification of rare migration paths, as the one between blood and brain. To infer migration between systemic compartments with potentially higher rates, earlier and more frequent sampling will be required.
Another study that Salemi and Rife 10 cite in their review is Cybis et al., 63 in which viral flow between plasma, CD4 + and CD8 + T cells is investigated. This study identifies potential migration paths by Bayes factor, but does not estimate migration rates between compartments.
A recent study by Lorenzo-Redondo et al. 11 investigated replication and migration of HIV-1 in blood and lymph nodes to address the question whether there is evolution in the latent reservoir. To this end, they use high-throughput sequencing data of HIV-1 DNA in cells from blood and lymph nodes, collected from three subjects before treatment initiation, and at three and six months after treatment. The main point of this paper was that the viral phylogenies are temporally structured during treatment, which they interpret as evidence for ongoing viral replication and evolution in the latent reservoir.
This conclusion has been heavily debated. Rosenbloom et al. 64 pointed out that the original study may have been measuring the evolution that was ongoing in the shrinking nonlatent reservoir during the first six months of treatment. Other studies could not recapitulate that the phylogenies are temporally structured. 65,66 More relevant to our review are the more direct anatomical insights the study by Lorenzo-Redondo et al. 11 provides. Using traditional measures of compartmentalization (F st ), they find that blood and lymph are dynamically separated. By using phylodynamic methods, they estimate migration rates between these compartments, and find higher migration rates from lymph to blood than in the reverse direction. They observe that haplotypes in the blood are always derived from haplotypes observed earlier in the lymph nodes, from which they derive that there is no source of virus to the blood other than the secondary lymphatic system. It may also indicate that the virus in the various lymph nodes of the secondary lymphatic system is not compartmentalized.
In the context of HCV infection, the case for phylodynamic analysis was made a few years ago. 67

| Barcoding
Recently, "barcoding," or "genetic tagging," has become popular in many biological systems, such as viruses, bacteria, and cells. One of the main advantages of barcoding compared with natural diversity is that their effect on fitness is less pronounced.
In some systems, there is even evidence for neutrality of these tags, as, for example, in Salmonella typhimurium, 68 Genetic tagging is also being adopted to elucidate the dynamics of blood and immune cells. T cells, for example, have been tagged by various methods and the resulting data have been paired with population dynamical analyses to estimate dynamical parameters characterizing replication and differentiation. 83,84 While the anatomical aspects in these systems do not take a central position in this research, the process of differentiation is formally similar to migration: instead of investigating at which rate a subpopulation of cells migrates from one compartment to the next, these studies estimate the rate at which T cells differentiate to become a member of another T-cell subpopulation. The analogy to viral latency or bacterial persistence in T cells is that memory T cells have a lower turnover rate, and seed the secondary response upon reinfection.
The main result of the studies by Buchholz et al. 84 is that the Tcell response in these systems unfolds according to a linear differentiation pathway, going from naive to central memory to effector memory to effector cells. This pathway has challenged the consensus that existed in this field for a long time that memory T cells are produced from effector T cells. 85,86 The newly determined pathway is consistent with the concept of "stemness": after the first infection is cleared, a small population of memory T cells remain that can seed a response against secondary challenge, which then unfolds faster because the first differentiation step is already taken. 87 By much finer barcoding, Gerlach et al. 83 could investigate the individual differentiation fates of a large number naive T cells. They found large heterogeneity in the fates of individual cells, adding support for the fact that stochasticity is an essential feature of the differentiation dynamics.
Comparing the analysis of experimental data involving barcoding between the viral and nonviral systems, there are a few differences.
First, the analysis of bacterial and cellular systems was firmly rooted in stochastic process theory. This allowed not just the identification of a single process rate, as for SIV infection, but opened the door to estimating multiple rate constants. Especially when several processes influence the dynamics similarly, such as migration into a compartment and replication therein, estimating the corresponding rate constants reliably is challenging. With barcoding, the resulting stochasticity of the dynamics can often be exploited to disentangle the contribution of competing processes.
Second, in the case of T-cell differentiation, the tagging paired with stochastic models can be used to identify the model structure.
Thus, rather than just estimating the rate constants of potentially competing processes, the rich data from studies using barcodes can be used to compare models with very different structures. Buchholz et al., 84 for example, considered hundreds of conceivable differentiation models.
In summary, due to the fact that barcoding is more controllable and potentially neutral, it can be a very powerful tool to elucidate the within host dynamics of pathogens and host cells. In the case of viruses, a fully stochastic modeling and inference framework might allow one to gain further insights that go deeper than those gained to-date. Especially in the context of anatomical spread, barcoding will allow one to identify migration pathways between multiple compartments.

| S ELEC TI ON A S A P OTENTIAL CONFOUNDER IN PHYLOANATOMIC AL ANALYS IS
Phylogenetic and -dynamic analysis works best if evolution is neutral. 88 Interestingly, one of the main motivations for phylodynamics came from immune-mediated selection that could affect the shape of viral phylogenies. 12 In one of the earliest phylodynamics studies, for example, the shape of influenza virus phylogenies was found to be consistent with an epidemiological model that includes shortlived, strain-transcending immunity. 89 Despite the central role of selection, both as a driver and a potential confounder of phylodynamic analysis, it is not usually captured quantitatively in phylodynamic inference schemes. Furthermore, selection is not considered explicitly when reconstructing genealogies or transmission trees, which provide the empirical input into phylodynamic analysis. Commonly variation in evolutionary rates is used to capture the effect of selection in phylogenetic reconstruction.
However, from the perspective of population genetics, the relationship between the evolutionary rate and selection is too indirect. In our view, selection can best be captured in terms of fitness values of individual viral variants.
In Evolution under selection can have the advantage to generate diversity fast, thus providing viral variants diverse enough for phylodynamic analysis. This is because under neutral evolution, mutations can only rise in frequency due to drift, which is a slow process in large populations. The faster diversification will come at the cost of having to put up with the biases in the estimates of the population dynamic parameters discussed above. In particular parameter regimes, these biases might be buffered; in very large populations with high mutation rates, for example, beneficial mutations will arise independently in different compartments, but likely on different genetic backgrounds.
This will allow one to distinguish the independent mutations, thus alleviating the bias introduced by parallel evolution.
These effects are illustrated in Figure 2, where sequence evo-  102 Although there are still uncertainties about the population genetic or dynamic driver behind the switch from using CCR5 to CXCR4, 103,104 it is known that the switch expands the range of cells that can be infected, and genetic signatures for the switch have been identified. 105 In the context of bacterial infections, Haemophilus influenzae has been found to evolve higher rates of invasion from the nasal tissues into the blood. 106 In poliovirus infection in mice, it has been observed that colonization of the central nervous system is associated with the diversity of the systemic strain. 107,108 In this system, it appears to be the composition and the cooperation between viral strains, rather than specific mutations, that facilitate invasion.
Selection pressures within an infected host are likely to be larger than those on the epidemiological scale. For HIV and HCV infections, it is well established that the evolutionary rates within hosts are faster than those between hosts. [109][110][111] One of the major within-host selection pressures is exerted by CTL. Selection exerted by CTL responses has been estimated to have selection coefficients close to 100% in animal models, [112][113][114][115][116] but can also be large in humans. 48,117,118 The strength of this selection pressure is often determined by the rate at which CTL escape mutants outcompetes the wildtype virus. Due to potential clonal interference between escape mutants, the selection pressure exerted by CTL might be underestimated. 119,120 Also nonlytic mechanisms of CTL action that have been considered to lead to unspecific selection can, in theory, lead to escape if spatial aspects of the interaction are taken into consideration. 121 Selection pressures of a strength F I G U R E 2 Neighbor-joining trees of simulated sequences in two compartments (10 sequences per compartment, red and blue) under different selection scenarios and significance levels of different compartmentalization test (-: not significant, *: compartmentalization significant at 0.05 level, **: compartmentalization significant at 0.01 level, ***: compartmentalization significant at 0.005 level). In all cases, simulations of sequence evolution (see Box 1) were run for 500 generations in two compartments of maximum 1000 sequences each, starting with the same single sequence in each compartment. We used a mutation rate of 10 −5 per base per generation and a migration rate of 0.05 per generation. In the neutral scenario (left), no mutational fitness effects were taken into account, leading to generally little variation between sequences (half of the sequences are still wildtype after 500 generations). In the parallel scenario (middle) a lognormal MFED was used with μ = −0.2, σ = 0.2, f l = 0.6. The same fitness table was used for both compartments. Beneficial mutations spread fast in both compartments, obscuring the compartmentalization of the system. In the diversifying scenario, the same MFED was used as in the parallel scenario, but 40% of the effects in the fitness table were compartment specific. This causes some mutations to be beneficial in one compartment and deleterious in the other. This leads to little survival of mutated migrants, thereby increasing the inferred compartmentalization similar to those exerted by CTL have been estimated for antibody responses. 122 Immune-mediated selection strength within hosts therefore dwarfs the typical selection coefficients in other systems, in which beneficial mutations confer a 10% increase in fitness. 123 For example, one of the most advantageous alleles in recent human history confers lactase persistance and has a selection coefficient of 10%. 124 Moreover, immune-mediated selection pressures within the host are likely to be more homogeneous than on the epidemiological scale. On the one hand, this is due to the mobility of immune effectors between different anatomical compartments within the host. 125,126 On the other hand, host heterogeneity-mediated by, for example, diversity on human leukocyte antigen-may lead to diversifying selection on the epidemiological scale, rather than to parallel evolution. We hypothesize that parallel evolution should therefore be more prevalent at the within-host scale. This does not apply to immune-privileged sites, which may explain the accumulating evidence for compartmentalization between the central nervous system and the periphery in the context of various viral infections.

| CON CLUS ION
In this review, we have assembled approaches from different fields to elucidate the spread of viruses between different compartments within their hosts. We started by briefly reviewing mathematical models describing the dynamics of viruses between various compartments. Across many viruses, these multicompartment models have been proposed to explain discrepancies between observed viral load measurements and prediction from single-compartment models. We argued that, while the multicompartment models have the potential to explain viral load patterns, their validity can rarely be confirmed.
We then discussed the various population genetic methods to infer compartmentalization that have been applied to virus sequences obtained from multiple compartments within an infected host. These methods test against a null model, in which virus populations are well mixed, and do not provide quantitative estimates for migration or replication rates. Moreover, we showed that their sensitivity and specificity strongly depends on the nature and extent of selection (Figures 1 and 2).
Last, we considered approaches that yield dynamical parameter estimates: phyloanatomy and barcoding. In our view, phyloanatomy has great potential but can be confounded by selection. The combination of barcoding and neutral population genetics, on the other hand, is not necessarily confounded by selection, but is applicable only in experimental systems.
The effect of selection on phylodynamics is not yet understood, and little research has been done on the subject. While selection pressures exist on the epidemiological and ecological scales where phylodynamics is commonly used, we consider its effect much stronger on the within-host scale. Evolutionary rates are known to be higher on the within-host scale than on the between-host level, and there is strong evidence for parallel evolution in viruses on the within-host scale, both in experimental settings and in clinical data.
We have illustrated that selection can be advantageous for phylodynamic inferences, because it can create sufficient diversity on shorter time scales. We also showed that the resulting migration rates between compartments might be over-or underestimated due to the presence of parallel or divergent evolution (see Box 1 and Figure 2). These stronger selection pressures indicate that a clarification of the role of selection is more important for phyloanatomy than for phylodynamic applications in the epidemiological setting.
We believe that the most promising path toward the reliable application of phyloanatomic methods to clinical data will require insights from both theoretical and experimental work. On the theoretical side, simulation of sequence evolution under specific selection regimes can help us understand how these specific selection pressures bias inferences. Experiments can help us assess exactly how much selection biases the estimates obtained from current methods, by working in a system where the dynamical parameters are known or can be independently determined. In vitro, a multicompartment model system could be set up, in which separate cell cultures represent compartments, and virus infection and migration can be controlled. In animal models, infection experiments with barcoded populations can be used to independently determine dynamical parameters, which can then be compared phylodynamic inferences from untagged sequences in the same system. The combination of theoretical and experimental work could lead to either new phylodynamic inference schemes specific to systems with strong selection, or to formulae that correct for biases arising from the application of classical phylodynamic methods in these systems.

ACK N OWLED G EM ENTS
We thank Sebastian Bonhoeffer, Veronika Boskova, Judith Bouman, and Frederik Graw for valuable feedback on our manuscript. We gratefully acknowledge the financial support from the Swiss National Science Foundation (grant number 31003A_149769 to RRR).

CO N FLI C T O F I NTE R E S T
The authors declare that no conflict of interest exists.