Some deterministic and stochastic mathematical models of naïve T‐cell homeostasis

Humans live for decades, whereas mice live for months. Over these long timescales, naïve T cells die or divide infrequently enough that it makes sense to approximate death and division as instantaneous events. The population of T cells in the body is naturally divided into clonotypes; a clonotype is the set of cells that have identical T‐cell receptors. While total numbers of cells, such as naïve CD4+ T cells, are large enough that ordinary differential equations are an appropriate starting point for mathematical models, the numbers of cells per clonotype are not. Here, we review a number of basic mathematical models of the maintenance of clonal diversity. As well as deterministic models, we discuss stochastic models that explicitly track the integer number of naïve T cells in many competing clonotypes over the lifetime of a mouse or human, including the effect of waning thymic production. Experimental evaluation of clonal diversity by bulk high‐throughput sequencing has many difficulties, but the use of single‐cell sequencing is restricted to numbers of cells many orders of magnitude smaller than the total number of T cells in the body. Mathematical questions associated with extrapolating from small samples are therefore key to advances in understanding the diversity of the repertoire of T cells. We conclude with some mathematical models on how to advance in this area.


| INTRODUC TI ON
A T cell is characterized by the T-cell receptors (TCRs) on its surface. The TCR is formed by random gene recombination processes in the thymus and determines the set of peptide-major histocompatibility (pMHC) complexes that the T cell is capable of reacting to. Every T cell found in the rest of the body either originated in the thymus, or is descended from a cell that did. Daughter cells inherit their TCR from their mother. More than 90% of putative T cells die in the thymus, either because they are insufficiently reactive, or because they react too strongly to pMHC ordinarily found in the body. 1-8 This thymic "selection" produces a repertoire of T cells, with many different specificities. 9,10 An adaptive immune response begins with a small number of binding events between TCR and pMHC on the surfaces of antigen-presenting cells (APCs) that have processed peptides not ordinarily found in the body. 11 T cells with the correct specificity, initially rare, then enter a program of multiple rounds of cell division, 12 activation 13 and production of chemical signals. The initial binding events typically occur in one of the body's lymph nodes 14 ; T cells usually circulate from node to node, 15 scanning APCs for pMHC, but rarely encountering strong stimulation. 16 Here, we review some models of naïve T-cell repertoire homeostasis: the maintenance by DOI: 10.1111/imr.12696

Some deterministic and stochastic mathematical models of naïve T-cell homeostasis
the body of a diverse repertoire of T cells capable of responding to new threats.
The field of immunology owes its name to the observation that the immune system responds more rapidly and more strongly to an infection that it has previously fought. Thus vaccination, deliberate introduction of well-chosen foreign peptides into a body, prevents the spread of diseases. This phenomenon of immunity, or property of memory, is due to a rebalancing of the repertoire of immune cells. The temporary army of T cells that springs into existence during an episode of infection does not demobilize back to its small number of precursors, but retains an enhanced presence that includes types of "memory" cells, with different patterns of circulation through organs and tissues. [17][18][19][20][21][22][23][24][25][26] We focus on naïve T cells whose TCR is made up of an α and β chain. The set of such cells is unambiguously divided into CD4 + and CD8 + T cells. Every T cell makes an irreversible commitment to one of the two lineages in the thymus that is, along with TCR specificity, inherited by daughters of cells that divide. CD4 is a surface molecule that acts as a coreceptor when a CD4 + or "helper" T cell, binds to a pMHC complex containing type II MHC; CD8 is a surface molecule that acts as a coreceptor when a CD8 + or "cytotoxic" T cell, binds to a pMHC complex containing type I MHC. In order to understand the homeostasis and function of the immune system, further classifications of T cells into subsets including, for example, central and effector memory based on CD62L, CCR7, and other surface markers, regulatory cells based on CD25 and FOXP3, and others, are necessary. 21,27 Full understanding remains elusive, in part because the distinctions between subsets are seldom clearcut, their origins and possible reversibility are the subject of active research. [28][29][30][31][32] To maintain their populations over the lifetime of the host, naïve CD4 + and CD8 + T cells depend on cytokine resources, especially IL-7, to survive and divide. [33][34][35][36][37][38][39][40] They may obtain IL-7 as they circulate through lymph nodes and the spleen, where it is produced by cells such as fibroblastic reticular cells. 41,42 It is possible to devise mathematical models of the binding of IL-7 molecules to receptors, and of the processes of internalization, degradation, and recycling of complexes on the surface and in the cytoplasm of individual cells. 43 Then, the variables of the model may be numbers of molecules of free ligand, numbers of ligand-bound complex, numbers of internalized complexes instead of numbers of cells. However, in this review, we restrict ourselves to modeling at the level of cells and populations of cells.

| Simple deterministic models
We can model a large T-cell population, for example, all naïve CD4 + T cells in a individual, about 10 7 in an adult mouse or about 10 11 in an adult human. 44 In a single-equation model, the number of cells is represented not as an integer but by a real function of age, N(t), that obeys an ordinary differential equation (ODE) of the form The simplest model, which may be obtained by assuming that cells circulate independently throughout the body and every cell is equally likely to undergo one round of division, or suffer death, in any small time interval, is where θ is the rate of export of cells from the thymus and β is the death rate per cell minus the division rate per cell. If θ does not depend on age and N(0) = 0, then the solution of Eq. (1) is with steady-state value N ss = . The transformation of some naïve cells into memory cells can also be included as another net loss rate contributing to the value of β.
To assess the usefulness of the simple model Eq. (1), let us consider reasonable values of the parameters θ and β. If a mouse's thymus produces 10 6 CD4 + cells per day, one peripheral naïve cell in 30 dies per day and one in 300 divides per day, then β = 0.03 per day and, using Eq. (1), we conclude that there are 33 million naïve CD4 + T cells at steady state. Over the timescale 1/β of approximately 1 month in mice (longer in humans), it is reasonable to treat cell division and death as events that may occur in a "short" time interval.
On the same timescale, however, there is an important decrease in the average thymic production of a mouse. We therefore replace the constant θ by a function of age, θ(t). For mice older than 6 weeks, we may use the function 45 where ν = 0.005 per day and the value of the constant A varies from mouse to mouse. If, at 6 weeks old, a mouse has 30 million naïve CD4 + T cells and its thymus is producing one million cells per day, we find In this description, because e −βt decreases more rapidly as a function of age than e −νt , the long-term decrease of the average number of naïve T cells in a mouse has the same timescale as the waning of thymic export to the periphery. It is also possible to model the increase in thymic output from infancy to adulthood. 46 For comparison, the rate of export of naïve CD4 + T cells from the thymus of a young adult human is about 2 × 10 8 cells per day; the rate declines proportional to e −νt where ν = 0.0001 per day. 47 This type of mathematical abstraction is valuable because its parameters have clear interpretation and their values can be measured. Division rates per cell can be estimated by Ki67 staining. 45 Suppose that, after cell division, the two daughter cells remain Ki67 + for 4 days. Then, ignoring the time spent cycling before division, the division rate λ, which is the probability per day that a cell in the population divides, is equal to one eighth of the fraction of cells in that population that are Ki67 + . Death and division rates per cell are estimated using, for example, deuterium which is incorporated into the DNA of dividing cells. [48][49][50][51] Simple formulae such as Eq.
(2) that result from simple models such as Eq. (1) can be fit to experimental data, using statistical techniques to understand which parameters vary from individual d dt N = rate of thymic export − rate of cell death + rate of cell division.
(1) (1) can be considered for the different thymocyte populations, for CD4 + and CD8 + naïve T cells and for CD4 + and CD8 + memory T cells. 48,52,53 Memory cells have higher rates of division than naïve cells; their input terms are not due to direct export from the thymus, but to differentiation of naïve cells. The memory compartments are typically classified into "central memory," "effector memory," and perhaps also "tissue-resident memory" or "stem-cell memory" compartments. If recent thymic emigrants behave differently from mature naïve T cells, 54,55 then they warrant a separate equation.
Models based on Eq.
(2) cannot describe extreme scenarios such as recovery from lymphopenia, 56,57 but they are useful in describing T-cell homeostasis. It could be said that they are most useful when, as in two recent cases, they have been shown to fail. In the first case, closer examination of deuterium-labeling data reveal that the assumption that all T cells in a large population share common values of death and division rates to be inadequate. Instead, models of "kinetic heterogeneity" are constructed in which the total cell population consists of subpopulations with different rates. 58

| MODEL S OF CLONAL SURVIVAL
The most important property of the T-cell repertoire is its diversity.
Every T cell in a mammal's body can be said to belong to the family of cells with which it shares a TCR, even if they never again find themselves close to each other after they, or their ancestors, leave the thymus. The number of cells in a family is an integer that increases whenever one of its cell divides and decreases whenever one of its cells dies.
In homeostatic conditions, death and division are infrequent enough that it makes sense to treat them as instantaneous and independent events, neglecting the time taken in the processes of death or division. 60 However, the numbers of cells per clonotype are no longer large enough to justify the use of a ODE approximation. In particular, ODE models do not capture the possible extinction of clonotypes in a natural way. Thus, we will consider models, in which n i (t), the number of naïe T cells of a clonotype i at time t, is an integer, with cell death and division treated as the basic events in a multi-dimensional "birth-and-death" stochastic process.
F I G U R E 1 Survival of clonotypes in the periphery, according to Eq. (3), which assumes no peripheral division. The values of μ are similar to estimated death rates of naïve cells in mice and the initial number of cells per clonotype is n = 8

| Death process
In an adult mouse, where division of naïve T cells in the periphery is either absent or rare, the simplest model for a single clonotype can be summarized as follows: n cells with the same TCR leave the thymus and begin to circulate independently through the body. They die, one by one, until the clonotype is extinct. At this end of the spectrum of scenarios, the birth-and-death process is simply a death process with rate μ, because the birth rate is zero. Assuming that each cell in each clonotype, independently, has a constant death rate μ, the probability that a clonotype that emerges from the thymus as a family of n cells, survives in the periphery longer than t is given by and is plotted in Fig. 1, with n = 8.

| Neutral models
In an adult human, in contrast, the frequency of division of naïve T cells in the periphery (at least 10 8 new cells per day) outweighs thymic export (about 10 7 new cells per day, declining with age). In this case, the opposite end of our spectrum of scenarios, the simplest approximation is that each clonotype, independently, follows a "population birth-and-death process" with equal birth and death rates. That is, in a small time interval Δt the probability that any cell dies, μΔt, is equal to the probability that it divides. Extinction of clonotypes occurs in this type of "neutral" dynamics 61,62 even if the total number of cells is stable.
To find the survival function of a clonotype under the assumption of neutral dynamics, we define and Because cells behave independently, the descendants of each of the n initial cells can be thought of as n independent families and the "probability generating function" satisfies, 63 Let us consider G 1 (z,t). Using the definition (4) we know that G 1 (z,0) = z. Next, consider a time Δt, short enough that the possibility of two events occurring in the time interval (0,Δt) can be neglected. One cell, present at the beginning of the interval, survives with probability 1 − 2μΔt, dies with probability μΔt, or divides with probability μΔt. Thus, and We conclude that G 1 (z,t) satisfies The solution of (6) is The survival function, S(t) = 1 − p 0 (t), shown in Fig. 2, is the probability that n i (t) > 0; in other words, the fraction of clonotypes that survive until t. Notice that extinction is certain because p 0 (t)→1 as μt→∞. 64 The values of μ shown in Fig. 2 are in the range of estimates for naïve and memory T cell subsets in an adult human.
While the approximation of the survival function (7) is useful, the neutral model has an important drawback: the mean lifetime of a clonotype is infinite. Thus, the assumption cannot be used to calculate a steady-state mean number of clonotypes in the periphery, which would be equal to the product of the rate of export of new clonotypes from the thymus and their mean lifetime.

| Global competition and thymic pressure
In order to consider the situation in which both thymic export and peripheral division contribute to maintaining homeostasis, we consider a population of cells, each with a label corresponding to its TCR clonotype, competing on an equal footing for a public or "global" resource. The dynamics is governed by a combination of thymic production, division in the periphery and cell death. In this section, let us suppose that cell division depends on access to a nonspecific resource, such as IL-7, that is produced in the whole animal at a constant average rate. Thus, there is a well-defined mean number of T cell divisions per unit time, denoted γ. The T cells circulate through the body, ready for their dose of resources. We F I G U R E 2 Survival of clonotypes in the periphery under the approximation (7) of neutral dynamics. Clonotypes leave the thymus as a family of eight cells suppose that, in any small time interval, each living cell is equally likely to receive a signal causing one round of division.
We continue with the simplest hypothesis that each T cell, independently, has a death rate μ.
The total number of T cells has a stationary mean, maintained by a balance between death, on one hand, and thymic production and division, on the other. The total number of new cells produced in a short time interval of duration Δt is ( + n )Δt. At steady state, this is balanced by the mean number of deaths in Δt, μN(t)Δt. Thus, the steady-state mean total number of cells is To give a mathematical description of the fate of a clonotype in the periphery, we need to consider n i (t), the number of cells of clonotype i at time t. The probability that one cell of the clonotype dies between t and t + Δt is μn i (t)Δt. The probability that one cell of type i divides between t and t + Δt is λ i (t)Δt where In other words, the fraction of the global resource received by clonotype i at time t is equal to its weight in the repertoire. As the total number of cells, N(t), is large and fluctuates about the mean value N ss , we may use the approximation N(t) ≃ N ss . In this approximation, the integer n i (t) follows a birth-and-death process 65 with death rate μn i and birth rate where is the nondimensional parameter that measures the ratio of thymic production to peripheral division. The effect of the thymus on the periphery is that of a competitive "pressure" that results in the mean per-cell division rate being less than the per-cell death rate. We do not call this model "neutral" because, although all cells in the periphery follow the same rules, the division rate is a function of the status of the whole repertoire, not simply equal to the death rate.
Under these assumptions, of global competition and thymic pressure, we can calculate the distribution of clonal (family) lifetimes. We say that T i is the clonal lifetime if n cells of type i exit the thymus at age t 0 and the last cell of type i dies at age t 0 + T i . The probability that the clonal lifetime is greater than t, is plotted in Fig. 3.
With thymic output and peripheral division constant then, as well as a mean steady-state number of cells, there is a well-defined mean steady-state number of clonotypes, N * , equal to the product of θ and the mean lifetime of a clonotype in the periphery. In the limit where γ E = 0.577… is the Euler-Mascheroni constant. 66

| Diffusion approximation and clonal size distribution
Under a diffusion process approximation of the birth-and-death process, the integer n i (t) is approximated by the real-valued stochastic process X t that satisfies the following stochastic differential equation 66 When α = 0, the survival probability of this process is 66 which is similar to (7). With the diffusion approximation, (13), it is possible to calculate the distribution of clonal sizes which, assuming a homeostatic steady state, is equivalent to the distribution averaged over the lifetime of one clone. The corresponding quantity for X t is the occupation density of the process X t . 67 The function L(y,n 0 ) = occupation density at y given that the process starts at n 0 and is absorbed at 0, satisfies (The mean lifetime of a clonotype is the integral of L(y,n 0 ) with respect to y.) The solution of (14) is That is, the distribution of family sizes in the repertoire is predicted to be a function that decreases in a faster-than-exponential way. F I G U R E 3 Survival of clonotypes in the periphery with competition for a nonspecific resource, using (11). The parameter α measures the strength of the thymus relative to peripheral cell division. We have chosen μ = 1 per year and n = 8

| Specific competition and thymic pressure
In our final example, we consider competition for resources that are TCR specific. An enormous variety of self-pMHC epitopes is available to circulating T cells. As a result of the random recombination and selection processes in the thymus, any one T cell will only recognize a small fraction of those available. Let us assume that the number of different pMHC in the body is a very large, but fixed, number M 66 and that each pMHC subset acts as a resource that is equally likely to be consumed by each of the cells capable of recognizing it. Let us also assume that each pMHC has probability p of being recognized by a randomly chosen TCR. The value of p is estimated to be 10 −6 . [68][69][70] The model can be thought of as a massive ecosystem of species, 71-74 competing for niches, subject to pressure from new species continuously exported from the thymus. If thymic export is constant, then a homeostatic steady state is attained in which extinction of existing species in the periphery balances the production of new species in the thymus. Remarkably, the single-clonotype dynamics in such a system of large-scale competition is similar to that of global competition, in that the division rate is scaled by the factor (1 + α) −1 and the steady-state number of clonotypes satisfies (12). This simplicity is due to the cross-reactivity being "unfocussed" in the sense of Mason. 75 In the model just described, the total number of cells in the system has a well-defined mean. De Greef et al. devised a model where the total number of cells is fixed. 62,76 To implement this computationally, each time the thymus produces n new cells, the same number of cells, chosen at random, are removed from the repertoire. In both types of model, the following illuminating equality holds: 62,76 The quantity on the left-hand side is the rate of export of new clonotypes from the thymus to the periphery. The first factor in the product on the right-hand side is μ, the death rate per cell (equal to the inverse of the mean lifetime of a cell). Increasingly accurate estimates of μ have become available in the past few years. [48][49][50][51] The second factor is the number of unicellular clonotypes in the repertoire. That is, the number of TCRs that are present on only one cell in the periphery. The maximum possible value of F 1 is simply S, the number of cells in the repertoire. The equality F 1 = S holds in the extreme scenario that all peripheral clonotypes consist of one cell only, otherwise F 1 < S.
To estimate θ, we need to divide the rate of thymic production, in cells per unit time, for which good estimates exist, 44,46 by n , the number of cells per clonotype at the moment of release from the thymus, which is not well established. 77 Putting aside the age-dependence of the thymus, we may use the estimates θ = 2.5 × 10 6 per month and μ = 1 per month for naïve CD4 + T cells in an adult mouse, in which case F 1 = 2.5 × 10 6 . This suggests that at least one quarter of naïve CD4 + T-cell clonotypes in the periphery of a mouse consist of only one cell. For comparison, we may estimate that θ = 10 9 per year and μ = 0.5 per year for naïve CD4 + T cells in an adult human, and deduce that F 1 = 2 × 10 9 . On the one hand, this value is comfortably below the total number of naïve CD4 + T cells in an adult human, so that it is not true that most clonotypes consist of only one cell. On the other hand, it suggests that the number of distinct TCR in the repertoire is greater than 10 10 . 66 In Fig. 4, we illustrate typical theoretical life histories of a clonotype in a mouse (left) and human (right). In the former case, the simplest assumption of no peripheral division is used. In the latter case, the slight imbalance between birth and death rates ensures that the mean clonal lifetime is finite, but the clonal size may grow from its initial value before, eventually, extinction occurs.

| Computational models
How should a computational model of the T-cell repertoire over the lifetime of a human or mouse be constructed? Computational models provide a testing ground for provisional classification and lineage schemes for T cells. If the number of subsets is not too large, models with one ODE per subset are easily solved. Then, Bayesian statistical frameworks can be used to compare models and estimate parameter values. Should, for example, recent thymic emigrants be treated as a separate subset from naïve cells 78 ? The fact that the CD4 + :CD8 + ratio in the periphery of an adult mouse is close to 1, although the SP4:SP8 ratio in the thymus is typically 4:1 38,79 is consistent with the hypothesis that CD4 + differ from CD8 + T cells in this respect. 55 Are death and division rates constant over the lifetime of a mouse or human, or do they change as part of the aging process 53,55,59,80 ? In mathematical extrapolation from small samples. [98][99][100][101][102] Assuming that single-cell techniques can overcome the practical problems associated with high-throughput sequencing, the remaining problem is the mathematical one of extrapolation from small samples.
Data available to date suggest that apart, perhaps, from a few large clonotypes, the repertoire of naïve T cells in a mouse or human consists of many clonotypes with small clonal sizes. Thus, given current sample sizes, the majority of TCRs found in typical samples will be found on only one cell per sample. 95 We must conclude that most TCR clonotypes are unobserved. These observations, no doubt a source of frustration to the experimentalist, are the basis of convenient approximations for mathematicians. The mathematical problem is illustrated in Fig. 6. We can imagine that we can take a sample of balls from a large box filled with balls of many colors. The total number of balls is known, but the number of different colors is not. What can be deduced from the colors of the balls in the sample?
Suppose that an unbiased random sample of m cells is taken, and sequenced, from a total repertoire of S cells. Let us start with the probability that one particular cell in the repertoire is part of the sample: 77,103 In naïve repertoire single-cell sequencing q ≪ 1, because m, approximately 10 2 -10 3 , is much smaller than S, approximately 10 7 (mice) or 10 11 (humans). We seek a relationship between n i , the num-  (17) The exponent r is a product of the factor m 2 /2S, that does not depend on the distribution of clonal family sizes, and the factor (n 2 )∕n − 1, that does. In Fig. 7, we consider a power-law repertoire. We say that the distribution is "power-law" if the number

| D ISCUSS I ON
Our understanding of the dynamics of an immune system repertoire of more than 10 11 interacting cells is, not surprisingly, incomplete. Individual human T cells may live for years; following their entire histories in vivo as they undergo migration, activation and differentiation processes is not possible. Most knowledge comes instead from in vitro experiments where samples of cells are confined and stimulated, and from adoptive transfer and imaging experiments involving genetically modified mice. [104][105][106][107] Mathematics provides a systematic language to express dogma, theory, and conjecture, so that assumptions can be tested and improved, parameters estimated and predictions compared with data. Progress is also being made in constructing in silico models which, by emulating in vitro models and in vivo processes, could reduce and replace animal testing.

ACK N OWLED G EM ENTS
We will always be grateful for the privilege of Robin Callard's friendship and scientific insight.

CO N FLI C T O F I NTE R E S T
The authors declare no conflicts of interest.