Detecting the population dynamics of an autosomal sex ratio distorter transgene in malaria vector mosquitoes

Abstract The development of genetically modified (GM) mosquitoes and their subsequent field release offers innovative and cost‐effective approaches to reduce mosquito‐borne diseases, such as malaria. A sex‐distorting autosomal transgene has been developed recently in G3 mosquitoes, a laboratory strain of the malaria vector Anopheles gambiae s.l. The transgene expresses an endonuclease called I‐PpoI during spermatogenesis, which selectively cleaves the X chromosome to result in ~95% male progeny. Following the World Health Organization guidance framework for the testing of GM mosquitoes, we assessed the dynamics of this transgene in large cages using a joint experimental modelling approach. We performed a 4‐month experiment in large, indoor cages to study the population genetics of the transgene. The cages were set up to mimic a simple tropical environment with a diurnal light‐cycle, constant temperature and constant humidity. We allowed the generations to overlap to engender a stable age structure in the populations. We constructed a model to mimic the experiments, and used the experimental data to infer the key model parameters. We identified two fitness costs associated with the transgene. First, transgenic adult males have reduced fertility and, second, their female progeny have reduced pupal survival rates. Our results demonstrate that the transgene is likely to disappear in <3 years under our confined conditions. Model predictions suggest this will be true over a wide range of background population sizes and transgene introduction rates. Synthesis and applications. Our study is in line with the World Health Organization guidance recommendations in regard to the development and testing of GM mosquitoes. Since the transgenic sex ratio distorter strain (Ag(PMB)1) has been considered for genetic vector control of malaria, we recorded the dynamics of this transgene in indoor‐large cage populations and modelled its post‐release persistence under different scenarios. We provide a demonstration of the self‐limiting nature of the transgene, and identified new fitness costs that will further reduce the longevity of the transgene after its release. Finally, our study has showcased an alternative and effective statistical method for characterizing the phenotypic expression of a transgene in an insect pest population.

transgene frequency time-series, two from the sex-ratio among transgenic pupae, and one from the egg number time-series.

Transgene frequency data
We denote by { } =1 the time series from the ℎ cage ( ∈ {1,2,3}) of the frequency of Ag(PMB)1 among pupae, over all the days on which this was observed = 1. . . We first smooth these series by transforming them into two-week moving average time-series, yielding three transformed series that we denote by { ( )} =3 −2 . Note that the smoothed time series are four data points fewer than the original series because there are two data points per week in the original series. We next computed the sum of square differences between { } and { ( )} for each of the three series pairs, which we denote by ( ). Specifically, . We apply the same smoothing and residual functions to the 2 simulated data { } =1 (where refers to the ℎ simulation with the given parameters) to get corresponding variables { ( )} and ( ). We can now calculate two distance measures between the observed and simulated data: is the sum of square differences between all corresponding pairs of smoothed empirical and smoothed simulated transgene frequency time-series.
is the sum of square differences between all corresponding pairs of real and simulated transgene frequency residuals.

Transgene sex-ratio data
In each cohort of pupae that were observed in the type 2 cages, the sex-ratio among transgenic pupae was recorded. Since we did not expect any temporal change in the transgenic sex-ratio, we computed the mean and variance of this data from cage , and 2 for the three cages, ∈ {1,2,3}. Similarly, we computed the mean and variance of this variable from the simulated data, ′ and 2 ′ for simulation . This resulted in the next two distance measures: is the sum of square differences between all simulated and empirical mean sex ratio means.
is the sum of square differences between all simulated and empirical within cage variances in sex ratio.

Egg count data
Finally, the experiment recorded the number of eggs that were produced after each bloodfeeding opportunity (twice per week), both in the type 1 cages (before Ag(PMB)1 mosquitoes were introduced), and in the type 2 cages. We denote by . We similarly obtained corresponding smoothed egg number time-series from the simulations which we denote by ′, in place of . The final distance measure is the sum of square differences between smoothed and observed egg number time series, across cages and replications: Note that we do not construct a distance measure from the variance in egg number. This is because the model was unable to replicate the high variability that we observed, as discussed in the main text.

Monte-Carlo inference algorithm
We inferred the Posterior distribution (shown in Fig. 3) by iterating the following algorithm.
First, we selected a parameter vector (a 'particle') at random from the prior distribution. We simulated the experiment using these parameters 20 times and calculated the distances 1 … 5 from the empirical data. We repeated this process 200,000 times to obtain a set of particles and associated distance vectors. We retained all particles for which all the five distances were in the lowest 0.3014 quantile for that measure. This yielded the 200 Posterior points shown in Fig. 3.