Selection of sites for field trials of genetically engineered mosquitoes with gene drive

Abstract Novel malaria control strategies using genetically engineered mosquitoes (GEMs) are on the horizon. Population modification is one approach wherein mosquitoes are engineered with genes rendering them refractory to the malaria parasite, Plasmodium falciparum, coupled with a low‐threshold, Cas9‐based gene drive. When released into a wild vector population, GEMs preferentially transmit these parasite‐blocking genes to their offspring, ultimately modifying a vector population into a nonvector one. Deploying this technology awaits ecologically contained field trial evaluations. Here, we consider a process for site selection, the first critical step in designing a trial. Our goal is to identify a site that maximizes prospects for success, minimizes risk, and serves as a fair, valid, and convincing test of efficacy and impacts of a GEM product intended for large‐scale deployment in Africa. We base site selection on geographic, geological, and biological, rather than social or legal, criteria. We recognize the latter as critically important but not as a first step in selecting a site. We propose physical islands as being the best candidates for a GEM field trial and present an evaluation of 22 African islands. We consider geographic and genetic isolation, biological complexity, island size, and topography and identify two island groups that satisfy key criteria for ideal GEM field trial sites.


| INTRODUC TI ON
We present a framework employed by the University of California Irvine Malaria Initiative (UCIMI) for the selection of sites to conduct field trials of a genetically engineered mosquito (GEM) with gene drive. These GEMs are designed to offer safe, cost-effective, and sustainable malaria control in sub-Saharan Africa. This will be achieved using a population modification strategy (Carballar-Lejarazú & James, 2017) wherein parasite-blocking effector genes are engineered into vector mosquitoes, rendering them incapable of transmitting the parasite (Isaacs et al., 2012). An essential GEM component is a highly efficient gene drive (Carballar-Lejarazú et al., 2020). These threshold-independent drives will spread through a population even when introduced at very low frequencies, thereby serving two critical purposes: to establish the effector genes at high frequency in the mosquito population at the immediate release site and to facilitate its spread into neighboring populations via normal mosquito dispersal and gene flow. This GEM is designed to eliminate the malaria parasite, P. falciparum without eliminating the mosquito.
Achieving malaria control on a large spatial scale requires a threshold-independent gene drive, meaning one with a maximum capability for spreading across the environment (invasiveness).
Henceforth, when we refer to a GEM, we mean a mosquito engineered with anti-Plasmodium effector genes and a thresholdindependent, highly invasive gene drive. This is the GEM that UCIMI aims to evaluate in a field trial.
Our primary goal for a site selection process is identification of a site that maximizes the prospects for success, minimizes risk, and serves as a fair, valid, and convincing test of the efficacy and impacts of a GEM product. The UCIMI program targets the principal malaria vectors, Anopheles coluzzii and A. gambiae, and is ultimately intended for malaria elimination throughout sub-Saharan Africa. Early field trials will provide critically important data to inform subsequent trials and influence decisions regarding large-scale deployment. These data include factors that cannot easily be assessed in caged populations because cages do not adequately replicate natural conditions and cannot assess phenomenon on a meaningful spatial scale. We consider a trial successful when the quality and utility of the data it generates justifies the time, effort, and cost that goes into conducting it. Phase 2 field trials are intended to measure entomological endpoints including GEM survival, mating competitiveness, gene drive spread, and construct stability (WHO/TDR & FNIH, 2014).
Biological complexity can complicate data interpretation. Therefore, selection of a site with minimal complexity will maximize the prospects for success. Identifying field sites with hard boundaries that prevent gene flow into and out of the site results in a high level of containment. Containment minimizes the risk of GEM invasion outside the target site and maximizes the prospects for success by minimizing immigration of wild-type individuals into the study population. For a trial to be considered fair, it is our opinion that it be conducted at a site that is well justified on scientific grounds. To be valid and convincing, a trial should generate data that define key parameters accurately and results should contribute to assessing GEM performance over a range of environmental conditions. In the following narrative, we set forth a framework describing how these goals may be achieved.
A multi-phase pathway for the development and evaluation of GEMs has been proposed by the World Health Organization (WHO; WHO/TDR & FNIH, 2014). This protocol has been widely endorsed (African Union & NEPAD, 2018; National Academies of Sciences Engineering & Medicine, 2016) and serves as the foundation for the framework described here. PHASE 1 of the WHO pathway includes design and construction of the GEM product and initial evaluation of its efficacy. This evaluation assesses the phenotype generated by the transgenes, transgene inheritance (especially as it relates to the efficiency of the gene drive component), the stability of the construct over time, and a rudimentary evaluation of overall fitness (Rebeca Carballar-Lejarazú et al., 2020;Hammond et al., 2016). GEM products that show promise then move into PHASE 2 field trials with a strong emphasis on containment.
Early guidelines recommended that initial tests be conducted in large, artificially contained greenhouse-like cages designed to simulate natural conditions (Alphey et al., 2002;Benedict et al., 2008;Facchinelli et al., 2013;Scott et al., 2002). Data generated in such caged environments are limited in several important ways: They do not allow analysis of community and ecosystem-level interactions in any meaningful sense, they cannot replicate food web structure, and they do not permit examination of ecological phenomenon (e.g., dispersal) across spatial scales (Carpenter, 1996;Wynn & Paradise, 2001). Critically, experiments conducted in artificial environments often yield highly replicable, but spurious results (Schindler, 1998).
These limitations were recognized in later guidelines and the use of artificially contained environments is now suggested as optional, unless required by regulatory authorities .
A different strategy that has been proposed for dealing with containment is to conduct field trials in a stepwise fashion with early trials using threshold-dependent drives. A threshold-dependent drive will only spread within a population when introduced above some threshold frequency. Examples include split-drive systems which have limited invasiveness and are therefore self-contained (Cisnetto & Barlow, 2020;Nash et al., 2019). Threshold-dependent drives have their place in controlling vectors on a small spatial scale, such as in urban settings (Li et al., 2020); however, deploying a highthreshold drive to achieve malaria control at the scale of continental Africa is not feasible .
From our perspective, conducting trials in large cages or with high-threshold drives does not satisfy our goal that field tests be valid and convincing. Therefore, we propose to use ecologically confined PHASE 2 field trials in their place. The issue of containment can be mitigated by selecting the appropriate site.
The first consideration in the selection of a GEM field site should be based on defining biological and physical characteristics that would make a site ideal, or as near to ideal, as possible (Alphey et al., 2002;Knols & Bossin, 2006;Scott et al., 2002).
Ethical, social, and legal issues are critically important, and no field test can be undertaken before these are addressed (Kolopack & Lavery, 2017;Neuhaus, 2018;Resnik, 2014). The UCIMI recognizes this and has adopted the relationship-based model for community and regulatory engagement which we have described in detail in a recent publication (Kormos et al., 2020). However, valuable resources, relationships, and infrastructure are best developed at a site that has first been determined to be scientifically suitable. Here, we describe a set of criteria that may be applied to a thoughtful consideration and assessment of potential field trial sites. When completed, this framework should provide a cogent justification for why a particular site was selected for GEM testing.
Ecologically confined field sites should offer geographic, environmental, and/or biological confinement (WHO/TDR & FNIH, 2014). Ecological islands not bounded by water have been suggested as a possibility, but these are not well known for African anopheline species. Physical islands have been suggested as ideal for conducting GEM field trials (National Academies of Sciences Engineering & Medicine, 2016; Scott et al., 2002). Islands have served as model ecosystems and have played a key role in the development of evolutionary theory (Warren et al., 2015). In addition to containment, islands have numerous characteristics that favor their use as GEM field trial sites, including relatively small size, distinct boundaries, simplified biotas, and relative geological youth. These features led to the development of contemporary "Island Biogeography Theory," IBT (Frankham, 1997;Losos & Ricklefs, 2009;MacArthur & Wilson, 1967;Santos et al., 2016), which we rely on to inform our assessment of the advantages of island over mainland sites for the evaluation of GEM. Under IBT, island size (area) and geographic isolation are considered the most important factors driving island biodiversity (Helmus & Behm, 2020), and since this information is readily obtained from published sources, it formed the basis for initial selection of potential sites.

| Selection of candidate island sites
Site selection was initiated with the identification of all potential island sites, which we define broadly as any island associated with the continent of Africa (Figure 1). Data for each site were obtained from published sources except for some genetic data which was generated de novo by us. These data were used to inform the suitability of potential sites by determining if they meet the set of criteria listed in Box 1. This information includes descriptions of entomological, genetic, geographic, and geophysical features of the sites and mosquito populations therein.

| Measuring island geographic isolation
Geographic isolation for each island was defined using three methods, all reported in Table 1.
The first metric is simply the geographic distance to the nearest mainland. Distances for individual islands were calculated as the shortest great circular distance between an island's mass centroid and the mainland coast. For archipelagos, distances from the nearest island to the mainland were used (Weigelt et al., 2013). Distance to the mainland for each Lake Victoria island and for Annobón was determined using Google Earth's distance and area measuring tool.
The two closest points on the mainland and island shores were used F I G U R E 1 African islands and island groups considered potential field sites for genetically engineered mosquitoes for malaria eradication BOX 1 Criteria for the selection of field sites.

Primary criteria Rationale
as measuring points. The significance of distance to mainland is that the nearest mainland is assumed to be the richest gene pool and the source of populations on the islands (Itescu et al., 2020;Weigelt et al., 2013).
A second metric is the United Nations Environment Programme (UNEP) Isolation Index, which is calculated as "the sum of the square roots of the distances to the nearest equivalent or larger island, the nearest group or archipelago, and the nearest continent (Dahl, 1991)." The higher the value, the more geographically isolated the island is.
The third isolation index is surrounding landmass proportion (SLMP) where the isolation of the focal island is proportional to the area of the surrounding landmass (Weigelt et al., 2013). SLMP is calculated as the sum of the proportions of landmass within buffer distances of 100, 1000, and 10,000 km around the island perimeter.
SLMP accounts for the coastline shape of large landmasses by considering only regions that extend into the measured buffers. SLMP values for the Canary Islands, Cape Verde Islands, and Bijagós Islands were represented as the average of all islands in their respective archipelagos (Weigelt et al., 2013). SLMP is a preferred index for analysis of species variation on a focal island. The equilibrium theory of island biogeography supports this index as individual islands may act as stepping stones for species dispersal and establishment, which this index accounts for by shortening the distance between an island and potential source populations (MacArthur & Wilson, 1967). A larger SLMP value indicates that an island is surrounded by more landmass.
For this study, we are focusing on islands with a lower SLMP value since these islands will have less surrounding landmass which could facilitate mosquito dispersal into or out of the target island.

| Island size and topography
Island size information, presented in Table 1 as area, is taken from the publication by Weigelt et al. (2013). They describe island size by using the Database of Global Administrative Areas (GADM) to obtain high-resolution island polygons. Area was calculated for each GADM polygon in a cylindrical equal area projection. Areas for archipelagos (Canary Islands, Bijagós, Cape Verde) were reported here as the TA B L E 1 Bioclimatic and isolation index values used for the evaluation of potential island field sites Note: DD, decimal degrees; UNEP, United Nations Environment Programme; SLMP, surrounding landmass proportion; GMMC, glacial maximum mainland connection, a proxy variable for island geological history which indicates whether an island was connected to the mainland during the last glacial maximum (LGM; 1 = true and 0 = false); -refers to missing or incomplete data. Additional data sources: Bugala: (Kayondo et al., 2005;Nambuya et al., 2013;Ssegawa & Nkuutu, 2006;Zeemeijer, 2012); Mfango (Idris et al., 2016); Ukara: (Lounio, 2014;Mugono et al., 2014;Smith, 1955); and Koome (BakamaNume, 2010; Google Earth; Jackson & Gartlan, 1965;Nampijja et al., 2015;Tuhebwe et al., 2015).
sum of all islands in each archipelago (Weigelt et al., 2013). The area for Annobón was obtained from the United Nations Environmental Programme (Dahl, 1991). The areas for the Lake Victoria islands (excluding Koome) were taken from the literature (Idris et al., 2016;Lounio, 2014;Zeemeijer, 2012), and the area for Koome Island was approximated using Google Earth's distance and area measuring tool.
Elevation maximum and minimum of each island were obtained from the AW3D30 Global Digital Surface Model of the Japan Aerospace Exploration Agency (Japan Aerospace Exploration Agency, 2020). GeoTIFF files were downloaded, and the highest elevation of each island/archipelago was identified. Island topography was further described using the United States National Aeronautics and Space Administration (NASA) 90m resolution elevation data from the Shuttle Radar Topography Mission (SRTM) 90m Digital Elevation Model database (Jarvis et al., 2008). In this case, altitude and magnitude of steepest gradient measurements were used to generate heat maps as graphic descriptors of topography.

| Population genomics analyses
We conducted a comparative genomics analysis of mainland and island populations of the two target species. The locations and sample sizes per site are provided in Figure 2. In total, 420 individual Anopheles gambiae and A. coluzzii genome sequences were analyzed in this study. The UC Davis Vector Genetics Laboratory (VGL) generated 167 genomes (Table S1). In addition, 196 genomes were obtained from the Anopheles gambiae 1000 Genome Project phase 2 (Anopheles gambiae, 1000 Genomes Consortium, 2020) and 57 were taken from a published Lake Victoria islands study (Bergey et al., 2020).
Individual mosquito DNAs from the VGL samples were extracted using a Qiagen Biosprint (Qiagen) following our established protocol (Nieman et al., 2015).  (Li, 2013) with default settings.
Single nucleotide polymorphisms were filtered out when they did not pass the accessibility mask from Ag1000G, missingness >10%, a minimum depth of 8 and minor allele frequency (MAF) <1%. In addition, population structure analysis was based on chromosome 3 SNPs only. This was done to avoid confounding signals from polymorphic inversions on chromosomes 2 and X (Sharakhova et al., 2007). Heterochromatic regions on chromosome 3R (3R:38,988,757-41,860,198; 3R:52,161,877-53,200,684) and 3L (3L:1-1,815,119; 3L:4,264,713-5,031,692) were also filtered out (Sharakhova et al., 2007). The results were grouped by population and significance tests performed between the islands and mainland populations using a Wilcoxon rank-sum test in R.

| Anthropogenic sources of dispersal
The prospects for a GEM emigrating out of a field trial site into a

| Anopheline species richness
Published compilations of Afrotropical Anopheles species distributions (Irish et al., 2020;Kyalo et al., 2017) were used to assemble the information for mainland and island countries. The first criterion for field site selection is the presence of the target species, which is in our case Anopheles gambiae sensu stricto and/or its sister species

| Identification of potential field sites
We evaluated 22 potential island field sites, including 5 individual islands, multiple islands within 7 archipelagos, and 4 islands within Lake Victoria (Figure 1). The sites identified include three island types: continental, oceanic, and lacustrine. Each type possesses features that impact its utility as a GEM trial site. Continental or landbridge islands are unsubmerged portions of the continental shelf and were, at one time, connected to the mainland. Oceanic islands arise from the ocean floor and were never connected to the mainland.
Lacustrine islands are islands within lakes and are typically formed by deposits of sedimentary rock, as are the Lake Victoria islands.
For comparison, our analyses include mainland sites closest to the islands and those in which GEM field trials are currently under consideration (e.g., Burkina Faso, Mali, Uganda). We then proceed by defining and justifying a prioritized set of criteria (Box 1) on which to base evaluations.

| Geographic isolation
Geographic isolation is among the most significant features favoring islands as GEM field trial sites. Although some mosquito species are known to disperse on prevailing winds over long distances (Huestis et al., 2019;Services, 1997), there are, to our knowledge, no reliable reports of open-ocean wind dispersal of malaria vector species over the distances (hundreds of kilometers) separating some of the oceanic islands under consideration here. Emigration of GEMs out of the field trial site into neighboring, nontarget sites, either on nearby islands or on the mainland, poses a problem, especially as it relates to risk and regulatory concerns. Equally important is immigration of wild-type individuals from neighboring sites into the trial site. Immigration, in this case, will confound efforts to measure GEM invasiveness and could potentially render the gene drive inefficient or even ineffective. Island biogeography theory predicts that choosing a remote island as an initial field trial site greatly reduces the potential for gene flow between vector populations both into and out of the island site. This is further supported by the results of our population genomics assessment, as discussed below.
We evaluated geographic isolation for all candidate islands using distance to mainland, UNEP Isolation Index, and SLMP (

| Island size and topography
There are no well-defined criteria to guide decisions with respect to an appropriately sized area for a GEM field trial. One important consideration is mosquito flight range. To evaluate the dispersal capacity of a GEM, the site should exceed the flight range of the target species. For our considerations, we assumed a maximal daily flight range of 10 km for A. gambiae (Kaufmann & Briegel, 2004). Generally, we aimed to identify sites small enough to be manageable, but large enough to be convincing, keeping the following considerations as a guide.
Area (km²) is an important parameter influencing the biology of populations residing on an island. Large island areas typically include more habitat types and can support larger populations. This characteristic can increase the rate of speciation and lower extinction rates over time (Santos et al., 2016). Using island size as a criterion, we exclude the islands of Annobón and Île Europa for being too small and Madagascar for being too large.
Evaluating the dispersal capabilities of a GEM is a critical outcome from a field trial. This capacity is best evaluated at a site that possess topographical features that may pose a challenge to dispersal, as would be encountered in continental Africa. Elevation was used as a measure of topographic complexity and as a proxy for environmental heterogeneity. The difference between the elevation maximum and minimum of each island measured from sea level is reported in the "Elevation" column in Table 1. Elevation relates to the number of available habitats because of differences between windward and leeward sites, temperature decrease with altitude, and high precipitation regimes at certain altitudes (Weigelt et al., 2013).
Altitude and magnitude of steepest gradient were used to generate a graphic representation of topography for each island. A representative sample of these analyses for the islands of Grande Comore and São Tomé is presented in Figure S1a,b to illustrate sites having the desired level of topographic complexity and for the islands of Zanzibar and Mafia in Figure S1c,d to illustrate a lack of suitable topographic features. Sites lacking topographic complexity were excluded from consideration; these included the Bijagos Islands, the islands in Lake Victoria, Zanzibar, Pemba, Mafia, and Ile Europa. Taken together, the data summarized in Figure 4 and Figure S2 reveal a high degree of genetic isolation among oceanic islands compared with either continental or lacustrine islands. These results indicate limited dispersal (gene flow) between islands and nearest landmasses, are consistent with expectations based on IBT as described above, and reinforce the benefits of selecting a contained island site for conducting GEM field trials. Genetic data are not currently available for several potential island sites, including the Canary Islands, Cape Verde, Île Europa, Zanzibar, Pemba, and Mafia.

| Anthropogenic dispersal
Anthropogenic dispersal of mosquitoes from inside the release site into nontarget populations and vice versa may occur and should be considered in selecting a field trial site. The level of genetic divergence between island and mainland populations of A. coluzzii and A. gambiae is generally high suggesting that dispersal off the islands is low. Nonetheless, dispersal that may occur is most likely to rely on anthropogenic conveyance (Belkin, 1962;Services, 1997).
The most significant source of passive anthropogenic dispersal of mosquitoes is by rail and road. This poses significant risk for mainland field sites, where extensive in-country and trans-boundary connections exist (Campos et al., 1961;Eritja et al., 2017;Frean et al., 2014). Risk by this mode of mosquito dispersal is reduced to zero for oceanic island test sites.
Frequency of air and sea departure to interim and final destinations for a sample of mainland and island populations is presented in Figure 5 and Tables S3 and S4. Islands, due to their smaller human populations and geographic areas, generally originate less transboundary air and sea traffic compared with the continent (Figure 5a).
This results in remote islands that are least connected by shipping having inherently lower risk levels for these modes of anthropogenic dispersal (Helmus et al., 2014). A notable exception is the Cape Verde archipelago which has relatively high ship travel due to its location as a major refueling site (Figure 5b). Traffic has increased with the completion of two new ports and upgrades to existing ports in 1997. Airline and shipping traffic data were only obtained for the locations shown in Figure 5; therefore, assessment of the potential for anthropogenic dispersal for the majority of island sites was not assessed. Results for São Tomé and Príncipe and for the Comoros suggest that the likelihood of mosquitoes migrating via anthropogenic means into or out of these islands is minimal.

| Anopheline species richness
The number of primary, secondary, and other (malaria vector status unclear) species present on island sites and select locations on the mainland are illustrated in Figure 4 (and Table S2). It is generally agreed that potential field sites with the fewest number of nontarget Anopheles species are desirable (Brown et al., 2014;James et al., 2020). If multiple sister species or unrecognized mating demes are present, there exists the possibility that the transgene will move between species via natural hybridization (Lee et al., 2013;White, 1971) which could add an additional level of complexity to postrelease assessments.
Although the movement of transgene elements between malaria vector species may be considered desirable, it raises the specter of horizontal transfer, which is generally identified as a risk to this technology (Courtier-Orgogozo et al., 2020).

F I G U R E 4
Anopheles species complexity in Africa including island and select mainland sites. Map locations and summary of data presented in Table S2. Cyan circle = total number of Anopheles spp.; blue proportion of primary vector species; green = proportion of secondary vectors; and yellow = proportion of species identified as nonvector or for which vector status unknown among island types yielded results that were consistent with island biogeography theory. Nucleotide diversity in continental island populations did not differ from mainland populations, and lacustrine islands had only slightly lower, but statistically significant, values for π. These observations are expected given the geological history and proximity of continental and lacustrine islands to the coast. Anjouan island populations presented the lowest (0.73%) nucleotide diversity (π) for A. gambiae and Príncipe island for A. coluzzii (0.66%), likely due their small size and high degree of isolation.
In general, the lower biocomplexity on isolated islands includes reduced genetic variation (Frankham, 1997). Our results are concordant with this observation (Figure 6). Selecting field sites with populations containing the lowest levels of variation should decrease the potential for transgene/genome interactions that might negatively impact GEM performance. These include São Tomé and Príncipe and the Comoros.

| Selection of candidate field sites
Each potential site was evaluated based on the criteria listed in Box 1. Evaluations were based on information available from the literature or calculated by us as summarized in the narrative above.
Sites that fail to meet all primary criteria were eliminated from further consideration. Those sites that met all primary criteria were raised from potential status to candidate status. Some criteria require further analysis or site visits before a final evaluation can be completed. Site visits are recommended for candidate sites only.
Evaluation of insecticide resistance should be conducted during F I G U R E 5 Annual departures by air (a) and sea (b) from representative island and mainland locations in Africa. Colors indicate destinations, grouped by geographic region. Air traffic data provided by Cirium*. *This information has been extracted from a Cirium product. Cirium has not seen or reviewed any conclusions, recommendations, or other views that may appear in this document. Cirium makes no warrantees, express or implied, as to the accuracy, adequacy, timeliness, or completeness of its data or its fitness for any particular purpose. Cirium disclaims any and all liability relating to or arising out of use of its data and other content or to the fullest extent permissible by law. Sea traffic data provided by the MarineTraffic Global Ship Tracking Intelligence database F I G U R E 6 Population diversity. Metric is grouped by sampling locations of (a) Anopheles gambiae and (b) Anopheles coluzzii populations from island and mainland (gray boxplots). Boxplot of nucleotide diversity (π) performed in 10 kb windows of euchromatic regions of chromosome 3. The midline in all boxplots represents the median, with upper (75th percentile) and lower (25th percentile) limits, whiskers show maximum and minimum values, and outliers are not shown. Mean nucleotide diversity for set of populations is shown above the boxplots; A. gambiae populations were divided into four groups: mainland continental (gray), land-bridge (pink), lacustrine (yellow), and oceanic (green/blue) islands and A. coluzzii into three: mainland continental (gray), land-bridge (pink), and oceanic (blue) islands. p-value for testing of means between islands and mainland is shown below. Geographic location for each site and numbers of genome analyzed per site are provided in Figure 2 BOX 2 Overall summary of evaluation of potential island sites.  Overall evaluations are presented in Box 2.
Evaluation of all twenty-two potential field sites indicates that Bioko, São Tomé & Príncipe, and the Comoros Islands (Anjouan, Grande Comore, Mayotte, and Moheli) can be elevated from "potential" to "candidate" GEM field trial sites. The Mascarene (Mauritius and Réunion) and Cape Verde Islands fit many criteria, but Anopheles gambiae or A. coluzzii do not occur in these islands. Annobón scores well based on several of our criteria but travel there was determined to be infeasible, and the island was deemed too small to represent a trial which would provide compelling outcomes.
Therefore, we propose the following as the lead candidate sites for a PHASE 2 GEM field trial: the Comoros Islands, São Tomé and Príncipe, and Bioko.

| CON CLUS IONS
Our early decision to consider physical islands as the ideal sites for a GEM field trial was guided by contemporary IBT. This theory provides the basis for certain expectations concerning species richness, in our case, anopheline species richness and also features such as genetic isolation and diversity. Consistent with IBT, anopheline mosquito species richness was lowest on small, isolated oceanic islands, higher on continental islands, and highest at mainland continental sites ( Figure 5). Our results likewise confirm IBT predictions regarding relationships between geographic isolation and both genetic divergence and genetic diversity (Johnson et al., 2000) which are significantly correlated ( Figure S3).
The framework described here has been applied by the University of California Irvine Malaria Initiative (UCIMI) as they enter PHASE 2 of GEM research. It is our belief that this comprehensive framework provides identification of site(s) that will maximize the prospect for success, minimize risk, and will serve as a fair, valid, and convincing test of the efficacy and impacts of the UCIMI GEM product, meeting the goal of a PHASE 2 field trial. Furthermore, this process provides a well-reasoned, science-based justification for selecting these sites for GEM field trials and a solid foundation on which to approach ethical, social, and legal considerations with field site stakeholders.

K E Y WO R DS
Anopheles gambiae, genetic control, islands, malaria, population modification

ACK N OWLED G EM ENTS
The work was supported by grants from the University of California Irvine Malaria Initiative and Open Philanthropy. We thank Dr.
William Sharpee for his assistance in the acquisition of the shipping data presented in this paper.

CO N FLI C T O F I NTE R E S T
The authors declare no conflict of interests.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data that support the findings of this study are openly available in NCBI GenBank at https://www.ncbi.nlm.nih.gov/genba nk/.
New whole-genome sequence data included in this study are under BioProject ID PRJNA729913. Sequencing read data from previous studies are available under different BioProject IDs (PRJNA607000,