Insights from the rescue and breeding management of Cuvier’s gazelle (Gazella cuvieri) through whole‐genome sequencing

Abstract Captive breeding programmes represent the most intensive type of ex situ population management for threatened species. One example is the Cuvier’s gazelle programme that started in 1975 with only four founding individuals, and after more than four decades of management in captivity, a reintroduction effort was undertaken in Tunisia in 2016, to establish a population in an area historically included within its range. Here, we aim to determine the genetic consequences of this reintroduction event by assessing the genetic diversity of the founder stock as well as of their descendants. We present the first whole‐genome sequencing dataset of 30 Cuvier’s gazelles including captive‐bred animals, animals born in Tunisia after a reintroduction and individuals from a genetically unrelated Moroccan population. Our analyses revealed no difference between the founder and the offspring cohorts in genome‐wide heterozygosity and inbreeding levels, and in the amount and length of runs of homozygosity. The captive but unmanaged Moroccan gazelles have the lowest genetic diversity of all genomes analysed. Our findings demonstrate that the Cuvier’s gazelle captive breeding programme can serve as source populations for future reintroductions of this species. We believe that this study can serve as a starting point for global applications of genomics to the conservation plan of this species.


| INTRODUC TI ON
Human activities leading to habitat loss and fragmentation are one of the main threats to biodiversity (Lindenmayer & Fischer, 2013;Ryser et al., 2019;Tee et al., 2018) and, together with other activities such as poaching, can result in isolation of populations, sharp declines in population size and eventually a population bottleneck. This situation could then trigger an extinction vortex. When a small number of animals are left for breeding, genetic drift becomes stronger and inbreeding increases which can reduce fitness, a process known as inbreeding depression (Blomqvist et al., 2010;Keller & Waller, 2002).
In order to preserve biodiversity and prevent the extinction of endangered species, conservation programmes may intend to establish stable captive breeding populations to provide a demographic and genetic reservoir for wild populations. One of the major longterm goals of captive breeding is the retention of genetic variation to enable future reintroduction efforts (Robert, 2009;Seddon et al., 2007). These events are usually high-risk endeavours, and although some authors have reported successes (Landa et al., 2017;Linklater et al., 2012) particularly for small populations (Van Houtan et al., 2009), others have not (Fischer & Lindenmayer, 2000;Germano & Bishop, 2009). Several factors have been identified to explain past failures including infections (Northover et al., 2018;Robinson et al., 1998), predation (Grey-Ross et al., 2009Moseby et al., 2011), habitat unsuitability (Cook et al., 2010;Sheean et al., 2012) and the use of captive-bred animals as founders (Bremner-Harrison et al., 2004;Robert, 2009). In this latter scenario, loss of genetic diversity and inbreeding can occur during captive management and therefore jeopardize the ability of animals to survive in the wild.
Genetic diversity is central to conservation efforts, and it is thus important to consider to what extent captive breeding programmes can maintain it (Coates et al., 2018;Frandsen et al., 2020;Van Dyke, 2008). Traditionally, one of the most relevant tools for the genetic management in captivity of rare and endangered species has been pedigree analysis. Pedigrees enable the retrieval of relevant genetic information such as the level of inbreeding, whose minimization is a primary short-term goal in such programmes. The inbreeding coefficient (F) indicates the probability that two alleles at a randomly chosen locus are identical by descent (IBD) (Wright, 1922). While there are methods to estimate F from pedigree data (F PED ) to perform demographic and genetic analysis in captive populations (Ballou et al., 2010;ISIS, 2005;Lacy et al., 2012), variations in Mendelian sampling (Hill & Weir, 2011) and pedigree accuracy might result in inaccurate estimates. Moreover, applying molecular genetic techniques to conservation genetics suggests that they yield more accurate estimates of kinship and the degree of inbreeding (Alemu et al., 2021;Galla et al., 2020). A recent study by Kardos et al. (2018) demonstrated that inbreeding is better measured with whole-genome data than by F PED , for instance by calculating realized inbreeding coefficient (F RoH ) based on Runs of Homozygosity (RoH). RoH represent tracts of the genome that are homozygous due to the inheritance of equal haplotypes from a common ancestor (Ceballos et al., 2018) and can give insights into population history and consanguinity (McQuillan et al., 2008;Prado-Martinez et al., 2013;Saremi et al., 2019;Xue et al., 2015). Therefore, F ROH can be used as an alternative to F PED when affordable. However, it has not been until the recent decrease in sequencing costs (Goodwin et al., 2016) and increased availability of genome assemblies for nonmodel species (Farré et al., 2019;Humble et al., 2020) that conservation studies have started to use whole-genome data to study endangered species, a subfield called conservation genomics (Ellegren, 2014;Feng et al., 2019;Gooley et al., 2020;Saremi et al., 2019;Xue et al., 2015).
Cuvier's gazelle (Gazella cuvieri, Ogilby 1841) is a medium-sized antelope species endemic to the mountain and hill ranges of the Maghreb (northwest Africa), with neighbouring ranges in Morocco, Algeria and Tunisia, where it inhabited the Tunisian Dorsal. Its historical range once spanned from the Mediterranean and Atlantic coast to the northern border of the Sahara (El Alami, 2018;Lavauden, 1924). However, the species has declined dramatically since the 1950s (Beudels-Jamar et al., 2006) due to excessive hunting, anthropogenic barriers, feeding competition with domestic livestock and habitat degradation (Attum & Mahmoud, 2012;Beudels-Jamar et al., 2006;Grettenberger & Newby, 1990;Herrera-Sánchez et al., 2020).
Today, only a few small and isolated populations remain (Aulagnier et al., 2015;Beudels-Jamar et al., 2006;Gil-Sánchez et al., 2017;IUCN, 2018 (Moreno & Espeso, 2008). The species is managed under a European Endangered Species Programme (EEP), an intensive type of population management for threatened species in European zoological institutions involved in wildlife conservation. The total EEP captive population (for the last 5-10 years) consists of about 160-170 individuals distributed in seven zoological institutions (Espeso & Moreno, 2019). In the population at La Hoya Experimental Field Station, no inbreeding depression has been reported in previous studies (Ibáñez et al., 2011(Ibáñez et al., , 2013(Ibáñez et al., , 2014Moreno et al., 2015). Breeding management and husbandry practices in La Hoya Field Station are described elsewhere (Moreno & Espeso, 2008). In October 2016, individuals from the Cuvier's gazelle EEP were used as founders for a reintroduction project at Jebel Serj National Park in Tunisia (Moreno et al., 2020). Jebel Serj National Park is part of the Tunisian Dorsal, a mountainous chain running from southwest (close to the Algerian border) to northeast (close to the eastern coastal line 16 miles from Tunis) Tunisia. Cuvier's gazelle inhabited this area up to the 1960s (IUCN, 2018;Moreno et al., 2018). The project used soft-release techniques where the animals were released into acclimatization pens upon arrival for 3 years. During this 3-year period (2017-2019), the reintroduced population was monitored. After the arrival in 2016, all females were distributed in breeding herds and only one adult male was included in each breeding herd (N = 5). The remaining males were housed in individual enclosures for mating in the following breeding seasons (further details in Moreno et al., 2020). Despite the small number of captive Cuvier's gazelles used as founder stock (N = 43, 31 females and 12 males), the population increased more than expected as indicated by the number of females giving birth, the number of offspring born and those surviving to 30 days from the first (2017) to the third (2019) breeding season (Moreno et al., 2020). The same tendency was detected for the offspring recruitment rate at the population level and for population size. Given these promising results following reintroduction, Moreno et al. (2020) suggested that genetic diversity in captive Cuvier's gazelles might be higher than what F PED revealed.
In the present study, our objectives are to (i) evaluate the impact of a Cuvier's gazelle reintroduction event on the genomic diversity of their descendants, (ii) estimate and compare inbreeding levels using both genomic and pedigree data and (iii) determine the genetic differences between gazelles from two populations: one managed and one unmanaged. To accomplish these objectives, we have sequenced the genomes of the first and second cohort individuals born after the reintroduction in Tunisia and compared these to the genomes of the reintroduction founders as well as to an unrelated captive but unmanaged population.

| Study dataset
We included samples from two different Cuvier's gazelle populations. One population is located in Tunisia (Jebel Serj National Park) as part of a reintroduction project initiated in October 2016 from captive-bred founders (Moreno et al., 2020).  Moreno et al., 2020). Female Cuvier's gazelles are fertile from nine months onwards (Moreno & Espeso, 2008); therefore, both founder female gazelles and F1 female gazelles can be progenitors of gazelles born in the second breeding season (F2).
All reintroduced gazelles were ear-tagged for individual identification. In spring 2017 and spring 2018, the first and the second cohort (F1 and F2) of Cuvier's gazelle were born in Tunisia (N = 27 and N = 24 calves, respectively, hereafter named Offspring). To facilitate rewilding and to minimize risk of death (stress, infections, abandonment by mother, etc.), calves were not tagged (see (Moreno et al., 2020) for more details). Therefore, from 2017 onward all descendants remained unidentified.
In contrast, the MOR population had no human intervention in mating decisions and although it is kept in a fenced area, we consider it to be an unmanaged but captive population. These Moroccan individuals were unrelated to those reintroduced and consequently to those animals born in Tunisia. This population was founded in 1997 with seven founders (5 females and 2 males, plus another male some years later) from the Jardin Zoologique de Rabat (Morocco).
We obtained whole blood samples and sequenced a total of 30 individuals from these two populations (Table S1)

| DNA extraction, library preparation and sequencing
We extracted genomic DNA from whole blood samples using either the Qiagen DNeasy Blood and Tissue Kit (Qiagen), the MagAttract HMW Kit (Qiagen) or the GEN-IAL All Tissue DNA kit (GEN-IAL) following manufacturer's protocols (Table S1). Then, we prepared  (Table S1).

| Filtering and variant calling
First, we trimmed the Illumina sequencing adapters and bases with an average quality <20 using Trimmomatic (version 0.36) (Bolger et al., 2014). Raw data quality was assessed with FastQC (version 0.11.7) (Andrews, 2010) before and after trimming. We aligned the sequencing reads to the chromosome-length dama gazelle (Nanger dama) reference assembly (NCBI GenBank accession GCA_019969365.1) using BWA-MEM (version 0.7.12) (Li & Durbin, 2009). This assembly has a total contig/scaffold length of 2.977/3.008 Gb, with contig N50 = 0.271 Mb, scaffold N50 = 156.273 Mb and a contig/scaffold L90 of 13,723 contigs/21 scaffolds. The two species are estimated to have diverged ~6 million years ago (Bibi, 2013) and differ in their respective karyotypes: Cuvier's gazelle has 2n = 32, 33 (females and males, respectively), while dama gazelle has 2n = 38-40 (O'Brien et al., 2006). The dama gazelle used for the reference assembly has 2n = 38. While some chromosomal rearrangements might have occurred since the two species split, reference assemblies of different yet evolutionary close species can be used for genotyping (Galla et al., 2019).

Offspring Founders
Moroccan quality of 30, we used VCFtools (version 0.1.12b) (Danecek et al., 2011). We also removed sites that could not be genotyped in more than 20% of the samples and those that had a minor allele frequency lower than 0.02. To further filter the data, we kept only those scaffolds larger than 45Mb (87.23% of the assembly). Also, to remove sex chromosomes we filtered out regions by (i) their depth of coverage and (ii) their alignment to the cattle X-chromosome (alignment data from Dobrynin et al. (2021)). Specifically, we used MOSDEPTH (version 0.2.6) (Pedersen & Quinlan, 2018) to analyse the coverage in windows. We removed regions in HiC_scaffold_19 (it aligns to the X-chromosome) that showed more than one standard deviation from the average genome coverage and two standard deviations in the remaining scaffolds. After applying these filters, 73.54% of the assembly remained accessible for the analysis, the callable genome.
A final VCF file with a total of 10,197,401 variants was used for all analyses unless specified otherwise.

| PCA and admixture
To study population substructure, we used the smartpca utility from the EIGENSOFT (version 7.2.1) software package (Patterson et al., 2006) and a custom R script (version 4.0.1) to plot the results. We We further filtered the VCF to eliminate positions that had missing data in any of the individuals and converted it to PLINK format using VCFtools (version 0.1.12b) (Danecek et al., 2011). This file was used as an input for the ADMIXTURE program (version 1.23) (Alexander & Lange, 2011) to infer population structure, and we used R and RStudio (RStudio Team, 2020) to plot the results. We used VCFTools (version 0.1.12b) (Danecek et al., 2011) to calculate Fst statistics between the different groups of gazelles although the number of gazelles per group is different.

| Heterozygosity and RoH
We calculated the number of heterozygous positions in the callable region for each scaffold to retrieve the genome-wide heterozygosity of each gazelle. Because the samples have an average coverage of ~7×, we used ANGSD (Korneliussen et al., 2014) to estimate genome-wide heterozygosity from genotype likelihoods and compared it to the estimates from the genotype calling using GATK McKenna et al., 2010). We estimated genomewide heterozygosity following filtering for callability (-sites), using allele frequencies (-doSaf 1), taking genotype likelihoods into account with the SAMtools algorithm (-GL 1) and specifying several filters: discard bad reads (-remove_bads1), use reads where the mate can be mapped (-only_proper_pairs 1), adjust quality for reads with multiple mismatches to the reference (-C 50), adjust quality scores around insertion/deletions (-baq 1), minimum mapping and base qualities of 30 (-minMapQ 30, -minQ 30) include sites with a maximum read length of 30 (-setMaxDepth 30). Genome-wide heterozygosity was calculated from the output using realSFS from ANGSD (Korneliussen et al., 2014).
To assess the distribution and density of heterozygous variants across the genomes, we estimated the number of heterozygous positions in genomic windows of 150 kbp with a 100 kbp overlap for every sample. Note that the heterozygosity and callability measures of two adjacent windows will be highly interdependent on account of their shared number of bps. Window heterozygosity was then defined to be the ratio between the number of heterozygous positions and the number of callable bps for every window.
We used window heterozygosity to define RoH via a Hidden The resulting sample-specific transition probability matrix and emission probabilities were used to segment and classify the window heterozygosity measures. On account of our window size definition, the minimum RoH size produced by this method was 150,001 bps and consisted of at least three adjacent, low-heterozygosity windows.
Because we do not know where the chromosomal rearrangements between dama and Cuvier's gazelle genomes may have occurred and how this would affect RoH discovery, in this study we will be referring to them as putative-RoH. To examine the genomic colocalization of putative-RoH among the studied gazelles, we calculated Jaccard coefficients using BEDTools (version 2.27.1) (Quinlan & Hall, 2010) and plotted them using R (Heatmap from the ComplexHeatmap R package version 3.12) (Gu et al., 2016). Jaccard coefficients measure the ratio of shared base pairs in RoH between pairs of gazelles, to the total number of base pairs in the two individuals minus the shared base pairs.
The site frequency spectrum (SFS) was estimated using ANGSD (Korneliussen et al., 2014) and realSFS using the same filters as for the heterozygosity estimates. Because the number of gazelles varies between groups, we downsampled the Founder and Offspring groups to the same N as the Moroccan gazelles (N = 5). For this, we generated three random subsets of gazelles for each of these two groups excluding the SPA-CAN gazelle from any of the Founder subsets.

| Inbreeding
Within the EEP population, the inbreeding coefficient (F PED ) was calculated from the studbook pedigree including all the gazelles in the breeding programme (Espeso & Moreno, 2019) using PMx software (Lacy et al., 2012). The studbook data assume that the initial founders of the captive population of the breeding programme were unrelated and noninbred, which may not be the case. However, considering information published by Valverde (2003) with details of their several capture locations, assumption of unrelated founders seems to be reasonably safe.
Although we tried to include samples from those founders in Almería in our study using skin specimens from the scientific collection at the Estación Experimental de Zonas Áridas, skins were chemically treated for preservation such a way that made DNA retrieval impossible. To compare whether the inbreeding coefficient obtained with PMx correlated with our genomic data, we estimated F HOM and F RoH two genomic inbreeding coefficients. F HOM was calculated using VCFTools --het (version 0.1.12b) (Danecek et al., 2011). It can take negative values and is based on the observed and expected autosomal homozygous genotype counts (Wright, 1948

| Kinship
We used NgsRelateV2 (Hanghøj et al., 2019), which considers the presence of inbreeding, to estimate relatedness among the 30 gazelles from the study. We also studied relatedness using 25 gazelles excluding the Moroccan individuals, so as to remove any potential population structure between the Moroccan population and the other gazelles. The kinship coefficient of ancestry estimated using NgsRelateV2 is the probability that two alleles are IBD. Next, we compared the kinship coefficient of ancestry from genomic data with the expected relationships from the pedigree. We obtained the pedigree kinship coefficient of all pairs of gazelles from the breeding programme using the studbook data (Espeso & Moreno, 2019) with PMx software (Lacy et al., 2012). The correlation between the genomic and the pedigree kinship coefficients was plotted and estimated with R (ggscatter (cor.coef = TRUE, cor.method = 'pearson') from the ggpubr R package). Also, for the visualization of the complex pedigree, we built a simplified genealogical tree representing the closest familial relationships of the Founder gazelles (N = 13) and two gazelles from the Offspring (the only two for which familial relationships were known) ( Figure S1).

| Population differentiation
We evaluated the genomic diversity and population differentiation through different cohorts of a reintroduced Cuvier's gazelle population originated from an initial ex situ captive breeding programme.
We generated whole-genome sequences for a total of 30 individuals to an average depth of coverage of ~7× (4.77×-14.60×) (Table S1), which has been determined to be sufficient for the purposes of our study (Benjelloun et al., 2019). ( Figure S4).

| Heterozygosity
Genome-wide heterozygosity, measured as heterozygous posi-  Figure S5). The genome-wide heterozygosity estimates computed with ANGSD are slightly lower yet consistent with the GATK results.
Both estimates show a statistically significant correlation ( Figure S6).
When analysing the SFS, we find the Founder gazelles to have more fixed unique alleles compared to the other groups ( Figure S7).  Figure S8). The two gazelles with the highest genomic proportion of RoH are G29 and G26 (47%), from the Moroccan population, and the gazelle with the lowest is G05 (21%), a founder from SPA-ALM (Table S1).
We also analysed RoH with the publicly available software Offspring and Moroccan) ( Figure S10A). After a pairwise comparison between all gazelles, we detect that gazelles G15 and G19 have the highest proportion of shared RoH (Jaccard Coefficient = 0.829, Figure S10B). Most of the RoH are shared by a small number of gazelles, and we do not find a population that has a higher amount of RoH shared by most gazelles compared to private RoH ( Figure S10C).

| Inbreeding and relatedness analyses
We observe no significant correlation between the inbreeding coefficient based on pedigree data (F PED ) and any other genomic inbreeding estimates (F ROH1 , F ROH2 and F HOM ) (Figure 3a). In contrast, all genomic inbreeding coefficients other than F PED have a significant correlation between them (Table S2). All genomic inbreeding estimates indicate that the Moroccan group has the highest inbreeding levels compared to the other groups ( Figure 3b and Table S1).
Kinship analysis confirms that Moroccan gazelles are not related to any of the other gazelles but they show high levels of relatedness amongst themselves ( Figure S11). Overall, the kinship coefficient obtained with NgsRelateV2 and the expected kinship coefficient from studbook data are positively correlated ( Figure S12) although in some cases the observed kinship coefficient is lower than the realized kinship estimated from molecular data (Table S3). The two gazelles in Tunisia with the highest kinship coefficient are G15 and G19 (Offspring F1) (Figure 3c), which also share most of their RoH ( Figure   S10B). If we focus only on the gazelles from which there is pedigree information (all Founder gazelles and Offspring gazelles: G14, F22; Figure S1, Table S3), we observe that the highest kinship coefficient is for the pair between G01 and G11, which do not seem to be close in the genealogy. Nonetheless, the next higher values belong to parent-offspring relationships that can be observed in the genealogy from Figure S1 (G14-G22, G03-G05, G06-G22, G07-G14).

| DISCUSS ION
Genomic approaches are becoming increasingly used by captive breeding programmes to analyse the levels of genetic diversity and inbreeding of captive-bred animals. They can be applied to study the progress of reintroduction programmes on the recovery of an endangered species. To assess the genetic diversity of the cohort born after the reintroduction of 43 Cuvier's gazelles back into the wild, we have sequenced 30 complete genomes, representing the first genomic dataset available from this species. Our study includes a subset of captive-bred gazelles that were reintroduced into Tunisia as founders, as well as some of their offspring. These gazelles are compared to one another and to an unmanaged Cuvier's gazelle population in Morocco.
Despite the increase of genomic studies on nonmodel species (Genereux et al., 2020), there are still many organisms that lack a reference genome or have never been sequenced, such as Cuvier's gazelle. Consequently, for our analyses, we used a chromosome-length assembly of a closely related species, the dama gazelle (Nanger dama). Recent studies have shown the usefulness of employing assemblies from other species for conservation management (Galla et al., 2019) and population genomic analyses (Prasad et al., 2021).
A Cuvier's gazelle population was reintroduced back into Tunisia in 2016 from captive founders with successful results in terms of fitness traits after their local extinction (Moreno et al., 2020). This suggests that, while in captivity, the animals did not accumulate significant amounts of deleterious alleles that would hinder their recovery when released in nature (Moreno et al., 2020). Our findings are in line with previous publications stating that genetic diversity in several endangered species can be increased, by including individuals from different populations (Biebach & Keller, 2012;Frankham, 2008;Rick et al., 2019).
We demonstrate that the captive management at La Hoya Field Station as described in Moreno & Espeso (2008) (see also Moreno et al., 2015) is successful in (i) managing genetic diversity at a population level and (ii) producing individuals retaining significant potential for adapting to a new indigenous environment if used as founders in reintroductions.
We observe that the Moroccan gazelles, which belong to an un- to other genomic inbreeding estimates, such as F ROH . It is known that F PED is based on the expected proportion of the genome that is IBD between two parents, but it does not capture the variation originating from Mendelian sampling and linkage during gamete formation (Hill & Weir, 2011). Also, the studbook of the Cuvier's gazelle conservation programme assumes unrelated and noninbred founders, which might not be the case and can result in underestimated inbreeding estimates (Hogg et al., 2019). We know the exact familial relationships of 15 gazelles, 13 of which belong to the Founder group and 2 of which are F I G U R E 3 (a) Correlation matrix of inbreeding coefficients. F PED has N = 15, and the other estimates have N = 30. Colour intensity and circle size are proportional to the correlation coefficients. Values inside circles indicate correlation coefficients (see Table S2), ***p<0.001.  G16  G17  G18  G19  G20  G21  G10  G06  G13  G22  G14  G12  G03  G15  G23  G24  G25  G07  G04  G02  G01  G05  G08  G11  G09   G16  G17  G18  G19  G20  G21  G10  G06  G13  G22  G14  G12  G03  G15  G23  G24  G25  G07  G04  G02  G01  G05  G08  Offspring born in Tunisia. The other gazelles from the Offspring group were not identified by ear tags or other marks and therefore could not be recorded in the studbook. Yoshida et al. (2020) compared F PED to other inbreeding estimates and suggested that genomic inbreeding estimates may be more accurate, which is in accordance with our results. The fact that, as expected, we do not find a correlation between the inbreeding coefficient based on F PED and the genomic inbreeding coefficients highlights the value of applying genomics to conservation and captive breeding. F PED can be blind to background inbreeding and inaccurate kinship. Although pedigrees are still a good source of information (Galla et al., 2020;Wang, 2016), genomic approaches can help gain a better understanding of the genomic makeup of the species under study and their levels of inbreeding (Saremi et al., 2019).
Sequencing costs have been decreasing since their appearance and have been increasingly applied to the study of endangered species (Genereux et al., 2020 (Valverde, 2003).
In this study, we showed how genomic tools are useful to assess the genomic composition of individuals that are part of an ex situ conservation project and their offspring. In this particular case, we have assessed the consequences of a reintroduction event after it was carried out. The characterization of the genomic status of the founder gazelles and the first offspring born after the reintroduction sets the grounds for future projects of genetic monitoring of the descendants of this initial release. When possible, future conservation plans could go one step further and use genomics not only to evaluate the success after the reintroduction has already been carried out but also to provide further information for the selection of the best-  (Moreno et al., 2020) and by observing no major genetic differences between Founders and Offspring. With this study, we have shown that genomics can aid conservation projects to guide and assess the success of a reintroduction event, particularly by using metrics for inbreeding estimates. We have determined that a captive-bred gazelle population can be a suitable source for reintroduction into the wild.

ACK N OWLED G EM ENTS
We want to thank the conservators, workers and staff of Jebel Serj National Park for their help while in the field. Gerardo Espeso, Abdelkader Jebali and Jesús Benzal also provided field assistance.

CO N FLI C T O F I NTE R E S T
The authors declare no conflict of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data that support the findings of this study are openly available in the European Nucleotide Archive (ENA) at EMBL-EBI at https:// www.ebi.ac.uk/ena/brows er/view/PRJEB 41985, reference number PRJEB41985.