The burden of anthropogenic changes and mutation load in a critically endangered harrier from the Reunion biodiversity hotspot, Circus maillardi

Anthropogenic impact is causing the decline of a large proportion of species worldwide and reduces their genetic diversity. Island species typically have smaller ranges than continental species. As a consequence, island species are particularly liable to undergo population bottlenecks, giving rise to conservation challenges such as inbreeding and unmasking of deleterious genetic load. Such challenges call for more detailed assessments of the genetic make‐up of threatened island populations. The Mascarene islands (Indian Ocean) present many prime examples, being unusual in having been pristine until first human arrival ~400 years ago, following which anthropogenic pressure was unusually intense. A threatened harrier (Circus maillardi) endemic to the westernmost island of the archipelago is a good example of the challenges faced by species that have declined to small population size following intense anthropogenic pressure. In this study, we use an extensive set of population genomic tools to quantify variation at near‐neutral and coding loci, in order to test the historical impact of human activity on this species, and evaluate the species' (mal)adaptive potential. We observed low but significant genetic differentiation between populations on the West and North‐East sides of the island, echoing observations in other endemic species. Inbreeding was significant, with a substantial fraction of samples being first or second‐degree relatives. Historical effective population sizes have declined from ~3000 to 300 individuals in the past 1000 years, with a more recent drop ~100 years ago consistent with human activity. Based on our simulations and comparisons with a close relative (Circus melanoleucos), this demographic history may have allowed purging of the most deleterious variants but is unlikely to have allowed the purging of mildly deleterious variants. Our study shows how using relatively affordable methods can reveal the massive impact that human activity may have on the genetic diversity and adaptive potential of island populations, and calls for urgent action to closely monitor the reproductive success of such endemic populations, in association with genetic studies.


| INTRODUC TI ON
Human activity impacts natural environments by promoting habitat loss and fragmentation, introduction of invasive species and hunting.Accordingly, human activity is driving a massive global extinction event, sometimes described as the sixth mass extinction (Ripple et al., 2017).Anthropogenic changes also impact biodiversity through a loss of genetic diversity within species and can cause the accumulation of deleterious variants (i.e. mutation load) in small, inbred populations.Declining, isolated populations may suffer from an increase in inbreeding, leading to inbreeding depression and to the exposure of recessive deleterious mutations in the homozygous state (Keller & Waller, 2002;Laenen et al., 2018;Lynch et al., 1995).Such exposure can lead to a reduction of the population's average fitness, further reducing population size, even when external anthropogenic pressures are reduced through conservation measures (the "extinction vortex"; Blomqvist et al., 2010).Making generalizations about mutation load during the extinction vortex remains difficult since the number of conservation genomic studies available is small, albeit growing rapidly.It appears that while an important role for mutation load applies to a significant number of cases (e.g.Feng et al., 2019;Jackson et al., 2022;van der Valk et al., 2019), there also exist exceptions.In particular, the accumulation of deleterious variants (mutation load) is sometimes avoided through purging of homozygous deleterious mutations, which may occur if the deleterious effects are sufficiently strong, and the population survives long enough for selection to efficiently eliminate exposed deleterious variants (Glémin, 2003;Robinson et al., 2018).Based on available studies, it can also be postulated that purging is particularly likely when the population under study has only diverged relatively recently from its closest relative (within the last 10,000-140,000 years for all cases including hunter-gatherers, Montezuma quails, and the alpine ibex; Grossen et al., 2020;Lopez et al., 2018;Mathur & DeWoody, 2021;Ureña et al., 2018) and furthermore when it has potentially been through multiple bottlenecks as a result of establishing in and being confined to an insular environment (e.g.gorillas in high altitude montane forest, Xue et al., 2015; kakapos on an offshore island, Dussex et al., 2021).
Island taxa are particularly sensitive to anthropogenic change.
Such species have less available habitat and often have to cope with high urbanization rates.In birds, about 90% of recently extinct clades were endemic to islands (Myers, 1979), and there have been a higher number of species extinctions on islands than on continents (Smith et al., 1993).Island species also usually display lower historical effective population sizes, which may reduce the efficiency of purifying selection against mildly deleterious variants over evolutionary time (Leroy et al., 2021).Addressing these potential issues requires a detailed understanding of the genetic make-up of endangered island populations, as well as the evolutionary processes that may underlie their current diversity.Informing conservation programmes by such studies is critical to protect this major component of biodiversity (Hoban et al., 2020) but has rarely been performed on birds outside the United States (Pierson et al., 2016).
The Reunion harrier (Circus maillardi, Maillard, 1863) is the only endemic raptor living on the island of Reunion (Indian Ocean).Locally named papangue, it is currently considered a separate species from its close relative, the Madagascar harrier (Circus macrosceles; Safford & Hawkins, 2013; Oatley et al., 2015), even though the Reunion lineage only diverged recently, an estimated 100,000 years ago (Oatley et al., 2015).This divergence time is similar to the previously cited cases of 10,000-140,000 years, including those of the mountain gorilla and kakapo (Dussex et al., 2021;Xue et al., 2015).Reunion is typical of oceanic islands in being relatively small in area (2500 km 2 ), even if it is unusually topographically diverse.Despite its current name, the Reunion harrier originally occurred naturally on the two main islands of the Mascarene archipelago (Mauritius and Reunion) but is now extinct on Mauritius (Mourer-Chauviré et al., 2004).
With about 200 breeding pairs (Augiron, 2022), it is classified as Endangered (BirdLife International, 2021).The Reunion harrier has experienced multiple anthropogenic pressures that are typical of oceanic islands worldwide, even if first human arrival was unusually recent (400 years).Since then, there has been unusually rapid habitat destruction, fragmentation, and modification (Strasberg et al., 2005;Lagabrielle et al., 2009), as well as the introduction of numerous invasive species (Baret et al., 2006).
Population genomic approaches may be helpful for informing conservation strategies and addressing resilience to human impact.New opportunities for cost-effectively sequencing thousands of singlenucleotide polymorphisms [SNPs; e.g. through RAD-sequencing (Hohenlohe et al., 2010) or Genotyping-by-Sequencing (GBS; Elshire et al., 2011)] now provide unprecedented opportunities to estimate the genetic diversity of populations, their past demographic history, and relatedness of individuals.Such information is critical for conservation programs which need to estimate inbreeding or the impact of historical human activities on populations.In addition, sequencing of coding regions through RNA or whole-genome sequencing can be used to quantify the abundance and frequency of variants with putatively deleterious effects.Such approaches have been successfully applied to the study of critically endangered populations, such as alpine ibex or mountain gorillas (Grossen et al., 2020;Xue et al., 2015), showing purging of deleterious mutations but also an accumulation of mildly deleterious mutations in bottlenecked populations.These findings suggest that recent bottlenecks induced by human activity may have a negative impact on populations by failing to purge a large portion of exposed deleterious variants.
In this study, we used a combination of GBS and RNAsequencing to address the effects of human impact on neutral and functional genetic diversity, using data from 61 Reunion harriers.Expert estimates range between 500 and 800 individuals, counting 175 to 275 reproductive pairs (Augiron, personal communication).Our sample therefore represents about 8%-12% of the global population.We first tested for the existence of any population structure within Réunion that may be caused by geography or environmental barriers.We also obtained relatedness between individuals to estimate inbreeding.Second, we tested how recently the Reunion harrier's population has declined, and compared demographic history to the known and unusually recent history of first human colonization.Third, we used models of demographic history and SNP variants called from RNA-sequencing data to test whether our data conform with the expectation of purging of deleterious variants from the population, given the similarity of the Reunion harrier lineage with other such cases in being insular and only having diverged relatively recently from a much larger population elsewhere.

| Sampling and sequencing
A total of 61 individuals (31 males, 30 females) were selected for sequencing.Individuals were captured between June 2016 and January 2020 using claptraps, pole traps, or whoosh nets, and marked with rings and wing tags.DNA was extracted from claws (collected from animals found dead or critically injured) and blood samples by LGC Sequencing failed for one individual (#59), which was removed from subsequent analyses.Adapters were removed and reads trimmed to a uniform length of 140 bp using cutadapt v3.4 (Martin, 2011).Reads were trimmed to ensure compatibility with the Stacks pipeline (see below), which does not accept variable-length reads.
In brief, this pipeline aligns the forward reads covering any given GBS locus and generates a consensus for each locus in each individual (ustacks algorithm).These loci are then linked across all individuals to create a catalogue of shared loci (cstacks).Reads are then mapped back on assembled loci for each individual (sstacks), and the reverse paired reads are used to elongate the consensus sequence for each locus (tsv2bam and gstacks).Finally, the populations module can be used to call genotypes for each individual in Variant Call Format (VCF; Danecek et al., 2011).Assembly is controlled by two main parameters, M and N. M corresponds to the number of mismatches allowed between reads belonging to the same GBS locus in a given individual, due to either sequencing errors or heterozygosity, while N controls the maximum distance between consensus sequences obtained for each individual when building the catalogue.Small values of M and N may lead to splitting a single polymorphic locus into two, while large values may incorrectly assemble two paralogue loci into one, generating artefactual heterozygous genotypes.We used the 80% rule to determine the best M and N parameters for assembly and variant calling (Paris et al., 2017)-that is, we examined the number of polymorphic GBS loci and SNPs called in at least 80% of individuals, using M = 1, 3, 5, and 7. We also changed the N parameter to take the values M − 2, M, and M + 2. The final VCF produced by populations was further filtered to only include genotypes with an individual sequencing depth of at least 6×, a minimum genotype quality of 20, and a p-value >.0001 for Hardy-Weinberg equilibrium, testing for excess heterozygosity to remove putative paralogues.We chose for our population genetics analysis the set of parameters maximizing the total number of genotypes produced with less than 20% missing data (see Section 3).

| Population structure
To characterize population structure, we performed a clustering analysis using discriminant analysis of principal components (DAPC) in the R package adegenet (v2.1.2;Jombart et al., 2010).DAPC first decomposes the variance in the dataset into principal components (PC) and then performs a discriminant analysis on these PC to identify the most likely genetic clusters.We selected the clustering model with the highest support using Bayesian Information Criterion (BIC) and retained the first principal component that explained about 8.5% of the total variance and two of the linear discriminants.These numbers were chosen through a cross-validation procedure that suggested perfect assignment to clusters with a 0% mean-square error (Jombart & Collins, 2015).We also used ADMIXTURE v1.3.0 (Alexander & Novembre, 2009) to assign individuals to genetic clusters.A 20-fold cross-validation procedure was used to identify the number of clusters (K) best fitting the data.The model showing the lowest cross-validation error estimate was chosen as the best one.Spatial projection of admixture proportions was obtained using the tess3r package in R (Caye et al., 2016).
To identify possible barriers to gene flow in a spatial context, we used a stepping-stone population genetics model-Estimating Effective Migration Surfaces (EEMS)-that identifies local deviations from the expected patterns of isolation-by-distance in a twodimensional grid (Petkova et al., 2015).The program implementing the model outputs a map showing local genetic diversity, as well as corridors and barriers to gene flow.To run the program, we first used default parameters and ran three independent MCMC chains for five million generations, sampled every 10,000 generations.We adjusted parameters so acceptance proportion would be in the 20%-40% range at the end of the run.We did not observe substantial differences in the output from different runs.The first million iterations were discarded as burn-in.
To test whether dispersal was correlated with environmental variation, we fitted an isolation-by-resistance model to our data and compared it with a pure isolation-by-distance model.We used the R package radish (Peterman & Pope, 2021) to fit mixed effect (so-called MLPE) and generalized Wishart models and used monthly data obtained from the French Meteorological Office (Météo-France, Toulouse) for temperatures and precipitations as a proxy for environmental variation.The data summarize measurements taken between 1990 and 2010 and projected onto a 132 m × 132 m grid.We reduced the complexity of these data by performing a principal component analysis using the R package Rstoolbox (https:// cran.r-proje ct.org/ web/ packa ges/ RStoo lbox/ index.html).Genetic divergence between individuals was measured as the Hamming's distance using the bitwise.dist()function in popr (Kamvar et al., 2014).

| F statistics and relatedness
We estimated inbreeding index (F IS ) and fixation index (Weir and Cockerham F ST ) within and between clusters using the R package hierfstat (Goudet, 2005).We tested whether these statistics were significantly different from 0 by examining the distribution of 1000 bootstrapped values.
We used ngsRelate v2 (Hanghøj et al., 2019) on called genotypes to examine two statistics of relatedness: the coefficient of coancestry Θ and the KING robust relatedness statistic (Waples et al., 2019).
The latter should be more robust to the effects of population structure, while the first one should be more robust to inbreeding.We also estimated the fraternity coefficient to identify putative siblings (Ackerman et al., 2017).

| Demographic analyses
Population genetics inference of demographic parameters based on the frequency spectrum can be sensitive to the inclusion of closely related individuals.To alleviate this potential issue, we excluded all putative first-and second-degree relatives [Θ < 0.0884, the lower bound of the confidence interval for the KING estimate of relatedness expected between second-degree relatives (Manichaikul et al., 2010)].Individuals with the lowest proportion of missing data were kept.After this procedure, there remained 20 individuals for our final analysis.We used the same filtering parameters for this subset of individuals as above.We reran the population program in stacks to estimate the number of invariable sites, which is needed to obtain demographic estimates based on mutation rates.Furthermore, we used the easySFS Python script (https:// github.com/ isaac overc ast/ easySFS) to project down populations to seven and five diploid samples for Northern and Southern clusters, which maximized the number of segregating sites that could be used for demographic inference after projection.
We used the program stairway v2 (Liu & Fu, 2020) to explore variation in effective population sizes over time.This likelihood algorithm uses the allele frequency spectrum of polymorphisms in each population to infer past changes in effective population sizes.We used the default recommended parameters and excluded singletons from the analysis to reduce potential biases due to errors in SNP calling.To obtain demographic estimates of population sizes and times, we assumed a generation time of 5.6 years (±0.76 years, Fay et al., in prep, Steve Augiron, personal communication) and a germinal mutation rate of 1.3 × 10 −8 mutation/generation/site (Smeds et al., 2016).
We also examined recent changes in population size (over the last 30 generations) using GONE (Santiago et al., 2020), a method based on Linkage Disequilibrium (LD) between SNPs along the genome.It has been successfully applied on domesticated breeds with high relatedness, and we included all individuals in this analysis to maximize power.We used the chromosome-level assembly of Accipiter gentilis (August et al., 2022) to align GBS reads using BWA MEM v0.7.17 (Li & Durbin, 2009) and called reads with freebayes v 1.3.5 (Garrison & Marth, 2016).The raw VCF file was filtered using the same filtering criteria as for de novo GBS assembly.We required an individual sequencing depth of at least 6×, a minimum genotype quality of 20, and a p-value >.0001 for Hardy-Weinberg equilibrium, testing for excess heterozygosity to remove putative paralogues.We also removed variants called sex chromosomes.We used default parameters and assumed a recombination rate of 2 cM/Mb (a rate representative of Neoaves [Malinovskaya et al., 2018]) and a generation time of 5.6 years to obtain demographic estimates.

| RNA-sequencing, variant calling, and annotation
In order to identify coding mutations with confidence, assess mutation load, and obtain information about the effects of purifying selection, we obtained transcriptomes for five individuals.Blood collected in the field and stored in RNAlater was sent to Genoscreen (https:// www.genos creen.fr/ en/ ) for library preparation and sequencing.RNA was extracted from blood samples, and ribosomal RNA was removed using a FastSelect-rRNA HMR kit (QIAGEN) following manufacturer's instructions.Libraries were prepared using a TruSeq Stranded mRNA sample prep kit and sequenced paired-end (2 × 150 bp) on an Illumina platform.
To facilitate downstream analyses and annotation of polymorphisms, we aligned the resulting reads on the annotated genome of a close relative species, the European Sparrowhawk Accipiter gentilis (August et al., 2022).Read quality was examined using FASTQC version 0.11.9 (https:// qubes hub.org/ resou rces/ fastqc), and adapters were trimmed using TRIMMOMATIC version 0.39 (Bolger et al., 2014), with the following options: LEADING:3 TRAILING:3 MINLEN:30 SLIDINGWINDOW:4:20.Cleaned reads were aligned on the reference genome using STAR version 2.7.9 (Dobin et al., 2013), accepting up to 15 mismatches between a read and the reference to take into account evolutionary divergence (option -outFilterMismatchNmax 15).To obtain insights about the possible role of small population size on mutation load, we aligned reads obtained from the transcriptome of another Circus species with a continental distribution and not currently endangered, Circus melanoleucos (Short-Read Archive identifier: SRR3203217).We chose this species as being the closest relative for which genome-wide data were available at the time of analysis (Oatley et al., 2015).Mapping quality was good for all samples, with ~80%-85% of reads with a unique match on the reference genome.
We note that recent benchmarking confirms that GATK used with joint genotyping is appropriate for RNA-seq (Brouard et al., 2019).
We used BCFTOOLS to apply hard filters on raw polymorphisms, excluding sites with a quality Q < 30, a Fisher strand bias statistic FS > 60.0, a symmetric odds ratio statistic SOR >3, a mapping quality MQ < 40, a mapping quality rank sum test MQRankSum < −5.0 and MQRankSum > 5.0, a quality by depth statistic QD < 2.0, and ReadPosRankSum < −4.0.We removed individual genotypes with a quality (GQ) lower than 20 and an individual depth of 8×.Based on the distribution of total depth at all loci, we removed sites with a total sequencing depth higher than 500× and with any missing Circus maillardi individual.We also removed variants called on sex chromosomes and short scaffolds not mapped to any chromosome.

| Annotation and estimates of mutation load
To compare the effects of selection on diversity at neutral and constrained sites, we annotated polymorphisms using SNPEff (Cingolani et al., 2012) with default parameters.This program classifies polymorphisms as coding or non-coding and predicts their effect on the amino-acid sequence.We first built a database using the GTF annotation file for the Accipiter gentilis genome (version bAccGen1.1).
We classified polymorphisms as having low effects (synonymous polymorphisms and mutations in untranslated regions), medium effects (non-synonymous polymorphisms resulting in an aminoacid change), and strong effects (premature stop codons).Because Accipiter gentilis is an outgroup to both Circus species included in the analysis, we considered the reference allele in Accipiter gentilis as the ancestral allele when drawing derived allele frequency spectra.
To compare our results with other studies on mutation load (e.g.Leroy et al., 2021), we estimated π N /π S , which is the ratio between nucleotide diversity at synonymous (π S ) and at non-synonymous sites (π S ), adapting scripts (https:// github.com/ Quent inRou gemont/ PiNPiS) from previous studies (Leroy et al., 2021;Rougemont et al., 2020).When a given gene had several alternate transcripts/ coding sequences, we only kept the longest, although estimates did not drastically change when computing the statistics over all transcripts.
Mutation load (i.e. the total reduction in fitness due to deleterious variants) can be assessed through three summary statistics: the number of derived alleles found in each individual across all loci, the count of homozygous-derived mutations for each individual, and the R xy statistics, which compares load between populations.The latter is the ratio between the number of derived mutations in a genome from population X that are not seen in another genome from population Y, and the number of derived mutations from population Y not found in population X, averaged through all possible pairwise comparisons.Note that here, we consider individuals as populations with two haploid genomes.This statistic is standardized (R' xy ) by the statistic observed at a set of putatively neutral markers to correct for past demography.Based on allele frequency spectra and values for the other statistics, we normalized using R xy values obtained for variants in untranslated regions.

| Simulations of mutation load
To determine the effect of historical changes in effective population sizes on mutation load, we ran simulations using the forwardin-time simulator SLiM v3.6 (Haller & Messer, 2019).We simulated a 9 Mb bird exome, with 150 bp non-recombining exons with a mutation rate of 1.3 × 10 −8 mutation/generation/site (Smeds et al., 2016).
We assumed a recombination rate of 3 × 10 −8 events/generation between exons (Kawakami et al., 2014) that we multiplied by the average length of introns (ca 20 kb) in the total genome (estimated at 1.2 Gb), reflecting the actual recombination rate between exons.Selection coefficients (s) were drawn from a gamma distribution with a mean of −0.05 and shape parameter of 0.2, which produced variants with a broad range of selective coefficients that could be used for comparative analyses.We set deleterious mutations as fully recessive.We modelled a population under a Wright-Fisher model and following the bottleneck history inferred by the stairway analysis for the North-Eastern cluster, from where most (four out of 5) of our RNA samples originated.We ran 500 simulations and calculated the same summary statistics for mutation load as above, but compared the frequencies of deleterious variants in the extant simulated population with the frequencies observed just before the bottleneck.To mimic our sampling scheme, we sampled five diploid individuals before and after the bottleneck.

| GBS data quality control
The largest number of polymorphic SNPs with less than 20% missing data was obtained for M = 5 and N = 7, corresponding to a maximum divergence of 5% between alleles at each locus.This dataset included 7955 polymorphic SNPs.Higher values of N led to a decrease in the total number of called genotypes, due to a larger proportion of loci displaying high heterozygosity before filtering (Figure S1).This probably reflects a larger proportion of paralogues incorrectly assembled as a single locus for higher values of N.

| Population structure
We used a DAPC (Jombart et al., 2010)  This division into two clusters was already visible in the principal component analysis (PCA, Figure 2).Differentiation between the two groups was moderate but significant (F ST = 0.08, p < .001).There were no fixed differences between the two groups.The distribution of individuals belonging to the two clusters was not random and reflected a geographical separation between a North-Eastern cluster and a Western cluster, although clear long-distance dispersal events could be observed, with individuals assigned to the North-Eastern cluster found in the Western range and vice versa.These clusters roughly correspond to the windward and leeward coasts, respectively, with the central volcanic range possibly acting as a barrier between the two.This spatial differentiation was confirmed by the EEMS analysis (Figure 3a), revealing a barrier to gene flow between the East and the West of the island, and lower genetic diversity in North-Eastern individuals (local reduction reaching up to 10 0.10 ~ 20% compared to the average rate for the whole island).
Higher F IS was observed in the North-Eastern cluster (F IS = 0.12, p < .001)compared to the South-Western cluster (F IS = 0.05, p < .001),consistent with inbreeding.
To explicitly test whether environmental differences between the leeward and windward coasts were associated with differences in connectivity and genetic structure, we compared models of pure isolation-by-distance with models of isolation-by-environment.
We obtained data from the French Meteorological Office (Météo-France, Toulouse) summarizing monthly average temperatures and precipitations over the 1980-2010 period.We chose these data since they clearly summarize both current and historical environments across Réunion at a high resolution (~132 m).We performed a Principal Component Analysis on these data, with the first axis (PC1, 58% of variance explained) clearly correlated with temperatures and elevation, while PC2 (39.7% of variance explained) was correlated with precipitations (Figure 3b).The models incorporating climatic PC as resistance variables outperformed a pure isolation-by-distance models (Table 1), with a negative correlation between PC values and conductance.The fitted conductance surfaces suggest less resistance to dispersal in the leeward zone compared to the windward zone, consistent with other analyses (Figure 3b).

| Geographical structure of recent dispersal and pairing
We investigated fine-grained genetic structure by examining pairwise relatedness between samples, aiming at identifying close relatives.Relatedness statistics were correlated, but Θ was generally F I G U R E 1 Spatial interpolation of DAPC and ADMIXTURE assignation scores for a model with two clusters.The colour of each dot corresponds to the assignment probability to the North-Eastern cluster.
higher than the KING robust statistics (Figure S2).This may be due to the robustness of the KING statistic to population structure.On the other hand, the KING estimator may be sensitive to inbreeding and underestimate relatedness between inbred relatives (Ackerman et al., 2017).To robustly identify close relatives, we define firstdegree relatives as having Θ ≥ 0.25 and KING robust ≥0.125, and second-degree relatives as having Θ ≥ 0.125 and KING robust ≥0.0625.
We identified ten pairs of putative first-degree relatives in our dataset (Figure 4).Among these ten pairs, two may be siblings, as shown by their fraternity coefficients equal or larger than 0.25 (individuals #22 and #23, #12 and #13).These ten pairs were usually found in close geographical proximity, apart from a few events of long-distance dispersal which remained within the two main geographical areas (windward and leeward coasts) identified by population structure analyses.Including putative second-degree relatives, we could observe more pairs of related individuals spread over long distances (Figure S3).Most related individuals could be placed on a NW-SE axis, consistent with the possible barrier effect of the central mountain range.
To remove putative relatives we only kept pairs of individuals with Θ < 0.088 (N = 20, Figure S4), which is around the lower level of kinship-coefficient values that would be expected for most second-degree relatives, using the kinship statistics giving the highest kinship estimates.This operation did not result in significant changes in terms of population structure inference, as shown by results from PCA, DAPC, and EEMS analyses performed on this subset (Figures S5-S7).As expected, removing individuals with high relatedness led to a decrease in F statistics, slightly reducing both F IS (all individuals F IS = 0.06, North-Eastern cluster F IS = 0.10, Western cluster F IS = 0.02, all p < .001)and F ST statistics (F ST = 0.05, p < .001).
Nevertheless, we still observe a higher F IS coefficient for the North-Eastern cluster, suggesting population fragmentation and mating with close relatives.

| Evidence for recent reduction in effective population sizes
Demographic analyses based on the allele frequency spectrum (stairway v2) all show a recent decline in effective population sizes, from an ancestral size of ~3000 diploid individuals to contemporary sizes of around 300 individuals (Figure 5), consistent with current census size estimates of reproductive pairs on Reunion.This decline is more pronounced in the North-Eastern population, which also displayed a higher F IS , consistent with accelerated drift due to population fragmentation and mating with close relatives.The decline in population size is estimated to start about 1000-2000 years ago, followed by another steep decline approximately 100 years ago.
We emphasize that these estimates strongly depend on the choice of generation times and mutation rates.However, these trends are consistent with an important role of anthropogenic activity since first human arrival 400 years ago in accentuating a decline that may have begun in pre-human times.Our analysis based on LD (GONe) suggests an even more recent decline over the past 30 generations (~170 years).We note that the relatively small size of our dataset (7727 autosomal SNPs called using the A. gentilis reference) probably limits accuracy of the LD method at older times, and therefore this result merits further confirmation with whole-genome data.
Nonetheless, a purely post-human decline cannot be ruled out based on these results alone.

| Coding variation and mutation load
We examined the frequency spectra of polymorphisms with no missing data in Circus maillardi coding regions.For this one analysis, we excluded individual DA252517 which did not cluster with other F I G U R E 3 (a) Map of migration rate (left panel) and diversity (right) surfaces inferred by EEMS.Red colours correspond to locally low migration rates/diversity, while blue colours correspond to regions of higher migration rates/diversity.Migration was calculated over 500 putative demes in a stepping-stone lattice.(b) Maps of the first two principal components for 24 climatic variables (monthly average temperatures and precipitations) projected on Réunion.The second row shows the conductance models fitted using the two PC values as explanatory variables.Higher conductance suggests lower resistance to gene flow and dispersal.
Circus maillardi individuals on a PCA based on SNPs called from exonic regions (Figure S8).Given their geographic location, we suspect the four other individuals to belong to the North-Western cluster (Figure S8), while DA252517 belongs to the Western cluster.We examined four main classes of mRNA variants: those found in untranslated regions (UTR), variants of low effect (synonymous mutations), medium effect (non-synonymous mutations), and high effect (premature stop codon).Variants of high effect showed a clear excess of rare variants compared to all other categories (Figure 6a).On the other hand, this excess was much less pronounced for variants of medium and low effects.This suggests that strong purifying selection has acted on variants of large effects, lowering their average frequency in the population.We also computed π N /π S , a proxy for mutation load, over concatenated coding loci, obtaining a similar value when including all individuals (π N /π S = 0.245, π S = 3.38 × 10 −4 ) or when excluding DA252517 (π N /π S = 0.243, π S = 3.34 × 10 −4 ).
To test whether the recent population decline had a significant impact on mutation load, we simulated deleterious variants under TA B L E 1 Summary of models of isolation-by-resistance and isolation-by-distance.a bottleneck scenario similar to the one inferred in the Reunion harrier.We first compared the number of derived alleles found in five individuals sampled before and after the bottleneck (Figure 7).
There was a slight increase in the number of alleles with low and medium effects, while the number of derived alleles was generally lower for variants of large effect, though variance was quite high across simulations for the latter category.We assigned variants to low, medium, and high effect categories, corresponding to 2 × Ne × s values in the modern population being lower than −0.1, comprised between −0.1 and −1, and lower than −1, respectively.We also observed for all categories of variants an increase in homozygosity, with a ~2.5× average increase for variants of large effects.This is in line with the bottleneck increasing average allele frequencies and homozygosity.Lastly, for most simulations, the R' xy statistic decreased as the deleterious impact of variants increased.
Patterns observed in the empirical data when comparing allele and genotype frequencies between C. maillardi and its relative, C.

| DISCUSS ION
In this study, we tested whether the genetic make-up of a critically endangered island raptor has been altered by recent human colonization through population decline and subsequent accumulation of deleterious variants due to accelerated drift.Our approach and findings are also relevant in evaluating whether generalizations can be made concerning decline in many other threatened species including those on other islands that are likely encountering similar challenges.

| Landscape shapes population structure even at a small spatial scale
We identified two genetic clusters corresponding to two distinct ecological regions on Reunion that are separated by a volcanic range with a large and high plateau separated by two peaks, the highest of which exceeds 3000 m elevation.Such a pattern is remarkable given the comparatively limited structure in other birds of prey such as ospreys that have a much broader repartition (Monti et al., 2018).We could observe some events of long-distance dispersal, but pairs of closely related individuals usually clustered together in the two main geographical regions considered.Reunion harriers are highly philopatric.Based on GPS tracking data and wingtags controls, maximal distances crossed by adult animals are 5.7 km on average for males (min-max = 3.4-9 km), and 6.7 km for females (min-max = 2.5-10.4km; Augiron, 2022).Juvenile dispersal data are not only scarcer but also point towards high philopatry.Recorded distances between fledging and first reproduction sites were 250 m and 1.85 km for two males, and 5.2 km for one female (Augiron, 2022).We also note that the Western cluster has a lower F IS compared to the North-Eastern cluster.This could be explained by a more open habitat promoting long-distance dispersal and mating.These results are consistent with observations in other endemic Reunion taxa.For example, in the Grey White-Eye (Zosterops borbonicus), distinct parapatric forms are separated by the central volcanic range, and display clear phenotypic differences between North-Eastern and Western coasts (Bourgeois et al., 2020;Milá et al., 2010).This species also shows evidence of high philopatry that may be responsible for the maintenance of parapatric, phenotypically distinct forms (Bertrand et al., 2014;Bertrand et al., 2016;Delahaie et al., 2017).High philopatry is also associated with significant genetic structure at extremely small geographical scales in the endangered Barau's petrel, Pterodroma baraui (Danckwerts et al., 2021).Comparable patterns are observed in an invasive bird species, the Red-whiskered bulbul, to test with the present data.Nevertheless, conservation efforts need to be refined to take into account differences in population dynamics between these clusters.

| Human impact on population decline
Our models suggest a population decline starting more than 1000 years ago, with a second steeper decline starting approximately 100 years ago.This is consistent with a pre-human initiation to decline, which was later accentuated by anthropogenic activity, following human arrival and first settlement that only began around 400 years ago.We emphasize that our estimates strongly depend on the mutation rate chosen for conversion into demographic units.
For example, assuming a mutation rate per generation about three times higher would place most of the observed decline in the last 500 years.Moreover, LD-based demographic analyses support a decline from ~1000 to ~100 breeding individuals in the past 200 years.
Nevertheless, we note that there is evidence for decline predating human arrival in other endangered species on Reunion.For example, population decline may have started as early as 5000 years ago in the nearly extinct Reunion cuckoo-shrike Coracina newtoni (Salmona et al., 2012).There is evidence for climate change and explosive volcanic activity in Reunion and the Madagascar region during this period, which may have resulted in rapid environmental changes and a reduction of suitable habitat for harriers (Quéméré et al., 2012;Staudacher & Allègre, 1993).
Assuming a pre-human initiation to the observed decline, the overall decline to current population size is nonetheless extremely occupation of the island (Cheke, 1987;Mourer-Chauviré et al., 1999;Probst, 1997).Introduction of cats and rats by humans in the seventeenth century (Cheke, 1987) has caused the extinction of several bird species on the island (Cheke & Hume, 2008;Drake & Hunt, 2009;Probst & Bial, 2002).Moreover, widespread diffusion of insecticides and rodenticides during the second half of the twentieth century (Julvez et al., 1990) has been found to be associated with dangerously high concentration of toxic compounds in Reunion harriers (Coeurdassier et al., 2019).Our results support effective population sizes being ten times larger 5000 years ago than today, and an at least fourfold decline in the last 350 years corresponding to human arrival and major anthropogenic environmental change.This is well in the range of recent meta-analyses that suggest a 27.6% average decline in genetic diversity for island species since the industrial revolution (Leigh et al., 2019).

| How likely is purging?
In addition to human impact, declining populations can accumulate mutation load, which may impair their ability to recover from anthropogenic disturbances.Deleterious variants are often recessive, causing loss-of-function which can be compensated by the ancestral allele.While this may shelter them in large populations, they are more likely to be purged in small inbred populations.Such purging may contribute to the survival of populations and avoid the so-called "extinction vortex" (López-Cortegano et al., 2021;Pérez-Pereira et al., 2022;Ralls et al., 2020).Purging has rarely been shown in natural populations.Concerning variants of largest effect, our results concord with expectations based on findings in other cases in which the bottlenecked population has diverged only recently from a larger population elsewhere, and in which it is likely or possible that the population has experienced multiple bottlenecks since then (Dussex et al., 2021;Grossen et al., 2020;Mathur & DeWoody, 2021).
However, also consistent with some of such cases (Alpine ibex, Montezuma quail; Grossen et al., 2020;Mathur & DeWoody, 2021), even if there is evidence of purging of variants of largest effect such as stop codons, the inbreeding and demographic bottlenecks of the Reunion harrier supported by our data have not prevented the accumulation of alleles of moderate effect.We also found that the Reunion harrier displays a clear excess of homozygosity when compared to a non-endangered continental species of harrier, which is consistent with their demographic history.Even before decline, a small population size may have prevented the effective removal of moderately deleterious variants in C. maillardi compared to continental relatives.Mutation load measured as π N /π S is among the highest we could find in birds, matching values found in passerines restricted to small islands (Leroy et al., 2021)  Our simulations also support our expectation of higher homozygosity for neutral and deleterious variants after the bottleneck.
However, it appears that variants of large effect may have been purged from C. maillardi populations.This is encouraging, as such variants can lead to fast decline given their impact on mutation load.On the other hand, such variants are rare before a population decline, and the additive impact of the more abundant variants of moderate effect has to be considered, especially in a context of inbreeding.Hatching rates are extremely low in the Reunion harrier monitored in a study area located on the eastern coast (42%, n = 76, Augiron, 2022) and are consistent with observations of low hatching rates following bottlenecks (Briskie & Mackintosh, 2004) and several studies of other species have shown a negative correlation between inbreeding and survival of embryos (de Boer et al., 2018;Hoeck et al., 2015), or survival of juvenile and older birds (Grueber et al., 2010;Szulkin et al., 2007).

| Possible conservation strategies
Conservation strategies need to consider high levels of inbreeding and relatedness, particularly in the North-Eastern cluster.A strategy aimed at increasing the heterozygosity of offspring within the two geographic clusters may help reduce the impact of these variants on fitness (Woolaver et al., 2013).Deciding whether gene flow should be facilitated between the two populations depends on whether outbreeding depression is expected to cause more damage than inbreeding depression.Following the decision chart proposed by Frankham et al. (2011) (see Figure 1), we see that there is clear evidence for dispersal and gene flow between the two clusters (relatively low F ST ), no fixed differences, and no evidence for a complete interruption of gene flow in the last 20 generations.Although there are substantial historical environmental differences between the windward and the leeward sides of the island, our observations point towards a recommendation of re-establishing gene flow between populations.Given the higher levels of inbreeding in the North-Eastern cluster, approaches consisting of translocating individuals and/or eggs from the Western cluster to the North-Eastern cluster may improve the latter's genetic health.Future work may benefit from whole-genome data, for example, to test for any signatures of local adaptation and fixed differences that would have to be included in conservation strategies (Frankham et al., 2011), infer very recent demographic trends, and predict the future of populations.
Genetic data, in combination with the field surveys that are currently ongoing (Coeurdassier et al., 2019;Fay et al., 2023), may help build a large pedigree for the species, inform controlled mating protocols, and provide a way to test for correlations between reproductive success and inbreeding.Accurate predictive models of populations can be powerful tools to guide conservation strategies, as recently shown by a study of mutation load in vaquita (Robinson et al., 2022), but require detailed information about human impact and genetic diversity of the species.While the current study paves the way towards this goal, future work will have to investigate in more detail how inbreeding, mutation load, and human impact (especially rodenticides, Coeurdassier et al., 2019) shape reproductive success through a combination of field surveys and genetic assessments.

| Synthesis and conclusion
Our study not only has significant implications for the conservation of an endangered species on a local scale but also makes an important contribution in adding to a small but rapidly growing assembly of endangered species whose population bottlenecks have been relatively well studied with population genomic approaches.
The field appears to be poised at a turning point in which population genomic data (especially of the type we employ here) will soon be affordable and practical to obtain for all species undergoing conservation genetic study.However, in the meantime, it is vital for the community to critically evaluate whether generalizations are or are not possible, before it is too late for the majority of the world's endangered species for which such data will likely arrive too late.
Particularly relevant for management at a local scale, our landscape genetics analysis revealed somewhat unexpected barriers to gene flow, structuring the species into two genetic clusters geographically separated across the island of Reunion.Consistent with recent independent findings from ecological data (Fay et al., 2023), our results from demographic analysis suggest an ongoing decline in the North-Eastern cluster.Nonetheless, the strength of the link between long-term effective population size and conservation status still remains elusive (e.g.Díez-del-Molino et al., 2018), thereby bringing into question the relevance of estimating effective population size for conservation.By applying models that can reconstruct past changes in population sizes, we could confirm a strong decline in diversity consistent with a pre-human initiation to decline, which was later accentuated by anthropogenic activity.These findings are useful as part of an integrated conservation approach that has been running for the past 6 years, incorporating ecological data and fieldbased demographic surveys.
Whether mutation load may cause extinction due to genomic meltdown (Rogers & Slatkin, 2017) remains unclear for a majority of species.Our empirical data and simulations drawn from reconstructed demographic trajectories both support the idea that deleterious variants of largest effect are purged after population contraction.On the one hand, considering other similar cases worldwide, these findings point towards a generalization in which populations that have diverged only recently from a larger population elsewhere, and in which it is likely or possible that multiple bottlenecks have taken place since then, are particularly liable to experience purging.On the other hand, also consistent with a subset of such existing cases, our findings suggest that any generalizations about purging ought to carefully consider the type of variants involved.In our case, our empirical data and simulations are consistent with a failure to efficiently purge moderately deleterious variants.
Such a pattern may be of particular concern in the conservation of many island species in which population sizes are historically lower and mutation load higher than in their continental counterparts (Leroy et al., 2021).
Through the continued addition to the dataset and analyses de- Metadata including the age, sex, and geolocation of samples are also stored in the SRA (BioProject PRJNA868120).

B EN EFIT-S H A R I N G S TATEM ENT
This study emerged as a collaboration between local conservationists (represented by SA) and academic researchers (BW, YB).All contributors are included as co-authors, the results of research have been shared with local decision makers (DEAL Réunion, SEOR), and the broader scientific community, and the research addresses a priority concern, in this case the conservation of papangue.
Genomics (Ostendstrasse 25, TGS Haus 8, 12459 Berlin, Germany) using sbeadex Tissue DNA Purification Kit and sbeadex Blood DNA Purification Kit.Sampling was conducted under a licence delivered by the CRBPO (Museum National d'Histoire Naturelle-France) within no.577 program coordinated by SA.Normalized Genotypingby-Sequencing libraries (Arvidsson et al., 2016) were prepared by LGC Genomics with the MsII restriction enzyme.Libraries were sequenced on an Illumina NextSeq 550 machine (2 × 150 bases paired ends), resulting in 91.5 million read pairs before filtering.Raw reads were demultiplexed using individual barcodes to obtain FASTQ files for each individual.No mismatch was allowed between sequence and barcodes.Read quality was checked for each individual using FASTQC v0.11.9 (https:// qubes hub.org/ resou rces/ fastqc).
based on genotypes for 60 individuals, after excluding one individual with too much missing data (Figure 1).The model with two clusters had the lowest BIC, suggesting the existence of two distinct populations.All individuals had at least 70% of their ancestry attributed to one or the other cluster, apart from one (#42) which we assigned as a potential recent hybrid.This division into two clusters was further confirmed by another clustering algorithm, ADMIXTURE, with K = 2 suggested as a sensible model based on cross-validation.

F
Principal component analysis plot based on GBS genotypes for 60 individuals.The purple point corresponds to a putative recent hybrid (F1) according to the DAPC analysis.
Significance levels are as follows: **p-value <.01, ***p-value <.001.F I G U R E 4 Position of putative first-degree relatives.Triangles indicate pairs that are likely to be siblings, while circles highlight putative parent-offspring relationships.Individuals from the same pair are given the same colour.
melanoleucos, are similar to those obtained by simulation.We observed a clear deficit of alleles of large effects in Reunion harrier, as well as a ~2× increase in homozygosity in C. maillardi compared to C. melanoleucos for variants of low and moderate effects.This is again consistent with recent decline in effective population sizes in C. maillardi (Figure 6c).However, unlike what was observed in the simulations, we observed a clear drop in the number of homozygous variants of large effect.This may reflect distinct demographic trends in C. melanoleucos that could not be considered in our models, or stronger selection against large effect homozygous variants than what we simulated.Further studies on the demographic history of C. melanoleucos and other close relatives of the Reunion harrier would be needed to address this discrepancy.Lastly, R' xy statistics also suggested purging of the most deleterious variants in C. maillardi compared to C. melanoleucos.Overall, these findings suggest that our results are consistent with expectations from other recently diverged lineages of partial purging of mutations of large effects.Nonetheless variants of moderate effect seem to contribute to most of the mutation load and have very likely subsisted in the modern-day Reunion harrier population compared to the pre-bottleneck population.

F
Inferred effective population sizes over time for the whole island and the two genetic clusters identified in clustering analyses.Dashed lines indicate 95% confidence intervals.Red dotted lines show the LD-based estimate from GONE.
Pycnonotus jocosus, showing rapid morphological and genetic differences between the windward and leeward coasts of Reunion in just a few decades (LeGros et al., 2016).These findings suggest that environmental and geographic barriers, combined with reduced dispersal, may have shaped the genetic make-up of several endemic species from Reunion.Whether the two genetic clusters correspond to two locally adapted populations remains difficult F I G U R E 6 Proxies of mutation load in the Reunion harrier and the Pied harrier (C.melanoleucos).Top left: Allele frequency spectra for different categories of coding variants obtained from four North-Western Reunion harriers (eight alleles in total).Top right: Relative count of derived alleles for each of the five Reunion harriers included in the RNA-seq experiment, standardized by the corresponding counts in the Pied harrier.The dashed horizontal line is at 1 (i.e. the expectation if both species have the same number of derived alleles).Bottom left: Relative count of homozygous-derived genotypes in the Reunion harrier, normalized by the corresponding counts in the Pied harrier.Bottom right: R' xy analysis contrasting each Reunion harrier with Pied harrier for different impact categories.R' xy < 1 indicates a relative frequency deficit of the corresponding category in Reunion harrier compared to Pied harrier.The rightmost comparison contrasts all Reunion harriers with pied harriers.R' xy is standardized for each comparison by the value obtained for UTR SNPs, given their apparent low average effect.
unlikely to be entirely due to natural causes.The past century has witnessed an extremely fast growth of the human population on Reunion, going from 180,000 people in 1926 to more than 890,000 inhabitants today, and exceeding a million in 2050 (United Nations, Department of Economic and Social Affairs, Population Division, 2019).It is acknowledged that human activity has led to the replacement of 73% of the original habitat by agricultural land and urban areas(Lagabrielle et al., 2009).For example, 30 out of 45 native terrestrial vertebrates have gone extinct since European F I G U R E 7 Proxies of mutation load in 500 simulated populations of Reunion harriers, comparing allele frequency before and after bottleneck.(a) Derived allele frequency spectra for variants of low, medium, and high effects (see main text for associated selection coefficients).(b) Relative count of derived alleles for five individuals sampled after the bottleneck.(c) Same as (b), but with the count of homozygous-derived genotypes.(d) R' xy analysis for five present individuals.R' xy < 1 indicates a relative frequency deficit of the corresponding category in modern populations compared to pre-bottleneck.For each category of variant, we normalize by the statistics obtained from one individual sampled just before the bottleneck.R' xy is standardized for each comparison by the value obtained for neutral markers.
veloped in this study, the Reunion harrier is placed to be among a handful of cases of bottlenecked species globally for which the genetic effects of the bottleneck are relatively well studied.As such, it should contribute to a broad perspective on the management of such species, informing us as to what extent genetic trends are case specific requiring detailed genetic study of each bottlenecked species, and to what extent they are globally applicable.Data accessibility: Raw sequence reads for GBS and RNA-seq data are deposited in the SRA (BioProject PRJNA868120).SliM3 scripts, as well as individual genotype data for GBS and RNA-seq are available on Figshare (https:// doi.org/ 10. 6084/ m9.figsh are.25117 061.v2) and Github (https:// github.com/ YannB ourge ois/ Scrip ts_ harrier).