Introgression across evolutionary scales suggests reticulation contributes to Amazonian tree diversity

Hybridization has the potential to generate or homogenize biodiversity and is a particularly common phenomenon in plants, with an estimated 25% of species undergoing inter-specific gene flow. However, hybridization in Amazonia’s megadiverse tree flora was assumed to be extremely rare despite extensive sympatry between closely related species, and its role in diversification remains enigmatic because it has not yet been examined empirically. Using members of a dominant Amazonian tree family (Brownea, Fabaceae) as a model to address this knowledge gap, our study recovered extensive evidence of hybridization among multiple lineages across phylogenetic scales. More specifically, our results uncovered several historical introgression events between Brownea lineages and indicated that gene tree incongruence in Brownea is best explained by introgression, rather than solely by incomplete lineage sorting. Furthermore, investigation of recent hybridization using ∼19,000 ddRAD loci recovered a high degree of shared variation between two Brownea species which co-occur in the Ecuadorian Amazon. Our analyses also showed that these sympatric lineages exhibit homogeneous rates of introgression among loci relative to the genome-wide average, implying a lack of selection against hybrid genotypes and a persistence of hybridization over time. Our results demonstrate that gene flow between multiple Amazonian tree species has occurred across temporal scales, and contrasts with the prevailing view of hybridization’s rarity in Amazonia. Overall, our results provide novel evidence that reticulate evolution influenced diversification in part of the Amazonian tree flora, which is the most diverse on Earth.


Introduction 51
Reproductive isolation is often seen as a prerequisite for speciation and as a defining feature of 52 species (Barraclough 2019;Mayr 1942). Despite this, hybridization between species is known to 53 occur and has several different outcomes, from the erosion of evolutionary divergence (Kearns et al.,54 2018; Vonlanthen et al., 2012) to the formation of entirely new 'hybrid species' (Mallet 2007). 55 which form part of the 'Brownea clade' (Fabaceae, subfamily Detarioideae (LPWG, 2017)). The list 136 of accessions and their associated information can be found in Supporting Information Table S1. 137 DNA sequence data were generated using leaf material collected from herbarium specimens and 138 silica-gel dried accessions. Genomic DNA was extracted using the CTAB method (Doyle & Doyle 139 1987), and sequencing libraries were prepared using the NEBNext® Ultra™ II DNA Kit (New 140 England Biolabs, Massachusetts, USA) with a modified protocol to account for fragmented DNA, and 141 including a ~600bp size-selection step. Hybrid bait capture was performed using the MyBaits protocol 142 (Arbor Biosciences, Michigan, USA) to target 289 phylogenetically-informative nuclear genes, using 143 a bait kit designed for the legume subfamily Detarioideae within which Brownea is nested (Ojeda et  to remove adapter sequences and to quality-filter reads. Trimmomatic settings permitted <4 150 mismatches, a palindrome clip threshold of 30 and a simple clip threshold of 6. Bases with a quality 151 score <28 and reads shorter than 36 bases long were removed from the dataset. Following quality-152 filtering, loci were assembled using SPAdes v3.11.1 (Bankevich et al., 2012)  Information Methods S1. 163 Gene trees were generated for each of the 226 gene alignments using RAxML v.8.0.26 (Stamatakis 164 2014) with 1,000 rapid bootstrap replicates and the GTRCAT model of nucleotide substitution, which 165 is a parameter-rich model and hence is most suitable for large datasets. Macrolobium colombianum 166 was used to root both the full and single-accession dataset analyses since it was the Macrolobium 167 accession that was present in the most gene alignments. The gene trees from both datasets were used 168 to generate species trees under the multi-species coalescent model in the heuristic version of ASTRAL 169 v.5.6.1 (Zhang et al., 2018)

Population-level introgression 197
Having examined the degree of historical reticulation among Brownea lineages, population genomic 198 data were generated with ddRAD sequencing and used to investigate recent introgression at a finer 199 taxonomic scale. More specifically, the degree of shared genetic variation was visualised for 200 individuals of the ecologically divergent species B. grandiceps and B. jaramilloi, which co-occur in 201 the Ecuadorian Amazon and putatively hybridize in the wild. Following this, in order to make 202 inferences about the potential evolutionary significance of recent introgression, the rates at which 203 different loci introgress relative to the rest of the genome were estimated for these taxa using genomic 204 clines. In order to do this, 171 specimens in total were genotyped using ddRADseq (Peterson et al.,  representing the relative abundance of each species in the forest plot from which they were sampled. 208 Leaf material was collected from Brownea trees in the Yasuní National Park 50ha forest plot in Napo, 209 Ecuador, and dried in a herbarium press. The sample list is shown in Supporting Information Table  210 S2. 211 Genomic DNA was extracted from dried leaf material with the CTAB method and purified using a 212 QIAGEN plant minikit (QIAGEN, Hilden, Germany) column cleaning stage. Sequencing libraries 213 were prepared by digesting the DNA template using the restriction enzymes EcoRI and mspI (New 214 England BioLabs, Massachusetts, USA), in accordance with the ddRADseq protocol. Samples were 215 then ligated to universal Illumina P2 adapters and barcoded using 48 unique Illumina P1 adapters. 216 Samples were pooled and size-selected to between 375-550 bp using a Pippinprep electrophoresis 217 machine (Sage Science, Massachusetts, USA). Following amplification and multiplexing, sequencing 218 was performed using a single-lane, paired-end 150bp run on the HiSeq 3/4000 platform. 219 DNA sequencing reads from the ddRADseq genotyping were processed de novo (i.e., without the use 220 of a reference genome) using the Stacks pipeline v2.1 (Catchen et al., 2011). Reads were quality 221 filtered using Stacks by removing reads which were of poor quality (i.e., had a phred score <10), 222 following which 'stacks' of reads were created, and SNPs were identified among all de novo 'loci' posterior probability (PP)), with lower support for inter-specific relationships. ASTRAL inferred a high 276 degree of discordance among gene tree topologies at many nodes (quartet score = 0.52), with multiple 277 alternative bipartitions reconstructed at most nodes. This is evident from the presence of many more 278 conflicting gene trees (numbers below branches) than congruent gene trees (numbers above branches) 279 in Supporting Information Fig. S1. 280 281

Ancient introgression in Brownea species 282
Phylogenetic networks estimated to ascertain whether diversification in Brownea was tree-like or 283 reticulate pinpointed two hybridization events within the genus (Fig. 3). This is indicated by the -log 284 pseudolikelihood values in Supporting Information Fig. S2, since the number of hybridization events 285 (h) which best describe the data give the largest improvement in -log pseudolikelihood. In Fig. S2, -286 log pseudolikelihoods increased steadily between h = 0 and h = 2, after which the increasing number 287 of hybridization events only made minimal improvements to -log pseudolikelihood. 288 The inferred phylogenetic network (Fig. 3) showed a broadly similar topology to the species tree in 289 PCA of SNP variation inferred using the R package ggplot2 (Fig. 4a)  jaramilloi. The same pattern was recovered from a fastSTRUCTURE run incorporating 40 random 343 individuals from each species, performed to account for any bias which may have been incurred by 344 differences in sample size (Supporting Information Fig. S4C). 345 The NEWHYBRIDS analysis revealed that most hybrid individuals were the result of continued 346 hybridization, with pure B. grandiceps making up 73.9% of the genotyped population, and pure B. 347 jaramilloi making up 21.9% for the 500 subsampled loci used in this analysis. There were no F1 (first 348 generation) hybrids identified by the subset of loci analysed, and all hybrid individuals were either F2 349 hybrids (0.592%) or had a broad distribution of probabilities across hybrid classes (3.55%), 350 suggesting backcrossing. In the latter, the probability of any one hybrid class did not exceed 90%, 351 which was the threshold used to categorize individuals as belonging to a certain class. 352 The Bayesian estimation of genomic clines (bgc) analysis, which was used to determine whether any 353 loci showed outlying rates of admixture relative to the genome-wide average, recovered a signal of 354 asymmetric introgression, but did not detect any differential rates of introgression between loci. Of Our study represents the first clear case of reticulated evolution among Amazonian trees documented 364 at multiple phylogenetic levels. This contrasts with previous work, which suggested that hybridization 365 is an extremely rare phenomenon in Amazonian trees. We demonstrate that within Brownea, a 366 characteristic Amazonian tree genus, reticulated evolution has occurred over the course of its 367 evolutionary history, with evidence of introgression deep in the history of the genus and more 368 recently, between the closely related B. grandiceps and B. jaramilloi. 369

Introgression has occurred at deep phylogenetic levels in Brownea 371
Phylogenetic network analysis suggested that introgression has taken place between Brownea lineages 372 in the past, with two separate hybridization events inferred (Fig. 3) also recovered a low degree of support for bifurcating inter-specific relationships (Fig. 2) and a high 384 degree of incongruence among gene trees (Supporting Information Fig. S1). These results can be 385 partially explained by introgression, rather than being caused purely by the stochasticity of lineage 386 sorting in large populations, which is suggested to be a common phenomenon in many rainforest trees 387  Since the inferred events of ancient introgression both involved the common ancestors of two sets of 395 present-day species, they are likely to have occurred several million years in the past, before the divergence of the descendent species. The inferred introgression events are likely to have occurred 397 since the Miocene period (~23Ma), as date estimates from previous work suggest that Brownea 398 originated around this time (Schley et al., 2018). Accordingly, it is unlikely that the inferred 399 hybridization events are due to 'accidental' hybridization between co-occurring species, merely 400 producing maladapted F1 hybrids which will eventually be outcompeted, fortifying the boundaries 401 between species (i.e. 'reinforcement' (Hopkins 2013)). Rather, our results suggest a persistence of 402 introgression over evolutionary time within Brownea. 403 404

Recent hybridization also occurs between Brownea species, resulting in persistent hybrids and 405 introgression across the genome 406
This investigation also uncovered a substantial signal of recent introgression between B. grandiceps 407 and B. jaramilloi in the Yasuní National Park 50ha plot located in the Ecuadorian Amazon. The low 408 Fst estimate for the B. grandiceps/B. "rosada" and B. jaramilloi populations (0.11), in addition to the 409 principal component analysis (Fig. 4a), the fastSTRUCTURE analysis (Fig. 4b)  The bgc analysis indicated that most introgression was asymmetric, with more loci containing mostly 423 B. grandiceps alleles (5.96% of all loci) than loci containing mostly B. jaramilloi alleles (1.3% of all 424 loci). Importantly, this bgc analysis also indicated that gene flow occurs largely at the same rate 425 across loci, since there were no outlying estimates of the β parameter. The excess of B. grandiceps 426 ancestry mirrors the asymmetry of introgression suggested by Fig. 4b and is likely driven by the 427 uneven population sizes of the two species. Brownea jaramilloi has only ever been observed in a 428 small part of the western Amazon (Pérez et al., 2013) and is much less populous than B. grandiceps, 429 which occurs across northern South America. More specifically, the disproportionate donation of 430 alleles from B. grandiceps may be due to 'pollen swamping', whereby pollen transfer from another, 431 more populous species is more likely than pollen transfer from less numerous conspecifics (Buggs & 432 Pannell 2006). Pollen swamping can serve as a mechanism of invasion (e.g., in Quercus (Petit et al.,433 2004)), and the associated asymmetric introgression may lead to the extinction of the rarer species, 434 especially when hybrids are inviable or sterile (Balao et al., 2015). However, due to the ongoing Iris (Arnold et al., 2010). Accordingly, it is possible that selectively advantageous alleles are passing 450 between species (e.g., between B. jaramilloi and B. grandiceps) through backcrossing events over 451 many generations as observed by the fastSTRUCTURE and NEWHYBRIDS analyses in this study 452 (Mallet 2005). This may facilitate adaptation to new niches due to the widening of the pool of 453 variation upon which selection can act (e.g. Whitney et al., 2010). However, it is difficult to ascertain 454 whether adaptive introgression has occurred without measuring the impact of introgression on 455 variation in the phenotype and its fitness effects (Suarez-Gonzalez et al., 2018). 456 457

The contribution of hybridization to neotropical rainforest tree diversity 458
Studies such as ours that document persistent hybridization across evolutionary time between tree 459 species within Amazonia, and within tropical rainforests in general, are rare. Indeed, it was previously 460 suggested that inter-specific hybrids between rainforest tree species are poor competitors, and that 461 fertile hybrid populations are nearly non-existent (Ashton 1969  While it is difficult to determine whether hybridization between tree species occurs across many 483 syngameons of closely related Amazonian lineages, or whether it is a tendency unique to certain 484 genera such as Brownea, it may be prudent to review how relationships among tropical tree lineages 485 are inferred. Accordingly, the use of phylogenetic networks in resolving the relationships between 486 such groups may provide additional insight into whether reticulate evolution has contributed to 487 diversification within Amazonian rainforest, which is among the most species-rich environments on 488