The phylogeographic history of Megistostegium (Malvaceae) in the dry, spiny thickets of southwestern Madagascar using RAD‐seq data and ecological niche modeling

Abstract The spiny thicket of southwestern Madagascar represents an extreme and ancient landscape with extraordinary levels of biodiversity and endemism. Few hypotheses exist for explaining speciation in the region and few plant studies have explored hypotheses for species diversification. Here, we investigate three species in the endemic genus Megistostegium (Malvaceae) to evaluate phylogeographic structure and explore the roles of climate, soil, and paleoclimate oscillations on population divergence and speciation throughout the region. We combine phylogenetic and phylogeographic inference of RADseq data with ecological niche modeling across space and time. Population structure is concurrent with major rivers in the region and we identify a new, potentially important biogeographic break coincident with several landscape features. Our data further suggests that niches occupied by species and populations differ substantially across their distribution. Paleodistribution modeling provide evidence that past climatic change could be responsible for the current distribution, population structure, and maintenance of species in Megistostegium.

Though morphological adaptations in response to climate have been explored in southwestern Madagascar (Evans et al., 2014), few other hypotheses exist for diversification in dry adapted plant taxa in this region (Ganzhorn et al., 2001;Nicoll & Langrand, 1989).
The dry spiny thickets of southwestern Madagascar (as defined by Aronson et al., 2018) occupy a sandy strip along the southwestern coast of the island and cover an area of approximately 16,000 km 2 .
This semiarid zone is located in one of the oldest biomes on the island (Buerki et al., 2013) and boasts some of the highest plant endemism on Madagascar (Davis et al., 1994;Jolly et al., 1984;Phillipson, 1996): 48% of plant genera and 95% of plant species occurring in the ecoregion are endemic to the island. The tree flora recorded from this zone is similarly exceptional with 89% of the tree species endemic to Madagascar and more than 20% of tree species endemic to the dry spiny thickets (Aronson et al., 2018). Despite these remarkable figures and the fact that dry, deciduous forests are among the most threatened biomes of the world (Janzen, 1988), the spiny thickets remain understudied (Moat & Smith, 2007;Waeber et al., 2015).
Understanding how past climate has altered biotic distributions may be particularly important in understanding divergence patterns throughout southwestern Madagascar. Aridity increased substantially throughout the Late Pleistocene and Holocene and subsequently expanded the xeric thickets of southwestern Madagascar (Clarke et al., 2006). Such extreme changes in climate are likely to strongly impact coastal species. These recent changes are especially compatible in the exploration of population divergence within species as frequent range shifts might allow for the reinforcement of reproductive barriers and/or provide opportunities for secondary contact and potential hybridization (Dobzhansky, 1940;Kay & Schemske, 2008;Liu et al., 2014;Rieseberg et al., 2003). The dry spiny thicket has been identified as a center of endemism and is bisected by a watershed proposed to act as a refugium for mesic species (Wilmé et al., 2006). Such low elevation watersheds are expected to have more extreme ecological shifts and a greater impact on habitat isolation than watersheds at higher elevations (Wilmé et al., 2006). Contrary to traditional refugia models, one might expect watersheds to act as an important barrier to dry-adapted taxa.
Increasing aridity would remove the barrier and allow for population expansion or increased population connectivity.
The Malvaceae family is the second most species-rich family in the spiny thickets of southwestern Madagascar (Aronson et al., 2018). Megistostegium Hochr. (Figure 1) is one of nine Malvaceous genera present in the region and the only one wholly endemic to the spiny thickets. As such, the genus represents an extraordinary group to explore diversification in Malagasy plants in this biodiversity hotspot. The three species of Megistostegium (M. microphyllum Hochr., M. nodulosum (Drake) Hochr., and M. perrieri Hochr.) are estimated to have diverged sometime in the Pliocene (~5MYA; Koopman & Baum, 2008). Each species is morphologically distinct in habit (tall shrub, tree, and prostrate shrub), floral, and leaf morphologies (Hochreutiner, 1955;Koopman, 2005Koopman, , 2011. Species of Megistostegium grow together in populations throughout southern Madagascar and evidence for interspecific hybridization is present at these sites in the form of limited prezygotic isolating barriers as assessed by hand pollination (Koopman, 2008) and the presence of infrequent morphological intermediates (Koopman, 2011). All three species grow within 100-200 m of one another at the Special Reserve of Cap Sainte Marie (Koopman, 2008(Koopman, , 2011 where they have overlapping flowering periods, only one potential pollinator (the Souimanga sunbird, Cinnyris sovimanga Gmelin), and putative hybrids have been found between two species pairs (M. microphyllum and M. nodulosum, M. microphyllum and M. perrieri). Despite these factors and demonstrated gene flow at the molecular level at Cap Sainte Marie (Koopman & Baum, 2010) the three species of Megistostegium maintain their morphological identity for the most part in sympatry as measured by morphology at two spatial levels (across its range in southwestern Madagascar and in sympatry at Cap Sainte Marie; Koopman, 2008Koopman, , 2011Koopman & Baum, 2010).
Given the recent origin of the genus and noted lack of complete reproductive isolation inference of relationship between the species in Megistostegium has proven difficult (Koopman & Baum, 2010).
Population-level data previously collected in Megistostegium from across its distribution indicate that the species show little genetic variation at the loci investigated: gene genealogies constructed from F I G U R E 1 Megistostegium microphyllum (Malvaceae). Photo credits: Margaret Hanes four nuclear and one chloroplast genes found variable levels of species monophyly and no exclusive species monophyly (Koopman & Baum, 2010). structure analyses and a series of AMOVA for six microsatellites suggested that genetic variation comes mostly from within populations and species rather than between them (Callewaert, 2014). More data are needed to understand relationships among these young species as well as population structure within species.
RAD sequencing has proven to be an effective and robust method to resolve species level questions, even those with interspecific gene flow, and phylogeographic patterns (Eaton & Ree, 2013;Hipp et al., 2014;Park & Donoghue, 2019).
Here, we use RADseq to investigate relationships between species and populations of Megistostegium in an effort to begin to understand diversification patterns across southwestern Madagascar. We then explore patterns of niche evolution in species, and in populations, to identify geographic barriers and to understand how climate and soil might contribute to patterns of population divergence in Megistostegium. Specifically, we build population trees, explore the similarity of climatic niches between species and regions, and assess how landscape features and climatic change may have driven the current niche diversity and distribution in Megistostegium. and 12 populations (Table 1). Between 12 and 40 individuals are represented from each species with between 4 and 11 individuals from each population (Table 1). Genomic DNA from silica-dried fieldcollected leaf material was extracted using a modified CTAB method (Alverson et al., 1999), further cleaned with a DNeasy kit (Quiagen) and quantified using a Qubit 2.0 Fluorometer (Thermofisher).

| Sampling and Genomic library preparation
Library preparation and sequencing of RAD markers from ge- F I G U R E 2 (a) Study region in southwestern Madagascar, (b) Distribution of subarid, dry, spiny thicket (bioclimate map after Cornet, 1974, obtained from Missouri Botanical Garden Madagascar gazetteer (http://www.mobot.org/mobot/ gazet teer/)) and location of the three major collection regions in this study (circles in primary colors), (c) Point locations representing species distributions used for niche modeling and rivers that bisect the southern slopes of southwestern Madagascar (after Aldegheri, 1972) TA B L E 1 Locality data and one representative voucher specimen for species and populations sampled
Demultiplexed and filtered reads were used to generate homologous de novo RAD loci. Reads were clustered within samples at 85% sequence similarity, and a minimum read depth of 6, into de novo loci. Loci were then clustered across samples at 85% sequence similarity and processed into five assemblies. We produced two broad categories of sequence assemblies for analyses: (1) one "phylogenomic" assembly was constructed for phylogenetic analyses among the three species, where all loci were shared among at least 12 ingroup and outgroup taxa (89 Megistostegium samples, two outgroup samples) and (2)

| Phylogenetic tree estimation
Maximum-likelihood analyses were conducted on the three assemblies with outgroups (phylogenomics; population genomics: MMOG and MNOG) using RAxML version 8.2.1 (Stamatakis, 2014) and implemented on the CIPRES Science Gateway Portal (https://www. phylo.org/porta l2/home.action) under a GTR+CAT substitution model and with 100 rapid bootstrap replications to assess support.

| Population genomic analyses
We examined population structure in M. microphyllum and M. nodulosum using the 30% "population genomics" assemblies (MM and MN) containing only ingroup taxa and one randomly selected SNP per RAD locus, implemented in structure v.2.3.4 (Pritchard et al., 2000) in the analysis kit of ipyrad. We executed 10 structure runs per k value (1-10), with each run having a burn-in of 100,000 fol-

| Bioclimate
We used ecological niche models to understand the ecological dis-  (Hijmans et al., 2005). Datasets were clipped to the regional extent of Madagascar and projected to the GCS WGS 1984/UTM Zone 38S coordinate system using ESRI arcGIs (v9.2). Pearson's correlation tests were performed in the ARCMAP plugin SDMtoolbox v2.4 (Brown et al., 2014(Brown et al., , 2017 to remove highly correlated bioclimate variables (Pearson r ≥ .70). Ecological niche models were calculated using both MaxEnt (Phillips et al., 2004(Phillips et al., , 2006 and the BioClim algorithm in the R package eNMtools (Warren et al., 2019(Warren et al., , 2021. Models were estimated from the average of 10 replicates and model performance was estimated using 25% of the points to test the model; the remaining points were used for training. We used the Area Under the Curve (AUC), calculated for testing data, to further assess model performance (Phillips et al., 2006). Once we verified that models were able to predict occurrence data better than random for species models with contemporary data (observed average by AUC =0.988 Because ecological niche models might conceal important spatial variation within the distribution of individual widespread species, we built ENMs using BioClim at the regional level to examine niche divergence at a finer scale. These ENMs were constructed in eNMtools and model performance was evaluated with AUC. We 10-25 records/region) were used to construct regional distribution models.

| Soil
Ecological tolerances of each species with respect to soil were separately explored by producing ecological niche models, calculated in MaxeNt v3.3.3e (Phillips et al., 2004(Phillips et al., , 2006

| Niche overlap and equivalency
Niche overlap values were calculated with BioClim models in the R package eNMtools (Warren et al., 2019(Warren et al., , 2021 using contemporary, noncorrelated bioclimate data for two sets of comparisons, (1) between species and (2) between each region encompassed by a species (East, West, South). Niche overlap values were evaluated using two standard metrics, Schoener's D (1968) and Hellinger's I (Warren et al., 2008), which produce values between 0 (no overlap between niche models) and 1 (identical niche models). Two randomization tests were carried out in eNMtools (Warren et al., 2019) to further evaluate niche conservatism between species and regions using contemporary, noncorrelated bioclimate data. Niche identity tests aim to determine whether a pair of species (or regional lineages) has equivalent niches (Warren et al., 2010

| RADseq bioinformatics
In total, more than 6.39 × 10 8 single-end reads were generated by Illumina sequencing of the RADseq library. After filtering low quality reads, defined as a read with more than 5 low quality base calls and a default phred Qscore offset of 33, more than 99% of reads were retained. On average, 7.02 × 10 6 reads were sequenced per sample (range = 3.52 × 10 5 -1.87 × 10 7 reads). The "phylogenomic" assembly recovered a total 24,481 RAD-loci, 48,865 parsimonyinformative sites and 69.46% missing data (

| Phylogenetic tree estimation
The ML approach of the "phylogenomic assembly" with outgroups recovered a well-resolved tree, strongly structured by geography,  the East. The M. nodulosum tree reflects regional signal but with less support and no population monophyly (Figure 4b). All individuals from the East form a well-supported clade that is sister to some individuals in the South. The remaining individuals from the South form a grade with individuals from the West. Simulations in structure assigned individuals in each species to two clusters that correspond to an East versus South/West cluster (Figure 4; Table 3).

| Bioclimate
After highly correlated bioclimate variables were removed, five WorldClim variables were included (annual mean temperature

| Soil
ENMs generated in MaxEnt with soil layers had AUC values ≥0.953.
All species have Arenosols and Ferralsols heavily represented in

| Regional comparisons
Within a region all niche overlap values between species have nonzero values (  (Table 5).
Remarkably, within a species metrics of niche overlap indicate no overlap between regions across a species' range (i.e., niche overlap is F I G U R E 4 Maximum-likelihood phylogeny, deltaK plots, structure plots, and population localities sampled for the two species groups found in more than one region. (a) M. microphyllum, and (b) M. nodulosum. Top-Population localities and deltaK plots summarized results across ten structure runs to calculate the most probable number of clusters (K) using the method of Evanno et al. (2005). Middlestructure plot for each species using a single SNP per locus using 'Population genomics' assemblies. Bottom-ML phylogenetic estimate of concatenated RAD-loci using "Population genomics" with outgroup assemblies. * = bootstrap values higher than 80. Representative images provided for each species. (Photo credit: M. Hanes) zero for every intraspecies interregional comparison) (  presented here suggest that each species in the South has discrete ecological tolerances across a small distance. Individuals from the three species collected from Cap Sainte Marie are generally reciprocally monophyletic and well supported, further suggesting that these taxa are discrete genetic entities in the South, despite evidence for incomplete reproductive isolation (Koopman, 2011;Koopman & Baum, 2010). Stratigraphy and geochronology have been well characterized at a nearby site (Faux Cap) in the South (Battistini, 1964;Mahé & Sourdat, 1972) and are broadly divisible into three major  decreases from south to north (Tovondrafale et al., 2014). A small study conducted at Cap Saint Marie suggests that the three species of Megistostegium grow in different soil depths (Koopman, 2008).

Three rivers lie between the West and South regions and might add
to the distinct population structure we observe in M. microphyllum in the western region. The Ohilahy River in the extreme northwestern distribution of Megistostegium was identified as a potential retreat corridor for mesic taxa during more arid times (Wilmé et al., 2006) and has also been hypothesized to act as a biogeographic barrier to two sister species pairs of plated lizards (Raselimanana et al., 2009) and to intraspecific population structure in one species of iguanid lizards (Chan et al., 2012). The influence of the remaining rivers that bisect the southern slopes and reside between the west and southern regions of Megistostegium has yet to be explored in other taxa. distribution close to where the Linta and Menarandra rivers spill into the ocean. This break is also visible in soil models and is concurrent with genetic structure. Furthermore, we have identified significant changes in stratigraphy, carbon sequestration levels, and groundwater potential simultaneous with this discontinuity in the distribution of M. microphyllum (Besairie, 1967;Collignon, 1971;Grinand et al., 2009;Serele et al., 2019;Waeber et al., 2015). Such a landscape Rivers running in more humid times likely acted to isolate populations of arid-adapted Megistostegium driving the regional structure we observe today.
In the future, temperatures are predicted to increase another 1.1 to 2.6°C and the region is expected to undergo further aridification (Hannah et al., 2008) with the greatest warming and drying expected in the southwest, along the coast. Under these scenarios, models consistently predict range expansion for plant species throughout the southwest (Hannah et al., 2008;Schatz et al., 2008 (Brinkmann et al., 2014;Harper et al., 2007) and studies estimate that between the years 2000 and 2005 these forests lost between 0.42 and 1.1% of forest/year (Brinkmann et al., 2014;MEFT et al., 2009). Unfortunately, the remoteness of the dry, spiny thickets combined with its harsh climate continues to elude conservation priority in these extraordinary plant communities (Moat & Smith, 2007).

| CON CLUS IONS
Phylogeographic data from xeric habitats in southern Madagascar remain scarce and few hypotheses exist as to how such extraordinary species diversity in the spiny thickets was created. This work represents the first study to combine ecological niche models with phylogeographic analyses to understand plant diversification in the Malagasy southwest. We find evidence for the roles of bioclimate, soils, and recent climate change in shaping phylogeographic structure in Megistostegium. Our work further highlights the heterogeneity of precipitation and temperature throughout this semiarid region and we identify strong genetic structure across the distribution of Megistostegium coincident with bioclimate and low elevation rivers. Distribution modeling into the past supports the influence of paleoclimate oscillations that allowed two species of Megistostegium to expand their distributions in the arid Pleistocene and Holocene and become isolated across inhospitable mesic areas. Such range shifts could be a major driver of differentiation across southern Madagascar and should be explored further in other taxa. We also identified a new and potentially important biogeographic break in southwestern Madagascar. In the future it will be interesting to look for genetic structure across the rivers that bisect southwestern Madagascar, the distinct bioclimatic regions we uncovered, and also the biogeographic break that we identified in a diversity of taxa across this region. We also hope to employ a variety of population genomic analyses in the future to incorporate gene flow and historical demography to better understand the evolution of these species. We thank three anonymous reviewers whose comments greatly improved this manuscript. We are grateful to Jacqueline Razanatsoa and Josh Lyon for extraordinary help in the field and Nicole Rowley for help with niche modeling.

CO N FLI C T O F I NTE R E S T
The authors have no conflict of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
The demultiplexed sequence data that support the findings of this study are openly available on the NCBI Short Read Archive: