Assessing ex situ genetic and ecogeographic conservation in a threatened but widespread oak after range‐wide collecting effort

Abstract Although the genetic diversity and structure of in situ populations has been investigated in thousands of studies, the genetic composition of ex situ plant populations has rarely been studied. A better understanding of how much genetic diversity is conserved ex situ, how it is distributed among locations (e.g., botanic gardens), and what minimum sample sizes are needed is necessary to improve conservation outcomes. Here we address these issues in a threatened desert oak species, Quercus havardii Rydb. We assess the genetic, geographic, and ecological representation of 290 plants from eight ex situ locations, relative to 667 wild individuals from 35 in situ locations. We also leverage a recent dataset of >3000 samples from 11 other threatened plants to directly compare the degree of genetic conservation for species that differ in geographic range size. We found that a majority of Q. havardii genetic diversity is conserved; one of its geographic regions is significantly better conserved than the other; genetic diversity conservation of this widespread species is lower than documented for the 11 rarer taxa; genetic diversity within each garden is strongly correlated to the number of plants and number of source populations; and measures of geographic and ecological conservation (i.e., percent area and percent of ecoregions represented) were typically lower than the direct assessment of genetic diversity (i.e., percent alleles). This information will inform future seed sampling expeditions to ensure that the intraspecific diversity of threatened plants can be effectively conserved.


| INTRODUC TI ON
Decades of population genetic studies have revealed the intrinsic (e.g., traits, geographic range size) and extrinsic (e.g., habitat quality, anthropogenic fragmentation) drivers of genetic diversity and structure in wild populations of plants and animals (Aguilar et al., 2008;Allendorf, 2017;Loveless & Hamrick, 1984). The genetic diversity and structure of highly managed and captive-bred animal populations, as well as seed banks of important crops and their wild relatives, have also been studied (e.g., Ogden et al., 2020;Singh et al., 2019). However, there is remarkably little knowledge of genetic diversity and structure in ex situ populations (such as botanic gardens, arboreta, and seed banks) for most plant species, even though thousands of botanic gardens globally hold over 100,000 plant species (Mounce et al., 2017). Relatively few studies have quantified genetic diversity patterns ex situ or compared ex situ and in situ populations (Christe et al., 2014;McGlaughlin et al., 2015;Namoff et al., 2010), while even fewer have sought to understand the drivers of these patterns (e.g., species traits, comprehensiveness of collection efforts, botanic garden management practices; see Griffith et al., 2017;Hoban, 2019;Hoban, Callicrate, et al., 2020;. This is a major gap in the field of molecular ecology and conservation genetics. Safeguarding species and populations ex situ is an essential component of conservation programs, especially when in situ threats are high (Oldfield, 2009;Westwood et al., 2021). Ex situ plant collections can be composed of seed banks, tissue culture, and frozen tissue or embryos, which require relatively little space, as well as living mature plants (i.e., 'living collections'), which take up orders of magnitude more space and have higher costs. Living collections offer some advantages over seed collections. They can help produce seed or cloned individuals for restoration or reintroduction if in situ populations are lost, increase public understanding and appreciation of biodiversity, allow scientific study of rare species, and provide genetic and functional trait diversity for breeding programs (Cavender et al., 2015;Heywood, 2017). Living collections are especially important for species that produce recalcitrant seeds (i.e., those that do not tolerate desiccation for storage in conventional seed banks). Approximately 8% of all plants (and 27% of threatened plants) are recalcitrant (Wyse & Dickie, 2017;Wyse et al., 2018). The space requirements and monetary cost of living collections mean that it is particularly important to evaluate and optimize ex situ genetic diversity.
Ensuring high genetic and trait diversity in ex situ collections is important for long-term persistence of a species under environmental change (e.g., climate change, new pests, and diseases). Botanic gardens can provide seed and plant material for ecological restoration, and restoration success can be influenced by genetic diversity (Breed et al., 2019). It is also increasingly apparent that genetic diversity, especially in trees and other keystone species, contributes to community structure and ecosystem resilience (Raffard et al., 2018;Reusch et al., 2005;Stange et al., 2020), as well as nature's contributions to people (Des Roches et al., 2021). However, living ex situ collections often have few individuals and/or were collected from only a few wild sources (e.g., many species have fewer than 50 plants in collections globally Hoban & Way, 2016;Maunder et al., 2001). Ex situ populations thus may have insufficient genetic diversity for species' long-term survival.
Ideally, most of the alleles that exist in situ should be protected ex situ, preferably in multiple locations for safekeeping, which later could be used for applications such as plant reintroductions or breeding programs (Brown & Marshall, 1995;Lawrence et al., 1995;Lockwood et al., 2007). Genetic markers, applied to tissue samples from ex situ collections and from wild populations, are an increasingly accessible and affordable way to assess genetic diversity and structure ex situ, as shown by several recent efforts. For example, Griffith et al. (2015) showed that 205 ex situ plants captured 78% of the alleles present in two wild populations of Zamia decumbens (Zamiaceae), while Hoban, Callicrate, et al. (2020) quantified how ex situ sampling strategies can be improved for 11 plant taxa across five genera. Such case studies in species with differing life-history characteristics help to establish "rules" for how genetic diversity ex situ is impacted by collection size, species' biological traits, and other factors such as geographic range size in situ (Griffith et al., 2017;Hoban, 2019;Hoban, Bruford, et al., 2020;Hoban & Strand, 2015). While the aforementioned studies are building such knowledge for rare, range-restricted species, we are aware of no similar studies for species that are geographically widespread but still threatened. Predictions from models suggest that species with larger population sizes, more populations, and geographically disconnected populations will need more ex situ individuals in conservation collections to sufficiently preserve in situ genetic diversity (Brown & Hardner, 2000;Hoban, 2019;. Ex situ collections should also represent geographic and ecological variation across a species range, which may help capture adaptive variation (Brown & Hardner, 2000;Guerrant et al., 2004).
Ecological and geographic coverage is much easier to measure and may be an effective proxy for genetic diversity because genetic diversity typically increases with geographic (Alsos et al., 2012;Hanson et al., 2017) and environmental distance (Di Santo & Hamilton, 2020;Wang & Bradburd, 2014). Genetic diversity assessments still require large numbers of samples and specialized equipment and laboratory work; it is infeasible to collect population-level genetic data to optimize collection strategies for the approximately 350,000 plant taxa that exist. Khoury et al. (2019) suggest that the percentage of a species' geographic range represented by plants in ex situ collections can be a "pragmatic estimate of the comprehensiveness of conservation of the genetic diversity." Measuring geographic coverage can help identify which species are sufficiently conserved, and prioritize among those that most need additional conservation effort. Such an approach has rarely been applied outside crop wild relatives (Khoury et al., 2020;Vinceti et al., 2013; though see Beckman et al., 2019), nor has this approach been directly compared to genetic assessments.
To address these major knowledge gaps, we assess patterns of genetic diversity within and among eight botanic gardens (containing 290 individuals), compared to in situ populations (667 individuals) of shinnery oak (Quercus havardii Rydb.), an uncommon but wide-ranging shrub species with recalcitrant seeds. We also assess geographic and ecological proxies of genetic diversity conservation. This is, to our knowledge, the first extensive genetic analysis of ex situ collections of a widespread but uncommon species; previous work has mostly focused on highly rare species (Griffith et al., , 2015Hoban, Callicrate, et al., 2020;Namoff et al., 2010). One may predict that ex situ populations (we will use the term "ex situ population" herein to refer to sets of plants at different gardens, as others have previously; see Schaal & Leverich, 2004) of the widespread Q. havardii will have lower genetic diversity than collections of rarer, small-ranged species, based on simulations demonstrating that range size and gene flow can impact genetic diversity in sampled seed collected for ex situ collections (Hoban, 2019;. However, ex situ collections of Q. havardii are large (290 seedlings collected and used for this study while Beckman et al., 2019 found that "the majority of U.S. oak species are represented by fewer than 150 plants in ex situ collections") and were sampled using best practice recommendations (i.e., many maternal plants spread out in populations across much of the geographic range; see details of seed collection in Methods). Therefore, levels of genetic diversity of Q. havardii may exceed levels conserved for previously studied rarer taxa. To make a comparison between Q. havardii and rarer species, we use a recently published dataset of >3000 individuals of 11 threatened species (all less common than Q. havardii), and we apply the same molecular analysis techniques (Hoban, Callicrate, et al., 2020). We have four aims in this study: 1. Quantify the percent of the extant in situ genetic diversity of Q. havardii that is conserved ex situ, and calculate the minimum number of sampled individuals needed for ex situ conservation of 95% of the known species' alleles.
2. Compare the percent of genetic diversity conserved and the minimum sampling needed (from aim 1) for this widespread species to values recently documented for 11 rarer species.
3. Quantify genetic diversity and structure within each of eight garden populations of Q. havardii and determine if genetic diversity within a garden is a function of the number of plants.
4. Compare the percent of genetic diversity conserved to two nongenetic measures of ex situ conservation: percent of geographic range and percent of distinct ecological regions from which seed was sampled.

| Study species
Quercus havardii is currently listed as Endangered on the IUCN Red List due to ongoing decline in population size and increasing fragmentation and habitat loss resulting from human activities (e.g., changes in land use for grazing or oil and gas development, and deliberate eradication by landowners due to the poisonous effects on livestock and/or competition with crops for water; Kenny et al., 2020). Quercus havardii is typically restricted to deep sand dunes and sandy grasslands, an unusual habitat for oaks (Peterson & Boyd, 2000). Projected climate change resulting in a hotter, drier Southwest United States could challenge the persistence of this species .
Although uncommon and restricted to a very specific habitat, Q. havardii is an ecologically important species where it does occur despite its diminutive height (0.2-1 m). Its large seeds (i.e., acorns) are an important food resource for wildlife. Notably, this species also provides habitat for the lesser prairie chicken (Tympanuchus cupido) and the dunes sagebrush lizard (Sceloporus arenicolus), both listed as Vulnerable on the IUCN Red List and continuing to decline (Boyd & Bidwell, 2001). Its extensive root system can be up to 10 m deep, which can help stabilize sand dunes (Nellessen, 2004;Peterson & Boyd, 2000). One individual may consist of many short stems in a dense clump from one to a dozen or more meters across ( Figure 1). Quercus havardii is wind pollinated, with seeds that are presumably dispersed by rodents, gravity, and water. Many oaks show masting behavior (periodic years of a high number of seeds produced, e.g., every 3 or 5 years), though detailed observation has not been made for this species.
We note that Q. havardii has a disjunct range (Tucker, 1970)

| Ex situ tissue collection
Prior to 2016, to our knowledge, only one botanic garden maintained Quercus havardii in its living collections. To help conserve this species, a large seed collection effort took place in 2016 as part of the US Forest Service− American Public Gardens Association Tree Gene Conservation Partnership (Hoban & Duckett, 2016). The focus of this program is to establish genetically diverse living gene banks of US threatened tree species by collecting seeds from across each species' native range and then distributing the seeds to public gardens for safeguarding (https://www.publi cgard ens.org/progr ams/plant -colle ction s-netwo rk/tree-gene-conse rvati on-partn ership). Following best practices to maximize genetic diversity (see Maschinski et al., 2019), few seeds per maternal plant were sampled ( Figure S1; Table S1), while visiting as many populations as possible across the range. This collecting effort resulted in 1751 seeds from 30 populations from 67 maternal lines (e.g. mother plants) or accessions across the geographic range of Q. havardii (Hoban & Duckett, 2016), though numerous seeds were desiccated, immature, infested F I G U R E 2 Map representation of the geographic range of Quercus havardii and populations that have been sampled for seed ex situ conservation F I G U R E 1 Quercus havardii ex situ at The Morton Arboretum (a); two example habitats from western in situ populations from the species' disjunct distribution (b and c); and an example from an in situ eastern population (d) with weevils, or did not germinate. Note that seeds from a given maternal line will be at least half-sibling relatives. These seeds collected for ex situ conservation were the offspring of the individuals used in the in situ population genetic study described in the next section.
Seeds were distributed to other botanic gardens and sown in 2017 and 2018. Once seedlings had produced several leaves, leaves from 290 seedlings from 66 maternal trees representing 26 wild populations were collected from the botanical gardens ( Figure S1; Table S1).

| In situ tissue collection
During the seed collection described above, a total of 667 mature Holmgren & Holmgren, 1991). In most locations, Q. havardii is the only oak species present due to its highly specific habitat; thus, the potential for hybridization is likely low in these populations.

| Molecular methods
DNA extraction was performed from approximately 0.035-0.060 grams of leaf material using E.Z.N.A. Plant DNA DS kits (Omega Bio-tek, Inc.) with small modifications (Methods S1). DNA was quantified using a NanoDrop One spectrophotometer (Thermo Scientific) and diluted to approximately 10 ng/ul. Eleven microsatellite loci were chosen from several other oak species (Table S2). These loci are not known to be associated with functional genes and are extremely unlikely to be linked to each other, considering there are 12 oak chromosomes (Plomion et al., 2018) and linkage disequilibrium in wind-pollinated forest trees is generally low (Neale & Kremer, 2011). (2001) (Table S3). We did not check for null alleles in the ex situ dataset because we expect high departures from Hardy-Weinberg assumptions (e.g., there are numerous close relatives in the ex situ dataset). Locus statistics for each in situ population were calculated using the R package diveRsity v.1.9.90 (Keenan et al., 2013) and can be found in Table S4. Clones were identified with the R package poppr v.2.8.3 (Kamvar et al., 2014), and only one of each clone group was included in the further analysis;

Regardless, we calculated the Agapow and Burt
we do not investigate the influence of clones in this manuscript. https://github.com/smhob an/IMLS_Safeg uarding), we calculated the percentage of in situ alleles that were present in the ex situ gardens by pooling all individuals held at gardens. We calculated this separately for alleles in categories based on their frequencies as follows: 'very common' (>10%), 'common' (5%-10%), 'low frequency' (1%-5%), and 'rare alleles' (<1%), as well as 'all alleles'. We focused on alleles as the measure of genetic conservation (as opposed to heterozygosity, for example) because they are the aspect of genetic variation on which natural selection can act (Brown & Hardner, 2000).

|
We also assessed alleles conserved for the East and West regions separately (see also Zumwalde et al., 2021) to determine whether genetic diversity is better conserved from one region or the other. We calculated how many "West alleles" (alleles present in West populations, e.g., all alleles minus alleles private to the East) were captured in ex situ seedlings taken from the West, and how many "East alleles" were captured in ex situ seedlings taken from the East. We used a Chi-Square Test in R to determine if there was a significantly higher capture of alleles in the East compared to the West.
Secondly, we use simulated subsampling of wild populations. We used the optimization approach of  to determine the minimum number of sampled individuals to achieve a given threshold of genetic diversity. This approach involved simulated subsampling of the entire in situ dataset for all possible sample sizes ranging from 1 to 667. This simulates collecting seeds from the wild in which a seed sampler selects plants randomly and takes one seed or cutting per plant; in contrast to the real ex situ dataset, this simulated ex situ collection will not have half-sibling families. In other words, the minimum sample size is truly the minimum and is based on ideal sampling. For each subsample, the percentage of alleles captured in the subsample was calculated. The first subsample to exceed 95% (averaged over 75,000 replicates) of the in situ alleles was recorded as the minimum necessary sample size. Similar to previous works (Hoban, 2019;Hoban et al., 2018), we made this calculation using two different assumptions: that 'all alleles' in the in situ dataset are considered (i.e., full dataset) and that alleles present in two or fewer copies are dropped (i.e., reduced dataset). The reduced dataset essentially filters ultra-rare alleles (i.e., those that occur only once or twice in the dataset), which may be deleterious or a potential result of genotyping errors, while the full dataset assumes all alleles have potential value (see Discussion in Hoban, Callicrate, et al., 2020).
2.5.2 | Aim 2: Comparison of genetic diversity conserved, and minimum sampling recommended between Q. havardii and 11 rare species We used an allele accumulation curve (i.e., the percentage of alleles captured for each sample size) to compare Q. havardii to 11 other taxa from a recent study (Hoban, Callicrate, et al., 2020): two palms, two cycads, three oaks, two magnolias, and two hibiscuses. These taxa have a smaller range size and numeric census size than Q. havardii, with most having at maximum a few thousand known plants in situ. We overlaid on this a logarithmic regression (using the "lm" and "predict" functions in R) between the number of individuals currently held ex situ and the percentage of alleles they conserve, established in this prior study of rare species. If Q. havardii is below the relationship for these 11 rare species, it indicates that less genetic diversity is captured than expected from the number of individuals.
The minimum sample size for Q. havardii was also compared to the minimum sample size for the 11 rare species, for which the same resampling procedure was applied. We also calculated F ST using the R package hierfstat v.0.4.22 (Goudet, 2005) between each garden and the East and the West regions. Then, to visualize garden populations in relation to in situ regions, we performed a Discriminant Analysis of Principal Components (DAPC) using the "dapc" function in the R package adegenet (Jombart et al., 2010). DAPC is a multivariate method for identifying and visualizing genetic clusters and the relationships between them (Miller et al., 2020).

| Aim 4: Calculation of geographic and ecological diversity conserved
We build on geographic methods introduced in Beckman et al.

| RE SULTS
Our microsatellite dataset had 2.6% missing data ( Figure S5), and a total of 244 alleles were observed in situ and 186 alleles ex situ.
Detailed genetic summary statistics for each in situ population can be found in Zumwalde et al. (2021), who showed that patterns of differentiation from genetic, morphological, and environmental datasets primarily corresponded to the disjunction of populations from the eastern and western regions of the species' geographic range.
Additionally, Zumwalde et al. (2021) noted that western populations generally had higher levels of genetic diversity and lower relatedness when compared to eastern populations.

| Aim 1: How much genetic diversity is conserved ex situ, and what is the minimum sampling recommended?
For the reduced dataset (in which singletons and doubletons are dropped), we found that 79% of the overall species' alleles are conserved in the 290 ex situ seedlings, with 100% of 'very common' and 'common alleles,' 94% of 'low-frequency alleles', and 55% of 'rare alleles' captured (

| Aim 2: Comparison of genetic diversity conserved, and minimum sampling recommended between Q. havardii and 11 rare species
A lower percent of the genetic diversity of Q. havardii was shown to be conserved compared to an expected allele accumulation curve for 11 rare, long-lived species, including three rare Quercus species (Figure 3). Notably, it is below the percent conserved for Quercus oglethorpensis, despite having twice the number of plants ex situ. Using simulated sampling and the 'reduced' dataset, we found that the number of samples needed to reach a minimum of 95% of the alleles in the 'all alleles' category was much greater for Q. havardii (245) than for 11 rare species (mean of 56). The number of samples needed for Q. havardii for 'low-frequency' alleles was also greater− 73 compared to a mean of 56 for the other 11 species ( Figure 4).

| Aim 3: Genetic diversity and structure among botanic garden populations and relationship to the number of individuals
The percent of alleles captured in each of the eight botanic gardens is shown in Figure 5. There was a clear increase in genetic TA B L E 1 Percentage of genetic diversity conserved ex situ, for each of five categories of alleles, for the East and West regions and overall values, using the reduced dataset (percent using the full dataset shown in parentheses)

Number of samples All alleles
Very common (>10%)  as might be expected for a widespread species. We, therefore, do not conclude there is a "best" value of k for this dataset, but we set k = 2 as it is biologically plausible (see also Zumwalde et al., 2021) and will allow comparison of the garden and regional in situ samples. This differentiation was also visible in a DAPC plot ( Figure S7) suggesting that the genetic composition of garden populations is more similar to eastern populations.

| Aim 4: Calculation of geographic and ecological diversity conserved
Percentages reflecting conserved geographic area and ecoregion coverage are shown in Table 2. Generally, these percentages were lower than the estimates of genetic diversity conserved. The buffer size does impact the percentage considered conserved with smaller buffer sizes resulting in a lower percentage. Depending on the region and buffer size, geographic area percentage ranged from 11.32% to 42.29%, while ecological coverage percentages ranged from 50% to 90.91% for Ecological Level III, and 29.03% to 54.76% for Ecological, Level IV ( Table 2). In almost all cases, the percentages conserved were higher in the West than in the East (the one exception being Ecological, Level III, 10 km buffer, Table 2). This trend is the opposite pattern found in the genetic diversity percentages. A PCA of environmental variables for the sampled sites visualizes this ecological coverage in two dimensions ( Figure S10).

| DISCUSS ION
Our results provide one of the first studies to quantify genetic con-  (Brown & Hardner, 2000;Guerrant et al., 2014;Maunder et al., 2004). We found that (1) a majority of Q. havardii genetic diversity is conserved, though the genetic diversity of the eastern region is better conserved in ex situ botanic garden collections than of the western region; (2) genetic diversity conservation of this widespread species is lower than for 11 previously studied rarer taxa; (3) genetic diversity within each garden is strongly related to the number of plants in the garden; and (4) the measures of geographic and ecological conservation (i.e., percent area and percent of ecoregions represented in seed collections) were lower than the direct assessment of genetic diversity conservation (i.e., percent alleles).
Our first main conclusion is that the majority of Q. havardii genetic diversity, at least using microsatellite alleles, is conserved (Table 1). Also, our results suggest that successful genetic conservation will require more ex situ individuals from species with large geographic ranges than from rare species (Figures 2 and 3). previously studied rare species (including 3 rare oaks). As one example, Q. havardii has a lower percentage of alleles conserved (79%) than the rare oak Q. oglethorpensis (94%), which has an estimated extent of occurrence of 130,000 km 2 compared to 300,000 km 2 estimated for Q. havardii (Kenny et al., 2020), though Q. oglethorpensis has half as many trees (145) ex situ. However, both species possess similar values for the conservation of 'low-frequency alleles' (94% and 97% respectively). The low percentage conserved for Q. havardii, therefore, relates to the fact that the number of rare alleles will be higher when the range size and census size are larger, as noted in Hoban (2019) and by others (Brown & Hardner, 2000;Brown & Marshall, 1995).
One reason for the relatively low amount of genetic diversity conserved is that Q. havardii is a widespread species with numer- Interestingly, garden E has more alleles than garden F ( Figure 5, see also Tables S5 and S6), though only about half the number of individuals. Garden E has seeds from 10 populations (6 East, 4 West), while garden F has seeds from only 8 populations (7 East, 1 West), emphasizing that sampling from as many populations and regions as possible is important. We also observe that even small collections (e.g., 20 trees) have value for conserving 'common alleles,' and the total collection of all gardens together (i.e., the metacollection) contains more genetic diversity than any individual population (as suggested by Griffith et al., 2019).
Previous studies have recommended a range of minimum number of samples for conserving genetic diversity. While investigating Zamia lucayana and Z. decumbens (Zamiaceae), Griffith et al. (2017) found that a single accession (a group of seeds from one maternal plant) contained 24%-51% of the alleles, while the total collection captured 90%. For Leucothrinax morrisii (Arecaceae), Namoff et al.  Our study is one of the first comparisons of genetic and ecogeographic diversity conserved ex situ. We find that geographic and ecological measures of conservation success are typically much lower than genetic measures (e.g., the percentage of alleles conserved), with some exceptions for EPA Level III ecoregions.
This is not entirely unexpected as other work has shown that populations can strongly decline in geographic extent or abundance (Alsos et al., 2012; without substantial losses of genetic diversity. This is partly because genetic diversity is shared among populations through gene flow and shared ancestry, with especially high within-population diversity in trees (Petit & Hampe, 2006

| Caveats and other remarks
An important caveat of this study is that eleven microsatellites are limited in their resolution; the oak genome likely contains 30,000 to 80,000 protein coding genes (Plomion et al., 2018;. Microsatellites are also typically non-coding DNA (though see Lind-Riehl et al., 2014), which may unlikely reflect adaptive genetic diversity. In addition, our estimates of genetic conservation are a snapshot in time, and they will change as seedlings in botanic gardens die and/or as new seed collections occur. Finally, our resampling technique chooses one sample (one seed or one cutting) per individual, while most seed samplers will realistically take multiple seeds per individual. As noted previously (Hoban, Callicrate, et al., 2020;Hoban et al., 2018), the minimum sample size to reach 95% of alleles is an absolute minimum, and seed samplers who sample multiple seeds per maternal plant should often aim to collect twice as much under realistic conditions (Hoban & Strand, 2015).
We note that the seedlings of Q. havardii in this study, like most ex situ collections, contain numerous half-siblings and possibly fullsiblings. Previous work has suggested that relatedness will reduce genetic diversity conserved , but the question of how the degree to which different sets of siblings (e.g. number of maternal families) will impact genetic conservation success requires further study, ideally using simulated data with many arrangements of family size.
Similar to previous work, we separate alleles into categories based on their frequencies. As expected, 'all' alleles, 'low frequency' alleles, and 'rare' alleles are harder to conserve compared to 'very common' alleles and 'common' alleles. It is not known if rare alleles are potentially advantageous, deleterious, or neutral.
The precautionary principle in conservation would argue for the capture of rare alleles to maintain genetic diversity as a potential resource for nature and for people, especially as environmental pressures change, including threats of new pests and diseases.
However, an opposing view is that ultra-rare alleles may be deleterious and should not be preserved (Brown & Kelly, 2020;Kardos & Shafer, 2018), though neutral nuclear microsatellites are unlikely to be linked to deleterious alleles. Consensus on the importance of rare alleles is needed to determine practical guidelines for sampling. A valuable area of future work will be to apply the methods we used to a dataset containing alleles that are known or putatively under selection.
We also note that genetic diversity is not the only concern of a collection. Another important need is having enough plants to start a new population, based on expected germination and survival rates (Cochrane et al., 2007;Hoban & Way, 2016

| Future directions
As next generation sequencing costs continue to decrease and oak genomic resources increase (Plomion & Martin, 2020), future studies have an opportunity to focus on coding regions of DNA that ultimately determine phenotypes of importance. Because nuclear microsatellites do not assess adaptive variation, a future study could include genes with possible physiological or morphological importance (e.g., water use efficiency, stomatal density, shape, and size of the leaves, etc.), or loci identified in gene−environment scans (Gugger et al., 2021), to analyze populations found in different ecoregions or areas of environmental space. This would allow comparison of genetic diversity captured in botanic gardens according to neutral genetic diversity, adaptive genetic diversity, and ecogeographic diversity. Another future direction is to establish common gardens of Q. havardii to test for local adaptation and predict response to climate change. It may be possible to identify which populations are more vulnerable or those unlikely to tolerate climate change and prioritize the samples for more seed collections (Borrell et al., 2020;Razgour et al., 2019).
While the botanic garden community is working together to conserve Q. havardii, there is still a need for further collections from the wild, particularly from western populations. In addition to increased sampling, Fant et al. (2016) and Wood et al. (2020) argue that arboreta and botanic gardens should take cues from the zoo community by improving shared databases of inventories for rare species, involving local communities in situ (or near in situ) to help conserve and sample the species, and sharing genetic material among gardens and herbaria for future studies.
Overall, our results show that higher sampling is needed in widespread species and that multiple types of data can reveal gaps in the ex situ collection (e.g., low genetic diversity from the West, low ecogeographic diversity from the East). It is worthwhile and important to continue collecting additional seeds from additional locations for a larger ex situ metacollection of this species (which could occur across multiple years due to relatively infrequent mast years of large seed production in oaks). The data collected on such future efforts, including sampling across multiple years, will inform future seedsampling expeditions to ensure that intraspecific diversity of threatened plants can be conserved.

ACK N OWLED G EM ENTS
We acknowledge funding from the USFS APGA Tree Gene

CO N FLI C T O F I NTE R E S T
The authors declare no conflict of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
Data and code for the genetic analysis can be found here: https:// github.com/smhob an/Qhava rdii_ex_situ. Data and code for the environmental analysis can be found here: https://github.com/BZumw alde/Querc us_havar dii_Safeg uardi ng_genet ic_diver sity. Data and code for the geographic and ecological conservation analysis can be found here: https://github.com/esbec kman/Querc us_havar dii_GeoEco_exsitu_conse rvation. All samples were collected and analyzed within the United States and the research described in the publication complies with relevant national laws implementing the Convention on Biological Diversity and Nagoya Protocol agreements.
Benefits generated: During sample collection in situ we met with numerous local stakeholders and discussed the project. A report on the collection was distributed to all who participated.
For sample collection ex situ we explained the project to the botanic gardens personnel involved. This report will be distributed to them upon completion. Their contributions are acknowledged.
The research addresses a priority concern of many botanic gardens, especially those safeguarding this species, and a concern of the IUCN Red List, and clear conservation recommendations are made. Finally, as described above, all data and code have been shared with the broader public via appropriate biological databases for reproducibility.