Macrogenetics reveals multifaceted influences of environmental variation on vertebrate population genetic diversity across the Americas

The broad scale distribution of population‐specific genetic diversity (GDP) across taxa remains understudied relative to species diversity gradients, despite its relevance for systematic conservation planning. We used nuclear DNA data collected from 3678 vertebrate populations across the Americas to assess the role of environmental and spatial variables in structuring the distribution of GDP, a key component of adaptive potential in the face of environmental change. We specifically assessed non‐linear trends for a metric of GDP, expected heterozygosity (HE), and found more evidence for spatial hotspots and cold spots in HE rather than a strict pattern with latitude. We also detected inconsistent relationships between HE and environmental variables, where only 11 of 30 environmental comparisons among taxa groups were statistically significant at the .05 level, and the shape of significant trends differed substantially across vertebrate groups. Only one of six taxonomic groups, freshwater fishes, consistently showed significant relationships between HE and most (four of five) environmental variables. The remaining groups had statistically significant relationships for either two (amphibians, reptiles), one (birds, mammals), or no variables (anadromous fishes). Our study highlights gaps in the theoretical foundation upon which macrogenetic predictions have been made thus far in the literature, as well as the nuances for assessing broad patterns in GDP among vertebrate groups. Overall, our results suggest a disconnect between patterns of species and genetic diversity, and underscores that large‐scale factors affecting genetic diversity may not be the same factors as those shaping taxonomic diversity. Thus, careful spatial and taxonomic‐specific considerations are needed for applying macrogenetics to conservation planning.


| INTRODUC TI ON
Ecologists have long sought to map and understand broad patterns in biodiversity distribution, with the latitudinal gradient in species diversity being one of the most studied phenomena in ecology and biogeography. Now with increased technology and data accumulation, intraspecific diversity can be studied in the same way as species diversity. Understanding the distribution of genetic diversity is important for providing insights on the spatial distribution of biodiversity, with implications for conservation planning (De Kort et al., 2021;Lawrence & Fraser, 2020;Leigh et al., 2021;Nielsen et al., 2022;Theodoridis et al., 2020). As a growing field, 'macrogenetics' -the study of genetic diversity at broad scales (Blanchet et al., 2017) -thus far has revealed that genetic diversity spatial patterns are weaker, loci specific (e.g., neutral microsatellites vs. adaptive genes), taxa-specific or in contrast to species diversity (Adams & Hadly, 2013;Habrich et al., 2021;Manel et al., 2020;Millette et al., 2020;Miraldo et al., 2016;Schmidt et al., 2022;Theodoridis et al., 2020;Yiming et al., 2021). The environmental factors that structure spatial patterns of macrogenetics remain of interest and are likely complex and interactive (De Kort et al., 2021).
Herein we focus on the potentially non-linear relationships that environmental factors may have with contemporary, broad scale patterns of nuclear genetic diversity within natural populations of vertebrate species, that is, population genetic diversity (GD P ). We chose to focus on non-linear effects to allow for greater flexibility in identifying relationships, whether linear, quadratic or otherwise polynomial. GD P is an important component of adaptive potential in the face of changing environments. Greater genetic diversity within populations increases the likelihood of potentially useful alleles for natural selection to act upon as the environment changes, thus increasing adaptive potential (Allendorf, 2017;Barrett & Schluter, 2008). Additionally, GD P can reliably represent genome-wide variation through the measurement of nuclear DNA variation (Ghalambor et al., 2007;Hurst & Jiggins, 2005;Jarne & Lagoda, 1996;Sgrò et al., 2011;Väli et al., 2008). Fortunately, such variation is often quantified at the population level using different metrics such as heterozygosity.
Several theories make predictions about the effect of climatic or environmental variables on GD P and the resulting spatial structuring that would occur from these effects. Some theory predicts higher GD P at low latitudes around the Equator due to the warmer temperatures and temporal climatic stability (De Kort et al., 2021;Hewitt, 2000;Schluter & Pennell, 2017). For example, higher mean temperatures are thought to increase mutation and evolution rates (Adams & Hadly, 2013;Allen et al., 2006;Gillooly et al., 2005;Rohde, 1992), which may favour the maintenance of higher standing GD P . Nevertheless, relationships between GD P and temperature variance over time may be more nuanced. For instance, regions that have experienced extreme temporal temperature variation can show reduced GD P due to bottlenecks (Comps et al., 2001;De Kort et al., 2021;Hewitt, 2000;Schluter & Pennell, 2017). Conversely, a great range in annual temperature may also favour higher standing GD P to deal with such unstable environments over relatively short time periods (e.g., years, decades) (Barrett & Schluter, 2008;Botero et al., 2014;Brennan et al., 2019). Given this, we might expect a nonmonotonic relationship between GD P and temperature variability, with GD P being higher in regions with either low or high temperature variability. Thus, a confounding effect between historical and contemporary ranges in temperature may exist, wherein regions that have experienced extreme temperature instability in the past may have reduced GD P overall, but regions with intermediate contemporary temperature ranges may show elevated GD P .
Other environmental factors potentially influencing broad scale GD P patterns include precipitation, productivity, and elevation. All else being equal, environments with higher precipitation and net primary productivity (e.g., at lower latitudes, Figure 1), may support larger population sizes in many taxa. Hence such environments could maintain GD P across species because larger populations are disproportionately less likely to lose genetic diversity through drift (Botero et al., 2014;Currie et al., 2004) (but see Brun et al., 2019;Thuiller et al., 2020 for discussion on phylogenetic diversity). This may lead to a positive relationship between GD P and precipitation or productivity, although the degree of linearity may depend on multiple factors, including habitat area and taxa-related dependencies that affect population size (Lawrence & Fraser, 2020). Elevation, conversely, can be an indicator for population isolation (De Kort et al., 2021;Ehinger et al., 2002;Polato et al., 2017) and may interact with other environmental variables to affect GD P . For example, high elevations are typically more variable, colder environments, which can act as physiological dispersal barriers in tropical regions (Ghalambor, 2006;Janzen, 1967), and likely support smaller population sizes that experience greater genetic drift, reducing GD P . Thus, while precipitation and productivity are expected to positively influence GD P (Brun et al., 2019;De Kort et al., 2021;Santini et al., 2018), elevation is expected to have a negative effect (Ghalambor, 2006;Janzen, 1967). Together, these effects could result in higher GD P at more productive latitudes and lower elevations, such as those within 20° of the Equator ( Figure 1). GD P has been infrequently studied in past macrogenetic works relative to studies based on repurposed individual-based sequence data from mitochondrial DNA (Millette et al., 2020;Miraldo et al., 2016;Theodoridis et al., 2020;Theodoridis et al., 2021;vs. De Kort et al., 2021;Schmidt et al., 2020;Yiming et al., 2021). Individual-based mitochondrial DNA studies found some support for a decrease in genetic diversity away from the Equator across species and taxonomic differences (e.g., stronger trends in mammals) (Adams & Hadly, 2013;Gratton et al., 2017;Martin & Mckay, 2004;Millette et al., 2020;Miraldo et al., 2016;Yiming et al., 2021). However, inconsistencies between archiving practices in mitochondrial DNA repositories have resulted in biased measures of genetic diversity when repurposed for macrogenetic research Paz-Vinas et al., 2021).
Furthermore, mitochondrial DNA variation itself has shortcomings for use in macrogenetic research; for example, it may not correspond with nuclear or genome-wide genetic diversity, and it can lack resolution for demarcating populations (Allendorf, 2017;Bazin et al., 2006;Paz-Vinas et al., 2021). Therefore, compared to nuclear DNA, mitochondrial DNA may not adequately portray broad scale genetic diversity patterns of import to conservation, such as hotspots or cold spots (Schmidt et al., 2022), that is, regions harbouring populations from different species with, respectively, high or low levels of genetic diversity for adapting to environmental change.
To determine how environmental variables influence broad scale patterns in nuclear GD P , we utilised a large database containing population-level, nuclear genetic data in vertebrate species across the American continents (Lawrence et al., 2019).
This database is particularly suitable for macrogenetic research because it is generated from a systematic literature review and reports measures of putatively neutral genetic diversity for each geo-referenced, genetically distinct population (Lawrence et al., 2019). The database also includes key data from the original empirical studies that can be incorporated into macrogenetic analyses, such as sample sizes and GD P metrics, to reduce bias and better account for nuances not considered in past macrogenetic works .
Our macrogenetic study specifically used microsatellite DNA data from 3678 vertebrate populations and generalised additive mixed models to investigate environmental drivers of GD P based on expected heterozygosity (H E ) as the metric. Our study is the first to account for possible nuances in GD P by assessing non-linear relationships between H E , latitude, and environmental factors, as raised in a recent conceptual synthesis (Lawrence & Fraser, 2020).
We addressed two specific objectives: (1) Investigate environmental and spatial relationships by modelling H E as a function of the following variables, while also accounting for sample size: latitude, longitude, mean annual temperature, annual precipitation, temperature annual range, net primary productivity and elevation. (2) Assess taxa-specific patterns to account for differences among taxonomic groups, as different clade history and life histories may lead to differences in genetic and/or latitudinal patterns among groups (Adams & Hadly, 2013;De Kort et al., 2021;DeWoody & Avise, 2000;Garnier & Lafontaine, 2022;Martinez et al., 2018;Miraldo et al., 2016). F I G U R E 1 Estimated groupwise spatial patterns in mean H E from HGAMs, plotted on a Mercator-projected map of North and South America at 0.33° by 0.33° scale. All estimates further than ~1° away from any observed population of a given group were excluded.

| Data acquisition
We obtained georeferenced vertebrate population genetic data from MacroPopGen that contains microsatellite-based data for H E (Lawrence et al., 2019). To define H E more specifically, H E represents the expected proportion of a population that is heterozygous based on allele frequencies (Allendorf, 1986). Note that H E values are based on the study from which they came because raw data were unavailable for many studies in the literature. We acknowledge the risk of ascertainment bias that could occur from aggregating metrics across studies in this way, however the database authors found that neither the number of microsatellite loci nor loci origin (i.e., the species a locus was developed from) affected genetic diversity metrics (Lawrence et al., 2019). As this suggests low risk of ascertainment bias, we selected data from genera that had a minimum of 10 populations and minimum population sample sizes of 20 to ensure each group had sufficient data to accurately represent populations.
Populations in MacroPopGen are considered genetically distinct by considering the statistical significance (at the α = .05 level) of pairwise F ST values above or below a threshold of 0.02, based on Waples and Gaggiotti (2006). Pairwise statistical significance was previously assessed to avoid aggregating significantly different populations or, conversely, arbitrarily defining different spatial samples as separate populations when, ultimately, they were determined to originate from the same population. Overall, then, pairwise comparisons >0.02 were considered genetically distinct, but statistical significance was checked when F ST was <0.02 and significance implied two separate populations despite a lower F ST (Lawrence et al., 2019).
The 0.02 threshold as a guide is a good balance for the taxa present in the database as it accounts for species with considerable gene flow (e.g., some fishes and bird species). Waples and Gaggiotti (2006) identified significant population differentiation as low as F ST ≈ .01, so .02 is reasonable for a database that does not include marine species which tend to have higher migration rates than terrestrial or freshwater species. Finally, we note that, on average, only 7% of the populations used in the data subset for our study were from aggregated populations, i.e., where multiple sampling sites were determined to be genetically indistinguishable, and hence pooled (Tables S1 and S2 for total populations per species). Additionally, excluding these populations from the dataset did not change the majority of the results, so we opted to retain them in the data subset for modelling.
For our H E dataset, our filtering criteria yielded 3678 genetically distinct populations across the American continents from 245 species and 547 studies published between 1994 and 2017. Each population has information on H E , sample size, latitude, longitude, and taxonomic grouping (i.e., Class, Family, Genus, Species). We obtained this information for six vertebrate groups: anadromous and freshwater fishes, amphibians, birds, mammals, and reptiles (we excluded brackish and catadromous fish species in MacroPopGen due to their low population numbers, 12 and 29 respectively). We split fish groups into anadromous and freshwater because previous work has shown that anadromous fishes tend to show distinct genetic patterns from freshwater fishes, and the groups were already indicated in the database (DeWoody & Avise, 2000;Martinez et al., 2018).

| Environmental variables
To test the influence of environmental variables on H E , we firstly ex- To obtain productivity data, we extracted raster data of net primary productivity (units of elemental carbon ×10e −11 ) for each population point (Imhoff et al., 2004;Imhoff & Bounoua, 2006). Correlations between environmental, spatial and genetic diversity variables are found in Figure S1.

| Spatial patterns in genetic diversity and environmental variables
To inspect broad spatial patterns and identify group-specific hotspots and cold spots in genetic diversity, where average H E was higher or lower than predicted based solely on environmental predictors, we estimated spatial variation in mean H E by fitting with a quadratic penalty on basis function coefficients to avoid over-fitting relationships, by penalising overly wiggly functions (Wood, 2017). HGAMs extend the GAM concept by allowing estimated non-linear functions relating predictors to outcomes to vary among groups (Pedersen et al., 2019). We used a spherical spline smoother (Wood, 2017) with 50 basis functions per group to model group-specific spatial variation in mean H E , and modelled population-specific variation in H E with a beta distribution with a logit link function as H E is a continuous variable bound between zero and one (Wood, 2017). Model penalties were estimated using Fast Restricted Maximum Likelihood, using the 'bam' function in the mgcv package (v1.8-31; Wood et al., 2015) in the R programming language (v4.2.2; R Core Team, 2022). Estimated groupwise spatial patterns in mean H E were visualised by plotting them on a Mercator-project map of North and South America at 0.33° by 0.33° scale, excluding all estimates further than ~1° away from any observed population of a given group.
To explore the temporal range within the data, we summarised the average, minimum and maximum year a study was published for each taxonomic group in Table S3. We used year of publication because year of sampling was not available in the database due to inconsistent reporting in the literature.

| Model structure and selection
We used HGAMs (Pedersen et al., 2019) to determine the effect of environmental, taxonomic, and spatial factors on H E using the bam function in R package mgcv v1.8-31 (Wood, 2017). We used the bam function as it has improved computational performance on large datasets (Wood et al., 2015). HGAMs allowed us to be flexible in identifying relationships between H E and the environmental predictor variables, since there is no a priori reason to assume that average H E varied linearly with any given predictor and the penalty on the wiggliness of each function controls overfitting (Pedersen et al., 2019). The complexity of the estimated curve for each predictor can be measured by the effective degrees of freedom, with a higher effective degree of freedom corresponding to a wigglier estimated function. This ranges from 0 (for no relationship) to a maximum value of the number of basis functions used. Finally, HGAMs allow us to account for residual spatial autocorrelation in average H E directly in the model via spatial smoothers.
We fit an HGAM with population-specific H E as the response variable, modelled with a Beta distribution with a logit link function using the betar family in mgcv (Wood, 2017). We modelled the relationships between all environmental predictors and H E with univariate P-splines with penalties on the squared values of the first and second derivatives of each curve (bs = 'ps' in mgcv), with five basis functions per smoother. As genetic diversity patterns can vary among taxonomic levels (Adams & Hadly, 2013;Hirao et al., 2017), we fitted separate taxonomic level smoothers with taxa-specific penalties for each environmental variable (i.e., a model I HGAM; Pedersen et al., 2019). We also included vertebrate group as a random intercept effect, to account for taxa-specific differences in average H E across the study area. We included group-specific smooths for sample size to account for sample size effects on H E for different taxonomic groups. Finally, we included study as a random effect (RefID) to account for variation between studies.
To assess residual spatial patterning in H E , we included a spatial smooth for longitude and latitude using a spherical spline smoother (Wood, 2017) that accounts for true (rather than projected) distances between points on a sphere. This spatial smoother allowed us to estimate conditional hotspots and cold spots for genetic diversity.
This smoother did not vary with taxonomic grouping here as there was not enough geographic coverage to make accurate predictions for each vertebrate group. By pooling taxa, we were able to borrow information across groups, and make broader predictions for American continental genetic diversity. This limitation in the data only further emphasises the need for broad sampling of genetic diversity data to make accurate predictions and analyses in future macrogenetic studies.
We applied selection penalties to the model by setting select equal to TRUE. Selection penalties allow for the bam function to shrink a smoother (i.e., variable) to zero effective degrees of freedom, effectively removing it from the model. This incorporates model selection (i.e., choosing relevant variables for prediction) into the model estimation process (Wood, 2017;Zuur et al., 2009).
The final model form was:

| Spatial patterns in genetic diversity and environmental variables
We mapped the distribution of genetically distinct vertebrate populations and their mean H E for each taxonomic group, finding inconsistent patterns in spatial clustering for most groups. Amphibians and freshwater fishes were the only groups that showed some indication of mean H E hotspots or cold spots, both having lower H E on the western coast of North America. After data filtering, there were very few bird populations, so interpretations cannot be made with high confidence for this group. Conversely, spatial patterns for the environmental variables follow well known patterns and can be found in Figure 2.

| Model selection & effect of variables
Model results from the HGAM assessing the combined effects of environmental variables on H E are presented in Table 1  where H E for all groups is higher or lower than expected given just the environmental predictors in our model. The map in Figure 3a, shows some evidence for regions of higher than expected and lower than expected H E . These spots are largely congruent with the plots in Figure 1, where regions of lower genetic diversity than expected occurred on the south-western coast and eastern coasts of North America. Regions of higher-than-expected genetic diversity occurred in internal South America. Figure 3b shows the taxon-specific random effect, indicating that freshwater fishes, amphibians, and birds had lower than average H E , whereas mammals, reptiles, and anadromous fishes had higher than average H E . Interestingly, the groups with lower than average H E tended to have more significant relationships with environmental variables (see below; Figure 4).
All five environmental predictors had significant statistical relationships with mean H E for at least one vertebrate group (Figure 4).
However, no variable was a significant predictor for all groups. The strongest H E -environment relationships occurred for freshwater fishes, amphibians, and reptiles (5, 2 and 2 variables, respectively), were less common in mammals and birds (both only one variable), and were not detected in anadromous fishes. The only significant predictor of H E for mammals was temperature annual range, net primary productivity for birds (Figure 4b,d), and no environmental variables were significant predictors of diversity in anadromous fishes ( Figure S2).
The shape of the environment-diversity relationship varied strongly among groups even when a given variable did predict diversity for multiple groups. In freshwater fishes and reptiles, average H E increased with temperature, whereas average H E for amphibians peaked around 10°C (Figure 4a). Freshwater fishes showed opposing patterns with temperature annual range relative to mammals and reptiles, where H E increased above 20°C for mammals and reptiles but was highest below this temperature in freshwater fishes

| DISCUSS ION
Across the Americas, our study on vertebrates revealed continentscale spatial patterns in population-specific H E , with no detectable shared patterns in continental-scale diversity among vertebrate groups. We did not find any evidence for strong latitudinal gradients for population-specific H E for any of the vertebrate groups in our dataset. Amphibians and freshwater fishes showed the strongest spatial patterning in H E , as well as the strongest relationships between environmental variables and H E (Figures 1 and 4; Table 1).
The lack of significant patterns for many vertebrate groups may suggest that neutral genetic diversity is not as tightly correlated with environmental variables as species diversity tends to be, pointing to a disconnect between the two diversity metrics. This disconnect suggests a gap in the theoretical foundation upon which macrogenetic predictions have been made thus far in the literature, and that large-scale factors affecting genetic diversity may not be the same factors as those shaping taxonomic diversity.
As with previous literature (De Kort et al., 2021;Schmidt et al., 2022), our results show that the latitudinal distribution of nuclear GD P is less clear than species diversity or previously used mitochondrial DNA data (Miraldo et al., 2016;see also Leigh et al., 2021). No group showed a clear latitudinal gradient in GD P .
Instead, we found spatially varying hotspots and cold spots in diversity, even after accounting for predictive effects of environmental predictors on group-specific diversity patterns. The main hotspot was internal South America, whereas North American coasts, particularly the eastern and south-western coasts, were cold spots for H E , congruent with results on North American mammals from Schmidt et al. (2022). The location of the main hotspot also appears to be consistent with spatial patterns of where environmental variables often have highest values (Figures 1 and 2).
For example, annual precipitation and net primary productivity both have some of their highest values in north-internal South America, whereas temperature annual range is lower in this region ( Figure 2). It is important to remember that the hotspots and cold spots identified in Figure 3a represents residual hotspots and cold spots. This means that for a given set of longitude-latitude coordinates, there is more (or less for cold spots) genetic diversity F I G U R E 3 Residual spatial and taxa-specific differences in average H E . (a) Mercator projection of estimated spatial smoother of residual variation in average H E remaining after conditioning on environmental effects. Colours indicate hotspots and cold spots across vertebrate groups where a given area is more/less genetically diverse than otherwise expected for that area. (b) Estimated group-specific deviations in logit-transformed average H E . than otherwise expected. Consideration of the other environmental relationships can provide further insight on why we detected this pattern.
The location of the residual H E hotspot may suggest some indirect latitudinal patterns due to correlations between latitude and the environment. However, it is critical to consider differences among taxa as well as the combined effects of each environmental variable regardless of their latitudinal distribution. This is where our usage of HGAMs instead of linear regressions shows its advantage -our models were able to identify how each variable contributes nonlinearly to the spatial structuring of H E for each taxonomic group.
Surprisingly, only one group showed consistent significant relationships with environmental variables, freshwater fishes (n = 4 relationships). Perhaps due to their close affiliation with and dependency on freshwater environments, which are often fragmented, H E in these groups may be more sensitive to changes in environmental variables.
In fact, freshwater fishes also had some of the lowest mean H E values ( Figure 3b; Table S4), followed by amphibians and birds -both of which showed significant environmental relationships (n = 2 and 1 relationships respectively). Conversely, anadromous fishes had the highest mean H E and did not show any significant relationships with environmental variables. This may suggest that groups with lower genetic diversity may be more likely to show environmental relationships. However, we note that there were very few bird populations (n = 47 populations; Figure 3; Table S1) relative to amphibian (n = 405 populations) and freshwater fishes (n = 1338 populations), so this potential trend should be interpreted with caution. Overall, multiple drivers, including taxonomic-specific influences of environmental variation and differing mean levels of GD P , may affect how GD P is affected by environmental variation across broad scales.
We predicted that H E would increase with mean annual temperature but have a U-shaped relationship with temperature annual range. The relationships we estimated were not generally consistent with these predictions, nor was there any clear single relationship between environmental variables and population genetic diversity that held across groups. Prior studies have argued that higher temperatures might increase metabolic and mutation rates in ectotherms (Allen et al., 2006;Brown et al., 2004;De Kort et al., 2021;Rohde, 1992;Schluter & Pennell, 2017), but not in endotherms as they can maintain a constant body temperature and thus metabolic rate. This is consistent with the estimated increase in population genetic diversity for freshwater fish and reptiles, and the lack of any consistent relationship between H E and temperature for mammals and birds, but it is not consistent with the unimodal relationship between temperature and H E observed in amphibians (Figure 4a).
We also detected a general increase in H E with increasing temperature annual range in three groups (freshwater fish, mammals and reptiles). The trend was in line with our predictions for freshwater fishes, but not for mammals and reptiles because there was higher H E at higher temperature annual ranges but not at the lower F I G U R E 4 Partial regression effects of environmental variables on expected heterozygosity (H E ) for vertebrate species (TaxaClass) across the Americas. Partial effects (y axis) are measured on the logit scale. Variables were fitted by p-spline smoothers and include (a) mean annual temperature (MAT, °C), (b) total annual range in temperature (TAR, °C), (c) annual precipitation (mm/year), (d) net primary productivity (NPP, units of elemental carbon ×10e −11 ), (e) elevation (m). Bands represent 95% confidence intervals for estimated functions.
end of temperature annual ranges as well (i.e., not the U-shape we expected). This suggests that more climatically stable environments may not always harbour more GD P across different taxonomic groups as previous works suggested (De Kort et al., 2021) and that, in general, the relationship between GD P and temperature variance remains elusive. Factors affecting local population dynamics (e.g. isolation, gene flow, population size) may be more relevant overall for shaping neutral GD P, like H E , but future macrogenetic investigations would benefit from considering how temperature might interact with other environmental variables to affect neutral GD P.
Across vertebrate groups, we detected only a few significant relationships between H E and the other three environmental variables examined. This was despite the expectation from results of previous works for largely positive relationships with two variables (annual precipitation, net primary productivity) and a negative relationship with elevation (De Kort et al., 2021;Ehinger et al., 2002;Polato et al., 2017). Where significant relationships with H E were observed in our study, these were generally inconsistent with expectations.
For example, for both annual precipitation and net primary productivity, we detected generally negative relationships with H E for all relevant groups (freshwater fishes, amphibians, birds). This result was counter-intuitive from our expectations and previous works' results (De Kort et al., 2021), which may suggest that higher productivity in tropical areas does not necessarily lead to larger population sizes and hence higher H E (Lawrence & Fraser, 2020). Instead, smaller-scale variation in other abiotic and/or other biotic factors such as variation in habitat connectivity may play a more important role in shaping GD P than large-scale environmental variation.
Although there was a significant trend for elevation in freshwater fishes, the wide confidence interval at the upper end of the curve suggests there is no evidence for an increase in H E with elevation.
As we can only make inferences from the lower part of the curve, our results may suggest that H E of freshwater fishes declines with elevation, at least up to 1000 m. This general pattern may support our prediction, but we emphasise that this result is speculative due to the very low-information curve.
While the size of samples used to estimate population H E varied spatially and with vertebrate group (Figure 1), we did not find any evidence that the lack of a latitudinal gradient was due to confounding sampling effects. Average H E was not significantly associated with sample size in five of six vertebrate groups assessed, where H E decreased for freshwater fishes ( Figure S2f). We expected positive relationships given that the more individuals are sampled from a population, the greater the likelihood of sampling more alleles, thus increasing the chance of higher levels of heterozygosity being detected. One reason that H E decreased with sample size for freshwater fishes may be due by chance to having a few species with very large sample sizes in the dataset but with relatively low H E . For example, one species had a sample size of 1300 individuals, but relatively low genetic diversity (0.345). Nevertheless, models tested after removing sample sizes >500 individuals (n = 74 populations total, n = 4 for populations of freshwater fishes), a statistically negative relationship between H E and sample sizes in freshwater fishes still remained ( Figure S3).
Although microsatellite loci can estimate genome-wide variation well (Mittell et al., 2015), we acknowledge that caveats exist with comparing genetic diversity across taxa using microsatellite data. For example, microsatellite loci are developed independently within a given species; the selection of loci in population studies can potentially generate biases in the estimation of GD P . Nevertheless, previous work (Lawrence et al., 2019) analysed the dataset used in this study and did not find that the numbers of microsatellite loci in original studies were associated with any bias in the estimates of genetic diversity. This is consistent with other work that suggests large-scale assessments of microsatellite data are not significantly affected by such biases (Willoughby et al., 2015). Our results on the distribution of H E still represents an important aspect of neutral GD P with data that spans multiple decades. This population-specific genetic diversity, rather than aggregates of potentially unrelated DNA sequences, (1) hotspots of neutral GD P may not exist in the same way that they do for species diversity, pointing to an uncoupling between the two diversity metrics (also noted in Lawrence & Fraser, 2020;Leigh et al., 2021;Schmidt et al., 2022).
(2) There is a great deal of variation between taxonomic groups, where groups can show virtually opposite relationships between neutral GD P and environmental variables.
(3) The narrow range of variation in neutral GD P compared to species diversity (Lawrence & Fraser, 2020;Leffler et al., 2012) might blur or impede clear relationships between neutral GD P and environmental variables in some instances. (4) There is a theoretical gap where results do not match expectations, suggesting more theorems need to be developed to explain the distribution of genetic diversity at broad scales.
These points have important implications for biodiversity conservation and for how macrogenetic assessments might be applied to conservation. For example, the uncoupling between species and genetic diversity implies that conserving regions of high species diversity may not simultaneously conserve regions of high GD P (Paz- Vinas et al., 2018;Schmidt et al., 2022). Furthermore, large universal reservoirs of genetic diversity (e.g. in the tropics) to 'rescue' populations may not exist. The extent of taxonomic variation observed here in GD P -environment relationships at large (continental) scales suggests that the utility of incorporating a macrogenetic approach into conservation planning or protected area design might be ultimately more applicable in some taxonomic groups than others. Similarly, such taxonomic variation in GD P -environment relationships may necessitate that conservation planning apply taxa-specific solutions over generalised ones, if the conservation goal is to maximise genetic diversity within specific groups of conservation concern at a specific scale (e.g., local population, community, and region).
Future macrogenetic works should strive to further the theoretical gap as well as incorporate other aspects of intraspecific diversity, such as population richness (Lawrence & Fraser, 2020), adaptive genetic diversity (Lawrence & Fraser, 2020;Leigh et al., 2021;Stanley et al., 2018), and genetic load (Mathur & DeWoody, 2021) to address our points above. Especially adaptive genetic diversity may be more tightly correlated to environmental variables than neutral genetic diversity. Considering the effects of biotic interactions, habitat connectivity, and population dynamics, which can have genetic consequences (Lawrence & Fraser, 2020;Mittelbach et al., 2007), may also be informative when assessing drivers of GD P patterns. Regardless, there is intrinsic value in managing intraspecific diversity to ensure species can adapt to and survive future environmental change (Barrett & Schluter, 2008;Bernatchez, 2016;Rey et al., 2016). In this regard, the heterogenous and taxonomic-dependent relationships observed between genetic diversity and environment variables in our study reemphasise the need for a prudent approach to macrogenetic assessments for conservation.

ACK N O WLE D G E M ENTS
We would like to thank J.-M. Matte and H. Ganz for input on the writing. We also thank the associate editor and anonymous reviewers for their insightful comments which have improved the writing and statistical approach. Finally, we thank our funding sources which supported this work: NSERC Discovery Grant, QCBS Seed Grant and a Concordia University Research Chair.

CO N FLI C T O F I NTER E S T S TATEM ENT
The authors declare no conflict of interest.