Phylogenetic signal of sub‐arctic beetle communities

Abstract Postglacial dispersal and colonization processes have shaped community patterns in sub‐Arctic regions such as Churchill, Manitoba, and Canada. This study investigates evolutionary community structure within the beetle (Coleoptera) families of Churchill and tests whether biological traits have played a role in governing colonization patterns from refugial and southerly geographic regions. This study quantifies sub‐Arctic beetle phylogenetic community structure for each family using the net relatedness index (NRI) and nearest taxon index (NTI), calculated using publicly available data from the Barcode of Life Data Systems (BOLD); compares patterns across families with different traits (habitat, diet) using standard statistical analysis (ANOVA) as well as phylogenetic generalized least squares (PGLS) using a family‐level beetle phylogeny obtained from the literature; and compares community structure in Churchill with a region in southern Canada (Guelph, Ontario). These analyses were also repeated at a genus level. The dominant pattern detected in our study was that aquatic families were much better represented in Churchill compared to terrestrial families, when compared against richness sampled from across Canada and Alaska. Individually, most families showed significant phylogenetic clustering in Churchill, likely due to the strong environmental filtering present in Arctic environments. There was no significant difference in phylogenetic structure between Churchill and Guelph but with a trend toward stronger clustering in the North. Fungivores were significantly more overdispersed than other feeding modes, predators were significantly more clustered, and aquatic families showed significantly stronger clustering compared to terrestrial. This study contributes to our understanding of the traits and processes structuring insect biodiversity and macroecological trends in the sub‐Arctic.


| INTRODUC TI ON
The Arctic is a land of change (Pielou, 1995). Glaciation changed, or largely eliminated, the communities inhabiting sub-Arctic areas such as Churchill, Manitoba, and Canada (Pielou, 1995). This region was very recently deglaciated (approximately 8K years ago), and the species composition of this area was formed by postglaciation colonization, with species primarily coming from the south and from the Beringian glacial refuge (Brandson, 2011;Pielou, 1995;Woodcock et al., 2013). While biodiversity in general tends to decrease with latitude, Arctic environments provide a diverse range of habitats and niches for life (Danks, 1992;Woodcock et al., 2013). As the climate shifts, these communities and habitats are experiencing rapid changes; this may be due to increasing temperature, melting sea ice, increased greenery, changing nutrient levels, or invading species (Walseng et al., 2018). Important questions remain about Arctic biodiversity, such as what species and traits make up Arctic communities, where did they colonize from, what patterns exist in their community structure, and how will these patterns shift in the future? With ongoing climate change, it is important to understand the traits of Arctic and sub-Arctic species, as well as to predict how their geographic ranges and community structure may shift in the future.
Investigating evolutionary community structure can help us understand the relationships among species in Arctic communities and their distribution patterns. Phylogenetic community structure metrics are used to quantify the relatedness among cohabiting species against patterns in a broader source community (Boyle & Adamowicz, 2015;Emerson et al., 2011;Kraft et al., 2007;Mayfield & Levine, 2010;Smith et al., 2014;Webb, 2000;Webb et al., 2002).
Are the species found in a local community more closely related than those in a broader community? What does this tell us about the mechanisms underlying their relationships and distributions?
In order to reconstruct and understand the phylogenetic relationships among species, it is beneficial to analyze DNA sequence data, which is a rich source of data for inferring relationships (Hillis et al., 1996). DNA barcodes are standardized DNA sequences that are used for specimen identification and species discovery (Hebert et al., 2003;Hebert & Gregory, 2005;Hubert & Hanner, 2015), and which also harbor phylogenetic signal to resolve relationships among closely related species (Boyle & Adamowicz, 2015;Smith et al., 2014;Wilson, 2010Wilson, , 2011. The barcode most commonly used for animals is an approximately 658 base pair region of cytochrome c oxidase subunit I (COI), a mitochondrial gene (Adamowicz, 2015;Hebert et al., 2003). DNA barcoding allows for data to be readily available to other scientists through data banks like the Barcode of Life Data Systems (BOLD), which contains a large collection of geo-referenced specimens from locations around the world (Ratnasingham & Hebert, 2007). This study leverages publicly available, geo-referenced sequence data for beetles from BOLD, combined with a published multi-gene backbone phylogeny (Zhang et al., 2018), to combine the merits of both approaches for community phylogenetics (Boyle & Adamowicz, 2015;Smith et al., 2014).
Various patterns can occur in phylogenetic community structure, including patterns of clustering, overdispersal, or random (Webb, 2000;Webb et al., 2002). A clustered pattern occurs when closely related species are found together more often than expected by chance, often caused by environmental filtering (Figure 1a) (Emerson et al., 2011;Kraft et al., 2007;Smith et al., 2014;Weiher et al., 2011). In this case, cohabiting species typically share the traits needed to survive in a given environment and are therefore found in the same region, while more distantly related species that lack these traits are excluded. Overdispersion occurs when closely related species cohabit in the same local community less than is expected (Figure 1b) (Emerson et al., 2011;Kraft et al., 2007;Mayfield & Levine, 2010;Weiher et al., 2011). This is often interpreted as evidence for competitive exclusion, whereby closely related species compete for the same resource, and this results in one species being forced out of the environment or into a different niche (Emerson et al., 2011;Kraft et al., 2007;Weiher et al., 2011). However, it is difficult to draw conclusions about mechanisms and the causes of these patterns based on the phylogenetic patterns alone. Mayfield and Levine (2010) suggest that competitive exclusion can also cause clustering. If competitive ability is phylogenetically clustered and is more important for surviving in the environment than niche differences, we can expect competitive exclusion to cause clustering rather than overdispersion (Mayfield & Levine, 2010). In order to draw conclusions about mechanisms, it may be beneficial to examine traits rather than community phylogenetic patterns alone.
There are various environmental and biotic factors that may influence the phylogenetic structure of communities, and these may F I G U R E 1 Phylogenetic trees demonstrating phylogenetic community structure patterns. Each habitat or geographic region is shown by a different colour and shape. (a) Pattern a shows a clustering pattern, where closely related species share the same region. (b) Pattern b shows an overdispersed pattern, where closely related species inhabit different regions or environments change with latitude. Factors such as the strength of competition and environmental filtering change across latitude. In northern environments, the climate and environmental factors can be important for determining species assemblages (Ernst & Buddle, 2015).
Dispersal ability of various taxonomic groups, geographic barriers, and stochasticity are also expected to play a role in the recolonization of formerly glaciated northern regions.
The biological traits of the species within a community, such as diet or lifestyle, can also affect the phylogenetic structure (Mayfield & Levine, 2010). For example, Poulin et al. (2011) found that closely related parasitic species are found together in local communities more than expected, likely due to closely related species having similar hosts. If these hosts are clustered geographically, we can expect the same of the parasites (Eagalle & Smith, 2017;Poulin et al., 2011).
Similarly, Vamosi and Vamosi (2007) discussed the effects of an aquatic lifestyle on community structure, with dytiscid beetle communities in the lakes of Alberta showing phylogenetic clustering. This may have been caused by a decrease in the importance of competition and an increase in environmental filtering in aquatic systems relative to terrestrial (Vamosi & Vamosi, 2007). In order to survive in aquatic environments, species need to have a certain set of physiological tolerances, and environmental factors such as salinity and pH influence the diversity (Heino et al., 2016) and composition of species found in the environment (Vamosi & Vamosi, 2007). However, different processes interact to determine species survival and co-existence, and it may be difficult to pinpoint one cause or mechanism (Peres-Neto et al., 2012). Across these varied examples, the lifestyles and characteristics of the species influence the community structure.
While prior studies have investigated clustering patterns and community structure within specific taxa and locations, few have compared these patterns across taxa or investigated how community structure is related to traits (Kraft et al., 2007;Poulin et al., 2011;Vamosi & Vamosi, 2007;Weiher et al., 2011). In this study, we investigate the community composition of a sub-Arctic region and seek to determine if phylogenetically related species are more likely to have colonized this area or if the community composition is random or overdispersed in relation to phylogeny. We investigate the patterns that occur in phylogenetic community structure at a species level across taxa and traits and investigate the phylogenetic relatedness of species inhabiting the sub-Arctic site of Churchill, Manitoba using northern North America as the regional species pool. We also seek to determine if biological traits (habitat, diet) influence the phylogenetic community patterns. This study allows us to investigate what traits are relatively more prevalent in Arctic communities and whether families with specific traits tend to exhibit phylogenetic clustering.
By understanding the current traits and community structure, and how these relate to environmental factors, we can better prepare for the changes likely to occur in the future. We hypothesize that environmental filtering will impact community structure of sub-Arctic communities due to the harsh environmental conditions present at higher latitudes. Specifically, we predict that the species in Churchill will present a significantly clustered pattern when compared against the broader North America species phylogeny. When comparing other regions within North America, we expect the regions found at higher latitudes to show a more significant clustered pattern.
Second, we hypothesize that the traits and characteristics of the species will influence the community structure. We predict that taxonomic groups with traits that expose them to more environmental filtering, such as being aquatic, or relying on a host species, such as being a parasite or parasitoid, will have a more clustered pattern than their terrestrial and free-feeding counterparts.

| Data and taxa
The focal organisms for this study are sub-Arctic Coleoptera. Beetles are understudied in previous community structure research yet are hyper-diverse, with species occupying a variety of niches and habitats and exhibiting substantial variability in traits (Marshall, 2006;Woodcock et al., 2013). There are also 466,260 public records available on the BOLD database as of July 7th, 2021. Particularly, we will be focusing on the Churchill region as there has been a concentrated effort to barcode fauna in northern communities, particularly Churchill (Woodcock et al., 2013;Zhou et al., 2009Zhou et al., , 2010. In the BOLD database (Ratnasingham & Hebert, 2007), there are 315 recorded species of Coleoptera in Churchill as of July 7th, 2021.
Using BOLD's application programming interface (API), all data for this study were pulled from the BOLD database [June 19th, 2019] directly into the R environment. The code for this study is available at github.com/S-Majoros/Phylogenetic_Community_Structure_Code.r.
All coding was done in R version 3.5.0 (R Core Team, 2019). Data for both Canada and Alaska were used as the regional species pool and compared to the data from Churchill, which will be defined as the local community for this study. BINs (Barcode Index Number ;Ratnasingham & Hebert, 2013) were used to represent species. BINs are OTUs (operational taxonomic units) that are clusters of barcode sequences similar to species (Ratnasingham & Hebert, 2013). We chose to use BINs to represent species in this study, because, for beetles, BINs frequently correspond to morphologically defined species boundaries (Pentinsaari et al., 2014(Pentinsaari et al., , 2017. While BINs do not always perfectly match with recognized species boundaries, Pentinsaari et al. (2014), Pentinsaari et al. (2017) found in Coleoptera that BINs matched with species 90%-92% of the time. We propose that using BINs is a valuable approach for insect biogeographic studies due to the widespread presence of cryptic (or nearly cryptic) evolutionary lineages in insects (Smith et al., 2006). Using BINs is, at this time, likely to result in a more complete account of the biodiversity present and readily enables comparison of biodiversity between geographic regions.

| Filtering data and defining Churchill
Once the sequences and metadata had been pulled from BOLD, the data were filtered. Families and genera were included in the analysis if they had three or more BINs present in Churchill. DNA sequences without a BIN assignment or GPS coordinates were removed. Sequences were also removed if they were not from the COI-5P marker, if they had internal missing data ("N" nucleotides) or gap content greater than 1% of the sequence length, or were less than 500 base pairs. COI is commonly used for DNA barcoding animals and provides useful phylogenetic signal at low taxonomic levels but has some limitations when used to construct deep phylogenies (Boyle & Adamowicz, 2015;Smith et al., 2014;Wilson, 2010Wilson, , 2011. This limited phylogenetic signal can be helped by using a constraint tree when constructing phylogenies (Boyle & Adamowicz, 2015;Smith et al., 2014;Wilson, 2011). Despite some limitations, COI can be readily sequenced from a large number of taxa and provides high sequence quality compared to other gene regions (Wilson, 2010). Barcode-based trees have also shown similar results when used for community phylogenetics compared to other trees (Boyle & Adamowicz, 2015;Erpenbeck et al., 2007;Smith et al., 2014).
Because of these findings, COI was suitable to use for this study.
Additionally, this marker had the advantage of large-scale taxonomic and geographic coverage for North American beetles ( Figure 2).
The sequence datasets were reduced to one sequence per BIN for phylogenetic analysis. The sequences were first aligned within each BIN in order to choose a centroid, which is a representative sequence for each BIN, defined as the sequence with the minimum average distance to all others in its BIN (as in Orton et al., 2019).
Alignments were performed using the muscle algorithm implemented in the muscle package version 3.30 (Edgar, 2004) with the following parameters: maxiters equaled 3, diags equaled true, and gapopen equaled −3000. These parameters were chosen in order to limit the number of iterations for optimization to allow for an alignment to be quickly generated, as sequences within BINs are similar.
Then, the selected centroids (one per BIN) were aligned within each family. A preliminary alignment was performed for each family with the above parameters in order to trim the sequences to 658 base pairs and to screen for outliers. The sequences for each family were then aligned using a reference sequence. A reference BIN that met the following criteria was selected from the public data on BOLD: it was from the order Coleoptera, it contained at least 10 CO1-5P sequences, it had at least one specimen photograph that matched the higher taxonomy, and it did not have taxonomic conflicts at family level or above. The reference sequence was chosen from this BIN and had to be 658 base pairs long, have 2 trace file chromatograms, and no missing information or stop codons. The reference sequence used for this study had the record id AEDNA549-12 and was from the species Colymbetes dolabratus. The final alignment was performed using the same settings as the previous alignments, but with the default maxiters parameter (maxiter = 8 in R implementation using muscle package) (Edgar, 2004). After the data were filtered, a Churchill subset was defined using coordinates: a latitude between 58.6 and 58.7 degrees and a longitude between −94.2 and −93.8 degrees. These coordinates were found using Google Earth (Google, 2018) and based on a map provided in Boyle (2012) that showed the accessible areas in the vicinity of Churchill, MB, included in prior DNA barcoding research. This map is compatible with maps in other Churchill-related DNA barcoding literature (Woodcock et al., 2013;Zhou et al., 2009Zhou et al., , 2010).

| Community phylogenetic metrics
In order to test for phylogenetic clustering and overdispersion, we calculated net relatedness index (NRI) and nearest taxon index (NTI); the calculation of these metrics requires a phylogeny as one of the inputs. First, we generated a maximum likelihood tree for each Coleoptera family using COI one sequence per BIN for all BINs

F I G U R E 2
Map showing the location of Churchill and Guelph and the sampling sites within Canada and Alaska for beetle data available from BOLD present in Canada and Alaska. The family level was chosen for analysis because members of beetle families often share important traits, such as feeding mode (Hunt et al., 2007). Before reconstructing the phylogenies, we first estimated the best-fit model of nucleotide evolution for each family using the R package phangorn version 2.4.0 (Schliep, 2011). The model with the lowest Bayesian Information Criterion (BIC) score was chosen, and the proportion of invariant sites was determined based on the fitted model. BIC evaluates models based on posterior probability and maximum likelihood (Konishi & Kitagawa, 2008). The number of intervals of discrete gamma distribution (the k value) was set to 4. A neighbor-joining tree (Saitou & Nei, 1987), generated using the function NJ from phangorn version 2.4.0 (Schliep, 2011), was used as the guide tree. Maximum likelihood trees based on COI were generated using the function optim.pml from phangorn version 2.4.0 (Schliep, 2011), and optNni, optGamma, and optInv were set to true. A bootstrapping analysis performed with 1000 replicates was then used to find the nodal support for each tree. For each family, an outgroup was chosen from another Coleoptera suborder and used to root each tree. Trees showing the nodal support values are included in Appendix S1. The most likely tree based on these replicates for each family was used in the NRI and NTI analysis. NRI and NTI calculate the pairwise distance between two species and use this to estimate the community relatedness (Webb, 2000). NRI averages the evolutionary distances between all pairs of tips in the community, while NTI takes only the distances between nearest neighbors ( Figure 3) (Webb, 2000).
When the NRI/NTI value (standardized measure) is above 0, this indicates phylogenetic clustering of the species within the community, while negative values indicate overdispersion (Webb, 2000). The two tests detect patterns at different levels within the phylogeny; therefore, in order to test for general patterns, both tests should be performed (Kraft et al., 2007). The NRI and NTI may differ in their estimates of significance, or, in some cases, even their predicted trend. If NRI suggests clustering, this is due to clustering occurring deeper within the phylogeny (Webb, 2000). For NTI, the clustering is occurring within the clades and at the tips of the phylogeny (Webb, 2000). For this study, it is beneficial to use both in order to detect clustering patterns at all levels. These calculations were performed using the R package picante version 1.7 (Kembel et al., 2010) and the null model "taxa.labels," which indicates that random draws of the same species richness as the Churchill community were made from each family phylogeny; and NRI and NTI are re-calculated with each randomization. The analysis was repeated 1000 times. The observed NRI and NTI values were then compared against the null distribution to obtain a p-value. These tests determined whether species inhabiting the Churchill region are more significantly phylogenetically clustered or overdispersed than expected by chance,

| Trait analysis
For the trait analyses, we investigated whether families with different traits have different phylogenetic community structure, by comparing the NRI/NTI values across families exhibiting different trait categories using an ANOVA. First, we created a character matrix for each family. Characters/traits were found for each family based on the literature (references available in Appendix S3). The traits that describe the majority of members of a given family were used; this included habitat (terrestrial or aquatic) and feeding mode (predaceous, phytophagous, or fungivorous). Where adult and larval diet differed, both were included as separate traits. Habitat remained relatively consistent across larvae and adult stages. We defined terrestrial as taxa that live primarily in land habitats and aquatic as taxa that live primarily in water bodies and habitats. Predaceous taxa were defined as those who prey on other insects or animals, phytophagous taxa as those who feed primarily on plant material, and fungivores as those who feed primarily on fungi. We then used a one-way ANOVA to compare the average phylogenetic structure (NRI or NTI metric) of families across trait categories, treating each family as an independent unit (as supported by the results of Pyle, 2018). We conducted a second analysis considering phylogenetic relationships among families. We created a family-level phylogenetic tree, that is, treating each family as one tip, using the phylogenetic hypothesis provided in Zhang et al. (2018) based upon 95 protein-coding genes.
Five species from the order Neuroptera were chosen as outgroups.
Based upon their topology, the tree was constructed manually using Mesquite (Maddison & Maddison, 2019) and loaded into R. We assigned branch lengths of 1, before fitting a phylogenetic generalized F I G U R E 3 Example phylogenetic tree with a chart showing nodal distances among members of the community. NRI uses all the distances to find the mean pairwise distance ((1 + 2 + 3 + 2 + 3 + 2)/6 = 2.16). NTI uses only the distances between nearest neighbors (nearest neighbor pairs: A&B, B&A, B&C, C&D; (1 + 1 + 2 + 2)/4 = 1.5) least squares (PGLS) model using picante version 1.7 (Kembel et al., 2010). This allowed us to determine whether families with particular traits have different clustering patterns while taking into account the relationships among families. The PGLS analysis used Brownian motion as the model of trait evolution, and the log-likelihood was maximized for the method. A chi-square test was also performed to determine whether the proportional representation of BINS in Churchill varied with traits.

| Community phylogenetic metrics for a temperate region
In order to compare the phylogenetic community structure patterns in Churchill to a temperate location, the analysis above was repeated for the Guelph region. Guelph was selected due to its temperate climate and the abundance of data available on the BOLD database (Ratnasingham & Hebert, 2007). A Guelph subset was defined using coordinates: a latitude between 43.4 and 43.6 degrees and a longitude between −80.3 and −80.1 degrees. These coordinates were found using Google Earth (Google, 2018). In order to determine if the community structure of the Churchill and Guelph subsets were significantly different, a paired t-test was performed to compare mean NTI and NRI values for beetle families between these sites.
The trait analysis was also repeated for the Guelph region. Due to the increased number of families found in this region, several more categories were added to feeding mode. Saprophagous taxa were defined as those that feed primarily on decaying organic matter, and omnivores as those that feed relatively equally on both plant and animal matter.

| Analysis at a genus level
The analyses described above were repeated at the genus level for both the Churchill and Guelph regions. Genera needed to have three or more BINs present in Churchill to be included. The data were filtered using the same criteria as the data at the family level, NRI/NTI was calculated for each genus, and traits were assigned at the genus level, instead of the family level as described above (references in Appendix S3). Traits are able to be assigned more accurately at the genus level, that is, with less variability among species within genera than among species within families. Some additional trees were needed in order to find the relationships among beetle genera. In

| Sensitivity analysis: constraint tree
An important part of phylogenetic analysis is the phylogenetic tree, and NRI/NTI and PGLS are all affected by the phylogenetic tree used. As discussed earlier, there are some issues with generating phylogenetic trees using COI data alone, although reasonable phylogenetic signal is expected among close relatives (Wilson, 2010(Wilson, , 2011. Park et al. (2018) suggested that inferred phylogenies often underestimate phylogenetic diversity, and errors in the phylogenetic reconstruction are common when environmental filtering is present, which we expect here. It is important to choose good constraint trees and outgroups. In order to account for the issues associated with the use of COI in the generation of phylogenetic trees, a sensitivity analysis was performed that used a constraint tree in addition to COI to generate Maximum Likelihood trees for individual families. The use of a constraint tree plus COI data is gaining support for constructing species-level phylogenies in diverse insect groups, including in caddisflies (Boyle & Adamowicz, 2015) and ants (Smith et al., 2014). However, due to limitations in the literature, a full constraint tree cannot be constructed for each family and genus. One family and one genus were chosen for the constraint analysis: Dytiscidae and Agonum (from the family Carabidae). Species-level constraint trees were built using trees from the literature: Michat et al. (2017) and Zimmerman (1981) for Dytiscidae and Liebherr and Schmidt (2004) for Agonum. For the Dytiscidae tree, 5 sequences from the genus Cicindela from the closely related family Carabidae were chosen as outgroups.
For Agonum, 5 species from the closely related genus Amara were chosen as outgroups. Constraint trees were constructed manually using Mesquite (Maddison & Maddison, 2019), and the alignments were generated in R as described above. The maximum likelihood trees were generated using RAxML (Stamatakis, 2014) using the COI sequence data with a binary (i.e., bifurcating) constraint tree.
The most common species-level identification for each BIN was used to assign species-level taxonomy to each BIN. These trees were then imported into R in order to complete the NRI/NTI analysis.
2.8 | Sensitivity analysis: size of regional species pool and taxon richness of source pool Kraft et al. (2007) state that the power for the NRI and NTI analysis is highest when local species richness is 30%-60% of regional species richness. For the Coleoptera of Churchill, all families are below this range except for Dytiscidae, Gyrinidae, and Haliplidae. To determine the effects of this, a sensitivity analysis was performed. The regional BIN pool was restricted to the Canadian provinces and territories of Manitoba, Nunavut, Northwest Territories, Saskatchewan, and Ontario. This restriction also helps combat some patterns that may be based on biogeography. For example, the Rocky Mountain Range may act as a barrier to dispersal, and this could create a clustering pattern on its own. By restricting the regional pool, we can largely control this effect. The same families and phylogenetic tree were used in this analysis. A paired t-test was then performed to compare the results of the NRI/NTI using the restricted BIN pool to the original analysis.

| Phylogenetic clustering metrics
Sixteen families of Coleoptera were analyzed for the study, following the data filtering steps described above, of which seven showed significant phylogenetic clustering (full results in  Table 1b).
The same analysis was completed for the Guelph subset.
Thirty-two families were analyzed, four of which showed significant phylogenetic clustering (full results in Table 1c; Figure 4b).
Overall, Guelph appears to be more overdispersed than Churchill; however, the taxonomically paired NRI values (t-statistic = 1.19, p = .26) and NTI values (t-statistic = 1.64, p = .13) of the two subsets were not significantly different. For the Guelph genus-level analysis, 32 genera were analyzed, four of which showed significant phylogenetic clustering (full results in Table 1d). A paired ttest could not be done to compare Churchill and Guelph genera, as there were no shared genera between the two regions that met our inclusion criteria.

| Trait analysis
Within the families studied in Churchill, only 5 were aquatic, while 11 were terrestrial. However, aquatic families have a larger percent of their total BINs found in Churchill ( Figure 5; Χ-squared 1 = 76.33, p = 2.2 × 10 −16 ). A similar result was shown for feeding mode, with the count of BINs present in Churchill, in relation to total northern North American BIN richness, differing among families with different feeding modes (Χ-squared 2 = 68.837, p = 1.13 × 10 −15 ). At the adult life stage, seven families were phytophagous, six were predaceous, and three were fungivores; at the larval stage, six families were phytophagous, seven were predaceous, and three were fungivores. The ANOVA showed no significant relationship between the community structure metrics and the traits of the families (Table 2a), including for habitat (F 1.14 = 1.79, p = .203), adult feeding mode (F 2.13 = 1.071, p = .37), and larval feeding mode (F 2.13 = 0.89, p = .43). These results were consistent with both the NRI and NTI values. The results of the PGLS differed from that of the ANOVA.
Community structure was significantly related to both feeding mode and habitat (Table 2b, Figure 6). Aquatic families were significantly more clustered than terrestrial in NRI (t = 2.32, p = .04) but not NTI (t = 1.8, p = .09). Fungivore families were significantly more overdispersed than other feeding modes in NRI (t = 2.12, p = .05) but not NTI (t = 1.35, p = .2). Predators were significantly more clustered than other feeding modes in NTI (t = 2.43, p = .03) but not NRI (t = 0.15, p = .88). This result was significant only for larval feeding mode.
In the genus-level analysis, eight genera were aquatic, and only three were terrestrial. For feeding mode, two genera were phytophagous, and nine were predaceous. The ANOVA showed no significant relationship between the community structure metrics and the traits of the genera, including for habitat (F 1.9 = 0.482, p = .505) and feeding mode (F 1.9 = 0.001, p = .971). These results were consistent with both the NRI and NTI values. The PGLS analysis yielded similar results. However, there was a nonsignificant trend toward terrestrial and predaceous genera being more clustered.
The trait analyses were also performed for the Guelph re- These results were consistent across NRI and NTI and the PGLS analysis. There was a slight nonsignificant trend toward increased clustering in aquatic families.
At the genus level, 31 genera were terrestrial, and only one genus was aquatic, so no statistical test was performed for habitat. For feeding mode, 13 genera were phytophagous, twelve were predaceous, and seven were fungivores, with no difference in phylogenetic community structure found among these groups using an ANOVA (F 2.29 = 1.602, p = .219). These results were consistent across NRI and NTI and the PGLS analysis.

| Tree nodal support
The median nodal support value was calculated for each set of boot- .001

| Sensitivity analysis: constraint tree
The NRI and NTI analyses were performed again using a constraint tree for the family Dytiscidae and the genus Agonum. Dytiscidae with the constraint was significantly clustered, while Dytscidae without the constraint was not. Agonum was overdispersed in the NTI as well as NRI with the constraint tree, while without constraint Agonum was only overdispersed in NRI.
3.5 | Sensitivity analysis: size of regional bin pool and taxon richness of source pool After the regional pool was reduced, Dytiscidae, Haliplidae, and Gyrinidae were still the only families where local species richness was close to 30%-60% of regional species richness (

| DISCUSS ION
Overall, this study discovered several interesting patterns in the phylogenetic community structure of beetles. Most Churchill families and genera included in this analysis showed phylogenetic clustering. Habitat and diet were shown to impact community structure, with aquatic and predaceous families more strongly clustered than terrestrial families and those with other feeding modes. In comparison to Churchill (sub-Arctic), Guelph (temperate) appeared more to be overdispersed, and traits had less of an effect on the phylogenetic community structure.
The main region of focus for this study was Churchill, MB, and based on our results several conclusions can be drawn. Overall, the strongest trend that we detected was higher representation of BINs from aquatic families in Churchill compared to terrestrial, indicating that beetles with aquatic habitat were more able to colonize this sub-Arctic region. We also found significant clustering in seven of the Coleoptera families studied, and a trend toward clustering in five. This provides support for the hypothesis that, due to the harsh conditions present at high latitudes, environmental filtering would be strong in sub-Arctic communities. The species present in the Churchill region possessed the traits needed to survive in this environment, while more distantly related species likely did not. This was not true for all families studied, as three families showed a trend toward overdispersal, though this was insignificant. All families were widely sampled across Canada, though less sampling was done in Northern Canada than Southern. The overdispersed families could potentially still have been experiencing clustering, just at a larger spatial scale, with closely related species being clustered in Canada. A trend toward clustering was also observed in the genus-level phylogenies.
In order to compare phylogenetic community structure across latitude, the more temperate region of Guelph, ON, was included in the study. When comparing the Guelph and Churchill subsets, there was a nonsignificant trend toward Churchill being more phylogenetically clustered. It is possible that Guelph, at 43.5 degrees north, and Canada in general, is still far enough north to exhibit clustering.
Guelph was less clustered than Churchill (58.7 degrees north) and if compared to more regions at even lower latitudes, it is possible there will be a more pronounced latitudinal difference. Temperate areas are under less extreme environmental pressures, which may result in stronger competition and more phylogenetically dispersed communities compared to polar regions. While our findings are consistent with the hypothesis that sub-Arctic communities are experiencing greater environmental filtering, further investigation across a broader latitudinal gradient is warranted.
Our results were similar to those found in other studies. Ernst and Buddle (2015) found that assemblage structure was correlated with latitude and that climate was important for determining community structure in northern communities of beetles when species were placed in functional groupings. Similarly, Shibuya et al. (2011) found that in the beetle family Carabidae in Japan, the environmen- conditions. There were some families that exhibited a trend toward overdispersion. Ulrich and Fattorini (2013) found a similar pattern in Tenebrionidae and suggested that this could be due to colonization patterns. Differences in the past colonization patterns of the families could also influence the community structure.
Our study not only showed patterns in phylogenetic structure across regions at different latitudes, but it also showed patterns across families with different biological traits. Supporting our predictions, there was a significant relationship between phylogenetic community structure and habitat in the PGLS analyses in Churchill at the family level, though not in the ANOVA or at the genus level. Aquatic families were significantly more clustered than terrestrial. Competition is often less important in aquatic communities due to the strong influence of environmental factors (Heino et al., 2016;Vamosi & Vamosi, 2007). This pattern could also be influenced by the low BIN richness of some of the terrestrial families, as well as low plant species richness in the sub-Arctic. Getting a true representation of terrestrial versus aquatic families was difficult due to the lack of variability in habitat across families and the limited families that inhabit the Churchill region. Only five of the 16 families studied were aquatic. However, these aquatic families had a significantly larger percent of their total northern North American species found in Churchill than terrestrial families. This suggests that it may be easier for aquatic species to colonize the Arctic than terrestrial. In order to better understand this pattern, other locations and taxonomic levels should be investigated. Based on the results of this study, habitat is likely one of the traits determining community structure in sub-arctic environments.
Feeding mode is also likely a determining factor. There was a significant relationship between phylogenetic community structure and F I G U R E 5 Phylogenetic tree showing the terrestrial and aquatic families present in Churchill. This tree is based on the one shown in Zhang et al. (2018). The pie graphs show the percent of the total BINs from Canada and Alaska that have been found in Churchill. Aquatic families have a larger percent of their total BINs found in Churchill than terrestrial families Significant results were also observed in predatory families, with predators more significantly clustered than other feeding modes.
This pattern could possibly be due to the predator's reliance on their Note: There is no significant relationship between the phylogenetic community structure within families and the habitat or feeding mode of the families. These results differed from the (b) PGLS analysis comparing community structure within families to feeding mode and comparing community structure to habitat for both (i) NRI values and (ii) NTI values, taking into account the family-level phylogeny of beetles. There was a significant relationship between community structure and feeding mode, as well as with habitat, though in NRI only. Aquatic families were significantly more clustered, as were predaceous families, while fungivore families were significantly more overdispersed. prey species. If the prey is clustered, so is the predator. However, out of the six predatory families studied, five were generalist predators (Marshall, 2006). The sixth family, Coccinellidae, consumes mostly aphids (Marshall, 2006). This is also the only predator family that showed a trend toward overdispersion. Therefore, the general diet of most of the families suggests that the clustering pattern observed is not dependent on their prey and that having a more general diet is beneficial for surviving in the Arctic. Another possible explanation is that these predators are able to survive in these northern habitats due to their cold tolerance and overwintering abilities. Predaceous families such as Coccinellidae and Carabidae have overwintering strategies that allow for survival in cold temperatures (Hamedi et al., 2013;Knapp & Saska, 2011). This includes occupying microhabitats that buffer the effects of the climate, lowering temperature thresholds for activity or increasing cryoprotectant concentrations (Hamedi et al., 2013;Knapp & Saska, 2011). If these traits are phylogenetically conserved, this would result in clustering. Diet has been shown to be important in other studies, such as Poulin et al. (2011), who found clustering in parasitic families due to their reliance on specific host species.
Despite the significant results observed at the family level, the relationship between phylogenetic community structure and traits was not as apparent at lower taxonomic levels. Using phylogenies of BINs within individual genera of Churchill, there was no relationship between the community structure and traits. Other traits not examined, such as overwintering strategies, may vary relatively little among species within genera, and therefore more random phylogenetic patterns may be expected within genera compared to families.
There was still a trend toward increased clustering in predaceous genera, but there was also a trend toward increased clustering in terrestrial genera, the opposite of the pattern observed at the family level. There were eleven genera included in the genus-level analysis; eight were aquatic, and only three were terrestrial. These eleven genera only included representatives of six of the Churchill families.
The small sample size and lack of BIN diversity within genera of Churchill could have an effect on the results; thus, larger geographic regions and other taxa should also be investigated in future research.
The effects of traits were also compared between the two regions included in this study. It appears that habitat and diet have a greater effect in Churchill. While aquatic families still had a trend toward increased clustering in Guelph, this was not significant. There was not a strong trend observed in diet or with either of the traits at the genus level. As habitat and diet might not be the traits determining community structure in this region, a greater number of traits Overall, there was support for our hypothesis that traits would impact phylogenetic community structure. The idea that traits are important for determining community structure is also supported by the literature; Mayfield and Levine (2010) suggest that phylogeny alone cannot determine community structure, and studies such as Vamosi and Vamosi (2007) have found traits such as body size to be related to community structure.
Our study encountered several limiting factors, but these present opportunities for future research. One was limited richness of BINs within some habitats and traits among the families present in Churchill, reflecting primarily the biological patterns as sampling has been relatively extensive for most families present (Pyle, 2018;Woodcock et al., 2013). There were more terrestrial species than aquatic and more phytophagous and predaceous families than fungivores; this makes it hard to compare accurately the observed phylogenetic clustering patterns in relation to the trait states. There were only 16 families studied here that met our inclusion criteria, and data were even more limited at the genus level. Future studies may expand on these results by conducting this analysis for other taxa, other geographic regions, and other traits. While this study did look at one temperate region, including more regions along a latitudinal gradient would allow us to understand better the effects of latitude and environmental conditions on communities. Moreover, by including more traits, we can discover what other traits are being filtered for in Arctic communities and how these traits are affecting phylogenetic community structure. Also, some families are understudied and under sampled (Brunke et al., 2019). While sampling in general is extensive for Churchill beetles (Pyle, 2018;Woodcock et al., 2013), families such as Scirtidae and Latridiidae are poorly understood and appear to be more diverse in Canada than the number of recorded species suggests (Brunke et al., 2019). By continuing to study Canada's insect communities, including intensive sampling at focal sites, we can further explore diversity, traits, and community structure based upon more complete sampling.
To summarize the observations of this study, most Churchill families and genera showed phylogenetic clustering, and most families were significantly clustered. Aquatic families were more clustered then terrestrial, and there was a larger percent of their total BINs found in Churchill compared to terrestrial families. Fungivore families were more overdispersed than other feeding modes, and predaceous families were more clustered. When compared to Churchill, Guelph families trended toward less phylogenetic clustering. Diet and habitat also appeared to have less effect on community structure in Guelph than in Churchill.
During postglacial colonization, species came from the south and from the Beringian glacial refugium (Pielou, 1995;Woodcock et al., 2013). Was this colonization random? The results of this study suggest that it was not. Closely related species, sharing similar traits, were found in sub-Arctic communities, likely due to the environmental filtering occurring in this area. Arctic communities are particularly vulnerable to climate change and increasing temperatures (Danks, 1992;Walseng et al., 2018). If Arctic conditions change, it is possible that some of their extreme environmental pressures will decrease or shift, and the environmental filtering occurring in these environments will likely also change. By understanding the current community structure and the factors and traits influencing this, we can better predict how these communities are likely to change in the future. If temperate locations show less clustering than those in northern regions, as shown by the comparison of Guelph and Churchill in this study, we can expect communities to become less phylogenetically clustered as species move northward.

CO N FLI C T O F I NTE R E S T
The authors declare that there are no conflicts of interest.