Combining data from field surveys and archaeological records to predict the distribution of culturally important trees

Indigenous communities involved in conservation planning require spatial datasets depicting the distribution of culturally important species. However, accessing datasets on the location of these species can be challenging, particularly when the current distribution no longer reflects areas with the full range of suitable growing conditions because of past logging. We test whether using occurrence data from community‐based field surveys and archaeological records in species distribution models can help predict the distribution of monumental western redcedar trees (Thuja plicata)—large, high‐quality trees suitable for cultural purposes such as carving dug‐out canoes, totem poles and traditional houses. This species is critically important to indigenous people of the Pacific Northwest of North America, but trees suitable for traditional carving and building are diminishing in abundance due to logging.


| INTRODUC TI ON
Indigenous people and communities are gaining enhanced rights and authority over their traditional lands, including the forest resources that are often deeply intertwined with their culture (Larson, Dahal, & Colfer, 2010). In such places, integrating local knowledge and perspectives about culturally important plants and animals are often key factor in successful conservation and resource management initiatives (Berkes, Folke, & Gadgil, 1994;Charnley, Fischer, & Jones, 2007). What kind of data is applicable and meaningful to indigenous communities in these contexts? Increasingly, data describing the locations of species occurrences are an integral part of spatial conservation planning because these data support prediction of spatially explicit species distributions Franklin, 2009). But questions about how to effectively apply these methods arise when the target taxon is an important traditional resource used by indigenous groups and is associated with a rapidly shifting distribution. In these situations, using occurrence data based on observations, knowledge or physical evidence of local inhabitants is an underutilized approach that can provide insights into ecological communities, populations and resource landscapes (Franklin, Potts, Fisher, Cowling, & Marean, 2015;Lopez-Arevalo, Gallina, Landgrave, Martinez-Meyer, & Munoz-Villers, 2011;Pesek et al., 2009;Ziembicki, Woinarski, & Mackey, 2013). Finding novel ways to bring together and compare alternative occurrence datasets, such as those based on field surveys and archaeological records, holds promise for spatially predicting past and current species distributions and more effectively meeting community objectives for conservation areas.
Redcedar is considered the "tree of life" to indigenous people because of its prominent role across diverse aspects of traditional and contemporary life (Garibaldi & Turner, 2004;Zahn, Palmer, & Turner, 2018). For example, the emergence of redcedar in these coastal forests during the Holocene is associated with rapid technological innovation stemming from its myriad uses in transportation, structural housing material, art, clothes and spirituality (Hebda & Mathewes, 1984;Stewart, 1995). Evidence of these uses over past centuries is imprinted in coastal forests by way of culturally modified trees (Turner et al., 2009). Furthermore, due to its great longevity (>1,000 years) and potential sizes (>3 m diameter and >60 m tall), redcedar is associated with many important ecological functions such as supporting wildlife habitat (Stevenson, Jull, & Rogers, 2006), diverse epiphytic communities (Price, Lilles, & Banner, 2017), and soil stability and carbon storage (Klinka & Brisco, 2009).
Although redcedar is broadly distributed and relatively abundant in the temperate rainforest of British Columbia (BC), those of "monumental" quality ( Figure 1)-large, high-quality trees suitable for cultural purposes such as carving dug-out canoes, totem poles and traditional houses-are rare (Sutherland, Gergel, & Bennett, 2016).
Monumental redcedar is an inherently scarce resource in managed forests because of the unique developmental pathways required to reach monumental status and the long time periods, typically more than 250 years, required for their development (Sutherland et al., 2016). Accessible, large, high-quality redcedar trees are also among the most profitable timber in coastal forests, making them a staple target of industrial logging (Nelson, 2004). On the Central and North Coast of BC, for instance, analyses suggest that logging is occurring disproportionately within highly productive redcedar stands (Green, 2007). The overharvesting of high-value redcedar can erode the natural capital of coastal temperate rainforests similar to high grading of rare tree species in tropical forests (Schulze, Grogan, Landis, & Vidal, 2008) or large fish in the marine environment (Poos, Bogaards, Quirijns, Gillis, & Rijnsdorp, 2010).
In addition to disproportionate harvesting, the current silvicultural regime, focused on principles of maximum sustained yield, does not provide sufficient time or possibly the growing conditions, necessary for redcedar to reach monumental status within the operational land base (LePage & Banner, 2014;Sutherland et al., 2016).
For example, conventional timber supply models calculate harvest rotation ages within managed forests (typically less than 100 years) that can be an order of magnitude shorter than the age of large monumental redcedars (often more than 1,000 years; Antos et al., 2016;MacKinnon, 2003;Waring & Franklin, 1979). Such divergence between managed and unmanaged forests is especially marked in the wetter portions of the coastal temperate rainforest, where large-scale disturbances, such as stand-replacing fires, are exceedingly rare, leading to a natural disturbance regime characterized by small-scale gap dynamics and generating forests dominated by old growth (Daniels, 2003;Lertzman et al., 2002;Lertzman, Sutherland, Inselberg, & Saunders, 1996). Thus, the seral shifts produced by industrial silviculture dramatically decrease the number of large old trees present on the landscape (Lindenmayer, Blanchard, Blair, McBurney, & Banks, 2016). Such a change is salient in any forest around the world because large old trees disproportionately affect the structure, dynamics and function in forests (Lindenmayer & Laurance, 2017;Lutz, Larson, Swanson, & Freund, 2012;Stephenson et al., 2014), but when these stand elements are also a cultural keystone like monumental redcedar (Garibaldi & Turner, 2004), archaeology, cultural data, ecosystem-based management, Great Bear Rainforest, indigenous people, Maxent, monumental redcedar, spatial conservation planning, species distribution model, western redcedar perturbations to their distribution influence the broader social-ecological system as well. Therefore, focusing conservation efforts on large old trees with connections to local cultural values and practices can create important synergistic benefits that cross the social-ecological divide (Blicharska & Mikusiński, 2014).
In the Great Bear Rainforest of coastal British Columbia indigenous groups (referred to as First Nations in Canada) can access redcedar for cultural purposes. According to the current EBM framework, planners must also incorporate First Nations' traditional F I G U R E 1 Pictures of monumental redcedar and culturally modified trees (photograph credits: Ken Lertzman). Picture (a) is characteristic of an occurrence that met the field survey criteria for monumental cedar status: diameter at breast height greater than 1 m and bole length greater than 5 m with few knots or defects. Picture (b) shows a very large redcedar tree that does not meet the field definition for monumental cedar due to excessive rot and other tree defects. Picture (c) shows a partially carved canoe that was likely abandoned over a century ago, which is characteristic of one type of aboriginally logged tree occurrence in the archaeological records (accompanying stump outside picture frame). Picture (d) shows a test hole in a redcedar tree-a technique traditionally used by indigenous people to assess the proportion of heart rot in potential monumental cedar forest resources and tree use into landscape reserve planning (Great Bear Rainforest Order, 2016), thus creating a strong mandate for the conservation of monumental redcedar. However, the large scale of spatial planning and the inherent rarity of monumental redcedar, coupled with the lack of a comprehensive inventory in the region, makes it challenging to implement CSAs without more knowledge of these trees' distribution across the landscape.
Species distribution models (SDMs) are widely discussed in the conservation literature and are increasingly being used to inform site selection for spatial planning around the world (Araujo & Williams, 2000;Ferrier, Watson, Pearce, & Drielsma, 2002;Franklin, 2009).
Typically, SDMs are used to produce predictive maps that show the probability of species presence, or habitat suitability, based on the statistical relationship between observed species occurrences and environmental factors. One of the major challenges with SDMs in an applied context is that many empirical models are based on potentially biased field surveys (Phillips et al., 2009). Various methods have been developed to account for these issues, including creating target background data to reflect sampling effort (Phillips et al., 2009) and altering occurrence datasets to remove biases (Dudik, Schapire, & Phillips, 2005). Although methods to address bias and uncertainty may improve predictions, they are difficult to apply in an objective manner because detailed information about sampling effort is often lacking Phillips et al., 2009).
Integrating multiple independent datasets is another approach to revealing biases and cross-referencing knowledge of species distributions (e.g., Lopez-Arevalo et al., 2011). For instance, combining data from modern field surveys of monumental redcedars with archaeological records of their past locations may fill the data gap arising because their modern distribution has been shaped by a century of industrial forest harvest. Bringing together such ecological and cultural data can also lead to a spatial conservation design that reflects the patterns and values of past traditional resource use.
In this research, we evaluate different data sources to predict the spatial distribution of monumental redcedar in a portion of the Great Bear Rainforest of BC, Canada. We examine SDMs derived independently from two types of monumental redcedar occurrence data: community-based field surveys carried out by the Heiltsuk First Nation and archaeological records of traditional harvesting locations.
We also create a third SDM that combines these two datasets. We first hypothesize that the distributions inferred from the field survey and archaeological datasets will have substantial overlap and exhibit similar relationships across environmental variables and that both SDMs will be influenced by variables related to access, such as elevation and proximity to the ocean. Next, we compare the predictions based on these individual SDMs with a more recent, independent and systematic, dataset of monumental redcedar from field transects, collected for this project. Our hypothesis is that the survey biases (i.e., targeted and opportunistic sampling) in the Heiltsuk field surveys and the traditional patterns of use in the archaeological records will limit congruence with this independent dataset. We also expect that the SDM based on pooled occurrences from the first two datasets will create a more accurate model of the spatial distribution of monumental redcedar (as represented by our third dataset), potentially via the most extreme biases in either dataset cancelling each other.

| Study area
Our study area encompasses the traditional territory of the Heiltsuk First Nation on the Central Coast of BC, Canada-an area that forms part of the Great Bear Rainforest (GBR; Figure 2). This region's EBM regime has received substantial scholarly attention, partly because of its global conservation status as one of the largest undeveloped regions of coastal temperate rainforest (Allen, 2005). The GBR has also gained notoriety around the world as a focal point in forestry conflicts (i.e., "the war in the woods") and, in relation, the almost unprecedented shift to 85% of its forests being off-limits to logging (Great Bear Rainforest Order, 2016; Price et al., 2009). But this region also has a long history of forest management and stewardship that precedes the modern forest industry. Indigenous people have been occupying this territory for millennia, with archaeological records showing human settlement and use of the land dating back to over 10,000 years ago (Cannon, 2000;McLaren et al., 2014). Although the Heiltsuk territory covers a large area of land (~15,000 km 2 ), population densities are currently very low (~1 person/4 km 2 ) and comprised of mostly First Nations. Currently, unemployment rates among First Nations are extremely high, with the majority of jobs provided by the local government and the natural resource sectors (Allen, 2005). Accessing and distributing traditional food and other resources from the surrounding territory form the basis of a significant subsistence economy, and thus, local people often make a clear and direct connection between ecological integrity and community well-being-two central pillars of EBM.
Ecologically, the GBR region lies within the coastal temperate rainforest and is within the Coastal Western Hemlock (CWH) zone of BC's Biogeoclimatic Ecosystem Classification system (BEC; Meidinger & Pojar, 1991). The BEC system classifies ecosystems across nested scales: zones represent the broadest scale based on climate, and site series represents the finest scale based on the local soil moisture and nutrient regimes (Meidinger & Pojar, 1991). The GBR region is characterized by high annual rainfall (2,000+ mm), moderate average monthly temperatures ranging from 4 to 16°C and extensive coniferous forests. Heterogeneous physiography and a landscape that includes mainland fjords and offshore island archipelagos create large regional variation in site productivity. Forests in the floodplains of large river systems can accumulate immense above-and belowground biomass, whereas other areas that are severely limited by nutrients and water tables are characterized by bog ecosystems with markedly shorter forest canopies. Over a dozen tree species occupy these forests, the most common of which, depending on site conditions and disturbance histories, are western redcedar, western hemlock (Tsuga heterophyla), amabilis fir (Abies amabilis), Sitka spruce (Picea sitchensis), yellow-cedar (Callitropsis nootkatensis), shore pine (Pinus contorta var. contorta) and red alder (Alnus rubra; Alnus sitchensis on the outer coast).
The spatial extent of our study area (350,000 ha) represents roughly 25% of the terrestrial area of the Heiltsuk territory. We selected this area based on availability of GIS data and in an attempt to exclude forests that are unlikely to yield monumental redcedar because of short tree canopies, logging history or unsuitable species composition. To identify a study area using these criteria, we queried the Vegetation Resource Inventory (VRI) spatial layer (Data BC; www.data.gov.bc) based on whether redcedar appeared in the species label (constituting at least 10% of forest canopy) of the stand, whether average stand height was equal to or greater than 20 m and whether average stand age was equal to or greater than 140 years.
VRI data are derived from interpreted orthophotos and represent stand values averaged across broad areas (typically over 1 ha in size), thus masking fine-resolution variability within stands. We created the final study area boundary by clipping to the Heiltsuk territory and by clipping to the Central Very Wet Hypermaritime BEC subzone and variant of the CWH zone (CWHvh2; Meidinger & Pojar, 1991) to distinguish among unique ecosystem types. We used the entire study area for model calibration and the portion of the study area that overlaps Chatfield Island for model validation (Figure 2). There is no differentiation in this occurrence dataset between redcedar and yellow-cedar-two species that look very similar. However,

| Heiltsuk field surveys
given that the survey locations are mostly associated with redcedar stands and large yellow-cedar trees are much less common in the region, it is likely that almost all the occurrences are redcedar.

| Archaeological records
The BC Archaeology Branch administers a database that contains archaeological features recorded by archaeologists within BC. In 2011, we accessed a GIS shapefile of archaeological features within Heiltsuk territory. Many of these features represent 10 m buffers around points or clusters of points; so, we used the centroid of each polygon as our occurrence point. We used occurrence data based on the records describing culturally modified trees (CMTs). Monumental redcedar is not listed specifically, but the CMT site type does contain records of 106 aboriginally logged trees within the spatial study area, which identify where trees or portions of trees have been harvested in the past. Species information is not listed for every CMT, but western redcedar represents 84% of the populated fields and the generic term "cedar" represents the rest. We used these records of archaeological aboriginal logging as a proxy of historic monumental redcedar occurrences, although further research is needed to quantify how frequently they would meet the monumental redcedar criteria described for the modern Heiltsuk field surveys.

| Chatfield Island transects
To create an independent validation dataset for testing models built from the above occurrence data, in July 2015 a team of four

| Species distribution models
To build SDMs, we used the machine learning program, Maxent (Phillips, Anderson, & Schapire, 2006), supported through the "dismo" package in r (Hijmans, Phillips, Leathwick, & Elith, 2013;R Core Team, 2015 (Fielding & Bell, 1997) to assess model fit through area under the receiver operating characteristic curve (AUC) statistics (Hanley & McNeil, 1982). With Maxent, this metric represents sensitivity (correct positive predictions) plotted as a function of the proportional predicted area, where values of 0.5 represent random predictions and values above 0.5 indicate performance better than random (Phillips et al., 2006). Unlike models that integrate presence and absence records, the AUC scores in Maxent represent the probability that random presence sites will score higher than random background sites-in our case, 10,000 randomly generated points distributed across the spatial study area. Finally, we accessed Maxent outputs that describe variable importance as well as species response functions that show how the probability of presence varies with changing variable values, while keeping all other variables at their average value.

| Predictive maps
To compare the spatial distributions associated with the two models, we generated predictive maps based on 25 m raster cells. We used the predict function in r to produce these maps (Hijmans et al., 2013). This function uses the statistical relationships within and across variables in the SDMs to interpolate in geographic space the probability of monumental redcedar presence. The associated values, represented by coloured cells in the predictive maps, are relative measures of probability of presence, where typical presence localities have a value of around 0.5-a value that is likely higher than monumental redcedar prevalence across the landscape. Maxent allows the default prevalence value to be changed, but estimating this parameter was beyond the scope of our study. We then extracted raster cells with values within the 90th percentile for the Archaeo and Heiltsuk models and calculated areas where these highly suitable areas overlap. We also assessed concordance between the two maps by using the Istat function within the "SDM tools" package of r (VanDerWal, Falconi, Januchowski, Shoo, & Storlie, 2014). This function calculates the I similarity statistic following Warren, Glor, and Turelli (2008) where 0 represents no overlap and 1 represents complete overlap.

| Model validation
We quantified model performance by testing the associated predictions against our independent validation dataset derived from the Chatfield Island field transects. We used AUC to measure the extent to which the SDMs that were trained on the Heiltsuk field surveys and archaeological records correctly predict the 62 monumental redcedar occurrences from the validation dataset. We also examined whether the Combined model increased the predictive ability of models relative to using the individual Archaeo and Heiltsuk models.

| Comparison of species distribution models
The Heiltsuk and Archaeo models show similar variables influencing predictions about the distribution of monumental redcedar ( When examining the three most important predictors in the EAV scenario, the similarities and dissimilarities between the Heiltsuk and Archaeo models are evident (Figure 3). For example, in both models the Canopy Height variable is associated with an increase in the probability of monumental redcedar occurrence (i.e., predicted values) at taller canopy heights (canopy heights had to be higher than 20 m to form part of the study area), though the sparse data in the upper height range make inferences about the response shape challenging.
The modelled response to ecosystem types, as measured by Site Series, varies across the two SDMs. Site series classified as 13-associated with a "very rich" soil nutrient regime and a "very wet" soil moisture regime (Green & Klinka, 1994)-has the highest predicted values in the Heiltsuk model. In the Archaeo model, site series classified as 01 has the highest predicted values. The 01 "zonal" site series represents the average climatic conditions in the area and is associated with a very poor to medium soil nutrient regime and a  moist to very moist soil moisture regime (Green & Klinka, 1994). The Slope variable also responds differently in the two models. In the Archaeo model, where Slope is the most important factor, the response curve shows a slight increase in predicted values up to 20% slopes followed by a general decline in the probability of presence with increasing steepness. Finally, predicted values for solar radiation show a fairly consistent relationship between the models and, though hard to interpret in an applied sense, generally show areas with moderately high insolation to be most suitable.

| Comparison of predictive maps
Calculating overlap, using the I similarity statistic, between the predictive maps derived from the Heiltsuk and Archaeo models shows moderate spatial congruence in both the IAV scenario (I = 0.790) and the EAV scenario (I = 0.737). Correspondingly, a visual comparison of the Heiltsuk and Archaeo predictive maps qualitatively indicates distinct similarities and dissimilarities (Figure 4a,b). For example, both maps show many productive old-growth watersheds as highly suitable for monumental redcedar, but the Archaeo model appears to generally predict higher suitability in forests closer to the shoreline than the Heiltsuk model. It is challenging to visually tease apart all the similarities and differences in spatial predictions between these maps without a more detailed examination of the individual variable layers as well as the covariate relationships and interactions outlined in section above. When focusing on areas that are predicted to have very suitable conditions for monumental redcedar (e.g., 90th percentile of probability of presence values), there is only 24% overlap between these maps (Figure 4c).

| Model validation
Evaluating the SDMs against the independent validation dataset from occurrences on Chatfield Island (EAV scenario) shows predictive performance highest when using the Combined model (AUC = 0.751) or the Archaeo model (AUC = 0.745). The Heiltsuk model on its own performs poorly (AUC = 0.594).

| Integrating cultural occurrence data into species distribution models
Integrating occurrences from community-based field surveys and archaeological records into species distribution models provides data that can support predictions of the distribution of monumental redcedar trees. When should indigenous communities consider using archaeological occurrence datasets in species distribution modelling and conservation planning? If communities are only concerned with mapping current presence distributions, then species inventories from large, rigorously designed surveys will probably be the most valuable data source for developing SDMs. In these situations, patterns of traditional harvesting sites across the landscape might be considered a "bias" that needs correcting. However, indigenous communities involved in spatial planning processes can face unique challenges and often have a broader set of considerations, including shifting environmental baselines, data limitations and distinct cultural objectives.
Triangulating results from field inventories with archaeological data has the benefit of extending the temporal resolution of species occurrences. The archaeological records of aboriginally logged trees used in our study spatially reference the location of traditional redcedar harvesting sites over a long time period-likely up to several centuries, given the slow decay rate of redcedar (Daniels, 2003). This time scale is relatively recent, however, compared to other applications of archaeological data in SDM that involve hindcasting over millennia to different climatic conditions (Franklin et al., 2015). Incorporating predictor variables into SDM that reflect past climate is an important approach for reconstructing resource   (Franklin et al., 2015), but is less critical in our study because the trees harvested in the aboriginally logged records survived in roughly the same climate as old monumental redcedar trees still present on the landscape.
Including archaeological information in SDM is especially important for resources with rapidly shifting baseline conditions such as monumental redcedar in the study area (Green, 2007) and other large old trees worldwide (Lindenmayer, Laurance, & Franklin, 2012).
This situation causes the modern distribution of remaining trees to be a censored dataset because they represent a non-random sample of the original distribution relative to various environmental gradients. The shift, due to industrial logging, in the locations where monumental redcedar trees remain indicates that simply focusing on living occurrences will result in a biased sample of suitable growing conditions. This type of human disturbance has a disproportionately strong influence on predictions of large old trees such as monumental redcedar because our results show its distribution correlated with site productivity, a variable that is also highly correlated with patterns of logging activity (Albert & Schoen, 2013;Pearson, 2010).  (Berkes et al., 1994), corresponding survey data related to these species' distributions are typically less common. Where such limitations exist, using proxies such as archaeological records or traditional use data (Tobias, 2009)

| Uncertainty arising from spatial datasets
Across SDM studies worldwide, data quality, grain and availability are often limiting factors in their application Franklin, 2009 (Table 1). Even the validation dataset, which is associated with the most random, independent survey design, is limited to one large island and thus does not cover the full range of environmental conditions across the larger territory. More research is needed to understand the extent to which these data limitations affect our findings.

| Predicting distributions
The spatial predictions from the Heiltsuk and Archaeo models used in this study have moderate overlap. A higher degree of congruence is not surprising given the survey biases and underlying differences in what the occurrence datasets represent. Each of the SDMs and scenarios in our study could be useful for understanding the distribution of monumental redcedar (see Appendix S1) and for conservation planning depending on specific community objectives. Local planners could prioritize the Archaeo model for its prediction of suitable conditions for recruitment, the Heiltsuk or Combined model for their prediction of the current distribution across the managed forestry land base, the IAV scenario for its prediction of traditional patterns of harvesting, or the EAV scenario for its prediction of suitable areas not constrained by access. If communities want to account for multiple data sources to address issues of survey bias, priority locations could be selected from portions of the landscape where highly suitable areas from different models overlap (Figure 4) or from a predictive map based on a single model that uses pooled occurrence data (e.g., Combined model).
To identify these highly suitable areas, a variety of methods could be used for selecting a specific threshold from the continuous Maxent predictions (Franklin, 2009). For instance, cells with values at the threshold of maximum sensitivity could be selected as priority sites for Cedar Stewardship Areas. At this threshold in the Archaeo model (sensitivity = 0.30), the top 30% of most suitable areas contain 82% of monumental redcedar occurrences within the validation study area on Chatfield Island. These highly suitable areas from the predicative maps could be translated directly into new legal land designations that support the conservation of monumental redcedar or they could be used as an input into conservation prioritization exercises that account for multiple landscape values (Moilanen et al., 2011;Whitehead et al., 2014). The predictive maps could also function as a guide for allocating survey effort during operational forestry planning.
In this study, we focus on novel ways to predict and conserve monumental redcedar, but the framework outlined here can extend to other important species and traditional resources globally. In particular, forests around the world that contain large old trees with cultural modifications have the potential to reveal information that span centuries and millennia. Forest management regimes need to better protect these types of biocultural features and the old-growth ecosystems in which they occur (Lindenmayer, Blanchard, Blair, & McBurney, 2018), due to their application in understanding cultural practices through time (Blicharska & Mikusiński, 2014;Turner et al., 2009). Ultimately, there is a suite of quantitative and qualitative methods available to examine the conditions that underpin distributions and prioritize locations for conservation areas (see Appendix S2). But the amount of resources and time required to pursue all this research can be an impediment to finalizing a conservation plan. So, whether it is the Heiltsuk Nation considering strategies to conserve and steward monumental redcedar trees or other groups trying to map and manage resources, communities will need to decide what mix of methods and tools are most appropriate for their objectives.
Whichever approach is chosen, it is important that communities consider and assess datasets that reveal past distributions and traditional patterns of resource use because industrial development has often shifted the current distribution away from many culturally important locations across the landscape.