The impact of strictly protected areas in a deforestation hotspot

Protected areas are often thought of as a key conservation strategy for avoiding deforestation and retaining biodiversity; therefore, it is crucial to know how effective they are at achieving this purpose. Using a case study from Queensland, Australia, we identified and controlled for bias in allocating strictly protected areas (IUCN Class I and II) and evaluated their impact (in terms of avoiding deforestation) using statistical matching methods. Over the 30 years between 1988 and 2018, approximately 70,481 km2 of native forest was cleared in the study region. Using statistical matching, we estimated that 10.5% (1,447 km2) of Category I and II (strict) protected areas would have been cleared in the absence of protection. Put differently, 89.5% of strictly protected areas are unlikely to have been cleared, even if they were never protected. While previous studies have used statistical matching at a country or state level, we conducted an analysis that allows regional comparison across a single State. Our research indicates that strictly protected areas are marginally effective at preventing deforestation, and this likely due to biases in establishing protected areas on unproductive land.

Despite these values, forests are imperiled by human activities such as agriculture, infrastructure and urbanization (Venter et al., 2016). Such actions, directly and indirectly, cause deforestation. The world has committed to both the sustainable use of natural resources and the expansion of protected area networks to mitigate the adverse impacts of extensive land clearing (Brooks et al., 2015;Díaz et al., 2015;Messerli et al., 2019). Global action to expand protected area networks (Dudley & Phillips, 2006) is underpinned by the assumption that, among other objectives, protected areas (such as national parks) will effectively abate deforestation.
In Australia, the coverage in protected areas afforded to native species by protected areas has been assessed (Barnes, Szabo, Morris, Possingham, & Rouget, 2015;Taylor et al., 2011), as has the protected area network's capacity to manage threats (Kearney, Adams, Fuller, Possingham, & Watson, 2018) and meet species or community representation targets (Barr, Watson, Possingham, Iwamura, & Fuller, 2016). Such studies have shown that protected areas in Australia tend to underperform, but the effects of protected areas on avoiding deforestation have not yet been carefully examined. As part of a broader movement towards evidence-based policymaking, a growing body of literature calls for robust impact evaluations (Gertler, Martinez, Premand, Rawlings, & Vermeersch, 2016). Robust impact evaluations attribute causality between an intervention (in this case, protection) and specific observable variables (in this case, the biophysical characteristics of land and deforestation). Recent literature has increased the prominence of rigorous impact evaluations (McKinnon, Mascia, Yang, Turner, & Bonham, 2015), yet they remain rare in conservation literature (Baylis et al., 2015;Ferraro, 2009;Pattanayak, Wunder, & Ferraro, 2010;Schleicher et al., 2019). There are efforts to improve evidence standards, but they are hindered by resourcing constraints (Curzon & Kontoleon, 2016), lack of technical capacity, perceived misalignment with core business (Craigie, Barnes, Geldmann, & Woodley, 2015), and the mistaken assumption that more straightforward approaches will yield sufficient evidence to support policy (Adams et al., 2019;Rose et al., 2019). Consequently, there has been limited uptake of robust impact evaluations of conservation interventions (Baylis et al., 2015). In this study, we assess strictly protected areas in terms of their impact on avoiding land clearing.
Queensland is Australia's second-largest state, and its diverse and iconic landscapes support globally significant biodiversity (Queensland Government, 2019). Queensland is home to 85% of Australia's native mammals, 72% of Australia's native birds, 50% of Australia's native frogs and reptiles and more than 11,000 plant species (Cresswell & Murphy, 2017). Queensland's rich biodiversity is also imperiled by the world's highest deforestation rates, averaging nearly 400,000 ha per year (Hudson, 2019). Despite a decline in global land clearing over the past 35 years (Song et al., 2018a;Song et al., 2018b), land clearing has been steadily increasing in Queensland over recent years (Evans, 2016;Queensland Department of Environment and Science, 2018;Reside et al., 2017). The Australian Federal Government and Queensland State Government have committed to acquiring areas under high threat of deforestation for protection (Commonwealth of Australia, 1997) by securing land from activities that conflict with nature conservation (Commonwealth of Australia, 2015). Still, the extent to which protected areas contribute to this commitment is unclear. Despite the globally significant values, a recent audit found there are no government strategies in place to systematically plan effective conservation actions (Queensland Audit Office, 2018), including protected areas.
Here we estimate the amount of clearing avoided due to strictly protected areas in Queensland by comparing two methods: statistical matching using biophysical characteristics and a naïve comparison using logistic regression without matching. We also investigate regional differences in the amount (in terms of percent and area) of the avoided clearing. The findings of this work have implications for the future management and conservation of Queensland's forests. Understanding impact in this context is critical to improving recommendations for new protected areas as networks continue to expand, not only in Queensland (Queensland Government, 2017) and Australia (Australian Government, 2016) but also globally as a result of international obligations (UNEP, 2016; United Nations, 2014).

| MATERIALS AND METHODS
The goal of this analysis is to measure the impact of the Queensland protected area network on deforestation. We measured impact (i.e., avoided loss) as the difference in deforestation between strictly protected areas and statistically comparable places (Gertler et al., 2016). We compared two types of evaluation to estimate impact; (1) using regression analysis on statistically unmatched data and (2) using regression analysis on statistically matched data.

| Study area
The study area (Queensland, Australia) is divided into 13 bioregions. Bioregions demarcate distinct regions based on climate, geology and biota (Thackway & Cresswell, 1997) and are the reporting unit for assessing the extent of protection of ecosystems in Australia's National Reserve System (Environment Australia, 2000). We excluded four grassland-dominated bioregions (390,000 km 2 or 22.2% of land area in the state) because such habitats are incompatible with the outcome, deforestation (described below).

| Strictly protected areas
Queensland's total protected area network covers 8.21% (130,493 km 2 ) of the total land area (1.85 million km 2 ) in the State (Figure 1). Each protected area has an IUCN classification (Dudley & Phillips, 2006) that specifies the protected area's management strategies. The strictest IUCN Management Categories (I and II) are specifically designated for biodiversity conservation, whereas other IUCN Management Categories allow for multiple uses, including sustainable resource harvest. The differences in these management purposes may influence where such protected areas are placed in the landscape. To avoid confounding across multiple drivers of protection, we constrained our analysis to "strictly" protected areas (IUCN Class I and II) established in 1988 or later (S1). The total extent of IUCN Management Categories I and II protected areas declared after 1988 was 49,536 km 2 or 38% of Queensland's total current protected area network and 2.9% of the total land area.

| Deforestation
Deforestation was defined as a change from forested landscapes (forests and woodlands) to a non-forested land cover. We used satellite data (based on Landsat 7) for tree canopy cover to assess deforestation (Dadhich & Hanaoka, 2010;Green, Kempka, & Lackey, 1994;Koh, Miettinen, Liew, & Ghazoul, 2011). This remotely sensed deforestation data combines a spectral clearing index derived from short wave infrared bands, tree foliage density, and an index of variability over time to calculate a "probability of woody vegetation clearing" index (Wedderburn-Bisshop, Walls, Senarath, & Stewart, 2002). Produced by the Queensland State Government under the "State-wide land and trees study" (SLATS), this data has a resolution of 30 m*30 m and was available from 1988. (Department of Science ITaI, 1988. We excluded areas attributed as "natural tree death" or "natural disaster damage" from further analysis. Thus, the outcome variable was binary, with a value of "1" indicating that a pixel contained woody vegetation before 1988 but was deforested between 1988 and 2018. Values of "0" indicate no change in forest cover. Areas that were deforested before 1988 were also given a value of "0".

| Quasi-experimental design
Quasi-experimental methods construct a plausible counterfactual comparison group with similar biophysical characteristics to treatment sites (i.e., strictly protected areas). Such methods are a robust approach for ex-poste policy evaluation or where on-ground experiments are not feasible (i.e., due to ethical constraints). Since it is impossible to observe what would have happened to strictly protected areas in the absence of protection (Holland, 1986), this approach allows us to mimic a randomized control trial within the context of an ex-post study (Blackman, 2013;Jusys, 2018;Kirk, 2007;Stuart & Rubin, 2008). Specifically, we utilized statistical matching (hereafter referred to as matching). Matching is a method of pre-processing data such that the effect of protection is decoupled from the influence of co-variates which also influence observed outcomes. Matching achieves this by producing a statistically reasonable counterfactual group (Stuart, 2010) (Figure 2). Counterfactual areas were then used as a proxy to estimate the otherwise unobservable conservation outcomes of strictly protected areas had they not been protected.

| Identification of relevant covariates
Protected areas are expected to retain habitat and secure biodiversity in the long term by preventing deforestation. Protection and deforestation are both predicted and influenced by biophysical and landscape characteristics. For example, clearing for pastoral production is Australia's primary driver of land clearing (Department of Climate Change and Energy Efficiency, 2017). In Queensland, more than 88% of the state is used for primary industry (86% for pastoral production, and 2% for broad-acre cropping) (Department of Agriculture and Fisheries, 2018). Land suitability in grass biomass is a predictor of deforestation and protection because areas with high grass biomass represent prime cattle country. Woodlands with a high capacity for grazing are more likely to be cleared for this purpose and are therefore unlikely to be protected. Using this well-established principle, we developed a theory of change to guide the selection of co-variates ( Figure S1) and identified candidate covariates known to predict land clearing in this context. These were: distance to population centers, distance to roads, distance to watercourses, grass biomass, land zone (geological information), rainfall, slope, shaded relief, temperature and vegetation type (Andam et al., 2008;Cuenca, Arriagada, & Echeverría, 2016;Laurance et al., 2002;Veldkamp & Lambin, 2001) (Table 1). All data were sourced from the Queensland Government's publicly available spatial catalogue-"QSpatial" (Queensland Spatial Catalogue -QSpatial, 2019). We performed data preparation and cleaning in ArcGIS 10.2.1 (ESRI, 2014). The data were rasterized into the same spatial extent and resolution (250*250 m pixel size) for analysis. A resolution of 250 m was chosen because it is sufficient to maintain mapping accuracy for use in predictive modeling (Hengl et al., 2015;Storlie, Phillips, VanDerWal, & Williams, 2013).
Once rasterized, spatial data frames representing each covariate value at the pixel centroid were created and data checks were performed (S1.2).

| Pixel matching
Following multiple trials, we selected a random sample of each bioregional dataset comprising 20% of the total pixels. The number of pixels assessed varied by bioregion with a maximum of 1.4 million pixels (Brigalow Belt) and a minimum of 22,727 pixels (New England Tablelands) (S2 ; Table S1). We then interrogated the data for unacceptably high levels of correlation and removed variables that had a correlation coefficient greater than 0.7 or a variance inflation factor (VIF) greater than 0.4 (Hair, Black, Babin, & Anderson, 2013) (S3). Next, we used the MatchIt package (Ho, Imai, King, Stuart, & Whitworth, 2018) in R Version 3.3.2 and RStudio 3.3.2 to match protected and unprotected pixels based on their co-variates. Exact matching was used for categorical covariates (vegetation and landzone), and nearest-neighbor based propensity score matching with replacement was used for all continuous variables (Table 1). Propensity scores are a pixel's probability of being treated (protected) based on the baseline characteristics of the co-variates estimated via logistic regression (Rosenbaum & Rubin, 1983). Thus, the propensity score model had the formula: F I G U R E 2 Flow chart for the design of the quasi-experimental approach used in this analysis: design (left) and analysis (right) T A B L E 1 Description of each co-variate, including the logic behind its inclusion, the dataset name, data authority, year published, and data type Higher costs are associated with extracting from lands that are further from current urban areas (Chomitz & Gray, 1999).
Minimize the mean standardized difference between protected and unprotected groups

Continuous
Distance To Major Roads (Department of Transport and Main Roads, 2018) Roads facilitate access and are a known correlate to deforestation (Chomitz & Gray, 1999 The vegetation type is an appraisal criterion for national park selection and specific vegetation categories are more attractive for clearing (Seabrook, McAlpine, & Fensham, 2008), and clearing is permissible on specific vegetation types (Queensland Government, 2018a Nearest-neighbor matching selects the most similar control (unprotected) pixel for each protected pixel. It does this by selecting pixels with the most negligible mean standardized difference from the protected pixel's propensity score. This matching method was selected based on data characteristics: the co-variate distribution was not normal; the sample size was large, the outcome variable (cleared/not cleared) was dichotomous (Ho, Imai, King, & Stuart, 2007;Imbens & Rubin, 2015;Rubin, 2006). All unmatched control pixels were discarded, allowing us to estimate the treatment effects on the counterfactual group. Matching with replacement was used to enable control pixels to be used as matches for more than one protected pixel and decrease bias in the estimates of impact (Stuart, 2010). Further details on model specifications are provided in S1.2.

| Quality checks: Co-variate balance
We created paired boxplots and used a Man-Whitney U Test to demonstrate the differences between protected and cleared pixels. We evaluated match performance (co-variate balance) for continuous co-variates using Mean Standardized Difference (MSD), variance ratios (V), Kolmogorov-Smirnov (KS) test-statistics and Love Plots in the Cobalt package v3.4.1 (Greifer, 2018). Using Love Plots, we visualized the MSD in co-variate values for each co-variate within each bioregion (Greifer, 2018) based on a random sample of the data before and after the data was matched. Post-matching, MSD should be as close to zero as possible. We considered balance acceptable if MSD was less than or equal to 0.25, (Austin, 2009;Stuart, Lee, & Leacy, 2013). For V and KS, we considered scores less than or equal to 2 and 0.1, respectively, to indicate acceptable balance (Austin, 2009;Stuart et al., 2013). We report MSD, V and KS for each bioregion (Table 2; S4: Table S2). We also compared the similarity of the likelihood of protection (propensity scores) by investigating the distributions of values for protected and matched unprotected (i.e., counterfactual) pixels (Imai & Ratkovic, 2014) for all bioregions (Table 2; S4: Table S2: Figures S4-S23). When distributions overlapped well visually, we inferred the matching method produced a comparable set of counterfactual pixels (Stuart, 2010). We tested for hidden bias using Rosenbaum's bounds (S8; Figure S34) and examined the influence of spatial autocorrelation (S7: Figures S25-S33).

| Estimating causal impact
To estimate the causal impact of protection on deforestation, we calculated the Average Treatment Effect on the Treated (ATT). This allowed us to assess the likelihood of clearing per pixel in the absence of protection by comparing the expected change in forest cover, based on each pixel's propensity for protection (propensity score) and their co-variate values, with the actual change in forest cover (Arriagada, Ferraro, Sills, Pattanayak, & Cordero-Sancho, 2012;Imbens & Rubin, 2015). The ATT was derived using doubly robust methods (Rubin, 1973;Stuart, 2010;Stuart & Rubin, 2008), which use the propensity scores derived from matching as a co-variate (Stuart, 2010;Stuart & Rubin, 2008). This controls for any remaining imbalance between the co-variates of matched treated and untreated pixels resulting in robust estimates of impact (Rubin, 1973) in a process called "regression adjustment" (Blackman, 2013;Imbens, 2015;Rosenbaum & Rubin, 1983). Regression adjustment is a statistical procedure that uses co-variates that are known to drive clearing and the propensity score derived from matching as predictors in a logistic model to estimate the probability of clearing. Doing so quantifies the relationship between the covariates and the outcome (i.e., cleared or not cleared) for each counterfactual and control pixel (Guo & Fraser, 2014;Rubin, 1973). Regression adjustments were conducted in Zelig v5.1.6 (Hair et al., 2013;Nori et al., 2013) by fitting a weighted logistic regression model to the matched dataset. This model has the formula clearedpropensity_score þ co À variate1 þ co À variate2… ð2Þ To capture any uncertainty in the overall ATT estimate, we computed 1,000 simulations of this model for each bioregion (Horton & Kleinman, 2007) (see S9; Figure S33 for further details). Finally, since matching with replacement was utilized, weights were incorporated into the regression to reflect the number of times each counterfactual pixel was used to match (Stuart, 2010). The average values from the above model were then used to estimate the ATT. The ATT is the mean difference in the expected outcomes (or the values derived from the model) between the protected and counterfactual pixels (see the example from the Brigalow Belt in Table 2). Negative ATT values suggest clearing would have occurred if protection was not present. The higher the negative value, the more likely the average pixel would have been cleared in the absence of protection.
The mean ATT (King, Tomz, & Wittenberg, 2000) for each bioregion was used to estimate the total area of avoided loss (km 2 ) attributable to strictly protected areas (Rasolofoson, Ferraro, Jenkins, & Jones, 2015) by multiplying the bioregion's average ATT (as a % of the probability of clearing) by the total area within each bioregion that was cleared between 1988 and 2018 (Jusys, 2018;Miteva, Ellis, Ellis, & Griscom, 2019).

| Un-matched (naïve) estimation of avoided deforestation
To assess the implications of not performing statistical matching when calculating impact, we used the same subset of randomly sampled pixels (i.e., 20% of a bioregion's total number of pixels) and replicated the approach described above to calculate ATT without statistically matching treated (protected) and control (unprotected) pixels. This generated a naïve (non-robust) estimate of the impact of protection on deforestation (Table 2).

| Characteristics of strictly protected areas
We found notable differences between cleared and protected pixels in their biophysical characteristics (co-variates). Specifically, protected pixels were further from built-up areas, had a lower grazing capacity (grass biomass), and occurred on steeper slopes. Cleared areas tended to be closer to roads, have a higher temperature and lower rainfall. Overall, we found that the majority of co-variates were statistically dissimilar between protected and cleared areas, and this trend was consistent across bioregions (p < .05) (Figure 3).

| Pixel matching and co-variate balance
Despite significant differences in protected and unprotected groups before matching (S4; Figure S4), we were able to match between 99 and 100% of protected pixels in all bioregions to equivalent unprotected pixels (S2-6). A minimum of one of the statistical balance thresholds was met for all co-variates and bioregions, but the majority met more than one. An in-text exemplar of tabulated balance metrics for the Brigalow Belt is shown in Table 3. Of the co-variates included in each bioregion, 27.7% did not meet a conservative threshold of 0.1 for mean standardized differences; however, 97% reached a less conservative limit of 0.25 (Figure 4; Table 3). Metrics for all other bioregions are provided in S4: Table S2.
Similarly, the variance ratio was less than two for the majority (97%) of co-variates. However, 58% of co-variates failed to meet the Kolmogorov-Smirnov threshold. We concluded that our matching algorithms performed well in eliminating non-comparable pixels, but we performed a regression adjustment given the poor performance against  ote: This table shows the co-variate name, the mean of the unprotected pixels from the random sample, the mean average of the protected pixels from the random sample, and their mean standardized difference (MSD). It then shows the mean average of the unprotected and protected pixels after matching, and their mean standardized differences. The values for the matched test statistics include variance ratios (V) and Kolmogorov-Smirnov (KS) thresholds. For each threshold, a "^" is given next to the value if it is acceptably balanced. the Kolmogorov-Smirnov threshold. We found that all our results were robust to hidden bias (Γ >1.2) (Rasolofoson et al., 2015), and we found clearing is spatially autocorrelated (see supporting information (Table 3; S7 and S8).

| Comparing measures of avoided deforestation (average treatment effect on the treated, ATT)
Without matching, the estimate of avoided deforestation across all bioregions was 25%-more than double the matched estimate (10.5% Table 4). We observed significant differences in the estimated avoided deforestation for individual bioregions when comparing unmatched and matched approaches. The mean ATT was almost always higher without matching (Cape York being an exception). Without matching, 7.32% of clearing between 1988 and 2018 was avoided because of protection in the Briagalow Belt. With matching, the highest mean ATT was also in the Brigalow Belt (2.60%) but was comparatively lower than the naïve estimate. The lowest ATT with matching was observed in the Wet Tropics bioregion (0.26%). Without matching, the lowest ATT was observed in Cape York (0.17%) ( Table 4).

| ATT estimates vary between bioregions
The overarching characteristic of the study region was that highly cleared areas tended to have minimal protection, and strictly protected areas tended to have minimal clearing (Table 4). The highest matched ATT estimates were observed in the Brigalow Belt (À2.60%), Southeast Queensland (À1.60%), and the Mulga Lands (À1.42%). The lowest estimated ATT was observed in the Wet Tropics (À0.26%). The mean ATT was less than 1% for five of the nine F I G U R E 4 Love plots of propensity scores for each bioregion. Love plots show the standardized mean difference of co-variates before and after matching (Matched and Unmatched). Love plots illustrate the standardized mean difference between protected and unprotected pixels before and after matching. The dotted line is a conservative mean differences conservative threshold of 0.1-though values up to 0.25 are acceptable. In these plots, the variable "distance" is the propensity score. Unadjusted values are the standardized mean differences before matching, and adjusted values are the standardized mean. BB, Brigalow Belt; CY, Cape York; CQC, Central Queensland Coast; DU, Desert Uplands; EU, Einasleigh Uplands; MU, Mulga Lands; NET, New England Tablelands; SEQ, Southeast Queensland; WET, Wet Tropics bioregions in the study area (Cape York, Central Queensland Coast, Einasleigh Uplands, and the Wet Tropics) ( Figure 5), indicating strictly protected areas had minimal impact on avoiding deforestation in these bioregions.
• Between 1988 and 2018, 70,190 km 2 of land was cleared in the study region. We estimated that 518 km 2 (or 96,800 football fields) of the clearing was avoided because of strictly protected areas across the study region (covering 974,907 km 2 or approximately 182,184,232 football fields). The majority of avoided deforestation was 1,075 km 2 (200,889 football fields) in the Brigalow Belt. The smallest area of avoided deforestation was approximately 0.96 km 2 in the Wet Tropics (Table 3). In total, this means that 10.5% of land in strictly protected areas would have been cleared in the absence of protection. Put differently, 89.5% of the strictly protected areas included in this study are unlikely to have been cleared, even if they were never protected.

| DISCUSSION
We evaluated the potential of strictly protected areas to avoid land clearing in Queensland, Australia. Since 1988, the strictly protected area network in the study area tripled in area (13,480 km 2 in 1988 to 46,611 km 2 in 2018), with the primary objective of conserving biodiversity by avoiding and managing threatening processes (Queensland Government, 2017). Total deforestation in the considered bioregions was 57,488 km2 or about 10.7 million football fields in this period. Despite this growing threat, we found that 89.5% of land in strictly protected areas are not suitable for clearing, even in the absence of protection. The estimated ATT was highly variable between bioregions. Highly developed regions tended to have a higher mean ATT and regions with moderate to low development had an ATT close to zero. These results demonstrate that strictly protected areas are not guarantees of effective reduction in deforestation because strictly protected areas are biased towards areas with a low propensity for clearing. Our analysis does not include the full suite of protected areas. IUCN Management Categories of protected areas have different management intents (i.e., resource harvesting and recreation) and different drivers for their gazettal (Nature Conservation Act, 1992). To reliably quantify the impact, we chose to constrain our analysis to strictly protected areas to avoid conflating several additional economic, political or social drivers. Nonetheless, the results of these analyses support recommendations for outcome-based targets (Visconti et al., 2019b) focused on preventing threatening processes (Sacre, Weeks, Note: Results are presented for unmatched and matched samples. We also show the area of avoided deforestation in km 2 as estimated with matching. Estimates of avoided clearing (km 2 ) were calculated by multiplying the total cleared area in the bioregion between 1988 and 2018? by the mean ATT (%). ‡ Signifies a significant difference in mean ATT of the matched and unmatched datasets at the 5% level (p-value ≤ .05).
The total cleared area is the product of the number of pixels by 0.0625 (or the area of one pixel in square kilometres).
Bode, & Pressey, 2020). Using rigorous evaluation measures for conservation interventions, we can quantify the impact of conservation interventions leading to measurable outcomes for biodiversity. While other studies have considered the impact of protected areas at a state or national scale, our study uniquely examines the ATT by bioregion. In Queensland, the extent of deforestation per bioregion is not uniform (with between 0.34% and 11.6% of each bioregion cleared). Performing a per bioregion analysis provides insights into the drivers of deforestation at socioeconomic and biologically relevant spatial scales that would have otherwise been unobserved. For example, we observed significant variation in ATT estimates across bioregions (Table 4; Figure 4). The highest ATT was observed in the Brigalow Belt (2.6%). The Brigalow Belt, named Australia's most ecologically transformed area (Ponce Reyes et al., 2016), is heavily impacted by grazing activities. With 11.6% of the bioregion cleared between 1988-2018 and over 30% cleared since European settlement (Neldner et al., 2017), this bioregion has experienced the highest clearing rates in recent years (Queensland Department of Environment and Science, 2018). Likewise, in South-east Queensland, where a long history of development has resulted in a profoundly transformed landscape (Neldner et al., 2017), our results demonstrate that less than 2% of strictly protected areas are likely to have been cleared in this bioregion in the absence of protection.
In contrast to the low ATT of strictly protected areas in highly cleared bioregions, we also found low ATT in relatively intact bioregions. For example, the Cape York Peninsula has had 0.38% of its area cleared since 1988 and received an ATT estimate of 0.78% ( Figure 5). This bioregion contains vast and relatively undisturbed landscapes which support extraordinary ecological and Traditional Owner heritage values (Hitchcock et al., 2013). Clearing, however, is an emerging threat to this bioregion as it is targeted for future development under a Federal commitment to increase agricultural outputs in Northern  Australia (Commonwealth of Australia, 2015;Taylor, Payer, & Brokensha, 2015). It is possible that incorporating the likelihood of land clearing into protected area selection will help future protected areas make measurable contributions toward achieving the globally agreed goal of halting deforestation (United Nations, 2018). The next few years present a new opportunity to acquire high-impact protected areas that mitigate likely deforestation in Cape York.
Strictly protected areas with low estimates of avoided deforestation in regions where deforestation rates are low may have a high impact on other metrics because there are other protected area objectives. For example, we observed the lowest ATT in the Wet Tropics bioregion (0.26%). This mountainous and species-rich bioregion has, by per cent of total area, the largest protected area network (44.13%), but has also experienced extensive land clearing for agriculture (Neldner et al., 2017). Large portions of the protected area estate in the Wet Tropics safeguard the remnant and topographically complex rainforest habitat and its highly endemic fauna (Commonwealth of Australia, 1986) as World Heritage Areas (Liburd & Becken, 2017). Establishing a World Heritage Area in the Wet Tropics has successfully prohibited selective logging (Laurance, 1994). Our analysis considers clearing outright and does not address the impact of strictly protected areas in avoiding selective logging. We could not estimate avoided selective harvest because the spatial resolution of the remotely sensed satellite data is insufficient to measure selective timber harvest. For these reasons, the Wet Tropics bioregion is expected to have a low estimate of avoided deforestation but may have considerable impacts when evaluating alternative outcome variables.
Confounding variables may mask conservation program failure or mimic conservation success, and it is essential to use approaches (such as statistical matching) to control for confounding. Scale is an important consideration for confounding variables because they are not uniformly distributed across landscapes (Joppa, Loarie, & Pimm, 2008). Indeed, we found that there were differences in the landscape characteristics between protected and cleared pixels within the Queensland context. Protected pixels tended to have less favorable characteristics for agricultural development (i.e., higher slope). This result is consistent with previous studies demonstrating the non-uniformity of protected pixels across landscapes (Joppa & Pfaff, 2010). A failure to control for confounding would have over-estimated the ATT of strictly protected areas by as much as 50%. This result is consistent with extensive literature regarding the use of statistical matching for estimating impact (Andam et al., 2008;Bruggeman, Meyfroidt, & Lambin, 2015;Nolte, Agrawal, & Barreto, 2013;Rasolofoson et al., 2015). To ensure resources directed at conservation initiatives are used to their maximum capacity, credible information regarding the effectiveness of conservation interventions is fundamental.
Australia is a relatively high-income nation with globally significant species. Australia was an early adopter of establishing a reserve system based on systematic conservation planning principles (Australian Government, n. d.). We found that strictly protected areas are perhaps underperforming to avoid threatening processes imminence and severity of current biodiversity declines. If they are ill-performing, then the strategies guiding their selection require a restructuring. For example, it may be helpful for Governments to employ robust impact evaluations when considering candidates to fulfill specific socioecological objectives. Such evaluations can be used as a negotiation tool for costing considerations (Adams, Game, & Bode, 2014). Evaluations can be used to assess a parcel's merit and potentially provide a cost-benefit analysis for including that parcel in the network by crucially answering the question: "What would happen if I did nothing?" Publically reporting on this evaluation can be an invaluable negotiation tool for future assets and resourcing and fill a fundamental knowledge gap concerning how well the current network fulfills its objectives.

| LIMITATIONS AND FUTURE WORK
Our analysis was constrained to strictly protected areas. This was necessary to avoid conflating economic, social and political drivers of gazettal for lower IUCN management categories. Further research should compare the avoided deforestation of multiple Management Categories of protected areas. Outliers or Average Treatment Effect on the Treated (ATT) estimates above the first and third quartile, were present for all bioregions. Significantly, the ATT estimates in the New England Tablelands (NET) ranged from À0.07% to À6.70%, giving this bioregion a more comprehensive range than others considered in this study. We attribute the cause of the outliers to the extensive clearing (McAlpine, Fensham, & Temple-Smith, 2002;Queensland Government, 2018b) and a small area under protection in the bioregion (28 km 2 ). We, therefore, presented the estimated ATT for NET, but caution that outliers influence the mean estimate. We also note that our study considers the 30 years from 1988 to 2018, corresponding with the development of high-resolution, highly accurate datasets on land clearing (Department of Science ITaI, 1988-2016). Whilst this does not provide an estimate of impact since European colonization, it does provide an estimate of impact which corresponds to both modern clearing (Evans, 2016) and the rapid expansion of the protected area network (Australian Government, 2016), however, these results may not reflect future deforestation trends.

ACKNOWLEDGMENTS
The authors gratefully thank two anonymous reviewers for their thorough and thoughtful comments on this manuscript. We thank Johan Oldekop for the R code used to produce Bubble and Love plots, Erin Graham for coding consultation, Anna Pintor for rainfall data, and Oyelola Adegboye and Rhonnda Jones for statistical consultation. We also acknowledge Kallum Jones for providing edits to the final manuscript. Stephanie Hernandez acknowledges financial support from the Queensland State Government. The authors also acknowledge James Cook University's eResearch centre for high-performance computational resources.

CONFLICT OF INTEREST
The authors have no conflict of interest to declare.

AUTHOR CONTRIBUTIONS
Project conception and oversight was provided by Vanessa Adams and Stephanie Duce. Analysis and writing was completed by Stephanie Hernandez with editorial support from all authors. Megan Barnes helped develop R Code, project oversight and writing support.

ETHICS STATEMENT
This manuscript is the authors' own original work and have not been previously published elsewhere. This paper reflects the author's own research and analysis.

DATA AVAILABILITY STATEMENT
Please contact the corresponding author for access to data used in this analysis.

SUPPORTING INFORMATION
Additional supporting information may be found online in the Supporting Information section at the end of this article.