Using multivariate analysis to deliver conservation planning products that align with practitioner needs


  • Simon Linke,

  • Matthew Watts,

  • Romola Stewart,

  • Hugh P. Possingham

S. Linke (, M. Watts, R. Stewart and H. P. Possingham, The Univ. of Queensland, The Ecology Centre, School of Biology, Brisbane, QLD 4072, Australia. SL also at: Australian Rivers Inst., Griffith Univ., Nathan, QLD 4111, Australia.


This software note describes an extension to the conservation planning package Marxan in which multiple solutions can be evaluated instead of only relying on the measures of best solution and irreplaceability. For this extension we coupled Marxan with the statistical software R. The pool of possible conservation plans is transferred from Marxan into R – which returns an ordination plot, as well as a cluster dendrogram that can be used to evaluate similarity of solutions. Also, the most efficient solutions per group are flagged. We believe that identification of alternative planning options facilitates review and implementation of Marxan solutions as negotiating parties have multiple alternative starting points.

A common problem in the field of systematic conservation planning is to find the best configuration of reserves or other conservation areas covering all features that are needed to fulfill biodiversity targets within a minimum cost. The conservation software package Marxan (Possingham et al. 2000, Ball et al. 2009) is spatial optimization software designed to solve this minimum set problem using a simulated annealing algorithm (Kirkpatrick 1983) that minimizes the objective function: total economic and social cost plus a weighted sum of all the unfulfilled connections (boundaries) plus a penalty for all unfulfilled targets. While simulated annealing uses random small steps to try to find good local minima of the objective function, we cannot guarantee that we have found a global minimum because practical reserve design problems are too large. Having many good solutions in place of a single optimal solution is generally regarded as preferred practice for it enables decision makers to consider the feasibility of alternative options. Typically, Marxan finds between 100 and 1000 good local minima to the minimum set problem and uses two outputs to communicate their results: 1) the single best solution across all runs, identified by the smallest value of the objective function. 2) the selection frequency derived for each planning unit (PU) sometimes called “summed irreplaceability” (Ball et al. 2009). This metric is the fraction of all the good local minima in which an individual PU is present. PUs that appear in every solution are termed irreplaceable and are probably essential for a cost-effective reserve system while PUs that are never selected are unlikely to be make an efficient contribution towards the conservation objectives.

We recognize that both of these outputs have limitations for planners and policy-makers. Not least is that many practitioners may have issues and constraints that are not included explicitly in the original formulation of the problem. For various reasons these may never be expressed. This decision-making environment typically proceeds by taking the solution generated for one problem and massaging it until it satisfactory to all stakeholders and/or decision-makers. The danger is that tinkering with a best solution that is unacceptable for reasons that are not included in the problem may not be a good way of finding other good acceptable solutions. Conversely, a map of selection frequencies does inform us which planning units are essential, and which we may do well to ignore, yet there will be many planning units with selection frequencies ranging from 10 to 90% where our action is unclear. Many of these planning units are required to solve the conservation planning problem and meet all the targets, however which ones are needed is not clear from the output. The selection frequency maps only highlights where key focal areas are located as guidance to where the prioritization process could be started.

Relating software outputs to practical planning efforts is crucial for effective conservation outcomes and planning products that better meet the requirements of the decision makers. The interpretation of prioritization outputs and mainstreaming products is regarded as a critically important stage for the implementation of conservation plans, yet it has so far received limited attention in the conservation planning literature (Knight et al. 2006, 2008). New research techniques that can deliver prioritization outputs as planning products for end-users are needed to ensure that conservation planning approaches remain relevant within a real-world context (Ferrier and Wintle 2009).

In this paper we present an extension of the Marxan software that identifies a portfolio of very good solutions that represent a representative sample of possible solutions. While we could deliver the top ten ranking solutions, there is a high chance that all of these may be quite similar (small variations on the best solution). Ideally what managers and decision makers need are a choice of options that are genuinely different, yet all reasonably good so that they can then proceed to canvas them for further consultation and negotiation.

The computational strategy presented here is to find a limited number of solutions that have the least planning units in common, while still achieving conservation targets in an efficient matter. This can be a computationally large problem. Recent analyses with Marxan have prioritized up to 70 000 planning units. (Klein et al. 2009). Obviously, a manual evaluation of a problem that size is not tractable. A multivariate evaluation of this problem has already been proposed by Airame (2005), who used cluster analysis to group alternative solutions. In this paper, we apply multivariate classification and ordination techniques to group solutions that are most similar and to identify single solutions representative of these groups. In this way, we can identify a number of options that represent feasible solutions as the basis for ongoing discussions.

Methods and results

Example data used in this study

The dataset for the case study consisted of 28 coastal and marine biodiversity features and 22 recreational features occurring at Rottnest Island, Western Australia (Supplementary material attached provides a more detailed description). Data was originally compiled by the Dept of Environment and Conservation for the Rottnest Island Authority to assist in the development of the Rottnest Island Marine Management Strategy. Coastal and marine biodiversity data included information on twenty-eight biodiversity features, including benthic habitats, coastal landforms and marine species such as invertebrates. Shorebased fishing data were further delineated according to the type of fish species targeted by recreational anglers (16 different species) and the areas where they are targeted. We set a blanket target of 25% representation for all conservation features.

Overview of computational strategy

After Marxan passes a two dimensional solutions matrix into the R software package (R Development Core Team 2008), an R script performs classification and ordination of our Marxan solutions. Following the calculation of a dissimilarity matrix based on the presence or absence of a planning unit within a solution, the first output is a hierarchical cluster dendrogram in which solutions are split into a user-definable number of groups. Based on a visual representation of the dissimilarities in ordination space (reduced to three dimensions using non-metric multidimensional scaling), the user can select solutions representative of each group. Alternatively, the user can set a fixed number of clusters and will receive a list of the cheapest solution in each cluster.

Configuration of R and the Marxan input.dat file

The first step in the analysis is running Marxan. While the script can handle an unlimited number of solutions, we recommend restricting the output to 100 solutions, as more will make the graphical outputs very hard to read. To be able to run the analysis, R needs to be installed. In addition, the packages rgl, vegan and labdsv will need to be installed. A few additional parameters have to be set in the Marxan input.dat file. These are documented in the manual (Supplementary material).

Calculation of dissimilarity matrix

When using this extension to the software, Marxan will log and write out every solution. The solution matrix is in the form of a binary matrix (Table 1).

Table 1.  Solutions matrix out of Marxan to be imported into R.

This will be passed on to R and the R script will calculate a Bray–Curtis dissimilarity matrix from the solutions matrix. The Bray–Curtis dissimilarity measure is used because it ignores joint absences (Faith et al. 1987), which are the predominant case in a solutions matrix. The output matrix that lists all dissimilarities between planning units is then used as input to the classification and ordination steps.

Classification and ordination

Within R, the “hclust” procedure performs a hierarchical agglomerative classification using a complete linkage algorithm on the Bray–Curtis dissimilarity matrix. First the two most similar solutions are selected and linked as a mini cluster (in the example S5 and S8, Fig. 1). The node that links them will become a starting point for the next agglomeration. After the entire tree is constructed, the user selects a cutoff point (2 groups in the example) to receive a complete list of groups. In our test run using the Rottnest Island data, we created 20 solutions. The 20 solutions are grouped into four clusters with the cutoff at a dissimilarity of 75% (Fig. 1).

Figure 1.

R output for 20 alternative solutions from Rottnest Island.

To pick a single solution from each of the classification groups, the user can then run a non-metric multidimensional scaling procedure (nMDS) using the nmds procedure in R. In contrast to eigenvector-based ordination techniques such as principal component analysis, multidimensional scaling is a randomization technique. The advantage of using nMDS over other techniques is that it is relatively space conserving i.e. portraits the multidimensional space so it reflects the true dissimilarities.

A measure of goodness of fit is the stress level (Kruskal 1964) which is defined as the percentage of mismatch between the dissimilarity matrix and the visual representation. The widely accepted limit for stress is 25%, which indicates a 75% match between the underlying data and the representation. Based on stress levels or the number of observations, the user can choose between a two-dimensional and a three-dimensional representation.

In Fig. 2, the example for the Rottnest Island dataset is displayed. At a stress level of 17% this is an adequate representation of the data. Solutions 3, 16, 9 and 17 would be obvious choices for user selection if the aim was to cover the maximum amount of variation. If the stress level for the two-dimensional representation is too high, the user can run an R-script that delivers an interactive three-dimensional representation using the RGL package (Fig. 3).

Figure 2.

2-dimensional nMDS: direct output from the R scripts.

Figure 3.

These solutions (1, 2, 14 and 17) were chosen as the cheapest solutions from each cluster. Blue shaded planning units are required in the solution. Note that the very different spatial designs result in similar costs (Table 2).

Cost-based automatic selection

In case the user deems the visual selection mode as too subjective or wants to run more alternative scenarios – in which case the nMDS plots will be harder to read – we added an automatic selection mode, in which the cheapest solution in each cluster will be automatically written out. As demonstrated in Table 2, the alternative solutions are not necessarily a lot more expensive. While the global optimum run (out of 10 000 solutions) incurred a cost penalty of 615.9, the best solutions out of 20 clustered runs range between 616.8 and 646.76 – the most expensive alternative only 5% more expensive than the global optimum.

Table 2.  Summary of the solutions by cluster. Cheapest solutions in each cluster are in bold characters.
SolutionCostNo. planning unitsCluster

While the four candidate solutions share similar patterns, we found that very few patches are completely irreplaceable. Only a large patch on the north shore and a smaller patch on the western end appear in every solution (Fig. 3).


While optimization software is commonly used in systematic conservation planning (Fernandes et al. 2005), the success of conservation action often hinges on implementation issues (Knight et al. 2008). Selection frequency maps provide some guidance as to the importance of individual planning units, yet they do not ensure complete coverage of conservation targets. In contrast, the single optimum solution can often be rejected by the participating parties – sometimes leading to a complete breakdown in the conservation negotiation.

Delivering effective conservation planning products requires knowledge of local conservation contexts and can take considerable time and resources to generate. We anticipate that increased institutional uptake of systematic conservation planning methods will lead to the development of more end-user products that can better support and inform decision-making. Our own experience in dealing with practitioners and those cited in the literature (Airame et al. 2003, Fernandes et al. 2005, Klein et al. 2008) has highlighted the decision making environment often seeks to consider options before engaging in a process of refinement that involves feedback from numerous stakeholders. Indeed, Airame's (2003) experience in the Channel Islands suggests that the exploration of alternative solutions is critically important to the planning process, more so even than optimality. While the best solution and the selection frequency map are useful indicators of priority areas, they often fail to indicate alternatives. Translating spatial prioritization outputs into products tailored to the planning framework in which practitioners use and interact, clearly has considerable potential for real-world applications.

We have developed an extension to Marxan, translating the software outputs into a user-friendly product that delivers alternative candidate solutions that retain a high degree of optimality. Apart from always accepting “good” steps, the simulated annealing algorithm in MARXAN also sometimes accepts “bad” moves in the optimization – resulting in a wider exploration of the solution space. Hence, often local minima are found capturing the variation of the solution space and delivering real planning options for consideration and negotiation. Visual evaluation of the nMDS plots provides the most unique solutions – however, this will not deliver the most efficient solutions in every cluster. In our example this did not really matter, as all solutions were within 5% of the value of the best solution. This very small margin demonstrates the usefulness of the application. The small margin also highlights the usefulness of the output for negotiations amongst stakeholders. As we demonstrated, even the second or third ranked solutions can be very useful – especially when they only add a maximum of 5-10% to the total economic cost part of the objective function. Note that not in every case the margin will be as small or the spread of the solutions will be that large. In a scenario with higher targets or more endemic features for example, more areas will be locked into every solution.

The flexible output from a dual approach using classification and ordination helps users of varied backgrounds – from managers to quantitative ecologists. The automated output (Table 2 in conjunction with a dendrogram) is an automatic decision-aid for users that have little or no experience with multivariate statistics. As discussed above, this will already yield a good result and deliver a diverse portfolio of options, whilst minimizing costs. The effectiveness of this “autopilot mode” is demonstrated in Fig. 3. Advanced users who are comfortable with multivariate statistics can evaluate the ordination plot (Fig. 2) in up to three dimensions simultaneously (or up to 4 dimensions on separate plots), hereby finding the single most unique solution in each group. This also avoids pitfalls in cluster analysis in which chaining effects cause pseudo-separation of sites that are fairly similar.

This paper demonstrates that multivariate analysis is a useful tool to distinguish amongst conservation planning options by identifying a range of candidate solutions that are of maximum dissimilarity, while still being efficient. Not intended to replace the traditional measures of summed solution and best solution, we offer end-users a product that translates Marxan outputs into alternative planning options and therefore facilitates uptake of Marxan solutions (Airame 2005). Ensuring effective products are delivered enhances the likelihood of successful mainstreaming (Knight et al. 2009).

Download the Supplementary material as file E6351 from <>.