Irreplaceability of river networks: towards catchment-based conservation planning

Authors

  • S. Linke,

    Corresponding author
    1. eWater CRC and Institute of Applied Ecology, Building 15, University of Canberra, Belconnen, ACT 2601, Australia; and
    2. The Ecology Centre, University of Queensland, St Lucia, Qld 4072, Australia
    Search for more papers by this author
  • R. H. Norris,

    1. eWater CRC and Institute of Applied Ecology, Building 15, University of Canberra, Belconnen, ACT 2601, Australia; and
    Search for more papers by this author
  • R. L. Pressey

    1. The Ecology Centre, University of Queensland, St Lucia, Qld 4072, Australia
    Search for more papers by this author
    • Present address: Australian Research Council Centre of Excellence for Coral Reef Studies, James Cook University, Townsville, Qld 4811, Australia.


*Correspondence author. E-mail: simon.linke@gmail.com

Summary

  • 1This study has adapted a complementarity-based area-selection method to estimate conservation value/irreplaceability for river systems. Irreplaceability represents the likelihood that an area will be required as part of a conservation system that achieves all conservation targets. We adapt this measure – often used in marine or terrestrial planning – to consider whole-of-catchment protection in a riverine setting.
  • 2After dividing the Australian state of Victoria into 1854 subcatchments, we successfully modelled distributions of 400 benthic macroinvertebrate taxa using generalized additive models. We calculated the minimum area required to protect all taxa using three different heuristic selection algorithms. The algorithms were modified to consider the entire upstream catchment for any subcatchment.
  • 3A summed rarity algorithm, corrected for upstream area, proved to be the most efficient, requiring 100 000 hectares less total catchment area to represent all taxa than the second most efficient algorithm. We calculated irreplaceability by running the algorithm 1000 times and randomly removing 90% of the catchments in each run. From this analysis, we estimated two metrics: Fs (the frequency of selection) and average c (average contribution to conservation targets).
  • 4Four groups of catchments were identified: (i) catchments that have high contributions and are always or very frequently selected; (ii) catchments that have high contributions and are infrequently selected; (iii) catchments that are always or very frequently selected but contribute few taxa; and (iv) catchments that are infrequently selected and contribute few taxa.
  • 5Synthesis and applications. For the first time, a complementarity-based algorithm has been adapted to a riverine setting. This algorithm acknowledges the connected nature of rivers by considering not only the local assemblages, but also upstream areas that need to be protected. We demonstrated that using standard algorithms in these connected systems would lead to two mistakes, namely: (i) not all taxa would be covered by reserves that were buffered from potential human disturbances upstream; and (ii) the standard algorithms would not lead to the most efficient solution, potentially costing additional millions of dollars to any conservation scheme. We therefore recommend the use of our algorithm or a similar riverine adaptation of reserve design algorithms to ensure adequate and efficient conservation planning.

Introduction

As freshwater biodiversity is lost rapidly all over the globe (Abell 2002), calls for establishing protected freshwater areas are increasing (Kingsford et al. 2005; Abell, Allan & Lehner 2007). However, river conservation science is still lagging in quality and quantity compared to terrestrial assessments of conservation value and terrestrial applications of systematic conservation planning (Abell 2002; Sarkar et al. 2002; Cowling et al. 2003). While the Ramsar system has allowed for the listing of wetlands based on their aquatic features (although in practice, mainly birds), only a handful of rivers that have reserve status were specifically designed as freshwater reserves. Most of these are protected in the framework of a terrestrial reserve system. One obvious limitation of this approach is that terrestrial conservation planning usually does not consider aquatic taxa as targets, which are thus under-represented (Nilsson & Gotmark 1992). Also, specific threats to freshwater ecosystems and their high levels of connectedness are not necessarily considered in the selection of terrestrial reserves (Filipe et al. 2004).

The most fundamental difference between terrestrial and riverine conservation planning is the required spatial configuration of potential protected areas: river networks are connected systems, laterally, longitudinally and vertically (Ward 1989). While terrestrial conservation planning is increasingly addressing issues of connectivity when dealing with metapopulations (Cabeza 2003), the nature and scale of connectivity are different in freshwater systems. Sections of a river can be affected by activities hundreds or even thousands of kilometres upstream. Therefore, in accordance with Hynes’ (1975) paradigm that ‘the valley rules the stream’, upstream areas – as well as the surrounding land – must be considered when estimating conservation value or developing a system of protected areas. While recent studies also note that influences across drainage basins including downstream influences should be considered (Pringle 2001), we propose here, for simplicity's sake, to only incorporate upstream protection into a potential measure of conservation value.

Unlike previous freshwater conservation studies, we do not use a richness or rarity metric, but introduce a method based on complementarity, the key principle underlying all systematic planning methods in the terrestrial and marine realms (Margules & Pressey 2000). More specifically, we adapt complementarity to recognize upstream connectivity of freshwater systems. Complementarity-based selection algorithms require quantitative conservation targets for biodiversity surrogates such as species or habitat types. The role of these methods is often to minimize the total cost or area of achieving all targets (minimum sets). However, near-minimum or true minimum (hereafter ‘minimum’) sets represent only single solutions to the problem of achieving targets for all features. To explore the options for achieving targets, Pressey et al. (1993) and Ferrier, Pressey & Barrett (2000) proposed a new measure –‘irreplaceability’. The original operational definition of irreplaceability has two aspects (Pressey et al. 1993): (i) the likelihood that an area will be required as part of a conservation system that achieves all targets; and (ii) the extent to which the options for achieving all targets are reduced if the area is unavailable for conservation.

Because measurement of irreplaceability is a combinatorial problem, it cannot be calculated exactly for large regional data sets and has to be estimated – seemingly trivial, yet deceptively difficult (Jacobi et al. 2007). Only some areas will default to irreplaceabilities of 1 because they have unique features. Areas can have high irreplaceability values for other reasons, depending on the way targets are defined and features are distributed across regions. More generally, it is important to measure irreplaceability of all areas across the range of values from 1 to zero. Here we present a new estimation method using a bootstrapped heuristic algorithm.

The few systematic conservation planning approaches for rivers to date use landscape attributes or river classes as features to be represented (Roux et al. 2002; Fitzsimons & Robertson 2005; Thieme et al. 2007). This is appropriate in areas lacking detailed data on species distributions. Although we do not believe that perfect taxonomic data are necessary for conservation planning, we decided to set targets based on actual taxa, as the use of environmental surrogates continues to be heatedly debated (Brooks, da Fonseca & Rodrigues 2004; Pressey 2004). To avoid negative spatial bias in unsampled areas, we modelled distributions of our target organisms analogous to Clark & Slusher (2000).

The main aim of this study is to develop a method to estimate the conservation value of river systems subject to the following requirements:

  • 1The method is based on complementarity and identifies minimum sets and estimates irreplaceability values.
  • 2Upstream effects are recognized and accounted for by incorporating the need for protection of the entire upstream catchment area into the reserve selection algorithm.
  • 3Taxa are the features to be targeted and represented.

We demonstrate our approach using data on benthic macroinvertebrates in the state of Victoria, in southern Australia.

Methods

study area

The Australian State of Victoria is about 227 600 km2 and covers a wide variety of landforms and climatic conditions. The western plains is a semi-arid region, the main natural land cover in southern coastal areas is temperate rainforest, and the Victorian part of the Snowy Mountains is an alpine region above 2000 m above sea level of altitude. The northwest corner of Victoria was omitted for two reasons. First, parts of it no have water at all. Secondly, the main stem of the River Murray has a distance to source of 1000 km, with headwaters also in other Australian states. The size of its catchment would have presented problems for our analyses.

planning units

We accounted for the connected nature of rivers and the natural boundaries of influence by delineating subcatchments as planning units. We used the SRTM (Shuttle Radar Topography Mission) 3 arc second DEM (van Zyl 2001) and patched holes smaller than 3 pixels using 3-DEM (Horne 2006). A total of 1854 subcatchments was then delineated using ArcHydro (Maidment 2002) within ArcGIS 9 (ESRI 2002). The size of these subcatchments ranged between 10 km2 and 50 km2. These entities are hereafter referred to as ‘subcatchments’, in contrast to ‘catchments’, which describes the entire area upstream of a subcatchment.

benthic macroinvertebrate data and environmental predictors

Benthic macroinvertebrates were used as targeted features to calculate minimum sets and irreplaceability of subcatchments. Invertebrates were collected using a D-net kick sample (250 µm mesh) and sorted in the field (see Metzeling et al. 2003) until at least 200 individual specimens were recovered. Invertebrates in 222 subcatchments were identified to species where possible and a total of 1065 taxa recorded (at genus or family level if species level was not possible). Distributions in the remaining catchments were modelled using the procedure described below.

Predictor variables for modelling benthic invertebrates in unsampled catchments can be categorized into four major groups.

  • 1Location of the subcatchment in the landscape. These included latitude and longitude and distance from the river mouth at the sea. Total catchment area upstream was calculated using ArcHydro (Maidment 2002). We also calculated mean elevation and range in elevation of the whole catchment upstream, as well as mean elevation of the subcatchment and standard deviation of elevation as a measure of topography.
  • 2Climate. Taken from the ANUCLIM model (ANU Climate Model, Houlder et al. 2000). Average rainfall was calculated for the subcatchment, but rainfall was also totalled across the whole catchment area. Temperature from ANUCLIM was averaged at the subcatchment level.
  • 3Landform. Slope was derived from the SRTM DEM using Spatial Analyst in ArcGIS 9 (ESRI 2002). Mean and standard deviation slope were summarized both at the catchment and the subcatchment level. More complex landform categories (ridge, ravine, concave slope, convex slope, and saddle) were calculated using the ‘toposhape’ module in IDRISI Kilimanjaro (Clark Labs 2004). Percentages were summarized by subcatchment.
  • 4Geology and vegetation. Vegetation growth category and vegetation density were calculated using Carnahan's spatial vegetation bibliography (AUSLIG 1991). After converting the coverage to a 3 arc-second raster layer, we coded growth classes to 0 (grasses), 1 (low shrubs), 2 (tall shrubs/low trees) and 3 (medium/tall trees). Vegetation density classes were recoded to 0 (< 10% density), 1 (10–30%), 2 (30–70%) and 3 (> 70%). Percentages of sandstone, siltstone, limestone, acid volcanic, and basic volcanic soils were derived from the digital version of the 1:2·5 million Geology of Australia Map (BRS 1991).

using generalized additive models to model distributions of taxa

We derived a predictive relationship for each taxon using Generalized Additive Models (GAMs, Hastie & Tibshirani 1999). The ability of GAMs to model both monotonic and Gaussian response curves makes GAMs appropriate statistical models for ecologists (Yuan 2004). When building the GAM models, we decided not to withhold a validation data set. The main reason was that the rarer taxa would have been impossible to validate based on only one or two observations. However, we took two measures to avoid overfitting. First, we removed taxa with less than 10 observations in the entire data set. Secondly, analogous to the modelling approach by Yuan (2004), we ran a principal components analysis (PCA) to eliminate correlated variables and reduce the number of predictors. The predictors that showed the highest correlation with the first six principal axes were used to build a stepwise GAM for every taxon (Table 1). The criterion used in the stepwise selection of predictors was corrected AIC (Akaike's Information Criterion). To reduce overfitting, the maximum number of variables was set to two and the spline interpolation was limited to 2 degrees of freedom, effectively allowing only one optimum.

Table 1.  Predictors with high loadings on the principal axes. *indicates that predictors were used for modelling. Loadings on PC axes are given in brackets
PC1PC2PC3PC4PC5PC6
Average slope in the subcatchment* (0·31)Catchment area upstream* (0·39)Percentage ravines* (0·34)Vegetation density category* (0·45)Percentage limestone* (0·35)Percentage sandstone* (0·55)
Standard deviation of slope in the subcatchment (0·30)Total rainfall in the catchment (0·37)Percentage ridges (0·30)Percentage limestone (0·31)Local temperature (0·30)Percentage acid volcanic (0·41)
Average elevation in the subcatchment (0·30)Range in catchment elevation (0·34)
Latitude (0·35)
Percentage sandstone (0·30)
Catchment area upstream (0·35)Percentage convex hills (0·31)  

Our measure of modelling success was area under curve (AUC) of the Receiver Operating Characteristic (ROC). AUC is a measure of the ratio of true positives to false positives. Based on the classification by Boyd et al. (2005), we set the cut-off for successful predictions to 0·6. Good predictions were defined as AUC > 0·8, very good predictions as AUC > 0·9. We predicted probabilities of occurrence of 400 successfully modelled taxa for all 1854 subcatchments. Probabilities were then converted to presence/absence at a threshold of 0·5.

minimum sets and irreplaceability

Our representation goal was to design a network capturing all 400 successfully modelled taxa at least once. Three heuristic approaches to estimate a minimum set were chosen because they consist of simple rules which makes them easy to follow (Margules & Pressey 2000) and are easily modified to include rules for catchment connectivity.

All three algorithms were modified to account for the connected nature of rivers. To ensure that upstream disturbances would not affect potential aquatic reserves, we introduced a rule that the entire catchment upstream of any selected subcatchment had to be protected. In the reserve design algorithm, single subcatchments were forbidden configurations if they were not headwater catchments (Fig. 1). Selection of a non-headwater subcatchment for conservation also selected the subcatchments upstream (Fig. 1c). All taxa found upstream at a single subcatchment were determined using a propagation algorithm based on the ARC Hydro network (Maidment 2002). This upstream taxa list was the input for the three heuristic algorithms. Additionally, because the catchment size varies by a factor 1000 from headwater catchments to the mouth of the Snowy River, the measures of number and rarity used in the selection process have to be corrected for the total area.

Figure 1.

Forbidden and allowed configurations in the complementarity algorithm. Protected subcatchments are highlighted by shading. Isolated subcatchments that are not headwater catchments (a) are forbidden configurations. Single headwater subcatchments (b) are allowed configurations as are whole catchments that include the subcatchment of interest (c), in this case marked with an asterisk.

The richness-based greedy heuristic algorithm (Kirkpatrick 1983) starts by finding the catchment with the highest number of taxa. The catchment and the associated taxa are then removed from the data set and richness is recalculated; hence, successive selections will find catchments with the highest number of previously unrepresented taxa. The procedure continues until all taxa are in at least one selected catchment. We corrected the richness measure by dividing it by the size of the catchment to incorporate one type of cost. The coefficient for selecting catchments was therefore:

c = n/area,(eqn 1)

where c is contribution to targets, n is number of taxa in the catchment, and area is hectares covered by the catchment. The different catchment sizes meant that the algorithm did not have to choose between catchments with equal values of c, making tie-breaking rules (Pressey et al. 1997) redundant.

Based on the findings of Csuti et al. (1997), we used a progressive rarity algorithm (Margules, Nicholls & Pressey 1988) as the second heuristic. In this method, also used in ResNet (Sarkar et al. 2002), the first catchment selected is identified by the rarest taxon, with rarity here divided by the area of the catchment (equation 2).

image(eqn 2)

where c is the selection metric, f is the frequency of the taxon in the entire data set (recalculated after every iteration), and area is hectares covered by the catchment. As above, c is recalculated as selections proceed until all taxa are in at least one selected catchment.

The third algorithm is based on summed rarity, adjusted for area (equation 3), analogous to the algorithms used by Rebelo & Siegfried (1992)

image(eqn 3)

where c is contribution to targets, summed across all taxa in the catchment and corrected for area, f is frequency of the taxon in the entire data set, and area is hectares covered by the catchment. Again, c is recalculated until all taxa are in at least one selected catchment.

To calculate irreplaceability while still incorporating catchment area, we developed a new approach that is not based on a statistical estimator (Ferrier et al. 2000) but is closer to Rebelo & Siegfried's approach (1992) of re-running a heuristic selection multiple times. We removed a fixed percentage of catchments each time we ran the algorithm: a simulated analogy to the real-world scenario that drove the development of irreplaceability measures. A certain percentage of the area is made unavailable. We removed catchments randomly 1000 times, ran the selection algorithm each time and repeated this process for three levels of removal: 50%, 70% and 90%. For each level, we then derived two measures of conservation importance from the 1000 minimum sets. The first was Fs, the frequency of selection indicating the percentage of 1000 minimum sets in which a catchment was represented. The second was the average c, the average contribution to targets of each catchment over the 1000 runs.

Both properties are important to planning scenarios. For example, catchments with small values for average c could have high values for Fs because they contain endemic taxa. Catchments with low values of Fs could still have high values for average c because, when they are selected, they make very large contributions to targets. To combine both properties into a single index, we use summed c– the sum of c over all the runs in which a catchment was selected:

image(eqn 4)

where n is the number of minimum sets with the catchment present and c is the contribution that the catchment has in the ith minimum set. This measure integrates both contribution to targets and frequency of selection and is hereafter termed ‘irreplaceability’.

To compare the new estimator against existing measures of irreplaceability, we ran the subcatchment-level data set through c-plan (Pressey et al. 2005) and marxan (Possingham, Ball & Andelman 2000). The subcatchment data set was used instead of the ‘whole-of-catchment’ data, because c-plan cannot currently accommodate cost or area and marxan cannot currently deal with planning units that are nested within each other. We compared summed c to c-plan's analogous summed irreplaceability. We also compared the selection frequency Fs of the bootstrapped heuristic to the analogous selection frequency in marxan.

Results

The first six principal components of the predictor variables PCA explained 71% of the total variation in the data set. The following six variables with the highest loadings (in brackets) were selected as predictors for the generalized additive models:

  • 1Average slope in the subcatchment (0·31)
  • 2Catchment area upstream (0·37)
  • 3Percentage of ravines in the catchment (0·34)
  • 4Vegetation density (0·44)
  • 5Percentage of limestone in the catchment (0·35)
  • 6Percentage of sandstone in the catchment (0·56)

Other correlated variables with high loadings (> 30%) on the six main axes are listed in Table 2.

Table 2.  Steps and areas required for minimum sets using the three different algorithms
AlgorithmSteps requiredPercentage of total area
Greedy heuristic3510·6%
Progressive rarity4411·3%
Summed rarity3210·1%

Out of the 452 macroinvertebrate taxa with more than 10 occurrences, 400 taxa were successfully modelled. The other 52 taxa remained below the ROC AUC of 0·6 or did not produce any predictions above 50% probability of occurrence. Out of the 400 successfully modelled taxa, most predictions were good (37%) to very good (50%), with only 10% below 70%.

Vegetation density was the most common predictor, appearing in 38% of the models. This was followed by slope and percentage ravine (both 31%) and catchment area upstream (20%). Percentage sandstone and percentage limestone were the least common of the six predictors, appearing in only 10% and 8% of models, respectively.

When calculating the minimum sets, the summed rarity algorithm was slightly more efficient than the greedy heuristic and much more efficient than the maximum rarity algorithm (Table 1). Although the difference in percentage of total area (Table 1) might seem small, the 0·5% difference in required area between the best two algorithms equates to 1000 km2. The remaining results are reported only for the summed rarity algorithm.

When examining the selection patterns, it becomes obvious that only a relatively small total area has to be protected to cover most taxa. Protection of only 2% of the area will represent 90% of the taxa in the data set (Fig. 2). Only the last 10% of taxa require an additional reserve area of 8% (Fig. 2).

Figure 2.

Accumulation curve for the summed rarity algorithm.

The first steps of the algorithm reveal an interesting pattern. First, small catchments with many rare taxa are selected. Then, slightly bigger catchments are chosen, still with a bias towards rare taxa, but with diminishing returns (Supplementary Material Figure S1). In step eight, the first catchment with predominantly common taxa is selected, adding 32 taxa to the protected list. The following stages consist of a mix of large and small catchments with rare and common taxa before the last 10 steps select very large catchments that are needed to protect the few remaining taxa (Fig. 3).

Figure 3.

Minimum set from the summed rarity algorithm. Darker colours indicate larger numbers of species protected by the selection step. Note that large catchments in the centre and east were needed to protect only two taxa each in the final steps.

As a comparison, we ran the same algorithm without the catchment restriction. Only 1% of the total area was required in this case, differing from the whole-catchment-algorithm by a factor of 10. Many of the chosen subcatchments are in fact only the outflows of the larger catchments selected in Fig. 3, illustrating how a catchment rule changes the configuration and connectivity of the selected areas.

When estimating irreplaceability, we started by removing 50% of the catchments randomly. After 1000 repetitions, only 370 catchments had been selected in one or more sets and therefore had non-zero values of irreplaceability. Recalculating irreplaceability by removing 70% and 90% of catchments randomly increased numbers of catchments with non-zero irreplaceabilities to 558 and 908, respectively. Irreplaceability values with 50% and 90% removals were comparable (r2 = 0·73). The top 100 irreplaceable catchments were identical in both solutions.

A comparison of the three different algorithms using the 90% bootstrap demonstrates that the summed rarity algorithm is also more efficient than the other two algorithms when estimating irreplaceability, with only 24·7 steps required on average, compared to 30·4 and 33·2 (Table 3). The average number of steps is less than the minimum set (Table 1) because not all taxa represented in the full data set are present in the bootstrapped versions.

Table 3.  Steps required in 1000 runs of the 90% bootstrap for three different heuristic algorithms
AlgorithmSteps required
MinimumAverageMaximum
Greedy heuristic2530·437
Progressive rarity2533·251
Summed rarity1924·732

A closer look at the data reveals more about the effect of the estimation method on the irreplaceability metric. For example, subcatchments 1789 and 1788 are small, adjacent headwater catchments. Their irreplaceabilities are ranked 2 and 4 out of all catchments, respectively. Both predicted assemblages are identical but subcatchment 1789 is smaller (65 km2) than subcatchment 1788 (80 km2). This led to catchment 1789 being chosen first in the minimum set and in all sets where neither catchment was removed. In these cases, catchment 1788 was not chosen in later steps, because its taxa – identical to those in 1789 – were taken out of the algorithm. However, if 1789 was not present in the bootstrapped data set but 1788 remained, 1788 was always chosen and was therefore ranked highly irreplaceable. Related to these interactions, we recognized four groups of catchments corresponding to the four quadrants in Fig. 4:

Figure 4.

Categorization of the 908 catchments present in more than one solution using 90% removal. The numbers refer to the text in the Results section.

  • 1Catchments like 1789 that were always or very frequently selected when they were in the data set (high Fs) and contributed many taxa (high average c);
  • 2Catchments like 1788 that were not always selected when they were in the data set (mid- to low Fs), but contributed many taxa when they were selected (high average c);
  • 3Catchments that were always or very frequently selected but did not contribute to targets for many taxa (high Fs, low average c);
  • 4Catchments that were occasionally selected to provide a few additional taxa (low Fs, low average c).

We found that 84 catchments were present in over 80% of solutions. Of these, 33 catchments contributed highly in most cases (quadrant 1), whereas 51 contributed few taxa (quadrant 3). Twenty-five catchments were present in less than 80% of the solutions, but ranked highly if they were chosen (quadrant 2). The remaining 799 catchments were chosen less frequently and did not contribute significantly (quadrant 4). When summing the contributions over the 1000 bootstrap runs, we obtained an estimate of irreplaceability that integrates both frequency and contribution. It was strongly correlated to average c (r2 = 0·82, P < 0·001) but weakly to Fs (r2 = 0·27, P < 0·001). A map of irreplaceability across the study area reveals a strong bias of higher values towards coastal and mountain lower-order streams (Fig. 5).

Figure 5.

Irreplaceability map for the study area using 90% removal.

Correlation of summed c (our irreplaceability) for subcatchments with c-plan's summed irreplaceability was very high (r2 = 0·95, P > 0·0001). The correlation between summed c and marxan's selection frequency was still strong, but lower (r2 = 0·52, P < 0·001). However, all of the 33 subcatchments that showed very high selection frequency in marxan had a summed c > 75% in the bootstrapped heuristic, indicating agreement in selection frequency for highly irreplaceable subcatchments.

Discussion

prediction of invertebrates

After carrying out a PCA on the environmental variables, Axis 1 summarized local topographic variation and, with Axis 2 (upstream catchment area), reflected the position of subcatchments in their larger catchments. Axis 2 illustrated how seemingly unrelated variables are highly correlated in Victoria State. Rainfall, latitude and position in the catchment are correlated, as often the higher parts of catchments that receive most rain are located in the north. All major groups of predictors were represented in the first six axes. Location descriptors were present in axes 1, 2 and 3, climatic descriptors in axes 2 and 5, and landform descriptors in axes 2, 3 and 4 (Table 2). Geological variables were included on four axes and vegetation density was present in axis 4. Considering that all major groups were covered on multiple axes and 71% of the total variation was explained by these six axes, we concluded that most information is retained by selecting six variables while reducing the risk of overfitting (Yuan 2004).

Taxa modelling success was high, with 52% of taxa predicted at excellent or very good levels according to the classification by Boyd et al. (2005). Only 52 taxa (11·5%) could not be predicted. Surprisingly, upstream catchment area was not among the most important predictors in respect to selection frequency. Position in the catchment is the most commonly used predictor in predictive bio-assessment programmes, particularly for macroinvetebrates. However, our results indicate that local predictors are more important than positional variables in the study area. This agrees with recent studies partitioning the influence of local vs. regional scales (Stendera & Johnson 2005; Johnson et al. 2007).

The success of the predictions can be partly attributed to an unusual group of predictor variables: local variables that were not measured on ground, but derived from a GIS system. Based on the high AUC for the invertebrate models, we suspect that these predictors, including landform predictors and detailed vegetation and climatic descriptors, are not only adequate replacements for on-ground observations, but might also enhance prediction in general. One possible reason is that remotely collected data provide a more objective approach to obtaining local information without being subject to the observer variation characteristic of site data (Hannaford, Barbour & Resh 1997).

The AUC values discussed above will have to be interpreted with caution because no validation data set was withheld (Yuan 2004). However, we do not consider slight overfitting as problematic for the purposes of our study because it was not the goal of the modelling exercise to predict the presence of single taxa in a catchment with 100% accuracy. Instead, the models were used to produce a more informed stratification and weighting of environmental surrogates, similar to the techniques described by Sarkar et al. (2005).

Also note that the distributions are hypothetical, based on models from sites in reference condition. This was a deliberate decision, in line with many existing terrestrial and marine planning exercises (Ferrier et al. 2000; Cowling et al. 2003). These hypothetical distributions and the minimum sets derived from them are then placed in a larger decision framework, like the irreplaceability–vulnerability framework (Margules & Pressey 2000) or an extension using condition (Linke et al. 2007).

minimum sets and choice of algorithms

The summed rarity algorithm produced the most efficient solutions, both in the minimum set (Table 1) and the bootstrapped runs (Table 3). This is initially surprising, considering the progressive rarity algorithm was most efficient in the comparison by Csuti et al. (1997) and is the standard procedure for minimum sets. However, many previous approaches have used planning units that are equal-sized grids, while we were dealing with catchments of different sizes, varying by a factor 1000. Not only does the number of steps have to be optimized, but also the contribution to targets per unit of catchment area.

An unmodified progressive rarity algorithm selects the catchment with the single rarest feature. This potentially results in a very large selected catchment, requiring a correction for area. The far greater efficiency of both algorithms that involved sums (i.e. richness and summed rarity) demonstrates that in a case where planning units vary greatly in size, both a measure of richness and area lead to relatively efficient networks. This might vary between data sets, however, depending on the frequency of taxon occurrences. Considering the consistently better performance of the summed rarity algorithm, a provisional conclusion is that the combination of richness and a rarity weighting with area can give efficient results for riverine planning and in other situations where planning units vary widely in area.

Although the differences in total area between the minimum sets and bootstrapped analyses from the three algorithms might seem small at first glance, the 0·5% difference between the two best algorithms, at land prices of A$1500–8000 (US$1000–6000) per hectare, would translate to between A$15 million and A$80 million. This applies to acquisition cost, with stewardship costs potentially being higher in the long run. This demonstrates that even small differences in algorithm efficiency can make a large difference once the study area comprises an entire state.

The species accumulation curve for the minimum set looked similar to those observed in terrestrial studies (Csuti et al. 1997). Characterized by a steep ascent at first, these accumulation curves typically deliver diminishing returns with later selections. Only 2% of the total area is needed to represent 90% of the 400 taxa. Representation of the remaining 40 taxa brings the required area to 10·3% of the total. Diminishing returns are exaggerated by the whole-of-catchment rule, where all of the upstream catchment must be protected. Two very large catchments, shown in Fig. 3 had to be added to cover taxa endemic to lowland rivers with large upstream areas.

However, the total areas should not be considered as reliable estimates. Only 400 species were successfully modelled for the entire area. While this is more real biological data than in any other systematic conservation exercise in freshwater systems, we had to leave out many rare species, because reliable models cannot be built on just a few records. We are comfortable with the existing results compared to a conservation plan that represented all of the rare species. Preliminary analysis showed that most of the subcatchments had one unique taxon and would therefore be included in a fully representative conservation system. This does not reflect reality, however, and is a product of sampling bias. A possible way forward would be a mix of real species data and surrogates that are informed by the distribution of rare taxa, such as Generalized Dissimilarity Modelling (GDM, Ferrier et al. 2007).

irreplaceability

Our study presents the first estimate of irreplaceability that incorporates connectivity of river systems. While not as elegant as the irreplaceability coefficient by Ferrier et al. (2000), a bootstrapped selection algorithm has the advantage of emulating real-life management situations. The outputs of a bootstrapped irreplaceability algorithm are therefore easily understandable: Fs is the percentage of times that a catchment appears in solutions derived from data sets in which it was not removed. This is a measure of how many spatial alternatives exist for a catchment if all targets are to be achieved. The other output is average c, the average contribution that a catchment made to the achievement of targets across all of the solutions – a measure of relative importance.

Both of these properties contain valuable information, as the example of the two neighbouring catchments (1788 and 1789) shows. While both have an almost equally high contribution when they are selected, one catchment is only needed if the other catchment is unavailable. A hypothetical opposite example are two catchments containing unique taxa. If catchment A had one unique taxon, but catchment B had five, both would have a selection rate of 100% but the summed contribution to targets of catchment B would be five times greater.

The frequency of selection is crucial when completeness of target representation is required. In this case, even catchments with low average c, but high Fs will have to be selected. If maximum returns are to be achieved with only a few catchments, high average c could be more important. Figure 4 demonstrates that the two measures are not necessarily linked. Catchments with the highest contributions will also have the highest frequency of selection – whenever they remain in the data set, they will be selected first. As contributions decline, the measures become increasingly decoupled: while most catchments with low average c have low frequency of selection, some are needed to fulfil all targets. For the purpose of our study, we combined both measures by summing the contributions over the number of selections. Summed c is analogous to ‘summed irreplaceability across multiple features’, a statistical estimator described by Ferrier et al. (2000).

The analogy is highlighted by the very high correlation between summed c and summed irreplaceability (r2 = 0·95). This correlation and the agreement of selection frequencies between marxan and the bootstrapped heuristic highlight the compatibility of the new estimator with established measures of irreplaceability used worldwide. However, the new estimator adds the flexibility to include catchments with different areas and even nested catchments as units of replication (i.e. where one planning unit is nested within a larger planning unit).

We found that 33 catchments were present in almost every solution, each contributing a large number of taxa. These catchments are the focus catchments for conservation action – highly important to achieve representation targets. For all representation targets to be achieved, another 58 catchments with lower contributions but present in almost all solutions would have to be considered.

The spatial distribution of irreplaceability makes intuitive sense. Highly irreplaceable catchments are found from the dry western catchments to coastal temperate rainforest and the mountain regions. This is expected, because different taxa will be found exclusively in some of the ecotypes. There is a strong bias towards coastal catchments and inland headwater streams caused by the whole-of-catchment rule, making large lowland river catchments less likely to be selected in the algorithm. Species-area effects (studied in the benthic environment by Woodward & Hildrew 2002, amongst others) are negligible, as Marchant, Ryan & Metzeling (in press) found no real relation between the size of the catchment and species richness for the same data set.

Conclusion and scope for future research

This study illustrates the value of an approach to river conservation that does justice to the spatial configuration of connected river networks. While the whole-of-catchment rule is a step in the right direction, future applications could make two modifications to the system. The first relates to types of connectivity. Pringle (2001) suggested that upstream protection is necessary but not sufficient. Disturbance across adjacent catchment boundaries can play a role as well as downstream connectivity and groundwater interactions. Additional rules similar to the whole-of-catchment rule may have to be created to accommodate these threats. Also, while out of the scope of this project, protection for freshwater systems other than rivers will add further spatial constraints to a freshwater conservation problem. Again, vertical interactions with groundwater and lateral connectivity with wetlands and the floodplain need to be addressed.

With the second modification, the whole-of-catchment rule could be eased to recognize that not all stressors have an equal effect on downstream ecosystems. For example, mild organic pollution might be metabolized within 10 kilometres (Schwoerbel 1972; often much less, Storey & Cowley 1997) and recreational fishing could be allowed outside a core sanctuary. In these cases, protecting several links upstream and downstream of an irreplaceable subcatchment would be sufficient. At the other extreme, sedimentation potentially influences the entire area downstream, as does over-extraction of water. In these cases, whole-catchment protection or at least control of the pollutant might be necessary. Mixed protection schemes where statutory reserves go hand in hand with community efforts and other kinds of regulation (Cowling et al. 2003) will be needed to achieve full protection. These issues have also been addressed by Abell et al. (2007), who suggested a mixed portfolio of different protection categories, together with best-practice land management to achieve conservation outcomes in rivers. Linking threats directly to species and using this to set modified design rules would also help to capture the range of habitat requirements (and their connections) that more mobile taxa need over their life cycle.

Three further challenges to operationalize a complementarity-based assessment of conservation value are (i) setting realistic conservation targets; (ii) estimating real costs of conservation instead of just using catchment areas; and (iii) embedding the irreplaceability coefficient in an overarching framework similar to the one described by Margules & Pressey (2000). Target setting is crucial to the success of a conservation plan. For demonstration purposes, we used one occurrence per taxon as a representation target. To ensure persistence, especially of migratory taxa like fish, local managers, taxonomic experts, and policy-makers should be involved in a target setting process. Furthermore, modifying conservation design is only a part of the challenge to be addressed in conservation planning for rivers. If actual protection is to be realized, implementation issues affecting freshwater conservation are crucial (Nel et al. 2007).

For the present, the method presented here satisfies the three requirements we set out to achieve. First, it is based on complementarity, similar to the leading techniques in terrestrial and marine conservation planning, and thus achieves high efficiency in achieving conservation targets. Secondly, the whole-of-catchment rule promotes complete protection when designing river reserves. Thirdly, instead of the environmental surrogates used in other river planning exercises, our method relates directly to the biota. These three requirements make the riverine irreplaceability index a big step toward systematic conservation planning in freshwater systems.

Acknowledgements

The authors would like to thank Leon Metzeling from the Victorian EPA, as well as Richard Marchant and David Ryan from the Museum of Victoria for data provision and stimulating discussion. Lester Yuan and James Mugodo taught us everything about Generalized Additive Models. The entire Canberra laboratory at University of Canberra, as well as Bob Bailey, Sahotra Sarkar, Robin Abell and Peter Davies engaged in stimulating discussions. Three anonymous reviewers gave us very constructive and insightful feedback. This work was supported by an IPRS scholarship from the University of Canberra and a PhD scholarship from the CRC for freshwater ecology.

Ancillary