Predictions of beta diversity for reef macroalgae across southeastern Australia

We analyzed and predicted spatial patterns of turnover in macroalgal community composition (beta diversity) that accounted for broad-scale environmental gradients using two contrasting community modelling methods, Generalised Dissimilarity Modelling (GDM) and Gradient Forest Modelling (GFM). Percentage cover data from underwater macroalgal surveys of subtidal rocky reefs along the southeastern coastline of continental Australia and northern coastline of Tasmania were combined with 0.018-resolution gridded environmental variables, to develop statistical models of beta diversity. GDM, a statistical approach based on a matrix regression, and GFM, a machine learning approach based on ensemble tree based methods, were used to fit models and generate predictions of beta diversity within unsurveyed areas across the region of interest. Patterns of macroalgal beta diversity predicted by both methods were remarkably congruent and showed a similar and striking change in community composition from eastern South Australia to western Victoria and northern Tasmania. Macroalgal communities differed markedly in predicted composition between the open coast and inshore locations. A distinct algal community was predicted for the enclosed Port Philip Bay in Victoria. Sea surface temperature standard deviation and average contributed most to changes in beta diversity for both the GDM and GFM models; changes in wave exposure and oxygen also influenced beta diversity in the GDM model, while salinity and exposure contributed substantially to the GFM model. The GDM and GFM analyses allowed us to model and predict spatial patterns of beta diversity in macroalgal communities comprising .180 species over 6600 km of coastline. These outputs advance regional-scale conservation management by allowing planners to interpolate from point source ecological data to assess the distribution of biodiversity across their full domain of interest. The congruence between methods suggests that strong environmental gradients related to temperature and exposure are the common drivers of community change in this region.


INTRODUCTION
Conservation planning and management of marine biodiversity requires information on the spatial distribution of biodiversity attributes, but quantitative analysis of patterns of biodiversity are almost absent from marine systems.Here biological sampling is often sparse and habitats are out of sight and costly to survey.Alternatively, environmental (surrogate) data are often more readily available and cover significant geographical space, (see review by McArthur et al. 2010), but have limited application when the congruence between mapped environment and biodiversity is weak or unknown (Faith 2003, Ferrier et al. 2007).Such lack of congruence can cause inefficiencies and at worst, avoidable declines in biodiversity because of complacency that targets of protection are adequately safeguarded (Edgar et al. 2008).Increasingly detailed coastal planning frequently includes application of Marxan (Watts et al. 2009) and other methods to maximize representation of remotely-sensed habitat categories within highly protected zones (e.g., Fraschetti et al. 2008, Klein et al. 2008, Watts et al. 2009).However, mapped habitat categories lack information on within-habitat variation in community composition, potentially leading to key elements of biodiversity being overlooked in applications such as marine protected area networks (Harborne et al. 2008).Linking environmental data to spatial variation in community composition confers significant advantages over a surrogate only approach and is a highly promising strategy for cost-effective conservation prioritization and management (Arponen et al. 2008).
Spatial variation in community composition lies at the heart of many ecological and biogeographical phenomena.Commonly described as beta diversity (sensu Whittaker 1972), it been used to refer to a wide variety of ecological phenomena (Jurasinski et al. 2009, Tuomisto 2010a) including motivation of research in a wide range of fields and in the development of ecological theory.There are numerous mathematical models to describe beta diversity based on either the presence or the abundance of species at the two sites of interest (see, e.g., Jost et al. 2010, Ricotta 2010, Wisley 2010) and a variety of statistical approaches with which to quantify beta diversity especially in relation to testing hypotheses about the origin of beta diversity (see, e.g., Legendre et al. 2005, Tuomisto andRuokolainen 2006).While these issues have been ably reviewed by Jurasinski et al. (2009), Anderson et al. (2009) and Tuomisto (2010a, b), what is clear is that the concept of beta diversity is not unambiguous, a reflection of the large variety of methods available to investigate beta diversity, and debates persist on concepts, measures and methods.Despite these debates, in community ecology studies of beta diversity have contributed to a greater understanding of species change across environmental gradients (Harrison et al. 1992, Jankowski et al. 2008), and of similarities and differences in these environments (Buckley and Jetz 2008).In addition as beta-diversity can quantify the turnover in species across space, it has important applications to the scaling of diversity, the delineation of biotic regions, and conservation planning (McKnight et al. 2007).
Recent advances in community modelling have significantly enhanced the value of environmental surrogates such that they can be linked to biological information, and have become especially valuable when used in the prediction of biodiversity into unsurveyed regions (see, e.g., De'ath and Fabricius 2000, Leathwick et al. 2005, Dunstan et al. 2011).Here we analyse and predict spatial patterns of turnover in community composition hereafter 'beta diversity' using two relatively new but quite contrasting community modelling methods for macroalgal species records from an extensive spatio-temporal quantitative survey of a rocky subtidal system on the southern coastline of continental Australia and the northern coastline of Tasmania (Barrett et al. 2009, Stuart-Smith et al. 2010).Specifically, we assess and quantify the congruence of predicted (mapped) beta diversity between models based on (1) Generalised Dissimilarity Modelling (GDM), a statistical approach based on modified generalized linear models (Ferrier et al. 2007) and (2) Gradient Forest Modelling (GFM), a statistical approach based on aggregation of tree split-point importances from random forest models (hhttp:// gradientforest.r-forge.r-project.org/i).The southern coastlines of continental Australia and the northern coastline of Tasmania are of particular interest globally due to exceptional macroalgal diversity and regional endemism, especially in South Australia (Bolton 1994, Phillips 2001, Kerswell 2006).Macroalgae contribute significantly to the biodiversity of coastal systems in this region (Smale 2010), and also provide and modify resources (both habitat and food) for other organisms (Connell 2003, Coleman et al. 2007, Wernberg and Goldberg 2008, Crawley et al. 2009, Vanderklift et al. 2009).To our knowledge the quantitative analysis and prediction of beta diversity (as opposed to other biodiversity attributes), has not yet been conducted extensively in any shallow rocky subtidal marine system.Both the GDM and GFM methods provide a transformation from environmental space into a space where distance represents compositional differences.While the methods are technically different, the predictive outputs i.e., mapped beta diversity of GDM and GFM are highly suitable as subjects for comparison.The GDM approach is now used in terrestrial ecological community modelling and biodiversity assessment (Ferrier et al. 2007), conservation planning (Ferrier et al. 2002, Marsh et al. 2010), regional scale survey design (Ashcroft et al. 2010), river classification (Leathwick et al. 2010) and marine environmental classification (Leathwick et al. 2009).GFM in contrast is a novel approach that has so far been applied for community modelling and biodiversity assessment to the continental shelves around Australia (Pitcher et al. 2011) and the Gulf of Maine and the Gulf of Mexico as part of the Census of Marine Life (CoML) project (unpublished).This is the first time that these two methods have been applied to a shallow subtidal rocky reef marine system to produce predictive maps of beta diversity for comparison on one of the largest sections of coast at a single latitude, worldwide.

Study reefs
The southern coast of continental Australia, which extends ;5500 km from Cape Naturaliste, Western Australia (33.328 S,115.018E), across South Australia and Victoria to the Victorian/ New South Wales border (37.348 S, 149.458E), is the longest west-east coastline in the world (Phillips 2001).The island of Tasmania is located to the south of Victoria, between 40 and 438 S latitude (Fig. 1), the northern coastline and islands in Bass Strait extending ;1100 km.Large sections of the coastline support extensive rocky substrata that provide suitable habitats for macroalgal attachment.Since 1992, rocky subtidal reefs have been monitored regularly by the Tasmanian Institute of Fisheries and Aquaculture (Barrett et al. 2009, Stuart-Smith et al. 2010), with major emphasis on reefs in South Australia and Tasmania, and including locations within four other recognized provincial bioregions (IMCRA 4.0, Commonwealth of Australia 2006).Here we analyse data for South Australia, Victoria and northern Tasmania, a region where remotely sensed environmental data have good geographical coverage.

Survey methods and data collection
Macroalgal data were obtained using underwater visual censuses (UVC) at 315 individual sites along ;6600 km coastal distance (where coastal distance is calculated as the linear distance along the relevant sections of coast using a 1:100,000 scale map (GDA94 Lambert Conical projection)).Data for each site comprise the mean of 20 equidistant spaced 0.5 m 3 0.5 m quadrats placed along 200 m reef transect in either 5 m or 10 m water depth.Macroalgal cover under 50 points per quadrat was assessed, and data on mean density of different algal species per site used in analyses (Barrett et al. 2009).The cover of canopy species was recorded first by divers, and then pushed aside and the points falling on understorey species next counted.Macroalgae were identified to the lowest possible taxonomic level in situ, usually species, otherwise to genus or, in a few cases, to broader functional groups (e.g., encrusting coralline algae).To reduce temporal influences on analytical results related to repeat sampling at individual sites, the dataset was reduced by including only the most recent surveys at each site.

Environmental data
A large set of environmental covariate data was collated across a 0.018 grid at a national scale (Huang et al. 2010) as part of the Commonwealth of Australia's Environment Research Facility (CERF) Marine Biodiversity Hub (hhttp:// www.marinehub.org/i).All data sources and references for spatial interpolation are annotated in Appendix A. Each site was assigned the environmental covariate of the closest node on the 0.018 grid.We also included one variable describing wave exposure, openness (see, e.g., Hill et al. 2010).Wave exposure is a particularly important environmental covariate that structures shallow reef systems, and has recently been made available as different quantitative indices for much of southern Australia (Hill et al. 2010).The covariates used in this analysis represent a wide range of environmental descriptors but we ensured that those found to be important for single species were included (unpublished).

Modelling
Macroalgal beta biodiversity was analyzed and predicted using two different modelling methods.GDM, an extension of GLM, allows for predictive modelling of community dissimilarity against a set of environmental predictors while overcoming two major problems that commonly occur in broad-scale ecological data; non-linearity in community dissimilarity between sites and ecological distance, and uneven rates of species turnover along environmental gradients (Ferrier et al. 2007).Uniquely, GDM models the pairwise community dissimilarity between sites as a function of the pairwise differences in the value of environmental covariates.Rather than using parametric transformations of the environmental covariates, GDM uses flexible splines that are constrained to be positively monotonic (Ferrier et al. 2002), thus capturing the manner in which biological differences between sites generally increases with increasing separation along environmental gradients (Leathwick et al. 2010).
We used a total of 19 candidate environmental covariates for modelling (see Appendix A).There is some debate as to whether some correlated predictors should be dropped or whether all predictors should be retained (see, e.g., Cutler et al. 2006, Knudby et al. 2010).Regardless of numerical method employed, it is usually difficult to disentangle and interpret the independent effects of correlated predictors.For modeling purposes (both GDM and GFM) we prefer to retain all predictors.This is partly because most predictors are correlated to some extent, so that any particular choice of exclusion is hard to justify, and partly because we do not know a priori which are the most important predictors v www.esajournals.orgthat are truly related to the response.We modeled beta diversity based on macroalgal occurrence (presence-absence) for a total of 185 taxa where dissimilarities between macroalgal communities occurring at survey sites were calculated using the Bray-Curtis dissimilarity.In this paper we employ the presence-absence version of the Bray-Curtis dissimilarity index: where A is the number of species common to both sites i and j; B is the number of species present only at site i; and C is the number of species present only at site j.Assuming that n environmental variables (x 1 to x n ) have also been estimated at the set of biological survey sites, matrix regression can be formulated most simply as a multiple linear regression: GDM was conducted using the .NET GDM Modeller, Version 1.0 (Ferrier et al. 2007) with the three I-spline basis functions option selected (see, e.g., Ferrier, 2002, Ferrier et al. 2007).We assessed the potential influence of spatial processes by comparing the results of GDM explanatory models that accounted for spatial autocorrelation, by including geographic distance between pairs of sites as a predictor variable, against GDM models that did not.When included, geographic distance between sites had little effect on either model performance (less than 1% improvement in deviance explained) or model structure.Based on this, and previous work on this dataset (unpublished), that demonstrate non-environmental spatial autocorrelation to be negligible in this case, we proceeded with an environmental response model only.
GFM modelling is based on RF modelling (Breiman 2001), a 'machine learning' approach.It is an ensemble method based on regression trees that combines many decision trees to produce a distribution of splits (rather than a point estimate).A random forest consists of a compilation of classification or regression trees (e.g., typically !500 trees in a single random forest) where each tree is fit to a bootstrap sample of the biological data using a recursive partitioning procedure, and the splits of which are selected from a random subset of one-third of the environmental covariates at each node.A cross-validation is carried out where the 'out-of-bag' sample (made up of the observations that were not selected in the bootstrap sample for a given tree) is used to estimate the prediction error of the forest.We modeled beta diversity based on macroalgal percent cover for a total of 185 taxa using the 'extendedForest' and 'gradientForest' packages for R (hhttp://gradientforest.r-forge.r-project.org/i)following the procedures outlined in Pitcher et al. (2010).Outputs of the univariate extendedForest calculations include R 2 measure of fit performance for each species, all (or optionally binned) splitpoint values and their impurities, and measures of conditional predictor importance where out of bag permuting to assess predictor importance is done within partitions of variables having an absolute correlation greater than 0.5 (see Smith et al. 2011 for extendedForest's implementation of Strobl et al. 2008).These outputs are used by gradientForest, which provides overall predictor importance, impurity-weighted density of split values for each predictor, and importance-weighted cumulative distributions of the splits.We used a Box-Cox method to select an appropriate transformation for percent cover data for all taxa and generated 1000 trees for each taxon.We used a total of 19 candidate environmental covariates for modelling.As with the GDM we also assessed the potential influence of spatial processes on the GFM results, by examining the extent of any residual spatial autocorrelation using the following procedure.Assuming it represented a suitable measure of raw compositional differences, the Bray-Curtis dissimilarity between sites was modeled as a function of the Euclidean distance between the biologically transformed (see under 'Prediction' section below) environmental variables at sites, both with and without geographic distance between sites, using GLM with an implementation of the Ferrier et al. ( 2007) GDM link function.The inclusion of geographic distance had little effect on either model performance (a change in fit of ,0.7%) or structure, again demonstrating that residual spatial autocorrelation was negligible, and that modelling with environmental variables only was appropriate.

Prediction
Predictions of beta diversity were made for v www.esajournals.orgapproximately 5500 km length of coastline of southern Australia and 1100 km of northern Tasmania (including the islands in the Bass Straight) to a depth of 30 m, using relevant environmental covariates for each modelling approach.The northern parts of the South Australian Gulfs were excluded due to insufficient covariate data.In the GDM approach, the combination of spline functions fitted to the environmental covariates in the explanatory model best represent the community dissimilarity between paired sets of survey locations.For prediction we recreated these spline functions fitted by GDM as empirical functions in R. Specifically, we interpolated GDM's reported value for each function at every 1/1000th over the range of each environmental variable.These empirical functions were used to transform the environmental variables at all unsurveyed locations, which we then ordinated and mapped across the region following the procedures described below for GFM (see, e.g., Pitcher et al. 2010).
The GFM approach accumulates all split values, weighted by split impurities, predictor importance and R 2 for all included species and standardized by observation density of the gradient, to provide the overall cumulative importance distributions for each predictor.These cumulative importance distributions represent compositional turnover functions that best describe changes in species composition and abundance across the survey locations.Prediction into unsurveyed locations across the region is achieved by transforming the environmental covariates at all locations using the cumulative importance distributions as empirical functions, as implemented by the gradientForest predict() function.Thus, the predictions are biologically informed environmental covariates that are now all represented on the same scale.
For both modelling approaches we used Principle Component Analysis (PCA) to ordinate the biologically transformed environmental variables.Specifically we represented beta diversity as the first three PCA axes that in both cases captured .90% of the variation in the biologically transformed environmental covariates.For plotting, the first two principal components (PC) were rotated so that the Y-axis represented the greatest variation in sea surface temperature, and assigned a red color palette.The X-axis was flipped (if necessary) to give positive correlation with exposure, and then allocated the green color palette.The third PC was allocated the blue color palette.The most influential environmental variables were plotted as vectors, analogous to a conventional biplot.The RGB color palette in biologically transformed PCA space was then mapped into geographic space, representing predicted patterns of beta diversity.

RESULTS
Our community dataset consisted of 315 sites with 185 taxa mostly to species level from 54 families.The families containing the greatest number of recorded taxa were Sargassaceae (36 species), Caulerpaceae (16 species), Rhodomelaceae (12 species), Corrallinaceae (11 species) and Dictyotales (11 species).On average, species were prevalent at 11% of sites with the most prevalent species occurring at 83% of sites.Twenty species occurred at one site.

Models of beta diversity
Macroalgal community composition showed a large amount of turnover, with sea surface temperate (both average and variability (i.e., standard deviation of annual temperature)) playing the most important role in the GDM model, followed by wave exposure and average oxygen concentration (Fig. 2).The model explained 35.5% of the deviance in observed species turnover, where 16 of a possible 19 candidate environmental covariates were chosen by the modelling procedure to be informative.The average chlorophyll-a concentration, standard deviation of chlorophyll-a, standard deviation of nitrate were not included as all three spline coefficients were zero, indicating a negligible contribution of this covariate.Appendix B shows the differences between environmental covariates with respect to their relative importance and shape of each fitted response.Each environmental covariate is plotted against the additive exponential link function of community dissimilarity.The relative range for the response of each environmental covariate indicates the relative importance in determining turnover in community composition.The shape of the response indicates which portions of a gradient of that variable would have a steeper turnover in community composition.In our GDM model, turnover in community composition changed at a greater rate with increasing variance in sea surface temperature (Appendix B).For average sea surface temperature, exposure and oxygen, relationships were approximately linear (Appendix B).
Using GFM, macroalgal community composition also showed a large amount of turnover, with standard deviation and average of sea surface temperature, sand and salinity (Fig. 2) playing the most important role.While GFM does not have an equivalent to the measure of deviance explained in GFM, the mean R 2 for the fit of transformed species abundances to the environmental variables was 26.2%.The cumulative importance curves for the GFM model are shown in (Appendix B).Taken overall, the relative importance of average and standard deviation of sea surface temperature and exposure used in the GDM and GFM explanatory models were largely consistent.

Patterns of predicted beta diversity
When mapped spatially, the pattern of macroalgal community composition predicted by both the GDM and GFM, showed a broad longitudinal gradient in turnover from eastern South Australia to eastern Victoria and north eastern Tasmania (Fig. 3).Gradients in average and standard deviation in sea surface temperature both the GDM and GFM were associated with patterns of community composition in South Australia.In western South Australia gradients in wave exposure were strongly associated with predicted turnover in western South Australia for both models.
For both GDM and GFM predictions, turnover in community composition differed markedly between open coast locations and embayment locations in South Australia and Victoria.In open coast locations (e.g., western South Australia, Victoria and northern Tasmania) gradients in wave exposure and oxygen were closely associated with predicted turnover for GDM and GFM.Both models predicted a distinct macroalgal community for Port Phillip Bay in Victoria, a region where variability in average sea surface The correlation (Spearman's Rank) between PCA axes one and two (here unrotated), provide an assessment of the congruence between the GDM and GFM predictions (Fig. 4).Overall congruence was good, with a correlation coefficient of À0.858 for PCA axis one and À0.78 for PCA axis two.A substantial amount of variance is explained in the first two PCA's, 63.9% and 29.8% respectively for GDM, and 52.4% and 23.7% respectively for GFM.The contribution of the environmental covariates to principal components one and two for GDM and GFM are also shown in Appendix A.

DISCUSSION
Beta diversity has been widely acknowledged as central to questions of what controls diversity in ecological systems (Whittaker 1960, 1972, Legendre et al. 2005), and how to adequately design strategies to protect this diversity (McKnight et al. 2007, Arponen et al. 2008).Community analyses with environmental variables (using methods such as constrained Canonical Correspondence Analysis, ter Braak (1986); Primer BIO ENV Clarke and Gorley (2001) and others mentioned in the introduction) have quantified how biodiversity composition may be influenced by the environment, and recent availability of suitable environmental data layers has begun to enable biodiversity prediction in marine regions (e.g., Leathwick et al. 2009, Pitcher et al. 2011).However, full understanding and hypothesis testing of the drivers of beta diversity and the ability to predict beta diversity patterns into unsurveyed regions over large spatial scales has generally been lacking in the marine environment.Here we used two flexible community modelling approaches that combined the complementary strengths of quantitative, point-location field survey data and continuous environmental data to analyse and predict spatial patterns of turnover in community composition for temperate subtidal rocky reefs across southeastern Australia.

Predicted beta diversity in relation to environmental drivers
Our predictions showed that processes related to key oceanographic circulation features (i.e., mean temperature and variability in temperature, salinity and oxygen) were associated with patterns of turnover in community composition.Strong oceanographic gradients exist for the region of Australia examined where the southern shelf and slope region hosts a complex circulation system that is strongly influenced by local winds, heating and evaporation (Middleton and Bye 2007).This creates a region that is oceanographically diverse with varying gradients in water depth, movement, chemistry and temperature.Coastal South Australia and western Tasmania also experience some of the greatest exposure conditions in the world with waves that originate from the south-west during storms in the Southern Ocean (Hemer and Bye 1999).Exposure declines in Bass Strait (between Tasmania and the mainland) and the South Australian Gulfs experience relatively sheltered conditions (Hemer and Bye 1999).An important regional oceanographic feature is the Flinders current (FC), which flows westward along the edge of the continental shelf from Tasmania to Cape Leeuwin in Western Australia (Middleton and Cirano 2002).The FC is linked to cold water upwelling events that occur in the austral summer-autumn off the south and south east of Kangaroo Island and the Bonney coast and are associated with low surface water temperatures and elevated concentrations of chlorophyll-a.However, this current is seasonal, again demonstrating the variability of oceanographic features along this coast.During winter the eastward-flowing Leeuwin Current (LC) brings warm waters from the western Australian coast to the Great Australia Bight (Cirano and Middleton 2004), and as far east as Kangaroo island (Middleton and Bye 2007) and western Tasmania (Crawford et al. 2000).Predicted turnover in community composition is quite different for these regions compared to nonupwelling regions and algal communities are known to be especially rich and productive in this region (Department for Environment and Heritage 2009).
Gradients in oceanographic conditions are also strongly associated with changes in community composition in the South Australian Gulfs and in semi-enclosed bays in Victoria (e.g., Port Phillip Bay).The South Australian Gulfs represent an environment unlike that along the open coast.During summer, temperature and salinity fronts form at the mouths of the gulfs, restricting water exchange.Limited freshwater input and high levels of evaporation cause increased salinity in Fig. 4. Spearman's Rank correlation of the first two principal components for GDM and GFM predictions (scaled axes).Color palette as for the GDM prediction plot (Fig. 3).v www.esajournals.orgthe northern gulfs in summer with seasonal extremes unlike other regions of southern Australia.In winter, cold, salty, dense water formed in the Spencer Gulf can cascade out onto the shelf, flowing south-easterly past Kangaroo Island (Middleton and Bye 2007).A unique community was predicted to occur in the semienclosed bay of Port Phillip, where similarly there is relatively low wave exposure, and relatively high variability in sea surface temperature, average oxygen and average chlorophyll-a.These conditions reflect an environment where there is reduced water exchange compared to sites on the open coast.Taken overall, these predicted patterns of beta diversity show variation both with proximity to the open ocean and longitudinally along the coast in relation to these environmental drivers.
The environmental drivers found here to be strongly associated with changes in macroalgal community composition are largely consistent with those identified in other studies.Temperature, in particular, is one of the strongest drivers of species distributions of marine organisms, including macroalgae (see, e.g., Adey andSteneck 2001, Anderson et al. 2009).Wave exposure is also a physical measure that has also long been recognized as an important factor in structuring reef communities (Collings and Cheshire 1998, Lindegarth and Gamfeldt 2005, England et al. 2008).In a recent quantitative analysis using modeled wave exposure (GREMO, see Appendix A), Hill et al. (2010) found that many southern Australian macroalgal genera have distinct relationships with exposure; for example, Carpoglossum, Caulerpa, or Sargassum tend to more frequently present at sites where exposure is relatively low, whereas Durvillaea and Phyllospora prefer high levels of wave exposure.
While turnover in community composition of macroalgae relates in part to environmental processes and associated species adaptations, it also likely reflects ecological interactions (Wright et al. 2005, Connell andVanderklift 2007), biogeographic history (Phillips 2001, Connell andIrving, 2008), and extant vicariant processes, particularly barriers to dispersal (van den Hoek 1987, Smale et al. 2010).It is clear that the observed patterns of beta diversity shown here are not fully explained by environmental models alone at this scale, and some of these other processes would play an important role.Clearly our analysis was restricted to the scale at which the environmental covariates were made available and will not capture turnover at smaller scales (;1 km 2 ).A number of other environmental or habitat features, for which covariate data were not available here, may also influence species turnover.For example, it is well known that macroalgal assemblages are strongly affected by topographic features, where the availability of different microenvironments (i.e., different light, water motion and sediment regimes and substrate slopes) can cause differences in assemblage structure (Toohey 2007).That subtidal assemblages may reflect environmental heterogeneity at small scales has also been confirmed for temperate reef flora and fauna in southern eastern Australia (O'Hara 2001); as well as for macroalgal assemblages examined by Wernberg et al. (2003) and Toohey (2007) in Western Australia.Future work on predicting patterns in beta diversity would ideally incorporate multiple spatial scales, allowing small scale habitat heterogeneity and species interactions to be accounted for wherever suitable physical data at matching scales are available.

Incorporating beta diversity into systematic conservation planning
The patterns of beta diversity predicted here provide a useful basis for conservation planning for temperate subtidal reef systems in Australia, where the national and state governments have committed to the development of a coordinated National Representative System of Marine Protected Areas (NRSMPA).GFM has already been used for national scale management applications for the Australian Department of Environment Water, Heritage and Arts (DEWHA) in the Southwest, Northwest, North and East Marine Regions (Pitcher et al. 2011).Biodiversity conservation has been formally recognized as the primary goal of the NRSMPA, (ANZECC 1998) placing obligations on coastal managers to maximize the representation of ecosystem types as well as threatened species with regional MPA networks (Edgar et al. 2008).Outputs of our study provide a basis for gap analysis to identify community types missing from current MPAs, and to ensure the full variability in community types is included within newly-developed MPA v www.esajournals.orgnetworks currently being developed in South Australia (Department for Environment and Heritage 2009).Because of the difficulty of mapping marine biodiversity across regional scales, MPA planning to date has tended to use remotely-sensed habitat types as surrogates for biodiversity and the basis for gap analyses (e.g., Williams et al. 2009).However, habitat types frequently include a variety of different community types, some of which may be omitted in MPA networks (Harborne et al. 2008).

Strengths and limitations of GDM and GFM
In community modelling, GDM has become a popular approach for analysing and predicting patterns of species turnover (mainly in terrestrial systems) (see, e.g., Ferrier et al. 2002, Allnutt et al. 2008, Growns 2009, Marsh et al. 2010, Ashcroft et al. 2010, Overton et al. 2009) but has rarely been compared with alternate modelling approaches for community data.For the first time we have been able to assess the congruence of predictions generated from a GDM approach to a quite different and novel approach, GFM, at a regional scale for an area that has globally been identified as important for algal diversity (Kerswell 2006).Despite the substantial technical differences in the two approaches, GDM and GFM modelling showed predicted patterns of beta diversity that were both striking and congruent, and captured a significant proportion of variation in turnover of community composition.The close congruence of the results of two approaches used here, provides strong support to the suggestion that environmental gradients related to temperature, exposure, water chemistry and nutrients are the common drivers of community change in this region.
However, it is also worth considering the modelling approaches used here for beta diversity prediction and how they can be further developed for the purposes of biodiversity modelling.Both GDM and GFM solve a number of problems that can arise with the use of environmental surrogates, including reconciling different measurement units and weighting variables to a common scale, and the non linear rates of community turnover along different parts of the gradient.GDM also employs automated selection procedures for environmental covariates if used in conjunction with GIS layers, but cannot as yet compute confidence intervals for model parameters or for predictions and does not accommodate interactions between environmental covariates (Ferrier et al. 2007).Further development of the GDM approach to incorporate uncertainty would give clear indications of the reliability of predictions, an important factor in marine conservation management and planning.
GFM, by growing large numbers of trees allows for the modelling of a large number of (possibly collinear) explanatory variables, while random predictor selection keeps bias low.As GFM is a tree method, it is able to model interactions between variables, but unlike most other tree methods, it presents an estimate of error for each outcome fitted.The variable importance measure in GFM may be used to identify ecologically important variables for interpretation; although a subset of variables is not automatically chosen exclusively in the same way as for example variable subset selection methods in regression.GFM is able to characterize and exploit structure in high dimensional data for the purposes of prediction.As with GDM, further development of the GFM approach to incorporate uncertainty would give clear indications of the reliability of predictions.

CONCLUSION
Temperate subtidal reefs along the southeastern coast of continental Australia and northern Tasmania are characterized by high macroalgal diversity and regional endemism.Yet, because of a lack of information on the spatial distribution of biodiversity attributes (e.g., species richness, species distributions), it has been difficult to assess and plan for conservation of biodiversity in a region where such planning may be critical.We have applied recently developed community modelling methods to characterize and map the pattern of biodiversity composition for a large area and overcome data limitations by combining biological survey data with environmental data.The representation of ecological relationships by GFM and GFM methods provide new avenues for exploring ecological relationships, both statistical and causal.With respect to the latter and although not formally detailed here, further biological interpretation of beta diversity patterns are both possible and useful to pursue and will form the basis of future work.This work may also be further advanced in the subtidal by similar analyses across different taxonomic groups, and analyses at finer spatial scales with relevant environmental surrogates as they become available to both increase our understanding of the whole ecological community and habitat, and advance regional-scale conservation management across the full domain of interest to planners.Of course the approaches described here are not limited to the subtidal or Australian marine systems, and the application of GDM and GFM in other marine systems can only further help develop these approaches that are useful in maximizing the value of existing data holdings, especially those for which biological data is sparse.

ACKNOWLEDGMENTS
This work has been funded through the Commonwealth Environment Research Facilities (CERF) program, an Australian Government initiative supporting world class, public-good research.The CERF Marine Biodiversity Hub is a collaborative partnership between the University of Tasmania, CSIRO Wealth from Oceans Flagship, Geoscience Australia, Australian Institute of Marine Science and Museum Victoria.We thank Piers Dunstan and Bill Venables for statistical advice, Glenn Manion for providing the GDM implementation and Scott Foster for valuable comments on earlier drafts of this manuscript.We also thank Bryan McDonald and Laurie Ferns for facilitating the South Australian and Victorian surveys respectively, and Carolina Zagal for database development.

Fig. 1 .
Fig. 1.The south-eastern coastline of continental Australia and northern coastline of Tasmania, with the location of rocky subtidal reefs surveyed by field teams.

Fig. 2 .
Fig. 2. Relative importance of the environmental covariates used in the GDM (left) and GFM (right) explanatory models.

Fig. 3 .
Fig. 3. Predicted spatial pattern of beta diversity for macroalgal communities around south-eastern Australia based on (A) GDM and (B) GFM.Colors represent gradients in beta diversity derived from environmental gradients across the region.Vector plots indicate the environmental variables that contribute the most to predicted patterns of beta diversity.

Fig. B2 .
Fig. B2.Cumulative importance distributions of splits for 8 of the 19 environmental variables, using GFM.The maximum value of the function for each environmental covariate indicates its relative importance.The shape of the function indicates the rate at which community composition changes across each environmental gradient.