Relationships between the species composition of forest field-layer vegetation and environmental drivers, assessed using a national scale survey


  • P. M. CORNEY,

    1. Applied Vegetation Dynamics Laboratory, School of Biological Sciences, University of Liverpool, PO Box 147, Liverpool, L69 7ZB, UK, Centre for Ecology and Hydrology, Lancaster Environment Centre, Library Avenue, Bailrigg, Lancaster, LA1 4AP, UK,
    Search for more papers by this author
  • M. G. LE DUC,

    1. Applied Vegetation Dynamics Laboratory, School of Biological Sciences, University of Liverpool, PO Box 147, Liverpool, L69 7ZB, UK, Centre for Ecology and Hydrology, Lancaster Environment Centre, Library Avenue, Bailrigg, Lancaster, LA1 4AP, UK,
    Search for more papers by this author
  • S. M. SMART,

    1. Applied Vegetation Dynamics Laboratory, School of Biological Sciences, University of Liverpool, PO Box 147, Liverpool, L69 7ZB, UK, Centre for Ecology and Hydrology, Lancaster Environment Centre, Library Avenue, Bailrigg, Lancaster, LA1 4AP, UK,
    Search for more papers by this author
  • K. J. KIRBY,

    1. English Nature, Northminster House, Peterborough, PE1 1UA, UK, and
    Search for more papers by this author
  • R. G. H. BUNCE,

    1. Alterra, Landscape and Spatial Planning Section, PO Box 47, 6700AA Wageningen, the Netherlands
    Search for more papers by this author
  • R. H. MARRS

    1. Applied Vegetation Dynamics Laboratory, School of Biological Sciences, University of Liverpool, PO Box 147, Liverpool, L69 7ZB, UK, Centre for Ecology and Hydrology, Lancaster Environment Centre, Library Avenue, Bailrigg, Lancaster, LA1 4AP, UK,
    Search for more papers by this author

Philip M. Corney (tel. +44 0151 795 5173; fax +44 0151 795 5171; e-mail


  • 1Simulation models of forest stand dynamics have increased understanding of over-storey vegetation functioning, and have facilitated the development of tools capable of assessing possible successional trajectories. However, few models incorporate the response of the field layer vegetation despite it being another key component of forest ecosystems.
  • 2Our main objective was to assess the degree to which field-layer vegetation composition in forests is determined by variables operating at different scales, from regional (e.g. climate, location) to local factors (e.g. basal area of canopy trees, management).
  • 3We used data gathered during a nationwide forest survey to assess the relative effects of a broad spectrum of environmental variables on species composition. Variation partitioning was used to examine the relative contribution of subsets of environmental variables such as site spatial variation, boundary type and presence of herbivores.
  • 4Ordination confirmed hypotheses that field layer vegetation is primarily structured by two composite geo-climatic gradients. However, variation partitioning demonstrated that site- and plot-scale management factors also strongly influence the floristic composition of forest patches.
  • 5Disturbance variables (site boundary type/regional presence of deer) accounted for considerable species variation, exceeding that due to either site spatial variation or forest structure.
  • 6This is the first time variation attributable to such a comprehensive range of environmental variables has been quantified for forests surveyed at a national scale. We thus provide a context within which regional studies, or analyses considering a more limited range of factors, can be viewed, and a framework from which robust models of floristic response to gradual and episodic natural and anthropogenic disturbances may be developed.
  • 7The methodology we present, including a novel technique for the identification and removal of outliers in large data sets, provides a unique and standardized means of assessing the relative importance of diverse environmental drivers across a range of habitat types at the landscape scale, and is readily applicable elsewhere.


Forest biodiversity is affected by numerous factors that are scale dependent. At the site scale, these include climate (Williams et al. 2002), substrate (Weber-Blaschke et al. 2002) and site history (Graae et al. 2004), with successional development (He et al. 2002) and direct management, including coppicing (Mason & MacDonald 2002) and grazing by wild ungulates or livestock (Mcinnes et al. 1992; Garin et al. 2000; Kirby & Thomas 2000), important at the local scale. Knowledge of the relative importance of the most important driving variables is essential to understanding ecosystem functioning (Jenny 1980; Kolb & Diekmann 2004), but few studies evaluate the interactions of these factors. Those studies able to provide predictive information on the response of field layer species to environmental variables are rare (Weisberg et al. 2003) and often context specific, considering only single, or groups of, forest sites.

Variables that play a key role at the local patch scale may differ from those that are important at the site or landscape scales (Weiher & Howe 2003), so their relative contributions should be assessed at a range of scales. This is important for theoretical ecology, robust model development (e.g. Bailey et al. 2002; Bragg et al. 2004), applied resource management (Bunce et al. 1999; Skov & Svenning 2003), policy development, and the assessment of causes and consequences of change.

This paper describes the range of variation in the field layer vegetation of semi-natural forests, using a countrywide survey across Britain. Much plant biodiversity in temperate forest ecosystems is contained within, or depends upon, this layer, which also provides a valuable habitat for organisms such as invertebrates (e.g. Wardhaugh 1997; Jukes et al. 2001; Tudor et al. 2004). Data from 103 sites, selected using a stratified-random sampling approach from almost 2500 forest patches spread across Britain and surveyed in 1971 (Bunce & Shaw 1973) provide a much larger, representative sampling domain than considered hitherto (Kirby et al. 2005). Because 16 plots were sampled within each site, it was possible to assess the effects of environmental factors on field layer vegetation at both site and within-site (plot) scales.

We used multivariate analysis to describe the range of variation in field layer vegetation, and to relate community composition to environmental data gathered, both during the survey and also derived subsequently from contemporary sources. Variation partitioning was then used to assess the relative contributions of environmental variables grouped into appropriate sets (e.g. climate, soils). The extended analysis and use of plot level species cover data provides an improved assessment of the floristic composition of field layer communities over our previous analyses of presence/absence data at the site level (Corney et al. 2004). We believe this analysis is the first to attempt such a comprehensive investigation using data for an ecosystem resource surveyed at a national scale.


national woodland survey (nws) and derived data sets

Association analysis of vegetation data was used to classify 2463 forest sites into 103 groups (Bunce & Shaw 1975; Sheail & Bunce 2003). Analysis of topographical and climatic data was used to identify a representative site from each group. Plots (14.14 m × 14.14 m, 200 m2, n = 16) were then positioned randomly within each site and surveyed for the NWS between June and October 1971 (details in Bunce & Shaw 1973). The median area of sampled forest sites was 20.4 ha, with lower and upper quartiles of 12.8 and 39.7.

Variables measured in each plot (Table 1) included cover (%) of field layer species (for nomenclature see Stace 1997), cover (%) of six ground cover categories and diameter at breast height (d.b.h., ≥ 1.3 m) of all species normally capable of attaining a tree-like habit. Because measuring all stems in sites of even moderate regeneration would have required a disproportionate amount of time, diameter and counts of saplings (under 5 cm d.b.h.) and all shrub species stems (e.g. Corylus avellana, Sambucus nigra) were only recorded in opposing quarters of each plot (Bunce & Shaw 1973) and multiplied by 2 to derive a plot estimate. Dead stems and multiple stems originating from the same stool were also counted. Micro-climatic variables and plot descriptors and soil data were also recorded during the NWS. Soil pH (H2O) and loss-on-ignition (LOI) were assessed from a soil sample (methods adapted from Allen 1989). Soil subgroup classes were assigned using Avery's (1980) classification, but were accumulated into the higher category of soil groups during exploratory analyses, to overcome problems of multicollinearity.

Table 1.  Description of environmental variables used in the multivariate analysis, along with data transformations and software packages used. These variables were either collected during NWS 1971 or generated for the present analysis. No. = the number of different categories comprising each variable group. For simple continuous variables, such as northing, each plot will have a single value. However, for categorical continuous variables, such as live tree species basal area, any given plot may be represented by up to the maximum number of classes (e.g. 75 species) for that group. Note that, for example, although 26 soil subgroup classes were recorded during the survey, each plot is represented by one or few soil subgroups. S = data source
Variable groupDescriptionNo.S
  • a

    Weighted arithmetic average of Meteorological Office monthly data to generate 32 seasonal variables.

  • b

    Arithmetic average of plot data by site.

  • c

    Arcsine transformation (Sokal & Rohlf 1995).

  • d

    Transformation of aspect (a, degrees), s = sin(a ×π/180)/2, giving a southerly aspect.

  • e

    Transformation of aspect (a, degrees), w = | sin((a ×π/180) − π/2)/2 |, giving a westerly aspect.

  • f

    Where more than one subgroup occurred in each plot, the complex was assumed to comprise equal amounts of each constituent subgroup.

  • g

    d.b.h. (cm) data converted to basal area (πr2). Square root transformation of basal area data; 0.5 added to all variates (Sokal & Rohlf 1995).

  • ERDAS IMAGINE version 8.5 (ERDAS 2001).

  • MAP MANAGER version 6.2 (ESRI 2001), ARCVIEW GIS version 3.2a (ESRI 2000).

  • MINDIST2 (Le Duc et al. 2000).

 Site boundary typeSite level presence or absence of different boundary types161
 Site form, shapeSite area (ha)*; perimeter (m)*; two shape indices: area index (Aw/Pw); perimeter index (Pw/Pc)41
 GeographicNational grid easting and northing (km) of plot position*; average site altitude (m)b; distance (km) from plot to nearest coast‡41, 2
 Forest climateForest wind climate; annual accumulated temperature; moisture deficit33
 Deer census dataDeer species census data; presence of six species in 10-km grid squares64
 Climatic seasonal 1961–90 LTA dataaMean daily max temp (°C); mean daily min temp (°C); days of ground frost; mean cloud cover (Oktas converted to percentage); total bright sunshine (hours); number of days per month having a rainfall ≥ 1 mm (rain days); number of days per month having a rainfall ≥ 10 mm (wet days); total precipitation (mm)325
 Climatic yearly 1961–2000 LTA dataAnnual extreme temp range (°C); number of growing degree days; growing season length; maximum number of consecutive dry days in a year; greatest 5-day precipitation total in a year (mm); mean rainfall amount (mm) on rain days65
 EdaphicSoil pH; LOI (%)c; cumulative depth (cm) to bottom of five soil horizons (A00; A0; A1; A2; B)71
 Micro-climaticPlot slope (degree); southerly aspectd; westerly aspecte; plot altitude (m); distance (m) from plot to nearest site boundary*51, 2
 Ground coverCover (%)c of six ground cover types (including bryophyte cover, litter, dead wood and bare ground)61
 Plot level descriptorsIncluding signs of management (7); habitats (46); signs of animal activity (13)661
 Soil classificationProportion of one of 26 soil subgroups present in each plotc,f261
Woody spp.
 Species basal areaLive and dead (l/d) basal area (cm2) of: tree (≥ 5 cm diameter) (75/47), sapling (< 5 cm diameter) (64/42) and shrub (51/29) species present within each plotg(190/118)1
 Total plot statisticsTotal tree, sapling and shrub basal area (cm2 plot−1)g, live and dead; density of tree, sapling and shrub stems (plot−1), live and dead121
 DateMonth during which the site was surveyed. Used as dummy variables.1 (5)1
Data sources
 1National Woodland Survey 1971 data  
 2Land-Form PANORAMA DTM contour maps; OS DIGIMAP service (EDINA 2002)  
 3ESC decision support system (Ray 2001) 1961–90 LTA data  
 4Biological records Centre, Centre for Ecology and Hydrology, Monks Wood, 1911–80 data  
 5UK Meteorological Office 5 km × 5 km grid baseline datasets  

Additional variables (also in Table 1) were subsequently derived for each plot, using the methods described by Corney et al. (2004), and employing ordnance survey (OS) grid references, generated using the software package ERDAS IMAGINE v8.5 (ERDAS 2001). These included plot altitude data (extracted from Land-Form PANORAMA Digital Terrain Model contour maps held on the OS DIGIMAP service; EDINA 2002), monthly and yearly climatic data, acquired from long-term average (LTA) data sets (produced in association with the UK Climate Impacts Programme; Meteorological Office 2003), forest climate LTA data (obtained from the Forestry Commission's Ecological Site Classification system; Ray 2001) and deer census data (supplied by the Biological Records Centre, from information collated by the British Deer Society, the Mammal Society and others; Arnold 1993).

classification of plant communities

Before ordination analysis, all 1648 plots were classified using both the UK National Vegetation Classification (NVC, derived from sites of conservation importance; Rodwell et al. 1991) and the Countryside Vegetation System (CVS, derived from a national stratified-random sample of vegetation; Bunce et al. 1999) to allow identification of plot outliers. Plot classification was implemented using TABLEFIT (v1.0) for NVC (Hill 1996), and MAVIS Plot Analyser (v1.0) for CVS (Smart 2000), with classes assigned using total species presence (i.e. trees, saplings, shrubs, bryophytes and field layer species) within each plot, because cover data were not available for non-field layer species. For classification and summary environmental data, see Appendices S1 and S2 in Supplementary Material.

ordination methods

Ordination analyses were conducted using canoco for Windows version 4.5 (Ter Braak & Šmilauer 2002) to investigate composition of field layer communities in relation to environmental factors. Cover (%) of field layer species (including tree and shrub seedlings < 25 cm tall) was used as dependent data, with all other variables, including the basal area of both canopy and shrub layer species, regarded as independent. Environmental variables were transformed as shown in Table 1 and field-layer species cover data using ln(10y + 1). Species identified only to genus level or as aggregates (e.g. Hieracium spp., Taraxacum agg.), and species identified as couplets (e.g. Poa nemoralis and P. trivialis), were down-weighted.

Detrended correspondence analysis (DCA) was used for exploratory analyses of species-environment relations. Assessment of gradient lengths for the primary axes also determined later choice of either unimodal or linear models (Ter Braak 1995; Ter Braak & Šmilauer 2002).

The occurrence of non-forest plots and planted crops, which is inevitable in a large-scale survey based on stratified-random sampling, caused severe distortion in the initial DCA. Removal or retention of non-forest plots was validated using an iterative approach in which the data set was split randomly into two equal-sized blocks, which were assessed using DCA. Plot sample scores for the first four axes were averaged by site within each block and the deviation between blocks compared using fitted line regression (MINITAB Inc. 2000). The effect of removing outliers was assessed with the aim of retaining as many plots as possible per site. Including only those plots classified as forest by both CVS and NVC provided a final data set with minimal between-block deviation, comprising 1438 plots, or 87% of those surveyed in 1971 (Fig. 1; the classification and number of both retained and rejected plots are presented in Appendices S1 and S2).

Figure 1.

Distribution of sites surveyed in the 1971 National Woodland Survey of Britain. Circle size represents the number of plots from each site included in the present analysis on the basis of vegetation classification.

DCA gradient lengths, after removal of the outliers, were > 3.5 standard deviations for the first four axes, suggesting that unimodal models were appropriate (Ter Braak & Šmilauer 2002); DCA and canonical correspondence analysis (CCA) were used subsequently (Ter Braak & Prentice 1988). Species occurring in only one plot were made supplemental (Ter Braak & Prentice 1988), and month surveyed was entered as a covariable in all further ordination analyses to remove effects of a staggered survey from June to October.

Interpretation of DCA/CCA biplots was assisted by reference to the forest specialist species list for Britain, derived by Kirby et al. (2006) from 12 regional lists of species indicative of ancient woodland and forest. Field layer specialists are species adapted to the relatively stable environmental conditions found beneath the canopy (Honnay et al. 2002a). They are more likely to have stress-tolerant traits and to be shade or semi-shade tolerant, compared with other forest plants (Hermy et al. 1999), and are important as indicators of the ecological value of forests (Honnay et al. 1998). Because species considered to be forest specialists in one region may not be classified so in another region (Hermy et al. 1999; Kirby et al. 2006), NWS species were classified into three broad groups: forest specialist species showing a strong association with the distribution of ancient forest patches across much of the country (present in seven or more lists), specialists showing an affinity only in some regions (present in six lists or less), and non-specialist (or not listed) species.

Several otherwise complex ordination diagrams have been simplified by plotting only species that are abundant in the data set. Selection of species was determined by N2 > 60. N2 (Hill 1973), a measure of species abundance, is related to Simpson's index, i.e. 1/(inline image).

Bivariate standard deviational (SD) ellipse analysis, implemented in EXCEL (Microsoft 2003), using a standard algorithm (Ricklefs & Nealen 1998; Milligan et al. 2004), was used to illustrate the position of NVC forest community types and forest specialist groupings along CCA biplot axes. The ellipses were generated using CCA sample or species scores (calculated by canoco) and enclose the area within 1 standard deviation of the centroid of each distribution.

variation partitioning

We wished to establish the total variation explained by different groups of variables. Environmental data for the 1438 plots were therefore divided into three initial groupings, corresponding to different spatial scales. Variables were then drawn from the relevant grouping to form each of three distinct explanatory sets, Site {S}, Plot {P} and Woody species {W} (Fig. 2), on the basis of a two-step, within-set, process. First, variables exhibiting the strongest multicollinearity (i.e. those almost perfectly correlated with one or more of the other variables in the set) were successively removed until all variable sets exhibited low inflation factors. Secondly, variables that did not contribute significantly to an explanation of variation were eliminated using the CCA forward selection process, in order to give ‘realistic estimates of variation’ (Borcard et al. 1992) explained by each set. Monte Carlo testing, with 499 permutations, was used to assess significance of each variable upon inclusion into set models (Økland & Eilertsen 1994). Although Borcard et al. (1992) suggest that each set should contain comparable numbers of explanatory variables, the number of variables included in each set is less important where, as in this case, forward selection is used before constrained ordination (Økland & Eilertsen 1994). The threshold limit for within- and between-set multicollinearity was set at 20.

Figure 2.

Procedural scheme, demonstrating analysis approach and hierarchy of seven VP models and two additional CCA models. *Total tree, sapling and shrub basal area (live and dead); density of tree, sapling and shrub stems (live and dead). Biplots of CCA models (General, Plot management and Forest structure) are illustrated in grey.

The fraction of variation explained by a set of explanatory variables is the sum of all constrained eigenvalues divided by total inertia. The interaction of all variable sets within one group, e.g. {P∩S∩W} was calculated by subtraction. The significance of each model was assessed using Monte Carlo tests with 9999 permutations (Legendre & Legendre 1998). Because the significance of CCA analyses was determined by this randomization test, a model with low eigenvalues should be interpreted as one with a small, but significant, effect. As seven variation partitioning models and two additional CCA models were examined (see Fig. 2), the Bonferroni correction for significance (Legendre & Legendre 1998) was used to control for Type I error (i.e. reporting a false positive), with α (0.05), being replaced by an adjusted level of α′ = α/9 = 0.006.

We first assessed the relative importance of the site {S}, plot {P} and woody spp. {W} sets (VP model I) using partial constrained ordination (pCCA). In VP model II, site {S} was broken down into Climatic {CS}, Spatial {SS} and Boundary/Grazing {BS} effects, with the last category further broken down into Deer {DB} and Boundary type {BB} (VP model III). Plot {P} (VP model IV) comprised Geo-spatial {GP}, Biotic {BP} and Management {MP}, the effects of which were then assessed in more detail with a pCCA model, while woody species {W} (VP model V) was decomposed into the effects of canopy tree species {TrW}, tree species saplings {SaW} and shrub species {ShW}.

Canopy structure and light availability are strongly associated with stem geometry (Weisberg et al. 2003; Dean 2004). In order to assess the relative contribution of woody structure as opposed to tree species composition, 12 variables (Table 1) describing basal area and density of woody stems were added to the plot model {P}, and assessment of inflation values and significance, using forward selection, was used to generate {P+W} (Fig. 2). Variation partitioning was then used to assess the relative importance of plot and forest structure in relation to site variables (VP model VI) and forest structure against plot variables only (VP model VII). Finally, a pCCA model was used to explore the effect of forest structure (stem area and density of trees, saplings and shrubs) on field layer species.

Unless stated otherwise, the percentage of variation explained by a given set includes any variation intersecting with that of an adjoining set. Variation intersections may be explained, either wholly or partially, by one or more sets and thus, while it is acceptable to say that the variation explained by a set {S} is that explained by the set alone, i.e. {S | P}, it is more accurate to say that it is ‘up to’ the amount described by both {S | P} and any intersections, i.e. {S∪(S∩P)}. Values of variation explained may therefore sum to more than 100% because of overlaps between sets.


floristic variation

The geographical range of retained plots covered the entire British mainland (Fig. 1), with considerable physiographic variability between different vegetation community classes. Of 352 species included as active species, 101 (29%) were classed as forest specialists, similar to the proportions found by Hermy et al. (1999) from a wider assessment of forests across Europe (21%), and 34 of these had a strong affinity with ancient forest fragments in most regions (Kirby et al. 2006).

The final DCA produced eigenvalues (λ1 – λ4) of 0.528, 0.325, 0.247 and 0.226, with gradient lengths for the first four axes of 5.761, 5.028, 4.489 and 3.639 (SD units), respectively. A plot of the first two axes (Fig. 3) shows two major trends. First, species with low scores along axis 1 are shade-tolerant forest species, associated with fertile, base-rich soils (e.g. Glechoma hederacea, Circaea lutetiana and forest specialist species Mercurialis perennis and Galium odoratum), whereas species with larger scores associate strongly with acidic, infertile soils and well-lit habitats such as heathland communities and moorlands (e.g. Pteridium aquilinum, Vaccinium myrtillus). Secondly, low scores along axis 2 are exhibited by species often found on moderately damp soils (e.g. Hedera helix, and forest specialists Hyacinthoides non-scripta and Lamiastrum galeobdolon) and larger scores by species typical of water-saturated flushes, wet forests and poorly aerated soils such as fens, ditches and marshes (e.g. Chrysosplenium oppositifolium, Filipendula ulmaria).

Figure 3.

DCA plot produced from analysis of 1438 NWS plots. Ordination of 62 species (with N2 > 60) shown. Species are denoted thus: species considered to be forest specialists (squares) across much of the country, ▪; species considered to be forest specialists in some regions only, ░; other forest and non-forest species, ×. Codes Poa 1.2, Quer 1.2 and Viol 3.4, refer to the species couplets Poa nemoralis and P. trivialis, Quercus petraea and Q. robur (seedling) and Viola reichenbachiana and V. riviniana, respectively. A full species list is available in Appendix S3.

floristic variation in relation to environmental factors

CCA of model I {S∪P∪W} was significant (F = 3.137, P≤ 0.001, λ1 = 0.395, λ2 = 0.223). Sets and subsets of significant variables are given in Table 2. Axis 1 of the summary biplot (Fig. 4) is distinguished by the geo-climatic variables soil pH, precipitation and temperature, while axis 2 demonstrates a moisture gradient; both are comparable with the first two DCA axes.

Table 2.  Sets and subsets of environmental variables; within-set order of selection is shown, along with the F-values of an unrestricted Monte Carlo test with 499 permutations conducted at the final (manual) forward selection stage
SetSubsetOrder selectedFSubsetOrder selectedFSubsetOrder selectedF
  1. Significance of the F-test: *P ≤ 0.004, **P ≤ 0.002.†Variable rejected during forward selection procedure to create PP+W.

Site {S}Boundary/grazing {BS}  Climatic {CS}  Spatial {SS}  
Road 84.34**Minimum temp summer113.96**Easting210.48**
Fence intact104.34**Cloud cover winter46.47**Site area38.01**
Red deer (BRC)163.51**Cloud cover spring65.03**Sea distance55.20**
Hedge thick173.38**Ground frost winter74.46**Site altitude114.27**
Railway183.21**Total precipitation summer143.74**Shape function (P)124.07**
Roe deer (BRC)193.39**Ground frost summer153.48**Shape function (A)322.97**
Wall intact203.15**Sunshine in winter213.35**   
Water223.24**Extreme temperature range243.17**   
Fallow deer (BRC)253.15**Maximum temperature winter302.87**   
Post and rail263.11**Windiness score342.74**   
Fence derelict273.21**      
Fence holes293.21**      
Bank and ditch312.90**      
Merging direct332.89**      
Sika deer (BRC)352.74**      
Wall derelict362.66**      
Wall gaps372.77**      
Hedge thin392.35**      
Mutjac (BRC)402.37**      
Plot {P}Geo-spatial {GP}  Biotic {BP}  Plot management {MP}  
Soil pH 118.81**Marsh/bog27.65**Old conifer stump65.05**
Plot slope 46.68**Cover of litter37.57**Sheep123.77**
Loss on ignition 55.97**Glade > 12 m74.76**Cattle192.78**
Brown alluvial soils 94.41**Aquatic vegetation84.55 *Rubbish (other)202.69**
Plot altitude113.89**Cover of bryophytes104.24**Ditch/drain wet262.38**
Brown calcareous earths153.35**Anthill133.82**Spent cartridges292.28 *
Bryophyte covered rock163.15**Bare ground143.67**Coppice stool371.82**
Cambic gley soils182.98**Red deer (NWS)172.96**Ditch/drain dry481.64**
Depth to B horizon212.54**Cover of dead wood222.56**Stump > 10 cm511.56**
Rendzinas232.53**Stream/river fast252.39**Path < 5 m531.39**
Brown sands242.44**Other deer (NWS)302.18**   
Southerly aspect272.39**Glade 5–12 m391.89**   
Sand rankers282.37**Log very rotten401.78**   
Westerly aspect312.18**Fallen branch > 10 cm411.80**   
Stone < 5 cm322.11**Mole421.80**   
Argillic brown earths332.08**Squirrel431.79**   
Rock ledges342.00**Rabbit492.07**   
Exposed gravel/sand352.02**Fallen uprooted tree501.53 *   
Distance plot to boundary362.02**Fallen broken tree521.43**   
Rocks 5–50 cm381.87**      
Depth to A1 horizon441.72**      
Brown earths461.95**      
Exposed mineral soil471.59**      
Woody ssp. {W} (basal area)Tree spp. {TrW}  Sapling spp. {SaW}  Shrub spp. {ShW}  
Fraxinus excelsior 17.00**Populus tremula 35.63**Corylus avellana83.07**
Alnus glutinosa 25.96**Crataegus monogyna122.63 *Corylus spp.93.07**
Pinus sylvestris 44.30**      
Quercus spp. 54.27**      
Ulmus glabra 63.06**      
Ulmus spp. 73.09**      
Acer pseudoplatanus102.76**      
Fagus sylvatica112.70**      
Betula spp.132.42**      
Crataegus monogyna142.21 *      
Acer campestre152.10 *      
Figure 4.

Model I {S∪P∪W}. CCA biplot illustrating significant environmental variables and plant species with N2 > 60. For clarity, only variables exhibiting strongest interset correlations (> ±0.2) with axes 1 or 2 are shown; full data are presented in Table 2. For species nomenclature, see Fig. 3. Inset: position of selected NVC community types, displayed as bivariate SD ellipses, for CCA ordination axes 1 and 2. NVC types illustrated are: Alnus glutinosa-Fraxinus excelsior-Lysimachia nemorum woodland, W7; Fraxinus excelsior-Acer campestre-Mercurialis perennis woodland, W8; Quercus robur-Pteridium aquilinum-Rubus fruticosus woodland, W10; Quercus petraea-Betula pubescens-Oxalis acetosella woodland, W11.

The first axis illustrates a gradient from forests in warmer, dryer and more eastern areas, to cooler wetter forests in the west (inter-set correlations of minimum summer temperature, total summer precipitation, summer ground frost, easting; r = − 0.45, r = 0.36, r = 0.47, r = − 0.39, respectively). Forests at the negative end of the axis tend to be found on deeper, less acidic, soils (depth to B horizon, r = − 0.26; depth to A1 horizon, r = − 0.20; pH, r = − 0.61) and are dominated by Fraxinus excelsior (r = − 0.34), Crataegus monogyna (r = − 0.20) and Corylus avellana (r = − 0.22). They are also associated with shade-tolerant field-layer forest specialist species of fertile and weakly basic soil conditions such as Mercurialis perennis and Galium odoratum, indicating NVC class W8 (Fig. 4, inset), diverse, base-rich lowland woodlands. The positive end of the axis illustrates forests at higher altitudes (r = 0.28), containing a greater proportion of large glades (r = 0.25), and a greater frequency of sheep grazing (r = 0.26). Field layer species associated with these open, moderately acid and intermediately fertile sites include Anthoxanthum odoratum, Agrostis capillaris, Galium saxatile and Potentilla erecta, typical of NVC W11, upland Quercus-Betula woods.

The negative end of axis 2 highlights drier communities, dominated by Quercus spp. (r = − 0.22), with a greater cover of litter (r = − 0.53) and associated with Pteridium aquilinum, suggesting W10 (the lowland analogue of W11, dominated by Q. robur, rather than Q. petraea). By contrast, species characteristic of wet habitats (marsh/bog, r = 0.36; stream/rivers, r = 0.24) dominated by Alnus glutinosa (r = 0.31), such as Filipendula ulmaria and Chrysosplenium oppositifolium, are found at the positive end of this axis, i.e. in the species-rich wooded stream-side habitats typical of W7.

contributions to overall effect

Variation partitioning (VP) model I {S∪P∪W} (Fig. 5a) explained 19.7% of variation within the data set. Plot- and site-scale variable sets accounted for 56.3% and 50.6%, respectively, but with a strong inter-set interaction (16.9%), suggesting that certain variables included in these sets may be important at both spatial scales. Together, site- and plot-scale variables can account for up to 90.0% of total variation explained (TVE). In contrast, woody species accounted for only 18.8% of the variation, of which 8.8% was shared with the other two sets.

Figure 5.

Procedural scheme of VP models, illustrated as Venn diagrams, showing proportion of total variation explained (TVE) in forest plant community composition (unique and shared components), attributable to nested sets of environmental variables, as summarized from pCCA: (a) general model, with CCA biplot (Figure 4) of combined variable sets; (b) site variables; (c) boundary/deer subsets; (d) plot variables, with CCA model and biplot (Figure 6) of management variables only; (e) woody species variables; (f) plot, including forest structure variables against site variables; and (g) plot against forest structure variables only, with CCA model and biplot (Figure 7) of woody structural variables only. Area not shown to scale.

contributions to site effect{s}

Within the site set (VP model II, F = 4.190, P≤ 0.001, λ1 = 0.304, λ2 = 0.154) climate, spatially structured variables and management of site accounted for 31.0%, 20.1% and 53.4%, respectively, of TVE (Fig. 5b).

Analysis of VP model III (F = 2.575, P≤ 0.001, λ1 = 0.062 and λ2 = 0.054) (Fig. 5c) showed that the boundary type subset {BB} could account for the majority (75.5%) of variation explained by site management, while presence of deer species {DB} may account for up to 26.9%. Total variation explained by boundary type (75.5% × TVE of VP model III) was found to be almost twice that explained by site spatial variables {SS} (20.1% × TVE of VP model II): 2.4% vs. 1.3%.

contributions to plot effect{p}

The geo-spatial subset made the largest contribution to plot level variation (VP model IV, F = 3.279, P≤ 0.001, λ1 = 0.341, λ2 = 0.193), but there was also an interaction (3.9%) with the biotic subset, because certain biotic variables have a geo-spatial component. For example, bryophytes, red deer and marsh/bogs tend to be more common in forests to the north and west of Britain. Management contributed 19.7% to TVE, including interactions with other sets (Fig. 5d).

Management variables were modelled using pCCA (F = 1.916, P≤ 0.001, λ1 = 0.046, λ2 = 0.031). The principal geo-spatial/moisture gradients (Figs 3 and 4) were largely removed as a result of employing the remaining variable sets as covariates. However, the ordination biplot (Fig. 6) and bivariate SD ellipse analysis (inset) illustrate correlations of plant species and forest specialist groupings with management. The size, shape and orientation of the SD ellipses suggests that species considered to be forest specialist species in most areas of the country are strongly associated with traditional forest management, such as coppicing. As a group, species considered to be specialists in only some regions have a broader floristic response and are more loosely associated with these variables, while the ellipse representing non-specialists encompasses all management types. Coppice stools, small forest paths, large stumps and dry ditches were associated with a flora more typical of shaded conditions, including several forest specialist species, such as Galium odoratum and Carex sylvatica (unlabelled, at the centre of the biplot). In contrast, grassland species (e.g. Carex caryophyllea, Phleum pratense, Galium pumilum) were strongly correlated with the presence of livestock, while forest specialists were negatively correlated with variables, suggesting strong grazing pressure. Seedlings of the introduced shrub Prunus laurocerasus, planted widely for landscaping purposes and game cover, were strongly associated with the presence of spent cartridges. Other plants associated with this variable included the lowland species Centaurium erythraea and Atropa belladonna, often found in open or disturbed conditions, along with Hypericum androsaemum, a forest specialist in southern England, but also occasionally planted. Old coniferous stumps and seedlings of commercial softwood trees (pine and larch) were associated with a range of species tolerant of infertile and acid soils, such as Eriophorum vaginatum and Nardus stricta, but few forest species.

Figure 6.

CCA biplot showing significant management variables {((MP)∩GP′∩BP′)∩S′∩W′} (all nominal) and all plant species. Seventy species at the edge of cluster labelled. For species nomenclature, see Fig. 3. Full species list is available in Appendix S4. Inset: position of forest specialist groupings, displayed as bivariate SD ellipses, for CCA ordination axes 1 and 2. The groupings are: species considered to be forest specialists across much of the country, A; species considered to be forest specialists in some regions only, B; and other forest and non-forest species, C.

contributions to woody species effect{w}

Variation partitioning model V (F = 3.662, P≤ 0.001, λ1 = 0.163, λ2 = 0.097) (Fig. 5e) examined the influence of woody species composition. The greatest proportion of total variation explained, 69.9%, was accounted for by canopy tree species, rather than saplings or shrubs. Overall, the variation explained by tree species (69.9% × TVE of VP Model V) was similar to that accounted for by site boundary type (75.5% × TVE of VP Model III) and site spatial variables (20.1% × TVE of VP Model II): 1.4%, 2.4% and 1.3%, respectively.

contributions to forest structural effect{p+w}

A second suite of analyses considered total live and dead basal area and density of tree, sapling and shrub stems, as predictors of ground flora composition. Forward selection identified three significant stand structure variables {WP+W}: total number of live shrub stems (P≤ 0.004, F = 2.63), total tree live basal area (P≤ 0.002, F = 2.37) and total sapling live basal area (P≤ 0.004, F = 1.70). These were added to the significant plot variables remaining {PP+W}, to generate the set {P+W} (Fig. 2). This set was assessed alongside site {S}, in VP model VI (F = 3.220, P≤ 0.001, λ1 = 0.390, λ2 = 0.216). Total variation explained by {P+W∪S} (Fig. 5f) was 17.7%, slightly less (2%) than that explained by model I {S∪P∪W} (Fig. 5a). The difference between models I and VI suggests that the part played by individual tree, sapling and shrub species in determining the composition of forest ground flora is small, relative to their effect through bulk cover or number of stems per stand.

For VP model VII, the three variables comprising {WP+W} were compared against remaining plot variables, {PP+W} (Fig. 5g) (F = 2.379, P≤ 0.001, λ1 = 0.182, λ2 = 0.121). Plot variables accounted for 94.7% of total variation explained. Only 6.7% of variation explained by this model was attributable to stand structure, of which 1.4%, or one-fifth, was shared with {PP+W}.

Ordination (pCCA) of stem area and density variables only (Fig. 7) (F = 2.262, P ≤ 0.001, λ1 = 0.032, λ2 = 0.029) illustrates two orthogonal gradients, suggesting a divergent response of field-layer species to shrub- and canopy-layer structure. Fraxinus excelsior seedlings are shown to be associated strongly with overstorey structure, while seedlings of Quercus spp. are found in more open habitats. In contrast, Acer pseudoplatanus seedlings appear to be correlated negatively with an increase in sapling basal area and shrub stem density, rather than overstorey structure. Field layer forest specialists showed little response to increasing understorey vegetation, but were more characteristic of forest patches with a larger basal area index, contrasting strongly with the grasses and associated species of more open woodland and grass/heath (e.g. Galium saxatile). However, although site and plot variables have been removed as covariates, a residual geo-spatial gradient may remain: the contrast between G. saxatile and Anthoxanthum odoratum on the one hand and F. excelsior on the other, may reflect an upland/lowland, acid to base-rich gradient, roughly paralleling increasing tree basal area.

Figure 7.

CCA biplot of stand structure and plant species {(WP+W∩PP+W′)∩S′}. Ordination of 62 species with N2 > 60. For vegetation nomenclature see Fig. 3.


Many semi-natural fragments of the formerly extensive British forests are relatively small (Spencer & Kirby 1992) and widely separated from one another (Kirby et al. 1994). They are frequently isolated within an agricultural matrix (Peterken 1993) and usually have a complex and intensive management history (Rackham 1980). The field layer tends to be more uniform in character in English lowland forest, where there is less geo-climatic variability than in the open forests of Scotland and western Wales. The imbalance in the geographical distribution of forest specialists, with a slight bias towards southern lowland forest types, may also reflect this, although it may be in part an artefact of the regions for which species lists were available (Kirby et al. 2006). As elsewhere (e.g. Hermy et al. 1999), many plant species found in the field layer were not forest specialists.

While British forests can differ greatly from those in North America or continental Europe (Bunce 1981), many of the factors found to be significant in this study (e.g. climatic, biotic, management, disturbance) also influence the distribution of specialist and non-specialist forest species elsewhere. Therefore, while the approach used here focused on those factors known to influence species composition in Britain, the methodology employed is of far wider applicability.

Variables that control species composition may operate at a range of scales considerably broader than the simple division of site and plot scales presented here. However, forward selection indicates the significance of any variable, regardless of the set into which it was placed, while intersecting variance regions illustrate where variables may operate at multiple levels, as suggested in Fig. 5(a). Our approach allows useful separation and quantification of responses with respect to the scale at which they are likely to exert greatest effect; this method can therefore be used to facilitate comparable assessments of variation in analogous ecosystems.

variation partitioning

The values of total variation explained (2.0–19.7%), derived from a data set containing a wide range of forest types sampled at a national level, are comparable with values reported elsewhere for smaller-scale analyses also based on unimodal response models, e.g. abundance of tree species in an area of c. 0.5 km2 (sample area 40 000 m2, TVE = 36.7%; Borcard et al. 1992) and seed bank data from a 2-ha forest stand (sample area 784 m2, TVE = 8.0–20.6%; Olano et al. 2002). In addition, much of the apparent ‘unexplained variation’ is likely to be due to lack-of-fit of data to response models (Økland 1999).

A large pool of environmental variables was used because environmentally conditioned variation is underestimated if important variables are missing (Økland & Eilertsen 1994). To balance this and guard against overestimation of TVE, forward selection was employed to retain only significant and non-collinear variables. It is possible that the threshold of multicollinearity used was too generous and may have allowed inclusion of collinear climate/spatially structured variables, such as winter cloud cover with ground frost, or site altitude with easting. This may explain the apparent discrepancy in selection position and TVE between climate/spatial and boundary/grazing variables. Another possibility is that as gradient-based models fit data representing clear gradients (e.g. climate), before anthropogenically dependent and less consistent data (e.g. boundary structure), the latter may not produce a good fit along an early ordination axis, yet still explain much variation.

Nevertheless, the strength of this approach lies in provision of realistic relative values (Økland 2003), and these results provide the first quantitative assessment of factors strongly correlated with the floristic component of a range of forest types sampled across an entire country.

site-scale variables

Ordination of species within forest plots confirmed and elaborated the key ecological and geographical gradients described for Britain by Bunce (1981) and Rodwell et al. (1991). Forests with Fraxinus excelsior and an understorey of Corylus avellana and Crataegus monogyna displayed a strong association with deep, damp, base-rich soils, while those dominated by Alnus glutinosa were clearly associated with wet environments, such as streams and bogs.

There were also strong correlations between climate and vegetation composition. Precipitation and temperature gradients were found to be of particular importance, along with ground frost and cloud cover, again similar to the splits in the NVC between types typical of the south-east vs. the north-west.

Variation partitioning indicated that, at the site level, management factors, including herbivory by deer, boundary type and spatial variation, accounted for a large proportion of TVE. Wild ungulates are important in determining forest field layer structure across a wide range of ecosystems (e.g. Mcinnes et al. 1992; Augustine & McNaughton 1998). They can facilitate the long-distance dispersal of a large number of plant species (Vellend et al. 2003; Eycott et al. 2004), but are often associated with a decline of palatable species (Augustine & DeCalesta 2003; Kirby 2001). Forest plants are often poorly adapted to high grazing levels (Rackham 1980) compared with non-forest, particularly grassland, species and this was reflected in ordination analyses. However, while heavy grazing is often detrimental to growth of vascular plants, light grazing can be beneficial in controlling the spread of taller competitive or ruderal pioneer vegetation, thus reducing competition for regenerating seedling and floristic components (Kirby et al. 1994; Truscott et al. 2004). Oak (Q. robur, Q. petraea) seedlings in particular, are noted for their ability to establish in open grazed conditions (Vera 2000). In contrast, ash (Fraxinus excelsior) produces abundant seedlings and young saplings under canopy, but these are easily lost to grazing (e.g. Crampton et al. 1998).

Boundary type, which was also found to be a significant factor, can influence vegetation in a variety of ways, acting as a physical barrier to ungulate movement or, in the case of hedges, acting as refugia (McCollin et al. 2000; Smart et al. 2001) or dispersal pathways for forest herb species, as demonstrated in central New York, USA (Corbit et al. 1999). Adjacent water bodies such as streams and rivers aid dispersal of both terrestrial and riparian plant species (e.g. Johansson et al. 1996; Merritt & Wohl 2002), and can considerably extend the range of those with a limited terrestrial dispersal capacity (Boedeltje et al. 2003). However, in this analysis, the road variable was selected before all other boundary types. In addition to facilitating the spread of exotic species into forests (Watkins et al. 2003), roads are known to have associated effects that alter interior-forest conditions; petrol combustion and the application of salt can lead to an increase of N, Na, Mg and Ca, which significantly alters vegetation and soil composition (Bernhardt et al. 2004). Clearly, more research is required into the effects of this ‘sleeping giant’ (Forman & Alexander 1998).

Forest spatial variation (e.g. shape and size) affects both structure and dynamics of species assemblages in a number of ways, including invasion of forest edges by weeds (Honnay et al. 2002b), pesticide and fertiliser impacts (Gove et al. 2004) and area-dependent extinctions (Jacquemyn et al. 2002). The spatial component of surveyed forests was found to be important, despite the fact that the same area was sampled per site, regardless of site size, which may have resulted in an underestimation of the effect of spatial dynamics; perhaps as a consequence, total variation explained by this set was smaller than that of other site-scale sets.

Gradient analysis using ordination demonstrated that, at the time of the survey, fallow deer were predominantly lowland species (Arnold 1993), whereas sheep were more typical of upland landscapes (Fuller & Gough 1999). Following roughly the same geographical pattern, thick hedges, banks and ditches were more frequent lowland boundaries, and water a more typical upland, forest boundary. However, there was little interaction between the effects of spatial and boundary/grazing sets, suggesting that boundary type and grazing are not acting primarily as a surrogate for larger scale spatial variation, such as surrounding land use or geo-climatic variability, but are significant in their own right. Thus, while an understanding of forest spatial dynamics is important, buffer zones and provision of appropriate exclosures in areas of significant species loss (e.g. Mcinnes et al. 1992; Cooke 2002), or desired expansion (Romagosa & Robison 2003; Palmer et al. 2004; Truscott et al. 2004) may prove as, or more, significant to the delivery of conservation objectives.

plot-scale variables

Local factors, such as soil pH, slope and the presence of wet areas or large glades, accounted for a large fraction of plot level variation. Canopy gaps can play a major role in maintaining or enhancing field layer species diversity (Peterken & Francis 1999; Ott & Juday 2002; Rantis & Johnson 2002). The variable representing current large open areas within forests (glades > 12 m), was selected relatively early for the plot model, whereas those suggestive of recent gaps (fallen uprooted tree, fallen broken tree) entered very late. Small gaps may therefore have only a minor and transient effect (Reader 1987; Peterken & Francis 1999), promoting whatever species diversity is already locally present. They may therefore uncouple control of species richness and abundance from resource-based niches, as observed in the neo-tropical forests of Central America (Hubbell et al. 1999).

Plot level management accounted for a significant proportion of TVE, although other variables included in the biotic set (e.g. glades, dead wood, etc.), may also have been the result of human activity. Ordination and bivariate SD ellipse analysis suggested that forest specialist species were strongly associated with coppice stools and cut stumps, including vernal species (e.g. Hyacinthoides non-scripta), shade tolerators (e.g. Lonicera periclymenum), and species that can survive shade phases in the seed bank or on rides and clearings, such as Veronica montana and Lysimachia nemorum (Mason & MacDonald 2002). This may represent an historic association with the long tradition of coppicing in Britain, which persists despite the widespread abandonment of the practice that occurred during the 20th century (Peterken 1993). In contrast, old conifer stumps within surveyed forest patches were found to be associated with species of open, rather than forest interior habitats, reflecting the history of forestry practice in the UK. Since the mid-1900s many woodlands have been planted or extensively under-planted with non-native conifers such as Picea sitchensis, Larix spp. and Pinus sylvestris beyond its natural range (Peterken 1993; Truscott et al. 2004). Recovery of former native forest, through removal of introduced plantation species (as encouraged by a recent policy shift; The UK Steering Group 1995) may not therefore, at least in the short term, necessarily result in the expansion of desired native forest species (e.g. Meier et al. 1995). Outcomes will depend on factors such as productivity of the forest soil and composition of the site species pool (Romagosa & Robison 2003).

CCA illustrated that the presence of cattle and sheep was strongly correlated with non-forest field layer species of grass- or heath-land. Our findings are thus consistent with other research, which suggests that over-grazing by livestock can reduce survival and growth of forest field layer species (Gustafsson 2004) and heavily influence woodland regenerative success (Pigott 1983).

Spent cartridges, Prunus lauroceracus and, by implication, management for game such as the common pheasant (Phasianus colchicus), were not closely associated with any forest specialists, or with many forest species. Pheasant release pens have more bare ground, a reduced vegetation structure, and lower species diversity and cover of herbs, compared with control areas (Sage et al. 2005).

woody species and forest structure

Forest succession models (Shugart 1984) have developed considerably in recent years, particularly in the USA (e.g. Robinson & Ek 2003; Busing & Mailly 2004). However, successful linking of gap models with field layer dynamics will need to place the complex relationships between the light environment and herbaceous species within the wider context of other factors contributing to field layer composition (Weisberg et al. 2003). On any particular site, clear differences in the ground flora between tree species are apparent (e.g. Mitchell & Kirby 1989), but this study demonstrated that the association between field-layer vegetation composition and particular tree and shrub species may not be that strong, relative to other plot and site conditions. Stand-based variables (basal area and stem density) also explained only a relatively small amount of variation in the data set, compared with other plot scale variables. This finding has important implications for the development of models based on stem data.

However, the wide geographical spread of this study may have prejudiced the analysis against variables important at finer scales, e.g. light penetration and tree species composition (Graae et al. 2004). Thus the effects of the local structural environment may have been swamped by the stronger pull of countrywide gradients such as pH or climate. Moreover, NWS sampling, in common with other survey work, was optimized to capture floristic variation, rather than gradients of potential explanatory variables. Therefore some gradients may, by chance, be longer than others and thus able to explain more variation. In addition, direct factors such as heavy grazing by sheep can create ‘bowling green’ swards (Fuller & Gough 1999) under the tree canopy, composed of species more characteristic of open habitats. This may cause the response of field layer species to become statistically uncoupled from passive drivers such as relative light regime. Recent work on changes in the vegetation of these plots from 1971 to 2001 indicates that changes in nitrogen regime may also interact with responses to changed light conditions (Kirby et al. 2005). Further research is required to test these hypotheses.


This study has sought to apply a range of modern multivariate techniques to a large-scale survey of forest patches sampled at a national level. Using variation partitioning, we have considered the relative variation due to site, plot and woody species/forest structure variable sets. In doing so, we have provided a quantitative assessment of the factors affecting a range of forest types in a temperate-maritime climatic envelope. Extending this approach to analogous forest ecosystems found elsewhere in continental Europe, Asia and North America would extend our understanding of the dynamics of the environmental factors driving temperate forest field layer species composition and distribution in other regions of the world.


We thank Henry Arnold (Biological Records Centre, Monks Wood) and Hugh McAllister (University of Liverpool) for technical support. We are grateful for the valuable reviews of two anonymous referees and the BES editorial staff, who provided constructive comments on previous manuscript versions. P.M.C. received financial support from English Nature and CEH.