Concordance between original and rescaled values
Rescaling by regression produced small gains over rescaling by a simple offset (Table 2). For salt tolerance, there was a distinct loss of accuracy. The RMSE were in the range 0·65–1·45 for rescaling by offset X(3) and in the range 0·75–1·33 for rescaling by local regression X(R). The RMSE for salt tolerance was lower than that for the other variables because the great majority of species have no salt tolerance and were not predicted to have any. Expressed as RMSE, predictions of R-values were generally the worst.
Table 2. Errors and correlations between original (X(0)) and repredicted (X(3), X(R)) species indicator values. The scales are represented by their standard German abbreviations: L = light, T = temperature, K = Continentality (Kontinentalität), F = moisture (Feuchtigkeit), R = reaction, N = nitrogen, S = salt
|Repredicted by offset only (X(3))||1·03||0·80||1·34||1·04||1·45||1·41||0·65|
|Repredicted by regression (X(R))||1·01||0·75||1·27||1·03||1·33||1·32||0·75|
|Correlation with original values|
|Repredicted by offset only (X(3))||0·71||0·75||0·39||0·91||0·68||0·77||0·88|
|Repredicted by regression (X(R))||0·72||0·79||0·49||0·91||0·74||0·81||0·85|
With the exception of continentality values, K, the correlations of X(3) and X(R) with the original variable X(0) were in the range 0·71–0·91. The best correlation was for moisture, F, which ranged from 1 to 12, and was a longer scale than the others. For continentality, K, the correlation between rescaled and original variables was much lower, 0·39 for K(3) and 0·49 for K(R).
Much of the RMSE for all variables could be attributed to systematic error in the correspondence between rescaled values and original values (Table 3). For example, the mean for K(R) only rose from 2·7 when K(0) = 1 to 4·1 when K(0) = 7. The origin of both systematic error and individual discrepancies can be illustrated by considering the following particular cases.
Table 3. Numbers of species and their mean repredicted values in relation to original scale values X(0). The two repredictions are reprediction by offset only (X(3)) and reprediction by two-way averaging followed by local regression (X(R)). The standard abbreviations for the scales, L, T, K, F, R, N and S, are explained in Table 2. Note that the original scales are of differing length, the shortest being K and the longest F. The value * denotes a missing value, either × or ? in the original enumeration, or a species not included in the original list
|Value on original scale X(0)|
|Numbers in category|
|Mean values, reprediction by offset X(3)|
|Mean values, reprediction by regression X(R)|
The largest discrepancy for light was in the values for Oxalis acetosella. For this species, L(0) was 1 and the repredicted values L(3) and L(R) were 4·8 and 5·2, respectively. Stage 3 of the analysis is illustrated in the regression shown in Fig. 1. The rescaled light values from stage 2 are the x-axis and the original light values L(0) are the y-axis. It is clear that O. acetosella had an exceptionally low L(0) value. Indeed, for no other species was the original light value 1. In Britain, O. acetosella is undoubtedly a shade plant (Packham 1978), but it is no more extreme in this respect than many ferns. Even Epipogium aphyllum and Neottia nidus-avis, orchids that completely lack chlorophyll and live in the darkest parts of woods, were rated L(0) = 2. Thus there appears to be some inconsistency in the original values.
It is fairly clear from Fig. 1 why the repredicted L-value should be about 5. Of the 20 most similar species, only the ferns Athyrium filix-femina, Dryopteris dilatata and Polystichum aculeatum and the woodland herb Viola reichenbachiana had L(0) < 5. Several of the species with higher light values were trees. It could well be argued that the trees are not similar to a small herb and that they ought to have been left out of the analysis. If trees are omitted from the regression then L(R) = 4·9, a slightly lower value but not greatly different.
Other species for which L(R)–L(0) ≥ 3 were Brachypodium sylvaticum (L(0) = 3, L(R) = 6·0), Cardamine pratensis (4, 7·2) (using the same order, but with the names L(0) and L(R) omitted for brevity), Epipactis helleborine (3, 6·5), Huperzia selago (4, 7·5), Lysimachia nemorum (2, 5·0), Moneses uniflora (4, 8·0), Orthilia secunda (4, 7·3), Pyrola rotundifolia (4, 7·4) and Phegopteris connectilis (2, 5·2). Of these, B. sylvaticum is a plant of woodland margins, C. pratensis grows mainly in open pastures and marshes, and H. selago is a species of open moorland, while L. nemorum and P. connectilis are similar enough in their light requirements to O. acetosella. The high L(R) for P. rotundifolia reflects the fact that it often occurs on dunes, parasitizing the mycorrhiza of Salix repens (as does Monotropa hypopitys, which was not included in our quadrat samples). On the other hand, the original values for E. helleborine, M. uniflora and O. secunda did seem more appropriate than the recalculated values. These discrepancies were probably due to quadrat size. Quadrats of at least 200 m2 were used in woodland, and these may have included dark and light parts within them.
There were only three species for which L(R)–L(0) ≤ −3, namely Juniperus communis (8, 4·7), Narcissus pseudonarcissus (8, 4·6) and Fallopia japonica (8, 5·0). Juniperus communis forms moderately dense scrub in a few localities in Scotland (McVean & Ratcliffe 1962) and these were perhaps disproportionately sampled. The low L-value was attributable to plants growing in its shade. The problem of layered communities with trees was noted in relation to O. acetosella; the phanerophytes Betula pendula (7, 5·1), B. pubescens (7, 5·3), Malus sylvestris (7, 4·2), Quercus robur (7, 5·2) and Rhamnus cathartica (7, 4·9) all showed the same phenomenon but in lesser degree. The situation is further complicated by the fact that the L-values for trees were intended to describe the requirements of juveniles rather than mature individuals. Narcissus pseudonarcissus frequently occurs in woodland in Britain, but it is also found in the open, and the value of L(R) = 4·6 is undoubtedly too low. Unintentional recording bias is the explanation; all but two of the records were from British Plant Communities (Rodwell 1991a, 1991b, 1992, 1995) woodland tables (ashwoods of type W8 and oakwoods of type W10). Fallopia japonica, familiar to city dwellers as a denizen of dark, squalid, derelict land, is also likely to have been sampled in a biased way. It is mainly a plant of half-shaded river banks and roadsides. A value of L = 6 would be more appropriate.
These discrepancies were notable but not especially numerous. Discrepancies greater than 2 in either direction amounted to 64 out of the 1060 species for which comparisons were possible. The majority of these discrepancies had an obvious explanation. Where they were adjusted to a value that was more suitable for Britain, they were an intended consequence of the reprediction.
Repredicted F-values were well correlated with original F-values F(0) (Table 2). Caltha palustris provides a second illustration of the use of regression for reprediction (Fig. 2). In this example, C. palustris occupies a rather extreme position on the x-axis, so that its repredicted value, F = 8·6, is appreciably larger than F = 7·4, the average moisture value of the 20 most similar species. The off-centre position of C. palustris reflects the fact that, although the species is always terrestrial, it is confined to localities close to the water table, while several similar species, such as Angelica sylvestris, Cardamine pratensis and Juncus effusus, occur in a wide range of other habitats, usually moist but not necessarily wet. The reprediction for C. palustris confirms the original Ellenberg value F = 9.
Of the 11 species for which F(R)–F(0) ≥ 3·0, only the reprediction for Agrostis vinealis (F(0) = 2, F(R) = 6·1) was perhaps correct. In Britain, A. vinealis is one of the most common species of heaths and moors, including wet moorland on peaty gleys; it is certainly not confined to dry ground. Repredictions for Chenopodium murale (4, 8·5), Elatine hydropiper (5, 11·0) and Platanthera bifolia (5, 8·0) correctly raised the F-values, but raised them too much. Repredictions for Eleocharis parvula (10, 15·1), Filago lutescens (3, 7·1), Hesperis matronalis (7, 10·0), Juncus filiformis (9, 13·9), Moneses uniflora (5, 8·6), Ophrys insectifera (4, 9·1) and Zostera noltii (12, 15·8) were definitely unsatisfactory; indeed, three of these values were larger than the scale maximum of 12.
There were only six species for which F(R)–F(0) ≤ −3·0. Brassica nigra (8, 4·9) and Carex vaginata (9, 5·6) were adequately repredicted; Carex divisa (9, 5·7), Ranunculus sardous (8, 4·6) and Sesleria caerulea (8, 4·1) were shifted in the right direction, but too far. Sesleria caerulea is abundant on limestone in northern England, but occurs in dry grassland as well as in mires. In any case, the discrepancy is basically taxonomic, because British authors unite S. caerulea (F = 8) and S. albicans (F = 4), which are distinguished in the list of Ellenberg et al. (1991). Crassula tillaea (7, 2·8) is a difficult case, growing where puddles form in winter but dry out in summer. It is often in close proximity to dry-ground therophytes, but at the scale of the individual puddle demands moisture.
As with the L-values, these discrepancies were very much the exception. Out of 1006 species for which a comparison was possible, only 65 had F(R) differing by 2 or more from F(0).
Repredicted R-values had the highest RMSE (Table 2). Even with this variable, most species were unproblematic. As an example, consider Nardus stricta (Fig. 3). Many of its typical associates in heathy grassland had R(0) = 1 or 2. However, it is also often associated with bryophytes and small herbs in closely grazed, base-enriched flushes and runnels. Its repredicted value R(R) = 3·0 was one unit larger than its original value, reflecting this additional range of habitats.
For 16 species, R(R)–R(0) ≥ 3. For Centaurea nigra (3, 6·5), Epilobium lanceolatum (3, 6·5), Filago pyramidata (4, 7·3), Juncus ambiguus (4, 7·0), Minuartia stricta (2, 8·0), Poa chaixii (3, 6·3) and Subularia aquatica (2, 5·0), the repredicted values seemed appropriate enough. Indeed, the most common of these species, C. nigra, is not markedly calcifuge in Britain and is often found on basic soil. For Asplenium adiantum-nigrum (2, 6·9), Coeloglossum viride (4, 7·0), Elatine hydropiper (2, 6·6), Holcus mollis (2, 5·0), Oxyria digyna (3, 6·2), Teesdalia nudicaulis (1, 4·0), Trifolium arvense (2, 6·2) and T. striatum (2, 7·3), the repredicted values were arguably closer to the true values than the original values, although they were probably too large. For Radiola linoides (3, 6·4), however, R(R) did appear to be a clear overestimate.
For 15 species, R(R)–R(0) ≤ −3. For Carex aquatilis (7, 3·8), Pinus nigra (9, 4·8), Utricularia intermedia (8, 4·3) and Viola lutea (8, 4·9), the repredicted values were credible. Pinus nigra is often planted on heathland in Britain and regenerates there from seed; V. lutea is characteristic of acid, but not heathy grassland. With Mentha spicata (9, 5·3), Schoenus nigricans (9, 5·4) and Silene acaulis (8, 4·5), the true value probably lies between the original and the repredicted one. Schoenus nigricans is a marked calcicole in eastern Britain, but extends to heathland with basic flushing in the south and west, and in western Ireland also to blanket bog (Sparling 1967a, b). Silene acaulis is commonly found on acid soils but is not a strong calcifuge. For the remaining eight species, namely Astragalus danicus (9, 5·5), Carex appropinquata (9, 5·5), C. montana (6, 1·0), Ornithogalum angustifolium (7, 3·9), Pinguicula vulgaris (7, 3·8), Salix purpurea (8, 4·8), S. reticulata (9, 5·8) and Veronica spicata (7, 3·8), the reprediction was unsatisfactory. Pinguicula vulgaris, like S. nigricans, often occurs in an acid matrix but in sites with at least slight basic flushing. The other species are all uncommon or rare, some extremely so. They were not (except for four records of S. reticulata) found by the Countryside Survey 1990. Carex montana was recorded by the British Plant Communities surveyors from Ulex gallii heath, O. angustifolium from a weed community, S. purpurea from swamp woodland and V. spicata from sandy Breckland grassland (no doubt with small-scale variation in pH). There were relatively numerous records of A. danicus, C. appropinquata and S. reticulata. Each of these species was recorded from several communities with a wide range of associates. The discrepancies were therefore almost all the result of calcicolous species occurring in a matrix of calcifuge or tolerant plants.
In spite of these discrepancies, the majority of the repredicted values gave a better indication of British pH preferences than the original ones. Examples of improvements are Hyacinthoides nonscripta (7, 5·3), Saxifraga aizoides (8, 6·1) and Tussilago farfara (8, 6·3). These are not strong calcicoles; indeed H. nonscripta is one of the most common species of woodland on acid brown earths. Likewise Carex arenaria (2, 4·9), Polygala vulgaris (3, 5·5) and Teucrium scorodonia (2, 4·5) can all occur on basic soils.
Only 854 species were originally scored for reaction. In 123 of these, R(R) differed by 2 or more from R(0).
Hill & Carey (1997) and Ertsen, Alkemade & Wassen (1998) have pointed out that Ellenberg N-values are, in effect, indicators of general fertility rather than nitrogen in particular. Urtica dioica, well known as an indicator of fertile conditions and of soil phosphorus as much as nitrogen (Pigott & Taylor 1964), can serve as an example here (Fig. 4). Most of the 20 similar species were common associates of U. dioica, occurring in fertile woods and fields and on waysides and waste ground; Galium aparine was the most similar species. Agrimonia procera, Agrostis gigantea, Fallopia japonica and Laburnum anagyroides were relatively uncommon species that occupied outlying positions in the diagram. None of these was particularly similar to U. dioica in its habitat requirements, although A. gigantea frequently occurs on fertile roadsides and F. japonica grows beside streams.
The small tree Laburnum anagyroides was the least similar to U. dioica of the selected species, and was by far the most aberrant in its N-value. In Britain, it is an introduced species, commonly grown as an ornamental in gardens. It is frequent as a garden escapee on waste ground but does not occur in semi-natural habitats. The Ellenberg value N = 3 is based on its natural occurrence in the Alps and is wildly unrealistic for Britain. This one species had a distinct effect on the repredicted value of N for U. dioica. If it was included, N(R) = 7·4; without it, N(R) = 7·7. This value, 1·3 less than the scale maximum of 9, is in fact appropriate; U. dioica occurs too widely in the British uplands, in localities that are fertile but not extremely so, to qualify for the scale maximum.
For 15 species, N(R)–N(0) ≥ 3. For Bromus racemosus (5, 8·0), Euphrasia nemorosa (1, 4·5), Laburnum anagyroides (3, 6·7), Lathyrus japonicus (3, 6·3), Lepidium latifolium (5, 8·1) and Petasites albus (5, 8·0), the repredicted values were arguably correct. Lathyrus japonicus is a plant of maritime shingle that often grows with weedy species in a habitat with little soil, but the others undoubtedly grow on soils that are generally more fertile than the original values N(0) would suggest. Laburnum anagyroides and P. albus are introduced to Britain, where they occur mainly in gardens. For Artemisia campestris (2, 6·9), Epilobium lanceolatum (3, 7·5), Filago pyramidata (1, 4·3), Lactuca serriola (4, 7·7), Rosa mollis (1, 5·8) and Vulpia myuros (1, 5·0), the reprediction differed from N(0) in the right direction but exceeded the true value. In several cases, of which the most extreme was V. myuros, the reason was that the main associates are weeds, which tend to have high N-values but which can sometimes grow in low-fertility habitats such as bare sandy ground and railway sidings. These are analogous habitats to that of Lathyrus japonicus. For Allium carinatum (2, 6·3), Filago lutescens (2, 5·2) and Myosotis discolor (2, 5·0), the reprediction seemed to be unsatisfactory.
For 13 species, N(R)–N(0) ≤ −3. For six of these, Adoxa moschatellina (8, 5·0), Athyrium distentifolium (7, 3·7), Atriplex littoralis (9, 5·9), Centaurium erythraea (6, 2·6), Huperzia selago (5, 2·0) and Leontodon hispidus (6, 2·7), the reprediction was credible. For Alopecurus aequalis (9, 3·8), Beta vulgaris (9, 5·7), Listera ovata (7, 3·3), Ornithogalum angustifolium (7, 1·7) and Rubus caesius (7, 3·7), the reprediction was in the right direction but overshot. Only for two species, Mentha spicata (7, 3·8) and Reseda luteola (6, 2·8) did the reprediction appear to be totally unsatisfactory.
The overall success of N repredictions was only moderate. In 131 out of 988 possible comparisons, N(R) differed by 2 or more from N(0).