Demography and management of the invasive plant species Hypericum perforatum. I. Using multi-level mixed-effects models for characterizing growth, survival and fecundity in a long-term data set


Yvonne M. Buckley, NERC Centre for Population Biology, Imperial College, Silwood Park, Ascot, Berkshire SL5 7PY, UK (fax 01344 873173; e-mail


  • 1Hypericum perforatum, St John's wort, is an invasive perennial herb that is especially problematic on waste ground, roadsides, pastures and open woodland in south-eastern Australia. We use detailed data from a long-term observational study to develop quantitative models of the factors affecting growth, survival and fecundity of H. perforatum individuals.
  • 2Multi-level or hierarchical mixed-effects statistical models are used to analyse how environmental and intrinsic plant variables affect growth and reproduction within a complex nested spatial and temporal context. These techniques are relatively underused in ecology, despite the prevalence of multi-level and repeated-measures data generated from ecological studies.
  • 3We found that plant size (rosette or flowering stems) was strongly correlated with all life stages studied (growth, probability of flowering, asexual reproduction, survival and fruit production). Environmental variables such as herbivory, ground cover and rainfall had significant effects on several life stages.
  • 4Significant spatial variation at the quadrat level was found in the probability of flowering, flowering stem growth and fruit production models; variation at all other spatial levels in all models was non-significant. Yearly temporal variation was significant in all models where multi-year data were available.
  • 5Plants in shaded habitats were smaller but had higher survival probabilities than plants in open habitats. They are therefore likely to have slightly different population dynamics.
  • 6Synthesis and applications. Analysis of these models for H. perforatum has provided insights into which plant traits and environmental factors determine how populations increase and persist in exotic ecosystems, enabling population management strategies to be most effectively targeted. Spatially and temporally correlated data are often collected in long-term ecological studies and multi-level models are a way in which we can fully exploit the wealth of data available. Without these tools data are either underexploited or crucial assumptions of independence on which many statistics are based are contravened.


Invasive plants share the ability to colonize new habitats and increase in numbers to troublesome levels, their deleterious effects on artificial and natural ecosystems bringing them to our attention. We can say little more, however, on general ecological correlates of invasiveness in plants (Crawley 1986; Sakai et al. 2001). This has made the search for an invasive plant blueprint for predictive purposes largely unsuccessful [but see a profitable new approach by Kolar & Lodge (2001) that changes the focus by breaking ‘invasion’ down into several component processes]. Due to this lack of generality, understanding the autecology and population dynamics of the species under investigation is often of vital importance for successful control. By developing a quantitative understanding of the factors affecting growth, survival and fecundity, insights can be gained into what factors determine how populations increase and persist in exotic ecosystems. Hypericum perforatum L. (Clusiaceae) is an invasive perennial herb that is particularly problematic on waste ground, roadsides, pastures and open woodland in south-eastern Australia (Campbell, Briese & Delfosse 1995). Previous studies have characterized the qualitative factors affecting growth, survival and fecundity (Briese 1997b) but an in-depth, quantitative analysis is needed to provide a scientifically sound basis for management strategies with the greatest potential for control of this weed.

Individual plant size is often reported as a key correlate of plant fate (see references in Rees et al. 1999). Additionally, the impact of age and other state variables, e.g. flowering status in the previous year, and external environmental inputs, such as rainfall and ground cover (of conspecifics, bare ground and other species), can also affect life-history processes. By modelling the impacts of these factors on individual plant development, we gain insights into how the population behaves as an aggregate of individuals. Using the individual as the unit of observation and analysis enables us to take into account the often highly variable distributions of size within populations (Weiner 1985; Hutchings 1997) and the differential responses of plants of different sizes to their environment. Individual-based data can pose problems for conventional general-linear modelling, however, as repeated-measures on the same individual and spatial and temporal autocorrelation contravene assumptions of independence of data points. For this reason we used multi-level mixed-effects models to account for the correlated error structures present in our data.

Multi-level or hierarchical mixed models have been used in the fields of social science (Duncan, Jones & Moon 1996), medicine (Beacon & Thompson 1996), genetics (Holsinger 1999) and agriculture (Green, Berriatua & Morgan 1998) but have only rarely appeared in the ecological literature (for an example of the use of single-level mixed-effects models see Rees et al. 1999). This is despite the prevalence of spatially hierarchical and repeated-measures data, for which these techniques are well suited, in ecological studies and the implementation of multi-level mixed effects methods in commercial and open-source statistical programs (Pinheiro & Bates 2000).

In this study, we combined fixed effects with multiple nested levels of random effects in general linear mixed-effects models and single-level random effects in generalized linear mixed-effects models, allowing the analysis of repeated-measures and spatially nested data without succumbing to the problems of non-independence and pseudoreplication. The complex nested hierarchy of spatial effects and repeated-measures inherent in a 7-year data set necessitated the use of these techniques. The model structure allowed us to investigate the relative importance of variation at different levels in the hierarchy from individual plants through intermediate spatial levels of quadrats and groupings of quadrats up to variation between years and sites. Knowing where stochasticity in the system lies enables us to make better estimates of the fixed effects and ultimately more informed predictions of how likely certain management strategies are to succeed. We used these statistical models to investigate the effects of environmental (e.g. rainfall, ground cover and herbivory), temporal and intrinsic plant traits (e.g. flowering history) on growth, survival, probability of flowering, fruit production and vegetative reproduction.

In a companion study (Buckley, Briese & Rees 2003, pp. 494–507 of this issue) we used these parameterized functions to build an individual-based simulation model to explore the effects of different management strategies on invasion dynamics. By building a realistic model that partitions stochasticity between different spatial and temporal levels we can generate probabilistic predictions about which management strategies are likely to succeed in the field.

study species

Hypericum perforatum, St John's wort, is a serious weed of pastoral and natural ecosystems in south-eastern Australia. In the 1970s it was estimated to occupy approximately 360 000 ha (Shepherd 1983) with more than 80% of infestations occurring under native forests. Hypericum perforatum was introduced from western Europe in the mid-nineteenth century and it spread quickly, becoming a serious weed in south-eastern Australia by the early twentieth century (Harris & Gill 1997). It grows in open forests or pastures in areas with rainfall > 760 mm per year. Hypericum perforatum is also established in North America, South America, New Zealand and South Africa (Campbell, Briese & Delfosse 1995). Recent interest in its medicinal properties has led to demand for large quantities of the herb and it is cultivated as a crop in parts of Europe and North America.

Hypericum perforatum is a phenotypically and genetically variable species with the variety in Australia most commonly referred to as var. angustifolium, although there is some evidence that at least two varieties have been introduced into Australia (Campbell & Delfosse 1984). Hypericum perforatum is a perennial herb that grows to a height of 30–120 cm and reproduces vegetatively from crowns and lateral roots as well as through seeds. However, in Europe seed reproduction of H. perforatum is 97% apomictic (pseudogamous) (Robson 1968).

known ecology

Briese (1997b) reviewed the known ecology of H. perforatum in Australia, the key features of which are summarized here. Hypericum perforatum can resist damage to above-ground parts by drought, fire (Briese 1996) and herbicides (Campbell, Dellow & Gilmour 1979) through resource storage in the tough, extensive root system. Large, persistent seed banks are maintained for what is estimated to be 30 years (Harris & Gill 1997), and vegetative reproduction can be of great importance in maintaining infestations. Hypericum perforatum infestations have been divided into two main types, labelled by Clark (1953) as type A and type B. Type A infestations occur on deep, fertile soil, as can be found in good pastureland (individual crowns are large). Type B infestations occur on poorer, shallow, stony soils, where spread is largely through vegetative reproduction (individual crowns are small). Hypericum perforatum exhibits genetic variation with respect to morphological variation (Campbell et al. 1997) and herbivore resistance (Mayo & Roush 1997).

Details of the insects known to attack H. perforatum in its exotic range are given in Campbell, Briese & Delfosse (1995). They fall into two groups: those that are polyphagous native or accidentally introduced species, and a group of introduced, host-specific biocontrol agents. Since the 1930s, 12 biocontrol agents have been released in Australia, although only the beetle Chrysolina quadrigemina (Suffr.) is considered to have had any significant impacts on the weed. Chrysolina quadrigemina sporadically reduces the extent of infestations in years where defoliation by the beetle is high, but it cannot reduce populations of H. perforatum to acceptable levels by itself (Briese 1985). It is also of limited use in natural ecosystems because it is intolerant of shade such as that found under native Eucalyptus woodland. Currently under observation are a species of mite Aculus hyperici (Liro) and the root borer Agrilus hyperici (Creutzer). Populations of the mite have established widely and cause some reduction in plant biomass, while the root borer has failed to establish in adequate numbers in Australia (Briese 1997a). Hypericum perforatum is poisonous to livestock, causing photosensitization, loss of condition and ultimately death if animals continue to feed on it (Bourke 1997). Careful grazing management, however, is often successfully used to control the weed (Campbell, Briese & Delfosse 1995) and goats are considered to be more tolerant of H. perforatum grazing than other stock (R. Arnott, personal communication).



The data used in the statistical analysis were collected from four sites over 6–7 years in south-eastern Australia. Full details of the design and data collection can be found in Briese (1997b) and are summarized here. Four sites were selected, three of these sites contained infestations in open pasture and native Eucalyptus woodland areas (Queanbeyan and Adaminaby in New South Wales and Beechworth in Victoria) and one had only infestations in open pasture areas (Pierce's Creek in ACT). Site details are given in Table 1.

Table 1.  Details of sites from which H. perforatum data were collected
SiteLatitude and longitudeAltitude (m.a.s.l.)HabitatAverage rainfall per year (mm)
Adaminaby148°42′E 36°02′S1090Native grassland in clearing next to Eucalyptus spp. forest, lightly grazed by kangaroos Macropus giganteus550
Beechworth146°40′E 36°19′S 310Open improved pasture next to Eucalyptus spp. forest (grazed by sheep Ovis aries and cattle Bos taurus)920
Pierce's Creek148°56′E 35°20′S 720Open improved pasture in clearing surrounded by exotic pine plantations (lightly grazed by horses Equus caballus)650
Queanbeyan149°13′E 35°26′-S 750Improved pastures in open eucalyptus woodland (grazed by sheep O. aries and cattle B. taurus)630

At each site and treatment (wooded vs. open) two blocks 20 × 20 m were marked out and within each block five permanent 0·5 × 0·5-m quadrats were chosen. Three sets of data were collected from each block from 1981 to 1987. (i) History of individually tagged crowns. Five mature H. perforatum crowns were marked and data were collected four times a year on plant size, condition and phenology; the fates of 350 plants in total were followed. Due to the high degree of vegetative suckering in this species we do not know whether these individually tagged plants were in fact separate individuals. (ii) Population changes in permanent quadrats. Four times a year total H. perforatum density (divided into different size classes) and percentage cover of the weed and other classes of vegetation were determined. (iii) Destructive plots. Once a year all H. perforatum crowns were removed from five randomly selected quadrats per block with data taken on root size and origin (seedling or sucker) of the plant.

In addition, fieldwork was undertaken in 1999 at three sites to determine a suckering rate for mature plants. At each of the three sites H. perforatum plants were chosen at random and their root systems were excavated by hand in order to locate all attached suckers. The ‘parent’ plant and all suckers were measured and aged using the number of old flowering stems present. Good quality data on individual plants for all non-seed life-history stages were therefore available. Although seed germination experiments were also carried out, germination rates in the field were too low to enable formal analysis. In agreement with a previous study (Clark & Clark 1952), there was some indication that disturbance was important for germination success (Y. M. Buckley, unpublished data).

The year was split into two seasons; season 1 runs from April to September (austral autumn/winter), when rosette growth and feeding by C. quadrigemina larvae occurs; season 2 runs from October to March (austral spring/summer), when erect stem (flowering and non-flowering) and fruit production occurs. There is minimal C. quadrigemina feeding in the early part of this season but intense feeding by adults can occur in November and December Thus the ‘H. perforatum year’ runs from April to March the following year, and the year codes used in the analysis are based on this definition.

As data were collected at monthly or 3-monthly periods within the season, it was necessary to summarize values of the data for analysis. Maximum seasonal values were taken for fruit production, vegetative stem length and flowering stem length. Percentage damage was assessed by eye throughout the season as the percentage defoliation due to herbivores or other loss of tissue (e.g. through drought or frost damage); these values were averaged over the season. Counts of C. quadrigemina were summed over the season, expressed per centimetre stem length and log-transformed. This was to allow for potentially damaging short-term build-ups of insects, which would not be obvious from an average value or from simply using presence/absence data. A natural log-transformation was used on all plant size and fruit production parameters, with 1 added to each value to include zero measures where necessary. As not all plants were aged at the time of data collection, the effects of age were determined from only two sites.

statistical analysis

The individual plant data set was used to construct a series of statistical models describing how vegetative growth, probability of flowering, growth of flowering stems, fruit production, sucker production and survival change with extrinsic and intrinsic explanatory variables. Extrinsic variables included whether the quadrat was in the shade or open (treatment), summed monthly rainfall, percentage bare ground, percentage grass cover, number of other H. perforatum plants present in the quadrat, year of observation, size-corrected C. quadrigemina count and a plant damage score (percentage damage arcsine transformed). Intrinsic variables included whether or not the plant flowered in the previous season, and the sizes of vegetative and flowering stems in the current or previous season.

All of the explanatory variables above were put into the model as fixed effects. Fixed effects are those explanatory variables associated with an entire population or with certain repeatable experimental treatments; we therefore estimated from the data a mean for each level of the fixed effects. Random effects are associated with individual experimental units drawn at random from a population (Pinheiro & Bates 2000) and govern the variance–covariance structure of the response variable. For a random effect a mean and standard deviation (SD) are predicted for each factor, i.e. the observed group means are assumed to be drawn from a normal distribution defined by this mean and standard deviation. Mixed-effects models enable us to model correlations that often exist within grouped data, such as those found in ecological studies where data are grouped by individual (repeated-measures on the same individual over time), quadrat and various other levels of, often nested, spatial groupings. We designate the groups as random effects and can therefore model the covariance structure introduced by the grouping of the data. Treating variables as random effects also has the advantage that it uses up fewer degrees of freedom than treating variables as fixed effects with multiple levels. For example, if quadrat is treated as a random effect, individual quadrat intercepts are treated as random deviations (defined by the standard deviation for a normal distribution) from a mean population value.

Individual plants were repeatedly measured over several years in this study and there was also a hierarchical spatial structure to the observations; these features contravene assumptions of independence of a general linear model. Assuming independence when it is not true will inflate the error degrees of freedom and can lead to spurious significance (type I error). It was therefore necessary to use mixed-effects models in order to take account of the correlated measurements. Where the random effects proved to be non-significant generalized linear models (GLM) were used, just incorporating the fixed effects. The statistics program R 1·3·1 (copyright 2001, The R Development Core Team) was used for all analyses.

linear mixed-effects (lme) models

With LME the observational units are assumed to be collected into clusters over which the random effects (intercepts in all models presented here) vary. In addition, several random effects can be nested within each other. The spatial variables were nested in the order (from largest scale to smallest): site/treatment/block/quadrat/plant. Treatment was also tested in the models as a fixed effect. Temporal autocorrelation is often a feature of time-series data and can occur at different scales from repeated-measures on the same plant over time, to year or season affects on all plants in the population. As a plant effect was not significant in any of the models, independent errors at this scale were assumed.

The minimal adequate model was arrived at by deletion of the explanatory variables one at a time from the full model. The depleted model was then compared with the full model using an F-test of the likelihood ratios for the linear mixed-effects models. Restricted maximum likelihood was used to compare nested models in which only the random effects differed, but maximum likelihood was used when comparing nested models where the fixed effects differed (Pinheiro & Bates 2000). Restricted maximum likelihood was used to calculate the estimates of coefficients for the minimal adequate model. Due to the complexity of the model structure and the relatively large number of potential explanatory variables, all possible combinations of interactions and polynomials could not be fitted. After initial model simplification, significant main effects were identified and subsequent graphical exploration of the data suggested likely interactions and polynomials to be fitted and tested. Where non-nested models were compared, Akaike's information criterion (AIC) was used to choose the best model. AIC is calculated for each model as:

−2(log-likelihood) + 2 × p

where p is the number of parameters estimated in the model. AIC therefore represents some measure of the explanatory power of a model discounted by the number of parameters that have gone into its construction; a lower value indicates the ‘better’ model.

Growth of vegetative stems

In the full model, the natural log of sizet+1 (in centimetres of stem length) was designated as the response variable in a linear mixed-effects model, with the full range of nested spatial random effects and a selection of fixed effects fit as explanatory variables. The random effects were deleted one at a time, and as none proved significant (P > 0·1 in all cases) the model was reconstructed using the fixed effects only as a general linear model.

Growth of flowering stem

The natural log of the size of the flowering stem in year t+ 1 (in centimetres of stem length) was designated as the response variable and only plants that flowered were included in the linear mixed-effects model. Both random and fixed effects were tested by single term deletion with subsequent model comparison.

Fruit production

As the number of fruit produced was a count, we considered using Poisson errors in the analysis. However, this would have left us unable to use linear mixed-effects models with multiple nested random effects. We therefore linearized the data by log-transforming the number of fruit and, as all plants that flowered also produced fruit, there were no zero counts in the response. Finally, inspection of plots of the residuals and fitted values gave no reason to suspect that normal errors were inappropriate. Both random and fixed effects were tested by single term deletion with subsequent model comparison.

generalized linear mixed (glmm) models

For binary variables (such as flowering and survival) we used GLMM, which allows a logistic-normal mixture model incorporating fixed effects and one random effect (Lindsey 1999). Repeated-measures can be dealt with if plant number is declared as the random effect and consequently a within-plant error term is generated. GLMM, however, only allows for one level of nesting of random effects, and therefore all spatial variables could only be tested in isolation from each other. Significance tests were based on the change in deviance, which was compared to the χ2 distribution for the GLMM model.

Probability of flowering

The binary response variable was the production, or not, of a flowering stem at time t. We initially tested the random effects in a generalized linear multi-level mixed-effects model using penalized quasi-likelihood (code supplied by B. Ripley) and found that the only spatial level likely to influence survival was the quadrat. We subsequently tested the significance of quadrat using GLMM in the ‘repeated’ library for R 1·3·1 (Lindsey 1999). Using GLMM, only one level of random effects is permissible. Due to the low numbers of plants alive and flowering in years 6 and 7, the data for these years were combined. As the minimal adequate model arrived at using these variables did not include the previous season's parameters, we reformulated the data set to include data collected in 1981. These data had been left out of the first analysis as no measurements were made in 1980 and consequently no values existed for the previous year's variables. This second data set was analysed in the same way but without the previous year's variables, and the model presented is based on this second data set.

Probability of survival

We designated the response variable as the probability of survival of plant i in year y to year y + 1. As in the probability of flowering model, we tested random effects singly because the GLMM code does not allow for multiple levels of nested random effects. In order to deal with non-independence of within-plant data points (survival of a plant from year to year could be correlated), time until death of each individual is more conventionally used in survival analysis as the response variable (McCallum 2000). However, statistical methods incorporating random effects with survival analysis of this form are not currently implemented in R 1·3·1. We therefore dealt with the potential non-independence of within-plant errors by testing the significance of including a plant random effect, and found it to be non-significant. Therefore, successive observations on the same plant over time were treated as independent of each other. As a further test of whether the inclusion of random effects was necessary, we compared the mixed-effects models with a GLM incorporating just the fixed effects, and found that all models including random effects had higher AIC values than an equivalent GLM with no random effects. As a consequence the GLM was preferred as the simpler model.

generalized linear (glm) models

Sucker production

The number of suckers produced by a mature plant was the response variable in a GLM with a log-link and Poisson errors. These data were collected separately from the individual plant data used for the other functions; there was little spatial structure inherent in the data and data were collected for one year only. A GLM with site as a fixed effect was therefore used; site was treated as a fixed effect here as it was the only source of spatial autocorrelation and there were enough degrees of freedom to test it as a factor with four levels. The occurrence of overdispersion (scale parameter approximately 1·5) meant that we used the more conservative F-ratio tests instead of χ2 tests during the model simplification process (Crawley 1993). The minimal adequate model was then fit using the ‘quasi’ family; the log-link was retained but the variance was specified as a function of the mean. As a negative binomial model would not converge in R 1·3·1, we used the quasi-Poisson model to estimate k for the negative binomial function using the relationships:

image(eqn 1)
image(eqn 2)

(Crawley 1993), where λ is the scale parameter estimated using the quasi-Poisson model, k is the shape parameter in the negative binomial function and is the fitted value from the quasi-Poisson model. This allowed the probability distribution of suckers to be specified in the simulation model described in Buckley et al. (2003).


Observations on damage were undertaken as a further measure of the impact of C. quadrigemina on plant fate. Individual plant damage scores in season 1 (collected as percentage damage) were arcsine transformed and first modelled using a general linear mixed-effects model. When compared with an equivalent GLM, however, the GLM was preferred because of its lower AIC score.

other models

We attempted to fit models with grass and bare ground cover, and summed C. quadrigemina observations as the response, but these models had very little explanatory power and are therefore not considered here.

testing the assumptions

In all cases, plots of fitted and observed values and residuals were examined for deviations from the assumptions (Pinheiro & Bates 2000). The assumptions of mixed-effects models can be divided into two groups, those referring to the within-group error and those referring to the random effects. As there was no evidence of correlation of observations within groups in any of the models examined here, we assumed that within group errors were normally distributed, centred at zero, and had constant variance. These assumptions were tested by looking at the distribution of within-group residuals (the estimated best unbiased linear predictors; BLUP), fitted values and observed values by grouping level, as described by Pinheiro & Bates (2000). These were satisfactory for all LME models. Diagnostic plots of this nature for binary response variables are not informative. A plot of fitted values against observed values for each model is given in Fig. 1. The second group of assumptions is that the random effects are normally distributed and that the random effects covariance matrix is as specified (e.g. constant). Normality plots of the random effects (BLUP) and scatter plots of the random effects were used to test these assumptions (Pinheiro & Bates 2000). These were acceptable for all models examined.

Figure 1.

These graphs show the fitted values from each model on the x-axis and the observed response variable on the y-axis along with a line showing the expected 1 : 1 relationship. For the binary response variables, proportion data and counts of suckers, mean values of small groups of data points are used. P (fl) = probability of flowering.


All life-history stages were positively affected by size of either the vegetative stems and/or the flowering stems, making size the most general determinant of life-history transitions in this species. To illustrate this, a plot comparing data with the model of probability of flowering against vegetative stem size is presented in Fig. 2; bigger plants are more likely to flower but this also depends on year. The goodness-of-fit of models to the data can be seen from Fig. 1, where fitted values are plotted against observed values for each model. All of the models showed a symmetrical 1 : 1 relationship between the fitted and observed values.

Figure 2.

Probability of flowering is plotted against ln(vegetative stem length) for each year of observation (years 6 and 7 were excluded from the analysis as there were too few data points). The lines represent the best fit model as parameterized from Table 3. Each year has a different intercept and slope; for all years except year 4, the model appears to be a good descriptor of the data.

The best-fit models for vegetative stem length, probability of flowering, flowering stem length, fruit production, sucker production and survival are given in Tables 2–7, with estimated parameters, standard errors (where estimated), P, F, likelihood ratio (LR) or deviance values and degrees of freedom. AIC is also reported for all models. These results are summarized in Fig. 3, which shows a schematic diagram of the life cycle of H. perforatum along with the significant variables (positive and negative) that affect each life-history stage.

Table 2.  The symbols are those used for specifying the individual-based model in Buckley et al. (2003). Vegetative growth general linear model, deviance explained = 54%, AIC = 1846. ln(total vegetative stem length), Sy+1, is the response variable. F-tests were carried out for all variables removed from the model one at a time, main effects included in a significant interaction were not tested
Year (y + 1)ai,sF5,674 = 23·24, P < 0·0001 1·10100·2736
ln(vegetative stem length), SycsF1,670 = 345·8, P < 0·0001 0·48860·0263
C. quadrigemina herbivory, HdsF1,670 = 31·55, P < 0·0001−1·88400·3354
Rain (season 1), R1fsNot tested, interaction significant 0·009960·0002
Rain2 (season 1), inline imagegsNot tested, interaction significant−0·000007540·00000214
% bare (season 2), B2,qhsF1,670 = 11·6, P= 0·0007−0·006010·00177
Treatment (shade)bshade,sF1,670 = 14·86, P = 0·0001−0·29970·0777
Damage (season 1), D1jsNot tested, interaction significant−0·96740·5362
Damage × rain, D1R1lsF1,670 = 6·97, P= 0·009 0·01170·0044
Damage × rain2, D1inline imagemsF1,670 = 11·04, P= 0·0009−0·00002780·0000084
Individual error, SDes  0·8493Not established
Table 3.  Probability of flowering generalized linear mixed-effects model, flowering (binary, 0 or 1) in year y+ 1, Pfly+1, is the response variable, AIC = 735. The model containing the quadrat random effect was preferred to a GLM without quadrat as a random effect as it had a lower AIC value (735 vs. 770). The change in deviance on removal of the variable from the model is tested against the χ2 distribution
Year (y + 1)ai,PflNot tested, interaction significant−7·38171·3369
Year × ln(vegetative stem length), Sy+1ci,PflDeviance =−26·71, P= 0·00002 2·00870·3488
Quadrat error, SDEq,PflSee AIC comparison above 1·03740·1194
Table 4.  Flowering stem length general linear mixed-effects model. ln(flowering stem length) in year y+ 1, Fly+1, is the response variable, AIC = 506. Likelihood ratio tests (LR) were used to assess significance, see Fig. 2 for a plot of observed and fitted values
Year (y + 1)ai,FLR4,10 = 23·7, P= 0·0001 2·83850·3397
ln(vegetative stem length), Sy+1ci,FLR1,10 = 74·44, P < 0·0001 0·38370·0407
% grass (season 2), G2,qdFLR1,10 = 24·74, P < 0·0001−0·01740·0035
% bare (season 2), B2,qfFLR1,10 = 14·31, P= 0·0002−0·01020·0027
Quadrat error, SDEq,FLR1,14 = 9·72, P= 0·002 0·3239Not established
Individual error, SDeF  0·5991Not established
Table 5.  Fruit production general linear mixed-effects model. ln(number of fruit) in year y+ 1, Fry+1, is the response variable, AIC = 601. Likelihood ratio (LR) tests were used to assess significance
Year (y + 1)ai,FrLR5,11 = 107·09, P < 0·0001−2·45470·2953
ln(flowering stem length), Fly+1dFrLR1,11 = 352·7, P < 0·0001 1·28360·0483
Treatment (shade)bshade,FrLR1,11 = 13·56, P = 0·0002−0·62880·1651
Rain (season 2), R2fFrLR1,11 = 20·8, P < 0·0001 0·00270·0006
Quadrat error, SDEq,FrLR3,14 = 28·45, P < 0·0001 0·5024Not established
Individual error, SDeFr  0·5408Not established
Table 6.  Survival, GLM with binomial errors and logit link function. Survival (binary 0 or 1), Psy+1, is the response variable. AIC = 216, % deviance explained = 0·45. Significance was assessed on the change in deviance compared to the χ2 distribution
Year (y + 1)ai,PsDev5,245 = −30·855, P= 0·00001−10·2681 4·6736
   −13·1395 5·2627
   −13·4904 5·2591
   −14·4379 5·2799
   −12·4153 5·3324
Treatment (shade)bshade,PsDev1,240 = 352·7, P < 0·0001  1·7915 0·4983
ln(vegetative stem size), Sy Not tested, interaction significantSee age 1See age 1
Agecage,PsNot tested, interaction significant  8·2061 5·4304
     9·5009 5·3579
     7·6516 5·4725
     7·9766 7·7131
Ln(vegetative stem size), Sy × age dage,PsDev4,243 = −12·38, P = 0·015  5·2304 2·1701
     1·4532 2·2109
     0·9592 2·1841
     1·3655 2·2006
     0·6121 2·3860
Table 7.  Sucker production, GLM, parameter estimates are given from a model with variance as a function of the mean and log-link function (quasi family), % deviance explained = 0·19. F-tests were done on changes in deviance as a result of deletions from the model with Poisson errors and a log-link function
InterceptaSuF1,88 = 28·4, P < 0·00001−4·44961·2192
ln(vegetative stem size), Sy+1cSuF1,88 = 12·4, P= 0·0004 0·21990·0813
ln(maximum flowering stem size), Fly+1dSuF1,88 = 27·5, P < 0·00001 1·03410·2884
Dispersion parameterβNot tested 1·5633Not established
Figure 3.

The life-history of H. perforatum (adapted from Briese 1997b) with each life-history stage labelled with the positive, negative and random effect (R.E.) variables identified from the best-fit statistical models, y − 1 refers to the previous year's values and (1) and (2) refer to seasons 1 and 2, respectively. Veg., vegetative; Fl., flowering; P, probability.

Quadrat-to-quadrat variation was identified as the most important source of spatial variation in the probability of flowering, flowering stem growth and fruit production models (Fig. 3) but all spatial random effects were non-significant in the vegetative stem growth and survival models. The interaction of age and size was significant in the probability of survival model, which means that, for a given size, the probability of survival is lower for older plants, as evidenced by the decrease in the age intercept and the age × size slope with increasing age (Table 6). Year effects were significant for all models where multi-year data were available (sucker production had only one year of data), indicating the presence of unknown additional factors correlated with year of observation affecting life-history processes in this species (Fig. 3).

The damage score assigned to each plant was primarily an indication of C. quadrigemina damage. Both the direct C. quadrigemina herbivory score and damage were significant in the vegetative growth model (Table 2) but did not affect any of the other life-history processes directly. Therefore, the main impact of the biocontrol agent is through reductions in vegetative size, and also indirectly as size affects all other life-history stages. In the vegetative size model there was an interaction between damage and rainfall, indicating that the level of damage received by a plant modifies the growth response to rainfall (Table 2). Damage to a plant decreases the plant's growth response to rainfall; e.g. at 100% damage (equivalent to total defoliation but not death) rainfall levels > 500 mm in season 1 can actually reduce subsequent plant growth (Fig. 4). Damage itself was modelled as a complex combination of mostly fixed effects, quadrat- and block-level variables (percentage cover of grass and bare ground are quadrat level variables and shade is a block level variable) and a negative correlation with plant size (Table 8). The negative correlation of damage with plant size was to be expected as herbivory (the primary determinant of the damage score) destroys vegetative stems.

Figure 4.

Effect of damage and rainfall on vegetative growth. This figure illustrates how differing levels of percentage damage, represented by the separate labelled lines, modify the plant's growth response to rainfall (lines are parameterized from the vegetative growth function with parameter values for rain, rain2, damage, damage × rain and damage × rain2 taken from Table 2, with damage back-transformed to a percentage value for ease of interpretation). The y-axis shows the damage and rainfall components of the linear predictor of vegetative size. High levels of damage do not adversely affect the plant response to low rainfall but damaged plants suffer a steep fall off in contribution to growth at moderate to high rainfall values.

Table 8.  Plant damage score (arcsine transformed) was the response variable in a GLM, % Deviance explained = 0·5, AIC = 637. Significance was assessed using F-tests of the change in deviance on deletion of the variable from the full model
Year (y)ai,DNot tested, interaction significant 0·27250·1449
ln(vegetative stem size), Sy+1dDNot tested, interaction significant−0·05420·01371
Year × shadebshade,i,DF4,664 = 6·13, P= 0·00008 0·88820·1386
Year × %grass (season 1), G1,qfDF4,664 = 11·67, P < 0·00001 0·002520·00233
Year × %bare (season 1), B1,qgDF4,664 = 13·1, P < 0·00001 0·008610·00154
Shade ×%bare, B1,qjshade,DF1,661 = 89·9, P < 0·00001−0·0150·00158
Shade × %grass, G1,qhshade,DF1,661 = 39·8, P < 0·00001−0·012230·00194
%Grass × %bare, G1,qB1,qkDaF1,661 = 22·68, P < 0·00001−0·0001590·00003
Individual error, SDeDNot tested 0·1432Not established

Shade had significant negative effects in the vegetative growth and fruit production functions but shade had a significant positive impact on survival (Fig. 3), therefore plants in shaded sites grow more slowly and produce fewer fruit but live longer. Percentage bare ground cover had a negative impact on both vegetative and flowering stem growth, while percentage grass cover also had a negative impact on flowering stem growth (see Fig. 3 for these relationships).


Our study is unusual in that this relatively complex set of models has been fully parameterized, using data collected over the entire adult life span of 350 plants over 5–6 years and involving models that explicitly test the spatial and temporal context of these observations. The terms in each of the models are easily interpreted in the context of the known ecology of the plant but have the advantage of providing explicit quantitative estimates of impact on plant growth and fates that can be used in simulation models. These statistical models are used in an individual-based simulation model (Buckley et al. 2003) to make predictions about the invasion dynamics and the effects of management strategies on populations of H. perforatum.

Quadrat-to-quadrat variation was identified as the most important source of spatial variation in the probability of flowering, flowering stem growth and fruit production. All spatial random effects were non-significant in the vegetative stem growth and survival models. The over-riding importance of quadrat-to-quadrat variation, as opposed to variation at higher spatial scales, in these models indicates that local neighbourhood interactions and variation at a small spatial scale have the greatest impact on fecundity through the seed pathway. Plants within a quadrat are more similar to each other than plants between quadrats in this respect. This could be due to the fact that there is a high degree of vegetative suckering maintaining populations, and plants within a quadrat are likely to be connected to each other via a shared root system. Therefore, the observed ‘individuals’ are merely ramets of the same genetic individual and possibly even share the same root system. In this system we cannot easily determine in the field whether two apparently separate crowns are connected beneath the soil surface, which underlines the importance of using linear mixed-effects models in order to determine at what level data points can be treated as independent. From the results presented here we can say that plants separated by about 1 m will behave independently of each other. This will aid the future design of experiments looking at individual plant responses in this species.

There is little indication from these results that Clark's (1953) classification of populations into type A and type B infestations has any relevance for the populations studied here. At the outset of the study, the population at Pierce's Creek and Beechworth could have been classified as type A and the populations at Queanbeyan and Adaminaby would have been classified as type B. From these few comparisons, where site effects were non-significant, we can suggest that site-to-site differences are not as important for individual plant life histories as the differences between plants in shaded and open sites and plants in different quadrats within a site.

Bare ground and grass cover both had negative impacts on flowering stem growth, and bare ground also negatively impacted on vegetative stem growth. Bare ground is either indicative of very resource-poor habitat (excessive amounts of stone, for example) or disturbance, either of which might limit the ability of plants to increase in size. The negative impact of grass cover may be indicative of a competitive interaction between H. perforatum and grass species. This suppression of H. perforatum by grasses has been documented previously (Willis, Groves & Ash 1998) and forms the basis of the successful management strategy of pasture improvement for eliminating the weed (Moore & Cashmore 1942, where grasses were sown in addition to subterranean clover).

Water stress has been found to suppress growth of H. perforatum in greenhouse experiments (Willis, Ash & Groves 1996). Our results support and expand on this observation. We found that rainfall in season 1 (April–September) increased both vegetative stem growth and fruit production. However, the quadratic effect of rainfall (Table 2 and Fig. 4) means that the contribution of rainfall to growth slows down at the highest levels. In addition there was an interaction between the plant damage score and rainfall on vegetative growth (Fig. 4), which leads us to conclude that the level of damage a plant is subjected to modifies its response to rainfall. At high levels of damage and moderate to high levels of rainfall, the total net contribution of damage and rainfall to vegetative growth is zero or negative.

We do well to heed the words of Volker Grimm (1999) that ‘statistics cannot supplant understanding’. We can, however, use these models as a basis for further refinement and research. Not only do they provide useful knowledge about important determinants of life-history processes, but the modelling procedure and results highlight and quantify gaps in our knowledge of the ecology of H. perforatum. More information is required for both vegetative and seedling recruitment into the population. The priorities for further research are as follows.

  • 1Survival and growth of sucker and seed recruits. It is very likely that sucker and seed recruits will have different survival and growth functions due to the persistent connection between parent and sucker. Only the growth of mature individuals was examined in this study.
  • 2In-depth study of the factors affecting sucker production, especially the influence of competition and plant size. A negative binomial model of sucker production did not converge. We therefore used a quasi-Poisson model in order to calculate parameter estimates for a negative binomial model. The use of quasi-Poisson models introduces a strict relationship between the mean and variance, an assumption that may not be supported by the underlying process (of which we have scant knowledge). Further study of the factors affecting sucker production might enable us to model the process better.
  • 3Impacts of control strategies on individual plant responses, especially regeneration from fire and biocontrol impacts.

We also used these statistical models as the basis for constructing an individual-based model of populations of H. perforatum, which enables us to test control strategies directly on the virtual populations obtained (Buckley et al. 2003).

The importance of small-scale (quadrat) spatial variation underlying survival and reproduction functions in this species highlights the danger of treating plants as independent units when making predictions about plant growth, survival and fecundity. These models form the basis for making predictions on how the population will respond to environmental conditions and control. The use of stochastic models including both temporal and spatial elements allows probabilistic predictions to be made. If our aim is to work with weed managers to control problem populations, it is necessary to present models with a realistic assessment of the probability of observing the predicted results.


We thank Kim Pullen and Paul Jupp, who helped to collect these data, and Brian Ripley for code and advice on GLMM-PQL. This work was supported by a NERC studentship to Y.M. Buckley; we also acknowledge the support of the CRC for Weed Management Systems and the Leverhulme Trust.