The form of the FSE null hypothesis dictated that treatments be chosen deliberately to represent a composite of agronomic effects, not a single ecological process. Any feature of the crop itself, such as a varietal trait, and any concomitant agronomic practice linked to the crop concerned, such as recommended herbicide usage, would contribute towards the potential treatment effect being measured. Such practices, tied to the crop, had therefore to be allocated to units as part of the identical process whereby the treatments were randomized. Composite null hypotheses are often used in initial studies, to demonstrate the existence and estimate the magnitude of effects and thereby to screen out those unworthy of further interest. In such experiments, the most important property is of realism and applicability, so that the results relate unequivocally to the system that is studied. The FSE was designed as a large-scale experiment of this kind.
half-fields vs. paired-fields: choice of experimental unit
An issue for discussion before the design was finalized concerned the size and location of experimental units. Specifically, should farms act as blocks and units be whole-fields, paired within the farm to be as alike in biodiversity as possible? Alternatively, should a single field be divided into two halves, again as alike as possible, defining a unit as a half-field? The arguments in favour of the alternative approaches involved scientific, statistical and practical issues.
A strong argument for the half-field design was the potential for reduction in variability. The two halves of a field are much more likely to be similar, in previous management, soil type and surrounding habitat, than two different fields. Residual variation is reduced by choosing blocks such that experimental units within them are matched, as far as practicable, for the measured variable (Perry 1997). Under this argument, halving fields should enhance the statistical power to detect differences between treatments, and increase the precision with which they are estimated.
However, ecological relationships measured at one spatial scale may not have the same parameters or pertain at all at other scales (Heads & Lawton 1983; Norowi et al. 2000). Caution is required in extrapolating the results of a study on half-fields to a larger whole-field scale. Duffield & Aebischer (1994), Perry (1997) and Kennedy et al. (2001) have shown how the use of relatively small plots close to one another has affected the interpretation of experiments for relatively mobile species such as carabid beetles; this argument could favour the use of paired whole-fields. Indeed, birds and small mammals were excluded from comparison within the main FSE precisely because their territories and foraging areas often extend beyond half- or whole-fields (Firbank et al. 2003). More generally, tritrophic interactions between the chemical ecology of plants, herbivores and their natural enemies are subtle (Vet 1999) and Schuler et al. (1999) highlighted many potential indirect effects of GM plants on arthropod natural enemies.
Movement of individuals between the two halves of the same field might bias the estimated difference between treatments, especially if movement was related to the effect of crop management. For example, increased mortality on one half of the field could be compensated by density-dependent immigration from the other half. An individual carabid may easily travel the order of 300 m, the breadth of a square 10-ha field, in two nights (Kennedy 1994). Duffield & Aebischer (1994) noted that the recovery from pesticide application of invertebrate populations would proceed at a slower rate when entire fields were treated, compared with within-field plots of an identical size. Despite limited replication in the largest of their plots, they suggested that small-scale within-field trials to evaluate pesticides would in many cases fail to predict accurately the impact of commercial pesticide management.
Despite these caveats, useful information may still be obtained from half-fields for highly mobile species, such as bees and butterflies, as long as direct inferences concerning abundance are not made from counts. Instead, treatment differences relate to foraging preferences towards flowering plants. These problems of interpreting data concerning bees, butterflies and, to a lesser extent, some carabids must be seen in the context of the ecology of the taxa studied, relative to the treatments imposed. Direct effects of herbicide management regimes are most likely to impinge on vegetation; effects on invertebrates will probably be indirect.
Care must be taken to avoid interference between experimental units that are close together, for example from spray drift. Here, the separation distances, of 50 m for rape and maize and 6 m for beet, between half-field units help to minimize problems. Any chosen design would have to attempt to match field-margin biodiversity between experimental units. Such margins are important habitat in arable ecosystems as reservoirs for plants and overwintering sites for insects, cover and food for birds, and may affect invertebrate distributions (Lewis 1967).
The FSE aims to compare GMHT and conventional varieties of each of the four crops grown in realistic commercial conditions, which might favour the use of whole-fields. Against this was the practical issue that in the pilot year there was a lack of candidate fields, vital to choose pairs sufficiently well-matched for previous management and cropping history; this strongly favoured the use of half-fields. Also, half-fields reduce greatly the sampling effort, as recorders travel less to collect data. Accuracy might be improved if there is less time pressure; experience during the pilot year revealed this as an important consideration at particular times of the year when sampling overlapped between taxa.
Unfortunately, very few data exist on the relative variability between whole-fields within farms and that between half-fields within whole-fields. Surveys have been used to assess the environmental effects of intensive agriculture within the UK for decades (Potts & Vickerman 1974) but designed experiments are relatively recent and lack adequate replication of realistic-sized units (Sotherton, Jepson & Pullen 1988; Aebischer 1990; Perry 1997; Moller & Raffaelli 1998; Raffaelli & Moller 2000). Lennon (1998) listed nine recent European projects on integrated pest management and noted that each suffered difficulties with inference that resulted from either inadequate replication or complications due to crop rotations. Unfortunately, the crops studied in the well-designed MAFF LINK Integrated Farming Systems Study (Ogilvy et al. 1995) were largely different to those of the FSE. However, some data from the Game Conservancy and Allerton Research and Educational Trusts (Boatman & Brockless 1998), from up to five winter oilseed rape fields on the demonstration farm at Loddington, Leicestershire, UK, provided information on components of variation (Perry 1989) within eight abundant suction-sampled invertebrate groups. Some fields were halved, yielding information from 1994 to 1996 on between- and within-field variation, that could be used to compare the likely efficiency of half-field and paired-field designs. The variability of paired-fields was often similar to that for half-fields, but sometimes, especially during 1995, was much greater (Fig. 1). It was not possible, due to constraints of proper randomization and insufficient replication, to use data from the FSE pilot year (1999) to inform the choice of design, although an informal inspection suggested that half-fields were inherently less variable than paired-fields.
Figure 1. Comparison of estimated coefficients of variation (CV) between half- and paired-fields from 1994 to 1996 from Loddington Farm. The data from the Allerton Project (Boatman & Brockless 1998), run by the Game Conservancy Trust for the Allerton Research and Educational Trust, were supplied by Dr Nicholas Aebischer (Game Conservancy Trust). Symbols represent annual values for the eight most abundant invertebrate groups in suction samples: Collembola (C), aphids (A), Homoptera (H), Thysanoptera (T), parasitoids (P), staphylinid larvae (S), Coleoptera adults (C) and Coleoptera larvae (L).
Download figure to PowerPoint
The final choice of a half-field design was based on the availability of fields, the associated difficulty of obtaining suitably matched paired fields, the probable major effect of herbicide being on weeds rather than invertebrates, the need to reduce variability and efficiency gains in sampling effort. The choice was made with the proviso that half-fields should fall within the range of field sizes used commonly for each crop, and should not compromise realistic growing conditions.
farm and field selection: representativeness and range
An important requirement of the FSE is that its results should apply to the British agricultural ecosystem and landscape as a whole. This raises the question of the representativeness of the farms included and the issue of farm and field selection. For example, it would be unsatisfactory if there were no fields within the FSE growing spring oilseed rape in Scotland, where a large acreage of the crop is grown. For the pilot study, fields came from a limited self-selected set of growers who were willing to grow GMHT crops. Within the FSE proper, the issue of representativeness was addressed by attempting to select fields that encompassed the full range of variation, in various variables, likely to be found in commercial practice. The current status within Britain for each crop was summarized with regard to its geographical distribution, usual agronomy, soil types and field sizes. This profile was then compared with more detailed information on specific candidate farms and fields, obtained from a questionnaire issued by the consortium to each grower who expressed interest in taking part (Firbank et al. 2003). Estimates were made of the intensiveness of the grower's inputs and the extent to which the farmers managed their land in ways that might favour biodiversity. Potential growers required early notification of whether their farm was selected, so a sequential approach was used to monitor the structure of the sample in terms of geographical spread, intensiveness and biodiversity, and to identify underrepresented strata.
The approach in the FSE was not to sample farms in proportion to their frequency of occurrence according to some factor. For example, low-intensity farms are relatively rare but they may contribute proportionately more to biodiversity than intensively managed farms (Watkinson et al. 2000). The consortium sought to include a disproportionately large sample of such low-intensity farms. Analyses will seek to identify a possible interaction between the treatment effect and intensity, for which there are ample degrees of freedom available.
Note that the randomization to half-fields within each field is distinct from the ability to scale-up from the experiment to some wider population, which requires that the experimental units within fields, and the fields themselves, must be representative. The larger the pool of farms, the more likely it was that a suitable set of farms could be selected. However, there is no requirement per se for such selection, either to ensure validity of the statistical test or for the ability to scale up.
Rather than the statistical tests of the null hypothesis, other approaches are to extrapolate the results of the FSE through explanatory, mechanistic modelling (Firbank & Forcella 2000; Watkinson et al. 2000) or multivariate community-based analysis; such work is not considered here.
choice of boundary to halve field and treatment randomization
Randomization of allocation of the GMHT and conventional varieties to the two halves of the field safeguarded against selection bias, for example GMHT crops being applied to the weedier half of the field. It also provided statistical validity for the test of the null hypothesis, and for the estimates of the precision of the magnitude of any differences. and it allowed differences detected to be ascribed causally as treatment effects.
The randomization protocol for the trial required a structured dialogue between the recorder from the consortium and the grower, so that the choice of boundary line to halve the field for sowing was made on scientific grounds not agronomic convenience. The optimum choice of boundary should result in two half-field units as alike as possible over the range of factors that contribute to the variability of wildlife within the field. The protocol also guarded against any preference a grower had for what side of the field should receive the GMHT treatment. Thus, treatment allocation was predetermined by project statisticians who assigned one treatment at random to the label ‘A’ and the other to label ‘B’. This allocation was provided to the recorder (but unknown to him or her) in a sealed envelope. After the boundary line was agreed between recorder and grower, the half-field unit towards the north (for an east–west boundary line) or towards the west (for a north–south boundary line) was labelled as ‘A’, and the other as ‘B’, and drawn on a rough map. The envelope was then opened and the treatments noted on this map. With this auditable procedure none of the recorder, statistician or grower could influence the randomization.
imposition of crop management by growers
In some respects, the FSE experimental design has much in common with on-farm trials carried out by farmers, on their own land, in studies on third-world agriculture (Buzzard 2000). The control crop variety was selected by the farmer according to local conditions, and varied between farms. Both GMHT and conventional systems were managed by growers as closely as possible according to their current commercial practice, although within this constraint management practices were kept as similar as possible. Any pesticide seed treatment was the same on both treatments at a farm. Where non-herbicide treatments were imposed on both GMHT and the conventional varieties, they were applied at the same time unless there was good agronomic reason, for example if there were more pests on one half-field than the other. Growers took usual decisions for weed control on the conventional variety; this might or might not involve the use of consultant agronomists. However, usual practice remains difficult to define for GMHT varieties, because none has yet been grown commercially within Britain. Procedures that ensured that the treatment applied within each management regime was applicable are outlined by Firbank et al. (2003). Such considerations are vital to enable valid inference, and are equally important as biometrical issues of design and analysis; no treatment randomization can allow for biases arising from inappropriate management of the GMHT variety. Note that it was possible for there to be no herbicide applications to either half-field unit if, for example, there were no weeds to treat.
Some agronomic practices, such as the increased use of direct drilling or changes to normal rotations, might become associated with GMHT technology if it were commercialized. The FSE cannot, at this early stage in the use of GMHT, evaluate efficiently such events within an experimental framework of imposed treatments. However, the FSE will provide data to parameterize predictive models in which such scenarios may be studied.