1. Metrics have become a standard way for summarizing environmental monitoring results. Different metrics react differently to natural variations and human-induced stressors. We suggest that combined analysis of time trends in selected biological metrics allows identification of biological processes (e.g. individual growth, mortality or recruitment) that have changed (increased or decreased) persistently. Alternatively, time trends in the abundance of sensitive species could indicate changes in environmental stressors.
2. We calculate the joint likelihood of time trends in three metrics and use it to evaluate the evidence in the data for different combinations of metric time trends. A simulation study provides guidelines for interpreting log-likelihood differences.
3. We illustrate the approach for identifying biological process changes for three North Sea fish stocks (cod Gadus morhua, lesser-spotted dogfish Scyliorhinus canicula and whiting Merlangius merlangius) using metrics derived from international bottom-trawl survey data for the period 1997–2008. Over the period, a decrease in recruitment and several simultaneous process changes were most likely for cod, while a recruitment increase, mortality decrease and several process changes were most likely for lesser-spotted dogfish. No significant persistent process changes were found for whiting.
4.Synthesis and applications. The likelihood approach offers a way of combining monotonic time trends in multiple metrics for identifying persistent changes in exploited populations or environmental stressors, given suitable metric time series and tables for interpreting joint time trends. For data rich fish populations, the proposed method can supplement analytical stock assessments. For many other populations with no fisheries-dependent data, it offers a way to identify population changes, which will be crucial for implementing the ecosystem approach to fisheries management and the European marine strategy framework directive.
Monitoring of the biotic conditions of ecosystems under human and environmental pressures has a long history both in aquatic and terrestrial systems. Biotic metrics and indices have become the standard approach for summarizing monitoring results on different organizational levels, e.g. de Heer, Kapos & ten Brink (2005), Loh et al. (2005). Temporal changes in these metrics point towards underlying driving stressors, such as productivity decreases leading to recruitment failure or the effects of contaminants and fishing (Sandström et al. 2005). It is important to be able to distinguish between natural environmental variations and human-induced stressors (e.g. chemical pollution, over-harvesting or habitat destruction). This requires using a combination of metrics expected to react differentially to various stressors.
Theoretical and empirical studies have been used for determining the sign and possibly shape of the relationship between biotic metrics and human or environmental stressors. Lenihan et al. (2003) derived the expected direction of change in the abundance of benthic fauna subject to organic enrichment (increased organic carbon) and/or toxic (copper) contamination using designed experiments. These authors found that arthropods and echinoderms decrease when toxic contamination or both stressors increase while annelids increase under organic enrichment on its own or in combination with toxic contamination. Other examples using a correlative empirical approach include studies of the relationships between freshwater fish functional metrics and a human pressure index (Hughes et al. 1998; Pont et al. 2006), benthic meiofauna diversity and heavy metal contamination (Moreno et al. 2008), and fish community indicators and measures of human impact, including fishing effort (Piet & Jennings 2005; Pont et al. 2006).
We reverse the reasoning applied to determine relationships to identify the changes in environmental stressors or biological processes based on time trends in multiple metrics, i.e. a certain observed combination of trends points towards a change in a particular stressor or process. For the benthic fauna example cited above (Lenihan et al. 2003), this means that an increase in organic enrichment over time would be inferred if annelids increased and arthropods and echinoderms showed no trend in a monitoring time series. This approach is referred to as the effects-based approach in contrast to the stressor-based approach, where all potential stressors would be monitored, e.g. total organic carbon and copper concentration in the benthic fauna example. The question is then how to determine and combine trends for multiple metrics. A common approach is to test for significant linear (or loglinear) trends in each metric separately (e.g. Rochet et al. 2005) or use correlation analysis for two metrics (Reid et al. 2005; Frank, Petriea & Shackell 2007). Both approaches raise statistical and conceptual concerns: multiple tests, no measure of uncertainty for the selected process changes and the assumption of linear trends for the linear regression approach. Many solutions for each issue exist. Here, we propose an approach for resolving them coherently in the context of the use of metrics for monitoring.
Our approach for jointly assessing time trends in multiple metrics and comparing the evidence in the data for different trend combinations (models) is based on the likelihood principle (see overview in Pawitan 2001). We demonstrate the approach by determining whether the available bottom-trawl survey derived population metrics for cod Gadus morhua (Linnaeus), lesser-spotted dogfish Scyliorhinus canicula (Linnaeus) and whiting Merlangius merlangius (Linnaeus) in the North Sea indicate any persistent changes in population processes (individual growth, mortality or recruitment). We compare the results with the information on changes in population processes provided by the standard stock assessments for these stocks.
Materials and methods
Fish populations, data and metrics
Data for cod, lesser-spotted dogfish and whiting from the International Bottom Trawl Survey (IBTS) covering the whole North Sea in the first quarter of each year were extracted from the ICES Datras data base (http://datras.ices.dk/) for the period 1997–2008. The survey design is stratified random (Anonymous 2004) and the same Grande Ouverture Verticale trawl is used by all nations who participate in the survey. We selected a 12-year period to be able to assess whether there are persistent single or multiple process changes. The assumption of persistent changes, i.e. either an increase, decrease or no change of given processes, would not be reasonable for a longer period. The data were used to calculate three metrics: log-transformed abundance (lnN), mean length in the population (Lbar) and the 95th percentile of the length distribution (L0·95). Two length metrics are used because only one of the two metrics might be sensitive to a change in fishing mortality or recruitment. For example, Lbar is expected to track recruitment for a population dominated by recruits, while changes in fishing mortality might measurably affect only L0·95. For this study, we set the precision of all metric estimates equal to a coefficient of variation of 3%.
Using basic population ecology, we first derived the expected directions of change in each of these metrics in response to changes in total mortality (Z), individual growth (g) and recruitment strength (R). We then reversed the interpretation and created a table in which a given process change is identified by unique combinations of the time trends of the three metrics (Table 1). This table will be used for interpreting joint metric trends in terms of underlying process changes. Note that certain time trend combinations cannot be linked to single process changes. In such cases multiple process changes, or other process changes, are assumed to have occurred.
Table 1. Combination of time trends in three population metrics corresponding to distinct models (Mx) suggesting changes in individual growth g, recruitment R, adult total mortality Z or multiple processes mu
↗↘, monotonic time trend in direction of arrow; ↔, no trend; i, increase; d, decrease. Metrics: L0·95, 95% length percentile; lnN, log-transformed population abundance; Lbar, mean length.
Method for calculating likelihood values
The most likely process change is taken to be the one with the highest support from the available data. The likelihood of a process change is the sum of the likelihood for all trend combinations indicating that process change in Table 1. For example, the top left hand cell in Table 1, which corresponds to an increase in all three metrics, lnN, Lbar and L0·95, is interpreted as providing evidence for a reduction in total mortality (Zd). The likelihood for this cell is the joint likelihood that each metric increased over the study period. Two additional cells also indicate a decrease in total mortality. Hence, the overall likelihood for a decrease in total mortality is the sum of the likelihood of the three cells containing Zd (cells M1, M2 and M10 in Table 1).
Four steps are required to calculate the likelihoods: (i) standardization of the metrics; (ii) time trend fitting; (iii) likelihood calculation for each time trend combination; and (iv) summation across combinations for overall evidence regarding persistent changes in processes.
Standardization of metrics. Each time series of estimates, I = (I1, …, IT) is standardized (normalized) by removing the mean and dividing by the empirical standard deviation of interannual variation to facilitate the contribution of the different metrics to the joint likelihood; metrics standard deviations s = (s1, …, sT) are standardized by dividing by the same interannual standard deviation.
Time trend fitting. Three types of time trends need to be fitted to each of the standardized metrics: an increasing trend , a decreasing trend and no time trend . A horizontal line is taken to represent no time trend. As the metrics are standardized, this line has intercept and slope zero (). Increasing and decreasing trends can have different shapes. As we are not interested in any particular shape, a monotonically increasing (decreasing) smooth function is fitted following the methods described in Wood (1994) and using the mgcv package in R (R Development Core Team 2008). In practice, we first fit an unconstrained generalized additive model (GAM), modelling I as a function of time t
where bi() are cubic regression splines, β are model parameters and Xt is the design matrix. The model is fitted using penalized quadratic least squares, minimizing
The second term is the penalty for ensuring smoothness. Estimation of the optimal smoothness parameter λ is carried out by restricted maximum likelihood. In a second step, constrained re-estimation of β is carried out, keeping the value for the smoothness parameter λ as estimated before. Two linear in equality constraints are introduced:
1 Monotonicity: (increasing) or (decreasing).
2 Significant time change: where a = 4 median(s*),
with s* the standardized standard deviations of metric estimates. The second constraint forces the fitted values for the final and first years of the time series to be significantly different if approximate 95% confidence intervals were constructed for each point estimate as . Quadratic programming is then used to minimize
with respect to β and subject to the constraints C. The time trend models are fitted assuming iid residuals; a Durbin–Watson test is used to check this assumption.
Likelihood calculation for time trend combinations. The joint likelihood for each of the 27 combinations of metric time trends (denoted M1 to M27 in Table 1) needs to be calculated. For each metric time series, the likelihood of an increasing, decreasing and no time trend (using the three models fitted in the previous section) is calculated. For example, assuming a Gaussian distribution, the likelihood of a monotonically increasing time trend in the indicator time series I is calculated by
where βinc and γinc are the parameters for a monotonically increasing time trend and is the residual sum of squares of this fit divided by T. The joint likelihood is obtained as the product of the likelihoods for the three metrics assuming that they are independent, e.g.
Independence of residuals across metrics is checked using Pearson’s product moment correlation coefficient.
Evidence for persistent process changes. The likelihood of a particular process change for a given population is calculated as the sum of the likelihood values for the appropriate time trend combinations. For example, a persistent decrease in total mortality (Zd) is indicated by M1, M2 or M10 (Table 1). Thus, the likelihood for Zd is estimated as
Comparing model fits
The largest log-likelihood value (lmax = max(log(L(M×))) indicates the best model fit (Pawitan 2001) and the corresponding process change is considered the most likely. However, owing to uncertainty in metric estimates, models with log-likelihood values within a certain distance of the largest value, ΔM = lmax − l(M×), provide basically the same evidence as the ‘best’ model. A simple simulation study was carried out to obtain guidelines for interpreting ΔM values, similar to what is available for interpreting differences in Akaike’s Information Criterion (e.g. Burnham & Anderson 2003, p. 70). Three 12-year time series were simulated by drawing values from standard normal distributions, linear regressions were fitted to each simulated series and the likelihood of each linear time trend was calculated. Multiplying these likelihood values across the three metrics provided the likelihood equivalent to an entry in Table 1. As generally three cells in the table indicate the same process change, their likelihood values were multiplied, taking into account that each pair of cells has two metrics trends in common. We repeated the process 10 000 times leading to the simulated distribution of likelihood values. A second similar simulation study was carried out to obtain the distribution of likelihood values in the case where only two metrics are used.
Sensitivity to methodological choices
Several decisions are made when applying the method. The impact of two of these: (i) the selection of pertinent metrics, and (ii) the type of monotonic function used for fitting time trends are investigated. The time trend interpretation table (Table 1) contains two length-based metrics for describing population demography, Lbar and L0·95. Whether these two metrics indeed carry independent information was explored by repeating the analysis for two metric couples (Lbar–lnN and L0·95–lnN) using for the interpretation subsets of Table 1 corresponding to no change for the omitted metric. Smoothness was imposed when fitting time trend models using GAMs using the parameter λ (eqn 2) but otherwise no constraint was placed on the degree of nonlinearity (number of d.f.) of the relationship. The analysis was repeated constraining time trends to be at most cubic (d.f. 3; a linear model has d.f. 1) in addition to being monotonic and having a significant time change (constraints 1 and 2 above) to explore the impact of this.
Interpreting log-likelihood values
The distribution of simulated log-likelihood values using three independent stationary metric time series was rather skewed (Fig. 1). The range between the 5th and the 95th percentiles was 4·8. Thus, it is considered that the data do not provide evidence in favour of a process change in addition to that with lmax unless ΔM < 5. In the case of only two metrics, the threshold is ΔM < 4.
Identification of changes in population processes in fish populations
The standardized time series of the three population metrics for North Sea fish populations with fitted time trend models are shown in Fig. 2. No significant first-order autocorrelations were detected in the residuals for cod and lesser-spotted dogfish using Durbin–Watson tests with test level 0·01/9 (with Bonferroni correction) in accordance with model assumptions. In contrast, residuals for whiting from the decreasing and no time trend model fitted to lnN were significantly autocorrelated. Thus, overall autocorrelation in residuals was not an important problem. Only one significant correlation among metrics was found for cod and none for the other two species, supporting the validity of the results.
The largest log-likelihood values for cod indicated decreased recruitment and multiple changes, and these values differed by <5 (bold entries in Table 2). The log-likelihood values for lesser-spotted dogfish pointed towards a persistent increase in recruitment, a decrease in mortality and multiple changes; the processes had the largest log-likelihood values, which were about equal in size. The log-likelihood values for most process changes were similar for whiting, providing no evidence for any particular process change.
Table 2. Model fit comparison using log-likelihood differences ΔM = lmax − l(M) for various process changes over the period 1997–2008
Values in brackets are for time trends which are at most quadratic. Processes: g, growth; R, recruitment; Z, total mortaliy; i, increase; d, decrease. Values <5 are in bold as they provide equal evidence for process changes.
Sensitivity to methodological choices
The analysis was repeated using only one length metric at a time to investigate the impact of the choice of metric (Table 3). For cod, using only Lbar led to conclusions identical to the 3-metric results. In contrast, when L0·95 was used, multiple changes were unlikely for cod. The results for lesser-spotted dogfish and whiting were insensitive to which and how many length metrics were used.
Table 3. Model fit comparison using log-likelihood differences ΔM = lmax − l(M) for various process changes over the period 1997–2008
Processes: g, growth; R, recruitment; Z, total mortality; i, increase; d, decrease. Values <4 are in bold as they provide equal evidence for process changes.
The relative contributions of lnN and the length metrics to the total log-likelihood (Fig. 3) provide information on which metric was most important. The identified decrease in recruitment for cod was due to a large log-likelihood for a decrease in lnN. Similarly, the large log-likelihood for an increase in lnN for lesser-spotted dogfish led to the identification of an increase in recruitment and decrease in total mortality. Thus, lnN was the most important metric because its time trend signal was less noisy compared to the length metrics.
When the model fitting procedure was repeated constraining the nonlinear monotonic functions to be at most cubic, the resulting log-likelihood differences were similar to those obtained with a more flexible smooth (values in brackets in Table 2). This indicates that results were robust to the shape of the monotonic functions.
Comparison with stock assessment results
The assessed cod stock covers the Eastern English Channel and the Skagerrak in addition to the North Sea sampled by the IBTS survey (ICES 2009a). Consistent with the results of this study, estimates of recruitment from the stock assessment for cod showed a decrease at the beginning of the period (Fig. 4). In contrast, estimates of fishing mortality (averaged over ages 2–4, Fbar) and spawning stock biomass (SSB) from the stock assessment decreased steadily with a slight increase at the end, while the metric trend analysis provided no evidence for a decrease in total mortality.
No formal assessment of lesser-spotted dogfish in the North Sea exists owing to lack of data, and a common catch quota is set for skates and rays (ICES 2009b). The ICES (2009b) assessment is therefore simply that ‘abundance and area occupied [of lesser-spotted dogfish] are increasing’. In line with this, the metric trend analysis indicated that increased recruitment and decreased total mortality were likely to have occurred in recent years.
The assessed whiting stock covers the North Sea and the neighbouring Eastern English Channel (ICES 2009a). Stock assessment estimates of SSB decreased, while the estimates of recruitment and fishing mortality returned to their values at the start of the period by its end after having been first higher and then lower (Fig. 4). Concordant with the stock assessment results, the metric trend analysis did not identify any persistent process changes.
Different process changes were identified for the three North Sea fish species for the period 1997–2008. The most likely changes for cod were a decrease in recruitment and multiple changes. The metric trends pointed at an increase in recruitment and a decrease in mortality for lesser-spotted dogfish. For both species, the results were rather robust to methodological choices and time trends in lnN with little noise had the greatest influence on the final conclusions. No significant process changes were identified for whiting. Formal stock assessments only exist for cod and whiting. However, survey data were available for lesser-spotted dogfish so that the method could be applied and process changes were identified.
All methods, including traditional stock assessments and the indicator trends based approach of this article depend on data availability and quality and validity of model assumptions. In particular, discarding and misreporting can compromise the quality of fisheries-dependent data and consequently stock assessment results. Scientific surveys used for calculating population metrics suffer less from lack of data quality as they adhere to strict survey protocols. However, any changes in survey catchability over time or space can compromise the reliability of survey derived metrics, as does partial coverage of the area inhabited by any population of interest (Trenkel & Cotter 2009). Subject to these conditions, the method of this article can provide insights into population dynamics in cases where reliable fisheries-dependent data do not exist, which is the case for the majority of fish species in the North Sea and elsewhere. Overall, it can provide an overview of potential process changes for a wide range of species. A potential application is to use the method as a first screening step, and to seek additional information to confirm the diagnosis and consider management actions only for those species where process changes are flagged up by the analysis, instead of having to consider a large number of species. This gain in efficiency will be essential for ecosystem based management and for example, for implementing the European Marine Strategy Framework Directive (directive 2008/56/EC). Although lack of evidence for change is not proof for a lack of change, in particular if metric time series are noisy.
A major characteristic of the method is its process orientated multivariate nature. The interpretation of joint time trends as process changes relies on the validity of the indirect inference, in particular the table used for interpretation. The table created here for identifying process changes (Table 1) did not explicitly incorporate well-known effects, such as density-dependent growth or correlations among biological processes. However, all joint and non-listed effects are part of the multiple processes column (Table 2). It might be possible to refine the interpretation of joint time trends and separate out some of these joint and other effects in future research.
The method provides likelihood values for different joint metric trends which are interpreted as process changes, and not only a single best explanation. The meaningfulness of time trends in the context of the measurement uncertainty of each metric is ensured by the second constraint used in the estimation procedure. This overcomes the shortcomings of normalization which standardizes the variance of all metrics series to 1 and thus removes relative differences in the precision of metrics estimates.
The threshold value used for interpreting the log-likelihood differences was derived from a simulation study. It was found that the threshold was somewhat sensitive to the number of metrics used and in particular to the length of the time series (results not shown). This is not surprising given that the time series are short, so sample sizes are small. Further, all metrics were considered independent in the simulations, which would provide conservative results if not true, although in the case study, there was little evidence of violation of the independence assumption (few residual correlations across metrics).
A crucial part of the method is the assumption of monotonic trends which expresses a persistent change in underlying processes and determines the relevant time window for the analysis. Metric trends with a single peak or trough, caused for example by an episodic event such as an exceptionally good recruitment, would indicate that the system returned to its point of departure at the end of the study period and thus no persistent changes occurred. In practice, however, an exceptional recruitment might take several years to work its way through the population, thus blurring the limits between persistent changes and episodic events in survey-based metrics. While time trends have to be monotonic, application of the method does not require them to be fitted using GAMs. In this study, similar results were obtained when constraining the functions to be at most cubic. Thus the actual choice of smoothing function is not crucial. However, it is connected to the time window used for the analysis. The shorter the period, the more reasonable it seems to assume linear trends.
In conclusion, the likelihood method offers a way of combining trends in multiple metrics. The method was demonstrated for identifying changes in population processes but it can just as well be applied to functional groups or communities or any other environmental process, such as the benthic example cited in the ‘Introduction’, as long as an agreed table for interpreting joint time trends and suitable metrics can be defined.
We thank our colleagues involved in the survey programme and Simon Wood for help with using the mgcv package. This work was partially funded by the EU project IMAGE (contract FP6 – 044227). Constructive comments by five referees and the associate editor Andre Punt are gratefully acknowledged.