The highs and lows of herring: A meta-analysis of patterns and factors in herring collapse and recovery

Pacific and Atlantic herring populations (genus Clupea ) commonly experience epi-sodic collapse and recovery. Recovery time durations are of great importance for the sustainability of fisheries and ecosystems. We collated information from 64 herring populations to characterize herring fluctuations and determine the time scales at low biomass and at high and low recruitment, and use generalized linear models and Random Survival Forests to identify the most important bottom-up, top-down and intrinsic factors influencing recovery times. Compared to non-forage fish taxa, herring decline to lower minima, recover to higher maxima and show larger changes in biomass, implying herring are more prone to booms and busts than non-forage fish species. Large year classes are more common in herring, but occur infrequently and are uncorrelated among regionally grouped stocks, implying local drivers of high recruitment. Management differs between Pacific and Atlantic herring fisheries, where at similarly low biomass, Pacific fisheries tend to be closed while Atlantic fisheries re-main open. This difference had no apparent effect on herring recovery times, which averaged 11 years, although most stocks with longer recovery periods had not yet recovered at the end of the observation period. Biomass recovery is best explained by median recruitment and variability in sea surface height anomalies and sea surface temperatures—higher variability leads to shorter recovery times. In addition, the duration of recruitment failure is closely linked with low biomass. While recovery times rely on the nature of the relationship between spawning biomass and recruitment, they are still largely governed by complex and uncertain processes.

or poor landings translate to widespread economic effects, including the collapse of markets based on herring .
Herring collapse may also lead to significant changes in key predator-prey interactions since herring are an important prey linking primary production to the highest level consumers including marine mammals, seabirds and larger fish (Ainsworth, Pitcher, Heymans, & Vasconcellos, 2008;Smith et al., 2011;, although how much predators are truly impacted is debated (Hilborn et al., 2017). The seasonal migration of Pacific herring in particular to spawn in intertidal and upper subtidal water provides an annual pulse of marine nutrients to marine and terrestrial predators (Fox, Paquet, & Reimchen, 2015Willson & Womble, 2006).
Most exploited herring populations collapsed in the 20th century, and overfishing was implicated as the most prevalent cause of collapse (Hay et al., 2001). Many herring fisheries recovered after fishing stopped, and while management approaches have improved, they still cannot anticipate and prevent all causes of collapse. Natural variability for herring is similar to that of other forage fish, which display large and irregular fluctuations attributed to a tighter coupling with environmental variability (i.e. through bottom-up forcing) and where collapses occur naturally (Checkley, Alheit, Oozeki, & Roy, 2009;Peck et al., 2014).
Explaining post-collapse population dynamics is important for understanding recovery. Across all fish taxa, intense overfishing that continues through collapse periods delays recovery (Neubauer, Jensen, Hutchings, & Baum, 2013), but for forage fish, intense overfishing does not explain differences in recovery time (Essington et al., 2015). Furthermore, while some argue that forage fish are more likely to recover than other fish taxa (Hutchings & Reynolds, 2004), others suggest they are more vulnerable to collapse because of schooling behaviour effects on catchability (Pitcher, 1995) and show enhanced sensitivity to environmental variability resulting from heavy exploitation (Essington et al., 2015;Pinsky & Byler, 2015;Pinsky, Jensen, Ricard, & Palumbi, 2011).
Several prominent examples of herring collapse (Hay et al., 2001;Pearson, Elston, Bienert, Drum, & Antrim, 1999)  These herring collapses have shown long recovery times even though fishing was drastically reduced or halted. Their occurrence begs questions regarding the expected recovery times for herring and the intrinsic and extrinsic factors that may control differences in these times amongst herring. Intrinsic factors relating to species biology, such as the age at maturity, growth and body size, are related to generation times (Bjørkvoll et al., 2012;Hsieh et al., 2006;Inchausti & Halley, 2003;Pinsky & Byler, 2015). Extrinsic factors include bottom-up influences on recruitment and growth originating from the physical environment Hay, Rose, Schweigert, & Megrey, 2008;Ito et al., 2015;Williams & Quinn, 2000). Strong top-down influences from predators have also been suggested as important for herring (Moran, Heintz, Straley, & Vollenweider, 2018;Read & Brownstein, 2003;Schweigert, Boldt, Flostrand, & Cleary, 2010;Surma & Pitcher, 2015;Tjelmeland & Lindstrøm, 2005). Finally, just like questions about whether chicken or the egg came first, there has been a long debate about whether low recruitment is a result of low spawning biomass, or low spawning biomass is a result of periods of low recruitment for herring (Gilbert, 1997;Myers & Barrowman, 1996), with more recent evidence backing recruitment-driven biomass in forage fish (Szuwalski et al., 2019;Szuwalski, Vert-Pre, Punt, Branch, & Hilborn, 2015). The nature and more specifically direction of this relationship will have significant implications for expected recovery times.
Here we investigate four key questions about herring collapse and recovery by adopting a comparative, cross-population perspective on herring dynamics. Across herring populations, we ask: (a) To what extent do herring biomass, recruitment, and catch dynamics fluctuate, and how does this compare to other fish species? (b) How often and for how long should we expect low biomass and recruitment to occur for herring? (c) What factors are most important in explaining the frequencies of low biomass and high recruitment? (d) What factors are most important in explaining the durations of periods of low biomass and recruitment? This study is based on the largest compilation of information on Atlantic and Pacific herring (n = 64 populations), which enables us to answer these questions.

| ME THODS AND MATERIAL S
We collected historical records of catches and time series of estimated adult biomass and recruitment for herring populations worldwide and defined collapse and recovery based on trends in biomass.
We used generalized linear models to predict the expected number of "collapsed" years from covariates representing stock-specific biology and population dynamics, environmental conditions, predator trends and fishing histories; and Random Survival Forests (Ishwaran, Kogalur, Blackstone, & Lauer, 2008) to explore whether these same covariates may explain differences in time to recovery.

| Data sources and types
We obtained spawning biomass, recruitment and catch time series from government agencies, public databases and the published literature (Table S1; Trochta & Branch, 2018). The collated data set included 54 spawning biomass, 46 recruitment and 64 catch time series (Figure 1). Data were not available for all fished herring stocks now and in the past (e.g. Barents and White Sea populations;Hay et al., 2001), although the data in this study still have comprehensive global coverage. Furthermore, not all herring stocks have formal stock assessments, and as a result, some biomass time series are raw population survey estimates (n = 27) and others are outputs from stock assessments (n = 27). Since variability characteristics, specifically the spectral frequencies and autocorrelation structure, of survey time series differed substantially from those for stock assessment outputs (see Figure S1), we applied a 3-year moving average to survey data so that survey and stock assessment information show more similar frequencies and autocorrelation for analysis (see Figure   S2). Eight time series had at least one missing year that we interpolated with the moving average. Nine time series missing more than two consecutive years were not interpolated. Assessment estimates were also derived from different modelling approaches (catch-atage analyses, state-space models or virtual population analyses), although there were no noticeable differences in variability of estimates amongst approaches. No catch data are available for Squaxin Pass. For all stocks and types of data, we calculated the minimum of each time series and the largest interannual changes across herring stocks.

| Identifying and characterizing collapse
Population collapses are generally recognized as substantial declines in abundance from some baseline. Fish population collapses are often defined in reference to the biomass at maximum sustainable yield (B MSY ), which provides a defensible theoretical basis and management relevance (Neubauer et al., 2013;Pinsky et al., 2011).
However, estimates of absolute abundance are unreliable (Hilborn, 2002), and when combined with the frequency of large-scale fluctuations and regime shifts in forage fish, make MSY and B MSY difficult to estimate (McClatchie, Hendy, Thompson, & Watson, 2017).
Instead, we focus on relative trends from surveys and stock assessments since these can be more accurately estimated and still provide information on the potential extent and duration of low abundance, assuming consistency in survey and assessment methodology within a time series.
To standardize relative biomass trends, we divided each time series by the mean values falling within the upper 90th percentile of data in that time series. This is preferable to a time series maximum or mean since those are sensitive to the value and frequency of outliers of both low and high abundance. The threshold for low abundance, or collapse, is defined to be 30% of the mean of those follow peak industrial fishing and the implementation of harvest control rules based on stock assessments (Hay et al., 2001). Biomass collapse duration is defined as the number of consecutive years in which biomass is below 30% of Mean High Biomass. Since these durations can be censored if the period of low biomass includes the starting or ending year of the time series, we use a Kaplan-Meier analysis (Kaplan & Meier, 1958) to produce a time-to-recovery curve (or the time to the end of observations, whichever comes first). In this analysis, recovery is when biomass exceeds the collapse threshold. The calculated recovery probability at each observed time interval is the cumulative proportion of stocks collapsed beyond the preceding intervals on the time-to-recovery curve. Only stocks with two or more years of low biomass or low recruitment are included in the analysis to limit the effect of high-frequency variability (i.e. measurement error). Collapse durations are also identified within the 30 most recent observations from each time series.
Recruitment dynamics may closely couple with population collapse and recovery. Consequently, low biomass provides an incomplete characterization of collapse because recruitment failure can underlie prolonged collapse while strong recruitment promotes recovery; low biomass is thus only a symptom. We use a threshold of 50% of Mean High Recruitment with which to determine the frequency of years above (moderate-to-strong recruitment) and the maximum duration below (recruitment failure) this threshold. A Kaplan-Meier analysis is also applied to low recruitment durations to calculate recruitment times to recovery. This Kaplan-Meier analysis is identical to the one outlined in the preceding paragraph.

| Statistically modelling collapse predictors
We used negative binomial linear mixed-effects (NBLME) models, either zero-inflated or not depending on the data (see Supporting Information for more details). Predictor variables are listed in the next section. Random intercepts in the NBLMEs that reflect regional groupings based on management definitions (groupings are colour coded in Figure 1) did not change the results, so we only present results from the NBLMEs without random effects models for simplicity. We test the significance of these NBLMEs using a parametric bootstrapping procedure that simulates zero-inflated counts to which an intercept-only model and each predictor model are fit (see Supporting Information). All models are coded in R 3.3.2 (R Core Team, 2016) using the glmmTMB package (Brooks et al., 2017).
These models were applied in the following ways: 1. To predict the number of years that biomass was below the collapsed threshold (30% of Mean High Biomass in the base case). The model was zero-inflated model.

2.
To predict the number of years of high recruitment (above 50% of Mean High Recruitment). The model was not zero-inflated.
We used Random Survival Forests (Ishwaran et al., 2008) to assess which factors best predict the number of years to recovery from low biomass or low recruitment (see Supporting Information).
Random Survival Forests ranked the importance of various effects and are an adaptation of random forest analysis for survival or event time data. Variable importance (VIMP) was assessed by calculating the out-of-bag prediction error, which in this context was measured using Harrell's concordance index (Harrell, Califf, Pryor, Lee, & Rosati, 1982). Partial plots of predictor effects were gener-

ated by inputting predictor values into the fitted Random Survival
Forest to obtain ensemble estimates (i.e. the average of 20,000 regression trees) of the expected numbers of biomass recoveries or high recruitment events, which is analogous to a cumulative hazard function in survival analysis (Ishwaran et al., 2008). Models were implemented using the randomSurvivalforest package in R (Ishwaran & Kogalur, 2016). These models were applied to the 30 most recent years for each population and address the following problems: 1. Predicting the number of years to recovery from low biomass (falling below 30% of Mean High Biomass).
2. Predicting the number of years to recovery from low recruitment (falling below 50% of Mean High Recruitment).

| Predictor descriptions
We developed and explored a suite of hypotheses for intrinsic and extrinsic factors that might influence the duration of low herring biomass or recruitment as measured by the number of collapsed years and time to recovery. Specifically, we evaluated effects from both bottom-up (oceanographic conditions) and top-down (trends in predator abundance), fishing, and stock-specific traits related to life history and population dynamics. The following factors (bold) were used to represent these effects (in no particular order).
Latitude (°N) of each herring stock's spawning location (Table S2) is included because it has previously explained differences in key population dynamics processes such as spawn timing, maturity and growth of eastern Pacific herring (Hay, 1985;Hay et al., 2008). We assume latitude is a proxy for the climatic gradient in the northern hemisphere.
Freshwater inputs characterize the physical processes and habitat quality of estuaries in which herring spawn and their progeny survive and grow to maturity (Fortier & Gagné, 1990;Hay & McCarter, 1997). Mean freshwater flux (km 3 /year) come from the Global Runoff F I G U R E 1 Herring time series from west (point 1) to east (point 64). Observations of each time series are divided by their respective historical maxima so they are visualized on the same scale. The filtered versions of the spawning biomass (Adults) series are shown for those stocks whose estimates came from surveys. The unique colour-shape combinations indicate the management area or stock complex to which each time series belongs in both the map and time series. The shaded regions represent a reference time frame , since some records start earlier and are "zoomed out" above to show their entirety. The select herring stocks and years used in the analyses of this paper are emboldened in black Data Centre's (GRDC) global hydrological model WaterGAP (Doll, Kaspar, & Lehner, 2003), based on river discharge measurements from the global network of GRDC stations (GRDC, 2014 and high recruits, this was the period following peak spawning at this same location (i.e. the egg-to-larval stages or the "critical period").
SST and SSHA for these locations and times are mapped in Figures   S4 and S5.

Sea surface height anomalies (cm) came from the JPL Physical
Oceanography DAAC (Boulder, 2013;Hamlington, Leben, Strassburg, & Kim, 2014) and are based on satellite altimetry measurements and historical tide gauge data. The resulting data products are weekly imagery composites from June 1950 through 2010 with 0.5° degree spatial resolution. SSHA time series were created by first identifying the 0.5° × 0.5° cell nearest each stock's spawning location (Table S2) (Tables S3 and S4).
Predator trends are included to determine their potential association with herring biomass collapse or recruitment failure.
Predation information (e.g. consumption rates) is sparse although there are some data for key marine mammal and fish predators of herring. Marine mammal trend data were obtained from Magera, Stock Assessment Database (Ricard, Minto, Baum, & Jensen, 2011) for stocks identified to cohabit areas with herring stocks based on area descriptions. Trends in predator populations (slope and 95% confidence intervals) were estimated using robust linear regression ("lmRob" function) in the robust library in R (Wang et al., 2017) with code provided by Magera et al. (2013). The three predictors Catch as a measure of the period of sustained high exploitation.
2. Mean of highest Catch/Biomass ratio is the mean of the three largest ratios of relative catch to relative biomass in each year for the most recent 30 years. A larger mean ratio indicates that catch was high when biomass was low, which is indicative of unsustainable fishing. Zero catch (no. years) is also included for an effect of fishery closures on collapse times, which could associate with more (i.e. low biomass drives decisions to close fishing) or less (i.e. more closures promotes recovery) years of low biomass or recruitment.

Years Catch increased while
First age at maturity is included to reflect differences in regeneration times that is hypothesized to impact the resilience of populations (Table S2).
Log(max catch) (metric tons) or the log of maximum catch is a measure of total population size, that has been shown to be strongly correlated with maximum sustainable yield (e.g. Srinivasan, Cheung, Watson, & Sumaila, 2010), since large populations are able to produce large catches. This is used in preference to estimates of absolute biomass since such estimates are highly uncertain and are unavailable for most populations. The expectation is that larger populations may be more resilient than smaller populations (Table S2).
Differences reflect somatic growth rate and asymptotic weight.
Age 5 is used to standardize mean weights for comparison amongst stocks. This measure may impact population trends through variability in individual body condition, intrinsic population growth rates and size selective predation (Table S2).
CV of biomass over the most recent 30 years is a measure of the variability of relative biomass (biomass divided by Mean High Biomass).

Median relative biomass over the most recent 30 years is included
in the NBLME models and Random Survival Forests predicting recruitment years to evaluate the importance of the relationship between biomass and recruitment. Median R/SSB is defined as the median R/SSB (after standardization as described above) is used to find whether average productivity is well below the peaks or close to the peaks for each population.

| To what extent do herring biomass, recruitment and catches dynamics fluctuate?
Our database reveals highly varied biomass dynamics with dramatic changes in abundance for most stocks (Figure 1). The average CV of relative biomass for 53 herring stocks with time series longer than 10 years was 0.58 (95% confidence interval, CI, 0.30-0.93), compared to a median CV of 0.44 (95% CI 0.11-1.10) for the 307 non-forage fish species in the RAM Legacy database (Table 1; Figure   S3). The average minimum relative biomass was 0.097 of Mean High

| How often should we expect low biomass and high recruitment to occur for herring?
For this analysis, we focused on the 30 stocks that have at least 30 years of biomass estimates and used the last 30 years of each time series (summarized in Figure 2). The median biomass of each stock is correlated with median recruitment (Spearman's ρ = 0.69, p < .001), but not with median relative productivity (R/SSB) (ρ = 0.19). Instead, CV of recruitment is correlated with relative productivity (ρ = 0.61, p < .001). Median catch is also correlated with median biomass, but less so (ρ = 0.40, p = .02), and the correlation is almost significant for Atlantic herring (ρ = 0.45, p = .06), but is significant for Pacific herring (ρ = 0.65, p = .02).
To address the frequency of collapse in biomass, we focused on the lower tails of this distribution (Figure 2). Collapse frequency is defined as the total number of years that biomass is below 30% of  Median time when catch = 0 9 (1-25) -Note: For values derived from distributions, the median is shown first followed by the 95% confidence intervals in parentheses. For entries indicating the number of stocks meeting a certain criterion (e.g. "No. stocks…"), this applies across all years available for a stock so that a single instance when a criterion is met qualifies that stock.

| What factors are most important in explaining the frequencies of low biomass and high recruitment?
The variability seen in biomass collapse frequencies is the result of a complex suite of environmental, fishing and biological factors (Figures 4 and 5). Since many factors are tested which increases the risk of Type I error, we applied the Holm-Bonferroni method (Holm, 1979) to adjust the significance associated with our bootstrapped p-values. Greater frequencies of low biomass were most associated (based on NBLMEs) with lower median recruitment (Holm-Bonferroni p < .01), and secondarily with lower SST standard deviation (Holm-Bonferroni p < .1). We found no other significant associations between low biomass and all other factors (Table 2).
Stock groupings (as colour coded in Figure 1) as a random effect did not impact estimates.
The frequency of high recruitment years was also analysed using NBLMEs (Table 3), which found that across stocks, the number of years with high recruitment is positively associated with median biomass (Holm-Bonferroni p < .01) and negatively associated with biomass CV (Holm-Bonferroni p < .01), and the mean of the highest catch-to-biomass ratio (Holm-Bonferroni p < .05).
These results were further checked with a suite of sensitivity tests to determine if any outlier values had a significant impact on the results (Supporting Information), which marginally changed significance of some of the minor predictors, but had no major influence on the key predictors.

| What factors are most important in explaining the durations of low biomass and recruitment?
For 23 stocks with low biomass (the other 7 of 30 stocks did not have low biomass persisting longer than 2 years), the most important predictors of the duration of low biomass were identified using VIMP in Random Survival Forests analysis (all predictors are shown in Figure 6). These predictors were, from most to least important, SD of SSHA, trend in SSHA, CV of recruitment, median relative recruitment and trend in SST (Figure 7). Other

F I G U R E 2 Box plots of time-series distributions for stocks used in our analysis (at least 30 years of observations). Each time series is
divided by its respective mean high value (i.e. mean of the values in the 90th percentile). Stocks are ordered by their median relative biomass from the largest median at the top to lowest median at the bottom and colour coded by species (Atlantic herring are blue and Pacific herring are orange). The 30% of Mean High Biomass and 50% of Mean High Recruit thresholds used to identify collapse levels are also shown by dashed lines in the relative biomass and relative recruits plots variables were unimportant (within or near the shaded region in Figure 7). The out-of-bag error rate using Harrell's concordance index ( Figure S6) was 0.32, where values <0.5 indicate better predictive accuracy.
The generated Random Survival Forests were then used to determine the partial dependencies, that is how each predictor affected the probability that biomass collapse would last a given number of years. In Figure S8, the probabilities that a stock would be collapsed for 5 or 10 years are shown as a function of each predictor, depicting that collapse probability declines with increasing SD of SSHA, increases with CV of recruitment, decreases with median relative recruitment and has non-linear dependencies with the other two predictors.
For 30 durations of low recruitment (i.e. times between high recruitment events; Figure 8), the most important predictors from the Random Survival Forests were median relative biomass, SD of SSHA, the highest relative catch-to-relative biomass ratio and mean predatory fish trend (Figure 9). The error rate was 0.23. The results were little changed when two outliers were removed.
Partial dependency plots revealed that higher proportions of stocks with low recruitment are associated with lower median relative biomass, lower SD of SSHA, higher catch-to-biomass ratio and lower rates of decline in predatory fish abundance ( Figure S9).
However, low recruitment only strongly depends on these predictors at the lower end of predictor ranges, suggesting that predictor importance relies on a small subset of observations (Fig. S9).

F I G U R E 3
Distributions of the number of years below or above a threshold amongst herring stocks (N = 30). Both the histogram (top) and empirical cumulative distribution (bottom) are shown for (a) the total number of years in which biomass falls below 30% of Mean High Biomass and (b) the total number of years recruits exceed 50% of Mean High Recruitment. Also shown are the frequencies of events with more than X consecutive years of (c) biomass below 30% of Mean High Biomass (N = 23) and (d) recruits below 50% of Mean High Recruitment (N = 30). The minimum period considered was more than 2 years. Since stocks may show multiple periods longer than 2 years below the stated thresholds, each stock's maximum period is considered. For all stocks, only the 30 most recent years of each stock's time series are used to determine the number of years on all plots Sensitivities in the Random Survival Forest analysis were also checked, including removing collinear predictors and re-running on times derived from different collapse thresholds (Supporting Information). While predictive accuracy of the analyses changed, ranking of the key predictors did not which largely upholds the key findings presented here.

| D ISCUSS I ON
Our study characterizes population collapse and subsequent recovery of herring stocks, finding that herring have more extreme interannual swings in biomass and recruitment than non-forage fish species.
For herring stocks, the average duration at low biomass is 11 years,

| Herring experience more population variability than non-forage taxa
Compared to non-forage fish species in the RAM Legacy database, herring biomass drops to lower minima, recovers to higher maxima and exhibits larger maximum interannual increases and decreases. Note: The estimated intercepts (Int.), effect coefficients (Eff.), and the lower (L 95% CI) and upper 95% confidence interval bounds (U 95% CI) on the effect are provided for each model. All variables are scaled by their mean and standard deviation. The p-values are empirically derived, based on the proportion of parametrically bootstrapped likelihood ratios between the full and null models (only the means in the conditional and zero-inflated models with a random-intercept on herring locale) that are as extreme as the observed likelihood ratio. These bootstrapped likelihood ratios derive from converged model fits from the null model (no. sims). Given the large number of predictors, we also apply the Holm-Bonferroni method on the empirical p-values to correct for Type I errors (***, .01; **, .05; *, .1). These observations are consistent with previous meta-analyses examining how clupeids decline and subsequently increase, which attributed their resilience to their short-lived, fast-growing life histories (Hutchings, 2001;Hutchings & Reynolds, 2004). However, these life-history traits were also shown to increase clupeid vulnerability to collapse (Pinsky et al., 2011) and to exacerbate their vulnerability to collapse when overfishing occurs (Essington et al., 2015;Pinsky & Byler, 2015).

F I G U R E 6
Herring are more likely to have very large recruitment events than non-forage fish and overall to display greater variability in recruitment. Strong cohorts (>1 unit of relative recruitment) occur

| Fisheries closed for Pacific herring, but not Atlantic herring
Nearly half of herring stocks experienced very low to no catches during the time frame of analysis, which is a greater proportion than observed in non-forage fish stocks; and half of these herring fisher-

| Half of herring stocks collapsed for a decade or more
Biomass in all herring stocks declined below 30% of Mean High  Note: The estimated intercepts (Int.), effect coefficients (Eff.), and the lower (L 95% CI) and upper 95% confidence interval bounds (U 95% CI) on the effect are provided for each model. All variables are scaled by their mean and standard deviation. Adjusted significance of the p-values using the Holm-Bonferroni method are also indicated (***, .01; **, .05; *, .1).
Abbreviations: SSHA, sea surface height anomalies; SST, sea surface temperature. set. This uncertainty is worrisome given that prolonged collapse times also may include severely contracted geographical ranges and the loss of spawning components (Hay et al., 2001;Melvin & Stephenson, 2006). These biological effects can have severe longterm consequences for fishing fleets , and cultures whose long-standing traditions rely on herring such as roe-on-kelp harvests (Gauvreau et al., 2017;Jones et al., 2017;Menzies, 2016;Thornton et al., 2010).

| Recovery hinges on the link between biomass and recruitment
Distributions of spawning biomass, recruitment, catch and productivity (defined as the relative recruits/spawning biomass, or R/SSB) across stocks reveal no relationship between median spawning biomass and median productivity. Furthermore, the number of low biomass years is best predicted by lower median recruitment and higher recruitment variability, and the duration of low recruitment by low median biomass, while the frequency of high recruitment is best predicted by high median biomass and high biomass CV.
Both high and low stock-recruitment associations have been previously shown for herring (Myers & Barrowman, 1996), and there is recent evidence for statistically significant stock-recruitment relationships in forage fish (Somarakis et al., 2018). Still, other recent studies have shown that cross-correlations are strongest when recruitment leads spawning biomass across stocks (Szuwalski et al., 2015(Szuwalski et al., , 2019 with arguments positing the overwhelming effects of environment and/or life history compared to spawner abundance (sensu Pepin, 2015). Our results do not robustly test one claim over the other; instead, our results suggest a strong association between recruitment and biomass at high and low levels as the primary determinant of collapse times and eventual recovery across herring stocks.

| Catch patterns and recovery
While catches generally track spawning biomass, catches alone are less useful than time series of fishing mortality. However, fishing mortality values are not available for many stocks. Therefore, we developed three proxies for the duration and magnitude of unsustainable exploitation (the number of years in which relative catch exceeded 0.75; years in which catch increased while biomass decreased; and mean of the highest catch/biomass). The linear mixed models and Random Survival Forest both found little relationship between times at low biomass and our exploitation proxies.
In contrast, results of the linear mixed models and Random Survival Forests found that stocks with larger maximum catch-tobiomass ratios in their record were likely to have fewer years of strong recruitment and longer recruitment failure. This connection could reflect that (a) recruitment failure precludes biomass recovery, which is exacerbated by increased exploitation, or that (b) this is the result of recruitment overfishing, as has been previously noted for F I G U R E 9 Metrics for selecting the most important variables from the Random Survival Forest regression on 30 low recruitment periods (low as recruits <50% of Mean High Recruitment). Variables are ordered by their importance in terms of minimal depth (top variables at the bottom) and displayed alongside their respective variable importance (VIMP). Variables near and within the shaded regions are considered unimportant, which is above the median of all minimal depth values and below the absolute value of the minimum VIMP score herring stocks (e.g. Cushing, 1971;Dickey-Collas et al., 2010;Hay et al., 2001). Fishing also mainly continues on Atlantic herring, but usually ceases on Pacific herring at the low relative levels we defined. These continuing catches of Atlantic herring imply a risk of increasing fishing mortality and overfishing especially given the density-dependent catchability that is characteristic of forage fish (Pitcher, 1995). Fishery closures that are more common among Pacific herring may negate this risk; however, closing does not seem to guarantee a speedy recovery since long durations of low biomass still exist for several stocks despite being closed to fishing.

| The importance of oceanographic variability
Variability both in SST and sea surface height anomaly (SSHA) were key predictors of low herring biomass. Greater environmental variability in SSHA and SST were associated with fewer years of low biomass. This makes sense given that higher spawning biomass is driven by occasional large recruitment events, which is in turn driven by recruitment variability. Variability in SSHA is correlated with periods of failed recruitment, and likely also adult mortality and individual growth, given the strong negative correlation between weight-atage 5 and SSHA variability. While the link between environmental and recruitment variability is implicit in hypotheses postulated by other authors (e.g. the "optimal stability window" hypothesis from Gargett, 1997), the correlation between body condition and variability has not been made before in the literature and warrants further investigation into the mechanisms controlling somatic growth and asymptotic size.

| Other factors
We found no significant effects of predator trends on the number of years of low biomass or high recruitment, but larger declines in Pacific fish predators were associated with shorter periods of low recruitment. Inference on this potential relationship is complicated by the spatial context: juvenile herring are generally found within coastally enclosed areas (e.g. sounds, bays, and straits), while our data on  (Pikitch et al., 2012(Pikitch et al., , 2014Smith et al., 2011). Herring have also been identified as a key prey item for pinniped populations throughout the Pacific and Atlantic, including Southeast Alaska (Gende & Sigler, 2006;Sigler, Gende, & Csepp, 2017;Womble & Sigler, 2006), British Columbia (Olesiuk, 1999(Olesiuk, , 2008, Puget Sound (Lance, Chang, Jeffries, Pearson, & Acevedo-Gutiérrez, 2012), the Southern Gulf of St. Lawrence (Hammill, Stenson, Proust, Carter, & McKinnon, 2007), the Gulf of Maine (Overholtz & Link, 2006;Read & Brownstein, 2003) and the North Sea (Sveegaard et al., 2012). A more detailed and thorough investigation of these relationships involving various herring predators is needed, as has been done by .
Factors that were not found to have any effects were latitude, freshwater influx, first age at maturity, age-5 weight and maximum catch. Given the importance of fishing and oceanographic predictors from our analyses, extrinsic factors may determine herring recovery times to a greater degree than the intrinsic stock-specific characteristics considered here.

| Challenges of approach
A variety of assumptions in our approach were explored in more detail in sensitivity analyses (Supporting Information). Chief among these was the choice of a threshold to define low or high levels in biomass or recruitment. Naturally, levels below 30% of the Mean High Biomass may or may not equate to true population collapse in some stocks. Other fisheries meta-analyses have taken a similar approach of defining collapse with a reference point determined from the time series themselves to standardize comparisons (e.g. Essington et al., 2015;McClatchie et al., 2017;Mullon, Fréon, & Cury, 2005;Pinsky et al., 2011;Worm et al., 2009). Applying alternative thresholds did not substantially change our conclusions except for the Random Survival Forest model of low biomass durations. Due to the loss of predictive accuracy, thresholds that were lower (20%) or higher (40%) modified the duration data in a way that made them less informative for our specific objective. For example, changing the threshold to 20% reduced the sample size from 23 to 16 since many stocks did not experience prolonged times below this threshold. Different thresholds in recruitment also did not alter our conclusions.
Testing of the various factors that may explain recovery times is also caveated by the scale with which we conducted this analysis. As with herring predators, ecosystem factors are particularly nuanced because of the time and spatial scales with which herring biological processes and oceanographic variables interact. For example, more localized oceanographic processes may influence herring biomass and/or recruitment than captured in our broad-scale factors so that even if such factors do impact some herring stocks, the cross-stock effect is absent or undetectable. This may explain why we do not find more significant predictors (for more discussion, see Supporting Information). However, small stocks for which only survey estimates are available comprise over half of the biomass data, and they lack stock-specific ecological information needed to develop more appropriate time series for analysis. As a result, our use of broader ecological factors is the best possible given available data, with the caveat that we can only detect effects from the variables for which data are available.

| CON CLUS IONS
The data presented bring together a wealth of herring knowledge in the most comprehensive compilation of herring population dynamic data sets to date, extending far beyond the data contained in the RAM Legacy database . Stock assessment models do not exist for many herring populations in the Pacific, and Pacific herring have much greater representation in our data set. Few other fish taxa have as many records across stocks and years, and thus, this analysis provides the largest available meta-analysis on a single marine fish group. This "treasure trove" of herring data was first glimpsed in Hay et al. (2001), which served as one of the foundations for this paper and its questions.
We found a wide variety in the extent and duration of collapse and time to recovery in herring, implying that any management strategies must be robust to this broad range of possibilities to avoid the risk of losing substocks and eroding the long-term resilience of stock complexes. Timely reductions in fishing effort may counteract initial rapid and drastic declines, allowing some leverage through robust fisheries management (Bakun & Broad, 2003;Essington et al., 2015).
Achieving this timeliness remains an obstacle and requires an ongoing investigation of biology and ecosystem interactions for each stock, since these factors likely change over time. One approach is to more specifically identify smaller scale environmental indicators of impending productivity changes that would allow prompt reductions in harvest rates (Lindegren, Checkley, Rouyer, MacCall, & Stenseth, 2013). However, the success of this approach relies on the accuracy of these indicators in predicting changes (Punt et al., 2013). Novel methods have demonstrated the promise of accurate predictions from environmental indicators (Deyle et al., 2013), and applying such methods to data sets such as ours along with more context-specific environmental variables would be useful.
Our analyses are correlative, highlighting links between factors and recovery times without offering further evidence for specific mechanisms (Pepin, 2015;Williams & Quinn, 2000). Distinguishing recovery probability based on biologically plausible factors is a useful assessment of the stocks most at risk for prolonged recovery when collapsed. Knowing which factors are important for distinguishing recoveries (e.g. "risk factors") amongst stocks can inform comparisons of management procedures amongst management areas. Identifying these risk factors also directs research to diagnose their causal pathways to longer or shorter collapses and recruitment failures amongst stocks. Our analyses demonstrate the potential of these data and similar data sets to address important questions on fish population dynamics. The outcome of such analyses has the potential to inform fisheries management on the types of policies that are best suited to promote faster potential recovery times after population collapse. Punt, T. Essington, and S. Dressel and three anonymous reviewers for invaluable comments on the manuscript.

DATA AVA I L A B I L I T Y S TAT E M E N T
Most of the data underlying this study are available at https://doi. org/10.24431 /rw1k32g. Restrictions apply to the availability of the remaining data, which were used under privacy agreements for this study, and those data may be requested from the data providers identified in Table S1 of the Supporting Information.