Annual variation in breeding success in boreal forest grouse: Four decades of monitoring reveals bottom‐up drivers to be more important than predation

Abstract Knowledge of the temporal variation in reproductive success and its key driving factors is crucial in predicting animal population persistence. Few studies have examined the effects of a range of explanatory factors operating simultaneously on the same population over a long period. Based on 41 years of monitoring (1979–2019), we tested prevailing hypotheses about drivers of annual variation in breeding success in two sympatric species of boreal forest grouse—the capercaillie (Tetrao urogallus) and the black grouse (T. tetrix)—in a 45 km2 boreal forest landscape. From counts in early August, we measured breeding success (chicks/hen) along with potential determining factors. We formulated five main hypotheses on causes of variation (hen condition, chick weather, chick food, predation, demographic characteristics) and derived 13 associated explanatory variables for analysis. We first tested the five hypotheses separately and then used model selection (AICc) to rank the best predictive models irrespective of hypotheses. Lastly, we used path analysis to illuminate potential causal relationships. Barring demographic characteristics, all hypotheses were supported, most strongly for chick food and predation. Among predictor variables, chick food (insect larvae and bilberry fruit crops), vole and fox abundances, the winter‐NAO index, and temperature after hatching, had the strongest effect sizes in both species. Precipitation after hatching had no detectable effect. Model selection indicated bottom‐up factors to be more important than predation, but confounding complicated interpretation. Path analysis suggested that the high explanatory power of bilberry fruiting was due not only to its direct positive effect on chick food quality but also to an indirect positive effect on vole abundance, which buffers predation. The two components of breeding success—proportion of hens with broods and number of chicks per brood—were uncorrelated, the former having the strongest effect. The two components had different ecological correlates that often varied asynchronously, resulting in overall breeding success fluctuating around low to moderate levels. Our study highlights the complexity of key explanatory drivers and the importance of considering multiple hypotheses of breeding success. Although chick food appeared to equal or surpass predation in explaining the annual variation in breeding success, predation may still be the overall limiting factor. Comparative and experimental studies of confounded variables (bilberry fruiting, voles, and larvae) are needed to disentangle causes of variation in breeding success of boreal forest grouse.


| INTRODUC TI ON
Boreal forest grouse (Tetraonidae) has declined through most of Western Europe (Storch, 2007) during recent decades, commonly explained by poorer breeding success (Jahren et al., 2016).
Traditionally, population regulation has been discussed in the context of density dependence: as density increases, factors that depress vital rates become successively stronger, eventually stabilizing population size to fluctuate around an equilibrium level (Lack, 1954;Turchin, 1999;Wolff, 1997). Krebs (2002) argued that in order to solve the ongoing rather fruitless debate about densitydependent regulation, research should have a "mechanistic" approach, whereby effects of explanatory factors on vital rates are more thoroughly examined. Our long-term study is an attempt to do so. Aiming to identify the main drivers of annual variation in reproduction, we examine how breeding success varied with a range of bottom-up and top-down factors in two species of Eurasian boreal forest grouse, the capercaillie (Tetrao urogallus) and the black grouse (T. tetrix).
Some long-term studies have examined aspects of breeding in birds (e.g., song sparrows (Arcese et al., 1992); Seychelle warblers (Brouwer et al., 2012); acorn woodpeckers (Koenig et al., 2011); white-throated dippers (Nilsson et al., 2011); eiders (Coulson, 2010;Morelli et al., 2021)), but causes of variation in avian breeding success have largely been inferred from short-term, single-factor correlations. In Eurasian boreal forest grouse, no previous study has looked at several potential factors operating simultaneously on the same population over a long period. Here, we examine several prevailing hypotheses, and their associated explanatory variables, based on quantitative measurements of environmental and demographic factors thought to influence breeding success of capercaillie and black grouse. Data were collected on sympatric populations of the two species in a boreal forest landscape in southeast Norway over a period of 41 years . In order to identify the main drivers, we first tested the predictions of each main hypothesis, after which we compared the best explanatory variables from each hypothesis by means of information-theoretical model selection (AICc) and path analysis.

| The hypotheses
Capercaillie and black grouse are large (hens weigh about 2.0 and 0.8 kg, respectively), ground-nesting birds, widely distributed across the extensive Eurasian boreal biome. Characteristic features of their breeding phenology include mating at leks in early spring, clutches of 7-9 eggs laid by well-camouflaged hens in well separated ground nests and incubated for 3.5-4 weeks. The hens alone rear broods of precocial chicks that feed largely on insect larvae for their first 3-4 weeks and fledge to independence 3 months after hatching.
During this long period of incubation and chick rearing, many environmental factors may influence the number of chicks reared to independence. From research on breeding ecology and population dynamics of boreal grouse, we formulated five main hypotheses (some including sub-hypotheses), deduced associated predictions, and selected potential explanatory variables to test the predictions (Table 1).

| Hen condition hypothesis
Capercaillie and black grouse evolved in northern boreal forests with a continental climate characterized by cold winters and dry snow. During winter, hens subsist mainly on low-quality, arboreal foods, and conserve energy by roosting in snow burrows whenever possible. The species mate in early spring when the ground is partly covered with snow. At this time, hens must rebuild their body reserves for laying eggs and for the energy-draining incubation period. Access to sprouting new ground vegetation is critical to their nutritional status. Poor condition or nutritional stress in this early period hampers breeding performance via low-quality eggs and reduced viability of chicks. It also forces hens to leave their nest more often, or to pursue longer feeding bouts during incubation, thereby exposing themselves and their eggs to predation (e.g., Brittas, 1988;Gregg & Crawford, 2009). We therefore hypothesized that cold, dry winters and early, warm springs should benefit hen condition and hence breeding success.
Large-scale climate indices sometimes predict ecological processes better than local weather statistics (Hallett et al., 2004). One equal or surpass predation in explaining the annual variation in breeding success, predation may still be the overall limiting factor. Comparative and experimental studies of confounded variables (bilberry fruiting, voles, and larvae) are needed to disentangle causes of variation in breeding success of boreal forest grouse.

K E Y W O R D S
bird breeding success, boreal forest, grouse, hypothesis testing, information theory, NAO, path analysis, Tetrao such is the North Atlantic Oscillation (NAO), the difference in sealevel atmospheric pressure between the Azores and Iceland. This strongly influences winter climate over western Europe (Hurrell et al., 2003) and to a lesser extent in summer (Folland et al., 2009 Effects of hen condition on egg quality and breeding success have received support from indirect evidence on several grouse species (Moss & Watson, 1984;Swenson et al., 1994;Zwickel & Bendell, 1967), plus strong empirical evidence on sage grouse (Centrocercus urophasianus) in USA (Gregg, 2006) and lapwing (Vanellus vanellus) in Sweden (Blomqvist et al., 1997).

The Hen Condition Hypothesis predicts that breeding success
should benefit from cold winter weather with dry snow, and hence a negative winter NAO index, and that it also should benefit from warm weather and early snow-free ground before mating.

| Chick Weather Hypothesis
Newly hatched grouse chicks thermoregulate poorly (Marcström, 1960) and depend on warmth from the brooding hen.
During their first 3-4 days, a yolk sac provides supplemental nourishment that helps regulate body temperature-independent selfregulation is achieved when the chicks are about a week old. For the next month or so, they need especially nutritious food such as arthropods for rapid growth. During cold or wet weather, feeding bouts become interrupted by needed brooding, which slows growth and weakens the chicks' physical condition-so making them more susceptible to starvation and predation. This hypothesis has been supported in several studies (e.g., Ellison & Magnani, 1984;Erikstad & Spidsø, 1982;Marcström, 1960;Moss, 1986;Watson & Moss, 2008).
The Chick Weather Hypothesis predicts that breeding success should benefit from warm weather and suffer from precipitation following hatching.

| Chick Food Hypothesis
The Chick Food Hypothesis comprised two sub-hypotheses-one quantitative and one qualitative. The Food Quantity sub-hypothesis involves larvae of butterflies, moths, and sawflies (Lepidoptera and Hymenoptera) feeding on new leaves of bilberry (Vaccinium myrtillus)-a crucial, protein-rich food for chicks during their first few weeks (Picozzi et al., 1999;Savory, 1989;Spidsø & Stuen, 1988;Wegge & Kastdalen, 2008). The abundance of larvae is known to fluctuate markedly between years, presumably due to weather (Reynolds et al., 2007). The idea that the quantity of larval food available to chicks should influence breeding success has previously been supported by both direct (Picozzi et al., 1999) and indirect evidence (Atlegrim & Sjöberg, 1995;Baines et al., 2017).
The qualitative Plant Stress sub-hypothesis stems from White (1984White ( , 1993, who proposed that stressed plants reduce their chemical defenses and so become more susceptible to herbivory. This prompted Selås (1997) to put forward "the mast depression hypothesis," whereby high crops of bilberry fruit reduce chemical defense compounds in bilberry vegetation, making it more digestible for voles and grouse chicks in the following year. Selås, Sonerud, et al. (2011) later added that cold summer weather during a masting year and the year before should accentuate the stress on the plants, making them even more nutritious for voles and grouse in the postmasting year. Selås (2000) presented a positive correlation between bilberry fruit production and capercaillie abundance-but not breeding success-in autumn of the following year.
The Chick Food Hypothesis predicts that breeding success should be positively influenced by (1) abundance of larvae on bilberry plants after the chicks hatch (food quantity), and (2) bilberry fruit production the previous summer, along with a negative influence of summer temperature in the previous 2 years (food quality).
Hence, we expected the growth rate (λ) of foxes to track vole abundance, thereby influencing grouse breeding success via a numerical effect (see below) in addition to the functional effect of a dietary shift.
Second, since red fox and pine marten are main predators of boreal forest grouse (Baines et al., 2016;Kauhala et al., 2000;Kurki et al., 1997;Lindström et al., 1994;Marcström et al., 1988), the breeding success of grouse should decrease with increasing densities of these two mesopredators. This sub-hypothesis has not been examined independently of the APH sub-hypothesis (above). We did not have reliable field data on marten abundance, and so our hypothesis refers only to the abundance of red fox-henceforth the Red Fox sub-hypothesis.
Third, Tornberg et al. (2005) predicted that grouse chicks should suffer higher mortality from increased numbers of breeding goshawks (Accipiter gentilis) following peak densities of the main goshawk prey species (grouse, hares, and squirrels [Sciurus vulgaris])-the Delayed Raptor sub-hypothesis. Goshawks typically do not breed until 2-3 years of age (Krüger, 2005), so this effect on grouse should be delayed by 2-3 years. We did not have reliable census data on breeding goshawks and so used total grouse density-a major prey group according to this hypothesis-as a surrogate for goshawks. Except for some indirect evidence (Selås & Kålås, 2007;Tornberg et al., 2012), the Delayed Raptor sub-hypothesis has not yet been explicitly tested.
The Predation Hypothesis predicts that breeding success should (1) fluctuate synchronously with the abundance of voles, while varying inversely with (2) the abundance of red fox and (3) the autumn density of grouse 2 and 3 years earlier. It also predicts that the growth rate of red fox should track vole numbers.

| Demography Hypothesis
The Demography Hypothesis comprised two sub-hypotheses.
First, juvenile grouse hens typically rear fewer chicks than older hens, apparently investing less in breeding because clutch sizes are smaller and, in large grouse species, some juveniles do not breed at all (Zwickel & Bendell, 2004). Hence, breeding success varies with the age composition of breeding hens in spring-the Agedependence sub-hypothesis (Lindström et al., 1997). A large data set on black grouse reproduction has supported this hypothesis (Willebrand, 1992).
Second, in dense populations, individuals compete for resources to get access to optimum feeding and nesting sites. This depresses their physical condition and might also depress their breeding performance-the Density-dependence sub-hypothesis (Blomberg et al., 2017;Lindström et al., 1997). Although several studies have examined the potential role of density-dependence in avian breeding success (e.g., Lack, 1966;Saether et al., 2016), no such study has been reported on boreal forest grouse.
The Demography Hypothesis predicts that breeding success should vary inversely with (1) the proportion of young hens and (2) the density of hens in spring.

| Study area
The study was conducted at Varald State Forest, next to the Swedish border in southeast Norway (60°10′ N, 12°30′ E; Appendix S1: Figure 1). The terrain is gently undulating between 200 and 400 m a.s.l., comprising Norway spruce (Picea abies) and Scots pine

| Data collection
Demographic data were obtained by counting birds in 22 c. 2 km 2 blocks during August using trained pointing dogs (number of blocks, census effort, and sample sizes are given in Appendix S1: Figure 2).
Flushed, fully grown birds were classified to species and sex, while the number of chicks in their broods was counted. Numbers observed per 10 h of sampling provided indices of density, which for hens was used as a surrogate for breeding density in springreasonable because intervening summer mortality of hens has been negligible in the study area (Wegge & Rolstad, 2011). The proportion of juvenile hens in spring was estimated from the proportion of female chicks in the previous August count (Appendix S1: Sampling).
Any emigration of dispersing female chicks during autumn and early spring was assumed to be balanced by immigrants prior to breeding, as our study area was contiguous with similar habitats outside it.
Dates of peak mating and subsequent hatching were estimated from direct observations at leks, supplemented by monitoring 4-6 leks with remote cameras since 2015.
To test the Hen Condition and the Chick Weather Hypotheses, we downloaded local weather statistics from Kongsvinger meteorological station, at 150 m elevation 25 km from the study area. Initially, we assembled 31 weather variables, reducing these to five during preliminary analyses (Appendix S1: Weather, and Appendix S2: Table 1). Temperatures and snow depth were adjusted for differences in elevations (Wegge & Rolstad, 2017). We used winter NAO as a regional index of winter weather. Spring and summer NAO indices did not correlate at all with breeding success, nor did they explain any significant aspects of the local spring and winter weather data. Thus, they were not included in the final analyses. We used the Northern Hemisphere Temperature Index (NHT, Appendix S1: Weather) to account for a long-term increasing trend in temperature (see Section 2.3.3 and Figure 1d).
For the Predation Hypothesis, we sampled voles in late Augustearly September by baited snap traps along six transects in grassy and bilberry-dominated habitats. Yearly abundance indices, calculated for each habitat type separately, were expressed as the number of voles captured per 100 trap-nights. Unless specified, the variable Voles is the mean of the two indices (Appendix S1: Predation). Red fox abundance was estimated from different sources: counting of tracks in winter along forest roads and fixed snowmobile routes and ski tracks (19 years, >60 km/year), and local and regional hunting statistics (1979-2019; Appendix S1: Predation).
For the Chick Food Hypothesis, we sampled larvae in late May and June by sweep netting at 6-10 fixed stations in bilberry-rich sites within old spruce-dominated stands. Lepidoptera and Hymenoptera larvae were counted and grouped into three size classes: small (<5 mm in length), medium (5-12 mm), and large (>12 mm). The larval index was expressed as numbers of large and medium sized larvae per 10 sweeps. We assessed the abundance available to chicks by interpolating the indices to 8-10 days after hatching (Appendix S1: Chick food). During the last 17 years, we counted the number of bilberry fruits (berries) in randomly distributed circular plots within bilberry-rich, old forest at fixed sites in August. After correcting for ramet coverage, berry abundance was indexed as numbers per 1 m 2 of bilberry plants. We also had access to bilberry fruit indices in the study area for the whole period of 41 years, based on newspaper records (Selås et al., 2021). Parallel analyses for the final 17 years, using either our field measurements or the newspaper index, gave very similar results (Appendix S1: Chick food)-we therefore used the latter here.

| Statistical analyses
Counts in August included the number of hens with or without broods plus the number of chicks in each brood, which were wellgrown at the time of the counts. Our primary measure of breeding success, for each species separately, was the ratio of chicks to hens. This could be broken down into the proportion of hens with brood (brood frequency) multiplied by the ratio of chicks to hens with brood (brood size). In regression models, we use these ratios as response variables, rather than the number of chicks (or broods) with the number of hens (or broods) as offset: the latter would have favored years with larger sample sizes and also put more weight on the more numerous black grouse. There were no correlations between census effort and breeding success, brood frequency, or brood size (Appendix S1: Sampling, and Appendix S1: Figure 2).

| Modeling approach
First, we tested predictions from each hypothesis separately by means of linear multiple regressions for breeding success or its components (brood frequency and brood size) as response variable and measures of each hypothesized cause as explanatory variables. The two grouse species were treated as subjects in a repeated measures analysis of deviance via SAS (ver. 9.1) Proc Mixed, specifying "species nested in year" in the REPEATED statement and "variance components" as the covariance structure. In further analyses, we investigated possible differences between species by including interactions between them (categorical) and explanatory (continuous)

variables. Proc Mixed fits models via REML (Restricted Maximum
Likelihood) assuming that the response variables, but not necessarily the explanatory variables, are normally distributed. Accordingly, we checked for each species that the distributions of breeding success and its two components did not depart significantly from normal.
Response and explanatory variables were standardized to Z-scores (subtracting the mean and dividing by its standard deviation) to facilitate direct comparisons of effect sizes. Predictions from each hypothesis were directional, and so we denote test results with one-sided type-I error rates <0.05 as statistically significant. Nondirectional hypotheses (e.g., possible differences between species) were tested two-sided. If not otherwise stated, we denote effect sizes β of 0.20-0.25 as weak, 0.25-0.35 as moderate, and ≥0.35 as strong. Throughout we use effect size in the statistical sense of the slope of the regression coefficient, which does not necessarily imply biological causation.
Second, we ranked the various hypotheses by combining their standardized slopes, adding those in the hypothesized direction and subtracting the very few that were not. This produced a composite "explanatory value" for each hypothesis. Standard deviations were combined via simulation: from each of the normal distributions described by an estimate and its standard error we picked a random sample and added these together to get an estimated explanatory value. We repeated this 10,000 times and calculated the standard deviation of the 10,000 estimates.
Third, we sought the best predictive models for breeding success, brood frequency, and brood size, respectively, irrespective of hypotheses. Candidate models combined the best explanatory variables from each hypothesis, added in order of their slopes as discovered during hypothesis testing. We ranked models within a group according to AICc  and selected the most parsimonious model (ΔAICc ≤2 for each extra parameter) while excluding "uninformative parameters" (Arnold, 2010). Confounding among explanatory variables from different hypotheses complicated the straightforward application of statistical criteria to model selection. Therefore, in order to disentangle direct from indirect causal relationships, we did path analyses (Blums et al., 2002;Mitchell, 1992) for each species separately via the SAS Calis procedure. Path analysis required Pearson correlation coefficients as inputs, and these had to be estimated from normally distributed pairs of variables.
We therefore took natural logarithms of the three variables that included a few abnormally large values: voles (adding 0.1 because there were some zeros), foxes, and larvae. This resulted in path coefficients that were smaller than comparable standardized regression coefficients calculated during hypothesis testing. As an informal aid to assessing the relative strength of path coefficients, we estimated (by simulation) conditional one-tailed probabilities p c that the normal distribution described by each coefficient and its standard error would include zero or a more extreme value.

| Choice of variables
Initially we collected 45 possible explanatory variables of which weather data (31) were most numerous (Appendix S2: Table S1).
Some variables were essential to a particular hypothesis, and these were retained for further analysis. There were also subsets of similar variables that represented alternative versions of the same biological phenomenon (e.g., temperature for 2 or 4 weeks after hatching

| Trends and autocorrelations
Two response variables-breeding success and brood frequencydisplayed long-term positive trends, especially in capercaillie. Such long-term trends were also present in some explanatory variables, most pronounced for summer temperature. As we were interested in teasing out year-to-year variation, we looked for potential detrending variables. Preliminary analyses showed that Northern Hemisphere Temperature (NHT) and a linear trend based on successive years gave very similar results when used as alternatives.
As they confounded each other when entered into the same model, we chose to use NHT as a detrending variable because it also represented gradually increasing temperature due to global warming ( Figure 1d), thereby having physical and biological meaning.
When testing the APH, we detected evidence of a 3-4-year cycle in vole abundance. This raised the specter that correlations between vole abundance and grouse breeding success could be due to similar vole and grouse cycles coinciding by chance. However, the evidence for a similar cycle in grouse breeding success was negligibly small. In addition, APH regressions using AR1 residuals were virtually indistinguishable from ones using the original variables.
Another possible joint autocorrelation structure for response and explanatory variables occurred in the Demography Hypothesis.

| Breeding success, brood frequency, and brood size
During the 41-year study, the breeding success of the two species fluctuated synchronously in a quasiperiodic pattern (Figure 1).
Black grouse had higher proportions of hens with broods and also larger brood sizes, resulting in their average overall breeding success being 55% higher (1.9) than that of capercaillie (1.2) (Table 2a).
In each species, brood frequency and brood size correlated strongly with breeding success-notably, brood frequency explained about twice as much of the variation in breeding success as brood size (Appendix S3: Figure 1a-d). Surprisingly, and importantly, there was no correlation between brood frequency and brood size-they Brood frequency was determined by hens losing entire clutches or broods, and so we could infer the relative importance of complete loss of clutches and whole broods versus partial loss of chicks within hatched broods. The former predominantly represents nest loss and the latter chick mortality. Complete loss of clutches (and broods) contributed significantly more to the total loss than the partial loss of chicks within broods, especially in capercaillie (56 vs 28%) but also in black grouse (44 vs 34%; Table 2b), which is consistent with the finding that brood frequency was better than brood size at explaining variation in breeding success (Appendix S3: Figure 1a-d). Comparison between and within species of total loss, loss due to complete loss of clutches/broods (i.e., mostly nest loss), loss due to chick mortality in broods with ≥1 chick, and chick mortality sequentially to loss of clutches/broods.
Because brood frequency and brood size were uncorrelated, models were tested with each as separate response variables.

| Hen condition
None of the local winter weather variables (temperature, precipitation, and snow depth) was useful in explaining variations in breeding success (Appendix S4: Table 1a). Even so, breeding success was clearly and negatively associated with winter NAO (Figure 4a), mostly via brood frequency (Table 3). This was quite surprising, because NAO correlated rather strongly with all three local winter weather variables: high NAO values were associated with mild and wet winters with little snow (Appendix S4: Table 1b). In capercaillie, but not in black grouse, warmer spring weather during the 8 weeks before hatching affected breeding success positively (Figure 4b), solely via a moderate effect on brood frequency (Appendix S5: Table 1). There was no discernible effect on breeding success of the date in spring when ground became snow-free (Table 3). Spring and summer NAO indices did not explain any variation in breeding success.

| Chick weather
Warm weather during the 4 weeks following hatching had a moderately positive effect on breeding success, mostly mediated through an effect on brood size (Table 3, Figure 4c). Surprisingly, we found no effect of precipitation on any aspect of breeding success, neither of the total amount of rain nor of the frequency of days with rain during 4 weeks after hatching (Table 3, Appendix S5: Table 1).
We also checked whether breeding success was related to rainfall only during colder weather. However, this was not confirmed, as there was no difference between the birds' response to rainfall at high and low temperatures (species combined, F < 0.01, p = .98), nor was there any difference between the two species in this response (F = 0.28, p = .60; Appendix S4: Figure 1, and Appendix S4: Table 2).
Thus, rainfall had no effect on breeding success irrespective of ambient temperature.

| Chick food
Tests of the chick food sub-hypotheses showed that the abundance of insect larvae influenced breeding success strongly and positively ( Figure 4g). In the Plant Stress sub-hypothesis, the effect of bilberry fruiting the previous year was strongly positive (Table 3, Figure 4h), affecting both brood frequency (mostly in capercaillie) and brood size (mostly in black grouse; Appendix S5: Table 1). The effect of previous summer temperature was also in the predicted negative direction, although rather weak, and significant only for brood size (

| Predation
Among the three predation sub-hypotheses, the abundance of voles-a buffer against predation in the APH-had a moderately positive effect on breeding success (Table 3, Figure 4d). It was mainly mediated through a strong effect on brood size, especially in black grouse (Appendix S5: Table 1). As expected, the growth rate of red fox tracked vole abundance (ln Fox λ vs ln Voles [t−1] : r = 0.43, t = 2.95, p = .005). Secondly, the overall abundance of red fox had a moderately negative effect on breeding success, primarily F I G U R E 4 Partial residual plots showing the relationships between breeding success and explanatory variables given that other independent variables are controlled for in the model. Capercaillie (CAP: •) and black grouse (BG: ○). Variables controlled for are given in brackets. The Northern Hemisphere Temperature (NHT) is included as detrending covariate in all models. Estimated β-coefficients are shown to the right, with those significant at p < .05 (one-tailed type I-errors in predicted direction) boldfaced. See Table 3 and Supporting Information (Appendix S5: Table 1) for standard errors and more details and differences between species.

F I G U R E 3 Correlations between brood frequency (proportion of hens with brood)
and brood size (ratio of chicks to hens with brood) in (a) capercaillie and (b) black grouse.  Table 3, and Appendix S5: Table 1). Thirdly, the density of grouse 2-3 years earlier-a surrogate for the breeding density of goshawks-had a moderately negative effect on breeding success (Table 3, Figure 4f), affecting both the frequency and the size of broods (Appendix S5: Table 1).
Analyses of confounding variables modified these results, see 3.3 below.

| Demography
Contrary to predictions, breeding success tended to increase both with the proportion of young hens and with the total density of hens in spring.

| Ranking of main hypotheses
Aiming to rank the hypotheses in importance, the "explanatory value" of each hypothesis comprised the accumulated partial regression slopes of the relevant variables (Table 4). The Chick Food and Predation Hypotheses explained breeding success 3-6 times better than the Chick Weather and Hen Condition Hypotheses. In capercaillie, this was mainly due to brood frequency, for which the Hen Condition hypothesis also attained a high value, whereas in black grouse, it was mainly due to brood size.

| Model selection via AICc
Our attempt to construct a set of predictive models, irrespective of hypotheses about biological causation, was bedeviled by confounding between variables-especially from the Chick Food and Predation Hypotheses. Thus, the model that best predicted breeding success (lowest AICc) included larval abundance, bilberry fruit production the previous year, winter NAO, temperature after hatching, and density of grouse 2-3 years before (Table 5a, (Figure 5a, Appendix S6: Figure 1 and Table 1)

| Hypothesis testing
Our data, collected over 41 years, confirmed most of the predictions based on five current hypotheses about the determinants of breeding success in boreal forest grouse. Only the two sub-hypotheses comprising the Demography Hypothesis were refuted. One-the Age-dependence sub-hypothesis (Lindström et al., 1997)-predicted low breeding success in years with a large proportion of first-year hens. In our study area, all radio-tagged capercaillie hens attempted breeding, and although the breeding output of first-year birds was lower than among older hens (Storaas et al., 2000), the difference was small and apparently had little influence on annual variations in average breeding success. In the smaller black grouse, two large datasets have reported contrasting results; whereas Marjakangas and Törmälä (1997) did not find differences in reproductive outputs between yearling and older hens, Willebrand (1992) found much lower breeding success among yearlings. Clearly, our data were not consistent with the prediction of the Age-dependence sub-hypothesis.
Neither was the prediction that high density of hens should lead to lower reproductive output verified, possibly because densities of the two grouse species were below the carrying capacities of their habitats. Studies in Finland (Kauhala & Helle, 2002), NW Russia (Borchtchevski et al., 2003), and Scotland (Summers et al., 2010) all reported slightly higher breeding densities than in our study area, but no study of forest grouse has yet examined the relationship between bird density and carrying capacity.
The Hen Condition Hypothesis gained some support, especially in capercaillie, which experienced higher brood frequencies in warmer springs, confirming the findings of Wegge and Rolstad (2017). In both species, poor breeding with low brood frequencies occurred after winters with high winter NAO indices, which went along with mild and moist local weather. Possibly, by preventing snow burrowing, such weather may have affected thermoregulation and hen condition negatively, thereby causing less investment in breeding. Notably, however, none of the local winter weather variables explained any variation in breeding success: Hallett et al. (2004) and Stenseth and Mysterud (2005) discussed similar discrepancies between effects of regional and local weather, concluding that large-scale weather indices may include biologically influential weather aspects not recorded by standard meteorological measurements.
A few studies have used the NAO index as explanatory variable in their analyses. In line with our study, Vergara et al. (2012) suspected red grouse in Scotland to be in better condition after winters with low winter NAO indices, as they found males (but not females) to have larger combs the following spring. Barnagaud et al. (2011) found a nonlinear n-shaped relationship between winter NAO and the breeding success of black grouse in the French Alps. Finally, Kvasnes et al. (2014) found a positive correlation between NAO and the breeding success of willow ptarmigan in Norway, although this involved spring and summer NAO indices. Thus, interconnections among large-scale weather indices, local weather measurements, and grouse performance remain unresolved.
The date of snow-free ground-a proxy for the timing of new plant growth in spring-did not seem to influence breeding performance.
We were surprised at this, expecting warm spring weather and access to bilberry and early sprouting bog cottongrass (Eriophorum vaginatum) (Odden et al., 2003;Pulliainen & Tunkkari, 1991) to increase chick production. The absence of detectable effects could be explained by NAO and time of snow melt being negatively correlatedpositive NAO winters (warm and moist) often were followed by early springs (Appendix S2: The Chick Weather Hypothesis received some support, but markedly less than Chick Food and Predation. As in other studies (Baines et al., 2016;Ludwig et al., 2010;Wegge & Rolstad, 2017), warm weather after hatching had a positive effect on breeding success. Although less influential than chick food, temperature after hatching was included in four of the six best AIC models of breeding success. In early life, chick foraging bouts are probably longer and more efficient during warm weather, when arthropod foods are more active and more readily detected. Contrary to some other studies (Baines et al., 2016;Coppes et al., 2021;Ellison & Magnani, 1984;Moss, 1986;Storch, 1994;Summers et al., 2004), rainy weather following hatching did not affect breeding success in our study. Nor was there any interaction between temperature and rain: the rain did not matter whether it was warm or cold. This was surprising and inexplicable, as our study area does not differ much from the weather regime in the aforementioned studies.
The Chick Food Hypothesis was well supported by our data.
Firstly, the prediction that breeding success should be positively related to insect larvae was clearly confirmed. Their wide and irregular fluctuations explained breeding success well in both species. We had expected larval abundance primarily to affect survival of chicks and were therefore surprised that it also affected brood frequency.
Secondly, the positive effect of bilberry fruiting the preceding year supported the Plant Stress sub-hypothesis-the mast depression hypothesis of Selås (1997). His added prediction (Selås, Sonerud, et al., 2011), that cold summers in a bilberry masting year (and the year before) should accentuate the positive effects, also received some support via a weak effect on brood size. However, serious confounding between bilberry fruiting and vole abundance became apparent when the chick food and predation hypotheses were analyzed in concert (Table 6, see section 4.2). Finally, the positive confounding between bilberry and larvae ( Among the Predation sub-hypotheses, the Alternative Prey hypothesis (APH) was well supported by a positive effect of vole abundance on breeding success (but see Table 6 and section 4.2 on confounding). APH has received much support in Scandinavia (section 1.1.4). Secondly, as predicted, the growth rate of foxes tracked vole abundance, and this may also have been the case for other mesopredators such as pine marten and stoat. Hence, the increase in fox numbers mediated by high vole abundances contributed to an increased predation pressure during the low phase of the vole cycle. This moderately negative effect was mainly through effects on brood frequency, indicating more predation on eggs in nests than on chicks in broods-consistent with quite high nest losses recorded in the study area (Storaas & Wegge, 1987).
The Delayed Raptor sub-hypothesis (Tornberg et al., 2005) received some support in the hypothesis-testing, and Grouse (t−23) -a surrogate for a delayed numerical response in goshawk-also figured in the three best AICc-models with effect size stronger than Foxes and Voles. Although trends and possible cyclicity in breeding success are beyond the scope of this study, we observed that chick production fluctuated in a quasiperiodic pattern with intervals longer than expected solely from the 3-4 years vole cycle (Figure 1a).
Such prolonged quasi-cycles of 6-7 years are well documented from Finnish grouse populations (Lindström et al., 1995). Thus, a delayed effect of predation might be at work, resembling the classic Lotka-Volterra predator-prey model (Odum, 1953). Tornberg et al. (2012) provided some evidence of a 2-3 years numerical lag in goshawk territory occupancy from Finland, and Selås and Kålås (2007) reported a weak 2 years lag from southern Norway. Apart from this, no other studies have provided convincing evidence for such a delayed numerical response in goshawk (e.g., Lindén, 1988).
Summing up, we evidenced several bottom-up and top-down factors influencing breeding success and noted that they fluctuated asynchronously. Furthermore, the two components of breeding success-brood frequency and brood size-were completely uncorrelated and affected differently by the explanatory variables. This resulted in overall breeding success fluctuating around low to moderate levels.

| Confounding effects and biological causation
Notable confounding of key variables became apparent when we tried to select the best predictive models from the full set of explanatory variables via AICc. Foremost was the confounding of Voles by BB (t−1) and Larvae (Table 6). Consistent with APH, voles alone provided the best predictive model for brood size, but confounding eliminated voles from the best predictive models for annual variation in brood frequency and total breeding success.
Naïve reliance on AICc while taking no account of biological insights can lead to misleading inferences about causal relationships Cade, 2015). Hypothesis-testing relies on biological insight but heedless reliance on it can also be misleading. For example, the result from model selection that brood frequency was affected more by larval abundance (after hatching) than by predation (largely on nests), makes little biological sense.
Likewise, while the prediction of the Plant Stress sub-hypothesis that previous year's bilberry production should be related to breeding success was clearly verified, path analysis showed that this was due in part to an indirect relationship via voles. It may have been that high bilberry fruit crops increased the overwinter survival of bank voles (Selås et al., 2021), which subsist mainly on bilberry shoots during winter (Hansson, 1983), thereby buffering predation on nests and chicks via APH. Finally, Selås, Holand, and Ohlson (2011) reported no effect of fruiting on the digestibility or N-content in shoots in bilberry plants.

| Predation
The mean number of chicks reared per hen was 1.2 in capercaillie and 1.9 in black grouse. These had mean clutch sizes of 7.1 and 8.2, respectively, so that about 80% of potential recruits were lost during the 4-month nesting and brood rearing period (Table 2b). Breeding success was determined partly by brood size but more by brood frequency, determined largely by red foxes and pine martens taking nests. Although incubating hens are well camouflaged and expose their eggs for only short periods during recesses, more than half of all nesting attempts fail (Wegge & Storaas, 1990). The generally low breeding success in our study area compared with other studies (Baines et al., 2016;Borchtchevski et al., 2003;Kurki et al., 1997;Marcström et al., 1988;Saniga, 2011) went along with rather dense populations of predators, especially red fox .
Although we found predation to have a marked impact on breeding success, the importance of APH via Voles was lower than expected from previous studies in Scandinavia. Possible reasons are (1) the irruption of sarcoptic mange disease in red fox in the late 1980s, which depressed the abundance of foxes during a few years (Lindström et al., 1994;our study), and (2) low abundance of voles and absence of the regular 3-5-year vole cycle during a long period in the early 2000s (Ims et al., 2008;our study).
In Fennoscandia, predation has long been considered an important cause of mortality in boreal forest grouse. Voiced largely by hunter organizations, this view has also been tested in field experiments via intensive predator control (Kauhala et al., 2000;Marcström et al., 1988) or provision of supplemental food (Finne et al., 2019;Lindström et al., 1987), all in the presence of vole cycles. In all four studies, breeding success-but not adult numbersincreased with treatments. Notably, these positive effects correlated with vole abundance-but only when predators were present: when main predators were scarce or absent, breeding success did not decline during the low phase of the vole cycle. Besides providing clear support for the APH mechanism, these studies document the strong effect of mammalian predation on breeding success in forest grouse. Its impact may well be stronger in more northern parts of Fennoscandian boreal forests because vole cycles are more pronounced there, showing larger amplitudes than at lower latitudes where prey are more diverse and abundant (Korpimȁki et al., 2005).
It should be noted that, in the present study, we have inferred possible causes of annual variations in breeding success from correlations with year-to-year variations in explanatory variables. This implies that variables with little yearly variation have less explanatory power than variables with large annual amplitudes. Predictions from both the Red Fox and the Delayed Raptor Hypothesis were tested with variables (Foxes and Grouse (t−23) ) that varied relatively little over the years of the study, thereby reducing their statistical power to detect possible effects (see Figure 2c and d). Thus, although the role of predation in explaining annual variation in breeding success appeared rather modest, predation may still be the overall most important factor reducing nesting success and chick survival, as previously documented in our study area (Wegge & Kastdalen, 2007;Wegge & Storaas, 1990).

| Breeding success and population regulation
We detected no density-dependent effect on breeding success.
The two grouse populations may have been regulated by densitydependent losses of adults or juveniles rather than of eggs and chicks. If so, this is in line with the general conclusion reached by Saether et al. (2016) from the population dynamics of 13 bird populations, supporting the population regulation hypothesis launched in the 1960s by David Lack (1966). Another generalization is that, although environmental stochasticity-mainly in food and weather-impacts all vital rates, it most strongly affects temporal variation in breeding success and recruitment (White, 2008). Hence, variation in breeding success should be determined primarily by environmental stochasticity rather than density dependence.
In populations where immigration and emigration cancel each other, variations in adult population size depend on adult mortality plus juvenile recruitment, the latter varying with breeding success and overwinter survival of fledged chicks (Moss & Watson, 2001).
As a general rule, avian breeding success varies independently of density. However, exceptions have been reported, even within the same species: In song sparrows (Melospiza melodia), for example, Arcese et al. (1992) showed that density-dependent reproductive success and density-dependent recruitment of juveniles each operated to regulate population size in a sequential manner. In a different population of the same species, Chase et al. (2005) found that adult density was related to rainfall-associated, densityindependent variations in reproductive success, plus densitydependent losses of adults in the previous year. In our study area-despite long-term increase in density-independent breeding success-adult numbers of the two grouse species have changed very little (Wegge & Rolstad, 2011). This suggests that losses during juvenile recruitment and among adults may have increased in a density-dependent fashion. However, since this scenario has not yet been examined statistically, we present no direct evidence of this. lates to the rather strong and consistent effect size of the winter NAO index. Since we did not find any direct effects of local winter weather variables, the ecological mechanism explaining this correlation remains unknown. Finally, a possible delayed predation effect from goshawks is at best tentative, since we did not have reliable data on goshawk numbers.

ACK N OWLED G M ENTS
In the early years, numerous students and volunteers assisted in various phases of the fieldwork, especially with radio tracking, bird censusing, and sampling of voles and insect larvae. Kongsvinger Hunting and Fishing Club participated in the annual bird counts during the last two decades; special thanks to the late Bjørn Bergersen for organizing this locally and recruiting members with good pointing dogs. We thank Vidar Selås for sharing his time series of bilberry masting derived from newspaper reports. Thanks also to Thomas Riecke and three anonymous referees for commenting on various drafts of the paper. Funding was provided by the Norwegian Ministry of Climate and the Environment, the Norwegian Research Council, the Norwegian Environment Agency, and T. Gotaas Fund.

DATA AVA I L A B I L I T Y S TAT E M E N T
Relevant data files are archived in the Dryad Digital Repository: https://doi.org/10.5061/dryad.931zc rjpb.